File Operations & CSV Handling

File Operations & CSV Handling – Automating File Processing with Python

In programming, working with files is essential. Whether it’s reading log files, processing CSV data, or automating repetitive tasks, Python makes file handling easy and efficient.

In this guide, you’ll learn:

✔ How to read and write files in Python
✔ How to handle CSV files for data processing
✔ Automating file processing tasks
✔ Real-world examples and best practices

Part 1: File Operations in Python

Python provides built-in functions to work with files. You can read, write, append, and delete files with ease.

1️⃣ Opening and Reading Files

# Open a file in read mode
file = open("sample.txt", "r")

# Read the entire file
content = file.read()
print(content)

# Close the file
file.close()

Tip: Always close files to avoid memory leaks. Alternatively, use with statements for automatic handling:

with open("sample.txt", "r") as file:
content = file.read()
print(content)

2️⃣ Writing to Files

with open("output.txt", "w") as file:
file.write("Hello PyCraftStudio!\n")
file.write("Automating file handling with Python.")
  • "w" → Write mode (overwrites existing content)

  • "a" → Append mode (adds new content at the end)

3️⃣ Reading Line by Line

with open("sample.txt", "r") as file:
for line in file:
print(line.strip()) # Remove newline characters

Part 2: Handling CSV Files

CSV (Comma Separated Values) is a common format for storing tabular data.

Python makes CSV processing simple using the csv module or Pandas.

1️⃣ Using csv Module

Reading a CSV File

import csv

with open("data.csv", "r") as csvfile:
reader = csv.reader(csvfile)
header = next(reader) # Skip header
for row in reader:
print(row)

Writing to a CSV File

import csv

data = [
["Name", "Age", "City"],
["Alice", 25, "London"],
["Bob", 30, "New York"]
]

with open("output.csv", "w", newline="") as csvfile:
writer = csv.writer(csvfile)
writer.writerows(data)

2️⃣ Using Pandas for CSV Handling

Pandas provides a high-level interface for CSV operations and is ideal for data analysis.

import pandas as pd

# Read CSV
df = pd.read_csv("data.csv")
print(df.head())

# Write CSV
df.to_csv("processed_data.csv", index=False)

Part 3: Automating File Processing

Automation saves time when handling multiple files or repetitive tasks.

1️⃣ Processing Multiple Files in a Folder

import os
folder_path = "data_files"
for filename in os.listdir(folder_path):
if filename.endswith(".txt"):
with open(os.path.join(folder_path, filename), "r") as file:
content = file.read()
print(f"{filename} content:\n{content}\n")

2️⃣ Combining Multiple CSV Files

import pandas as pd
import os

folder_path = "csv_folder"
all_files = [f for f in os.listdir(folder_path) if f.endswith(".csv")]

combined_df = pd.concat([pd.read_csv(os.path.join(folder_path, f)) for f in all_files])
combined_df.to_csv("combined.csv", index=False)

3️⃣ Renaming Multiple Files

import os
folder_path = "documents"
for i, filename in enumerate(os.listdir(folder_path)):
new_name = f"file_{i+1}.txt"
os.rename(os.path.join(folder_path, filename), os.path.join(folder_path, new_name))


Tips for Efficient File Handling

  • Always use with open() to avoid manual closing

  • Check if files exist using os.path.exists()

  • Use try-except blocks to handle errors

  • For large CSVs, use Pandas for faster processing

  • Automate repetitive tasks with loops


Real-World Use Cases

  • Batch renaming files

  • Merging multiple CSV datasets

  • Log file processing

  • Data cleaning and preprocessing for ML

  • Report generation and automation

Python makes file operations and CSV handling easy and automatable.

Mastering these skills allows you to:

  • Process and clean data efficiently

  • Automate repetitive file tasks

  • Combine multiple datasets for analysis

  • Build real-world automation scripts

Start small, automate simple tasks, and gradually handle larger datasets for real productivity gains.



Comments

Popular posts from this blog

Database Integration in FastAPI (SQLAlchemy CRUD)

Middleware & CORS in FastAPI

Python Data Handling