Python File Handling: CSV and JSON
This document covers reading and writing CSV and JSON files in Python.
1. CSV (Comma Separated Values)
CSV is a simple file format used to store tabular data (numbers and text) in a plain text format. Each line of the file is a data record, and each field within a record is separated by a comma (or another delimiter).
1.1 Reading CSV Files
The csv module provides functionality to read and write CSV files.
import csv
# Reading a CSV file
with open('data.csv', 'r') as file:
reader = csv.reader(file) # Creates a reader object
# Iterate through each row in the CSV file
for row in reader:
print(row) # Each row is a list of strings
Explanation:
import csv: Imports thecsvmodule.with open('data.csv', 'r') as file:: Opens the CSV file in read mode ('r'). Thewithstatement ensures the file is automatically closed even if errors occur.reader = csv.reader(file): Creates areaderobject. This object iterates over lines in the file, splitting each line into a list of strings based on the delimiter (comma by default).for row in reader:: Loops through each row in the CSV file.print(row): Prints each row, which is a list of strings representing the values in that row.
Example data.csv:
Name,Age,City
Alice,30,New York
Bob,25,London
Charlie,35,Paris
Output:
['Name', 'Age', 'City']
['Alice', '30', 'New York']
['Bob', '25', 'London']
['Charlie', '35', 'Paris']
1.2 Writing CSV Files
import csv
# Data to write
data = [
['Name', 'Age', 'City'],
['Alice', '30', 'New York'],
['Bob', '25', 'London'],
['Charlie', '35', 'Paris']
]
# Writing to a CSV file
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file) # Creates a writer object
# Write multiple rows at once
writer.writerows(data)
Explanation:
with open('output.csv', 'w', newline='') as file:: Opens the CSV file in write mode ('w').newline=''is important to prevent extra blank rows when writing on some platforms (especially Windows).writer = csv.writer(file): Creates awriterobject.writer.writerows(data): Writes all rows from thedatalist to the CSV file. You can also usewriter.writerow(row)to write one row at a time.
Output output.csv:
Name,Age,City
Alice,30,New York
Bob,25,London
Charlie,35,Paris
1.3 Using a Different Delimiter
You can specify a different delimiter using the delimiter parameter in csv.reader() or csv.writer().
import csv
# Reading a CSV file with a semicolon delimiter
with open('data.csv', 'r') as file:
reader = csv.reader(file, delimiter=';')
for row in reader:
print(row)
# Writing a CSV file with a semicolon delimiter
data = [
['Name', 'Age', 'City'],
['Alice', '30', 'New York'],
['Bob', '25', 'London'],
['Charlie', '35', 'Paris']
]
with open('output.csv', 'w', newline='') as file:
writer = csv.writer(file, delimiter=';')
writer.writerows(data)
2. JSON (JavaScript Object Notation)
JSON is a lightweight data-interchange format that is easy for humans to read and write, and easy for machines to parse and generate. It is based on a subset of JavaScript syntax, but is language-independent.
2.1 Reading JSON Files
The json module provides functionality to work with JSON data.
import json
# Reading a JSON file
with open('data.json', 'r') as file:
data = json.load(file) # Loads the JSON data from the file
print(data)
Explanation:
import json: Imports thejsonmodule.with open('data.json', 'r') as file:: Opens the JSON file in read mode ('r').data = json.load(file): Loads the JSON data from the file and parses it into a Python dictionary or list (depending on the structure of the JSON data).print(data): Prints the loaded data.
Example data.json:
{
"name": "Alice",
"age": 30,
"city": "New York",
"skills": ["Python", "Data Analysis", "Machine Learning"]
}
Output:
{'name': 'Alice', 'age': 30, 'city': 'New York', 'skills': ['Python', 'Data Analysis', 'Machine Learning']}
2.2 Writing JSON Files
import json
# Data to write
data = {
"name": "Alice",
"age": 30,
"city": "New York",
"skills": ["Python", "Data Analysis", "Machine Learning"]
}
# Writing to a JSON file
with open('output.json', 'w') as file:
json.dump(data, file, indent=4) # Dumps the data to the file
Explanation:
with open('output.json', 'w') as file:: Opens the JSON file in write mode ('w').json.dump(data, file, indent=4): Serializes the Pythondataobject into a JSON formatted string and writes it to the file.data: The Python object to serialize.file: The file object to write to.indent=4: Optional parameter that adds indentation to the JSON output, making it more readable. A value of 4 adds 4 spaces for each level of indentation.
Output output.json:
{
"name": "Alice",
"age": 30,
"city": "New York",
"skills": [
"Python",
"Data Analysis",
"Machine Learning"
]
}
2.3 Pretty Printing JSON
The indent parameter in json.dump() is crucial for creating human-readable JSON files. Without it, the JSON will be written on a single line.
Key Considerations:
- Error Handling: Always include
try...exceptblocks to handle potential errors likeFileNotFoundErrororjson.JSONDecodeError. - Data Types: JSON supports a limited set of data types (strings, numbers, booleans, lists, dictionaries, and null). Ensure your Python data is compatible with these types.
- Encoding: Be mindful of character encoding, especially when dealing with non-ASCII characters. Use
encoding='utf-8'when opening files if necessary. For example:with open('data.json', 'r', encoding='utf-8') as file: - File Paths: Use relative or absolute file paths correctly to ensure your program can find the files.
newline=''for CSV: Remember to includenewline=''when opening CSV files for writing, especially on Windows, to prevent extra blank rows.