Module: File Handling

CSV and JSON

Python File Handling: CSV and JSON

This document covers reading and writing CSV and JSON files in Python.

1. CSV (Comma Separated Values)

CSV is a simple file format used to store tabular data (numbers and text) in a plain text format. Each line of the file is a data record, and each field within a record is separated by a comma (or another delimiter).

1.1 Reading CSV Files

The csv module provides functionality to read and write CSV files.

import csv

# Reading a CSV file
with open('data.csv', 'r') as file:
    reader = csv.reader(file)  # Creates a reader object
    
    # Iterate through each row in the CSV file
    for row in reader:
        print(row)  # Each row is a list of strings

Explanation:

  • import csv: Imports the csv module.
  • with open('data.csv', 'r') as file:: Opens the CSV file in read mode ('r'). The with statement ensures the file is automatically closed even if errors occur.
  • reader = csv.reader(file): Creates a reader object. This object iterates over lines in the file, splitting each line into a list of strings based on the delimiter (comma by default).
  • for row in reader:: Loops through each row in the CSV file.
  • print(row): Prints each row, which is a list of strings representing the values in that row.

Example data.csv:

Name,Age,City
Alice,30,New York
Bob,25,London
Charlie,35,Paris

Output:

['Name', 'Age', 'City']
['Alice', '30', 'New York']
['Bob', '25', 'London']
['Charlie', '35', 'Paris']

1.2 Writing CSV Files

import csv

# Data to write
data = [
    ['Name', 'Age', 'City'],
    ['Alice', '30', 'New York'],
    ['Bob', '25', 'London'],
    ['Charlie', '35', 'Paris']
]

# Writing to a CSV file
with open('output.csv', 'w', newline='') as file:
    writer = csv.writer(file)  # Creates a writer object
    
    # Write multiple rows at once
    writer.writerows(data)

Explanation:

  • with open('output.csv', 'w', newline='') as file:: Opens the CSV file in write mode ('w'). newline='' is important to prevent extra blank rows when writing on some platforms (especially Windows).
  • writer = csv.writer(file): Creates a writer object.
  • writer.writerows(data): Writes all rows from the data list to the CSV file. You can also use writer.writerow(row) to write one row at a time.

Output output.csv:

Name,Age,City
Alice,30,New York
Bob,25,London
Charlie,35,Paris

1.3 Using a Different Delimiter

You can specify a different delimiter using the delimiter parameter in csv.reader() or csv.writer().

import csv

# Reading a CSV file with a semicolon delimiter
with open('data.csv', 'r') as file:
    reader = csv.reader(file, delimiter=';')
    for row in reader:
        print(row)

# Writing a CSV file with a semicolon delimiter
data = [
    ['Name', 'Age', 'City'],
    ['Alice', '30', 'New York'],
    ['Bob', '25', 'London'],
    ['Charlie', '35', 'Paris']
]

with open('output.csv', 'w', newline='') as file:
    writer = csv.writer(file, delimiter=';')
    writer.writerows(data)

2. JSON (JavaScript Object Notation)

JSON is a lightweight data-interchange format that is easy for humans to read and write, and easy for machines to parse and generate. It is based on a subset of JavaScript syntax, but is language-independent.

2.1 Reading JSON Files

The json module provides functionality to work with JSON data.

import json

# Reading a JSON file
with open('data.json', 'r') as file:
    data = json.load(file)  # Loads the JSON data from the file

    print(data)

Explanation:

  • import json: Imports the json module.
  • with open('data.json', 'r') as file:: Opens the JSON file in read mode ('r').
  • data = json.load(file): Loads the JSON data from the file and parses it into a Python dictionary or list (depending on the structure of the JSON data).
  • print(data): Prints the loaded data.

Example data.json:

{
  "name": "Alice",
  "age": 30,
  "city": "New York",
  "skills": ["Python", "Data Analysis", "Machine Learning"]
}

Output:

{'name': 'Alice', 'age': 30, 'city': 'New York', 'skills': ['Python', 'Data Analysis', 'Machine Learning']}

2.2 Writing JSON Files

import json

# Data to write
data = {
    "name": "Alice",
    "age": 30,
    "city": "New York",
    "skills": ["Python", "Data Analysis", "Machine Learning"]
}

# Writing to a JSON file
with open('output.json', 'w') as file:
    json.dump(data, file, indent=4)  # Dumps the data to the file

Explanation:

  • with open('output.json', 'w') as file:: Opens the JSON file in write mode ('w').
  • json.dump(data, file, indent=4): Serializes the Python data object into a JSON formatted string and writes it to the file.
    • data: The Python object to serialize.
    • file: The file object to write to.
    • indent=4: Optional parameter that adds indentation to the JSON output, making it more readable. A value of 4 adds 4 spaces for each level of indentation.

Output output.json:

{
    "name": "Alice",
    "age": 30,
    "city": "New York",
    "skills": [
        "Python",
        "Data Analysis",
        "Machine Learning"
    ]
}

2.3 Pretty Printing JSON

The indent parameter in json.dump() is crucial for creating human-readable JSON files. Without it, the JSON will be written on a single line.

Key Considerations:

  • Error Handling: Always include try...except blocks to handle potential errors like FileNotFoundError or json.JSONDecodeError.
  • Data Types: JSON supports a limited set of data types (strings, numbers, booleans, lists, dictionaries, and null). Ensure your Python data is compatible with these types.
  • Encoding: Be mindful of character encoding, especially when dealing with non-ASCII characters. Use encoding='utf-8' when opening files if necessary. For example: with open('data.json', 'r', encoding='utf-8') as file:
  • File Paths: Use relative or absolute file paths correctly to ensure your program can find the files.
  • newline='' for CSV: Remember to include newline='' when opening CSV files for writing, especially on Windows, to prevent extra blank rows.