JSON to CSV Python β€” DictWriter + pandas Examples

Β·Backend DeveloperΒ·Reviewed byPriya SharmaΒ·Published

Use the free online JSON to CSV directly in your browser β€” no install required.

Try JSON to CSV Online β†’

Almost every data pipeline eventually hits the same step: an API returns JSON, but the next consumer β€” a spreadsheet, an import script, a Redshift COPY command β€” needs CSV. Converting JSON to CSV in Python sounds trivial until you hit nested objects, inconsistent keys, or datetime values that need special handling. Python gives you two solid paths: the built-in json + csv modules for zero-dependency scripts, and pandas for nested flattening and larger datasets β€” or the online JSON to CSV converter for quick one-off conversions without any code. This guide covers both approaches end to end, with runnable Python 3.8+ examples.

  • βœ“csv.DictWriter converts a list of dicts to CSV with zero dependencies β€” use json.load() to parse, then writeheader() + writerows().
  • βœ“Always open CSV files with newline="" on Windows to prevent blank rows between data rows.
  • βœ“pd.json_normalize() flattens nested JSON into a flat DataFrame before calling to_csv() β€” handles multi-level nesting automatically.
  • βœ“Pass index=False to DataFrame.to_csv() β€” without it, pandas writes an unwanted row-number column.
  • βœ“For files over 500 MB, use ijson for streaming JSON parsing combined with csv.DictWriter for constant memory usage.

What is JSON to CSV Conversion?

JSON to CSV conversion transforms an array of JSON objects into a tabular format where each object becomes a row and each key becomes a column header. JSON is hierarchical β€” objects can nest arbitrarily deep. CSV is flat β€” every value sits in a row-column grid. The conversion works cleanly when every object shares the same set of top-level keys. Nested objects, arrays, and inconsistent keys are where things get interesting. The raw data stays identical; only the structure changes.

Before Β· json
After Β· json
[{"order_id":"ord_91a3","total":149.99,"status":"shipped"},
 {"order_id":"ord_b7f2","total":34.50,"status":"pending"}]
order_id,total,status
ord_91a3,149.99,shipped
ord_b7f2,34.50,pending

csv.DictWriter β€” Convert JSON to CSV Without Pandas

The csv module ships with every Python installation. No pip install, no virtual environment gymnastics. csv.DictWriter takes a list of dictionaries and writes each one as a CSV row, mapping dict keys to column headers. The fieldnames parameter controls both the column order and which keys get included.

Python 3.8+ β€” minimal json to csv example
import json
import csv

# Sample JSON data β€” an array of order objects
json_string = """
[
  {"order_id": "ord_91a3", "product": "Wireless Keyboard", "quantity": 2, "unit_price": 74.99},
  {"order_id": "ord_b7f2", "product": "USB-C Hub", "quantity": 1, "unit_price": 34.50},
  {"order_id": "ord_c4e8", "product": "Monitor Stand", "quantity": 3, "unit_price": 29.95}
]
"""

records = json.loads(json_string)

with open("orders.csv", "w", newline="", encoding="utf-8") as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=records[0].keys())
    writer.writeheader()
    writer.writerows(records)

# orders.csv:
# order_id,product,quantity,unit_price
# ord_91a3,Wireless Keyboard,2,74.99
# ord_b7f2,USB-C Hub,1,34.50
# ord_c4e8,Monitor Stand,3,29.95

That newline="" argument on open() is not optional on Windows. Without it, you get double carriage returns β€” which show up as blank rows between every data row in Excel. On macOS and Linux it is harmless, so just always include it.

The code above uses json.loads() for a string. Use json.load() (no trailing s) when reading from a file handle. This trips people up constantly β€” one reads a string, the other reads a file object.

Python 3.8+ β€” read JSON file, write CSV file
import json
import csv

with open("server_metrics.json", encoding="utf-8") as jf:
    metrics = json.load(jf)  # json.load() for file objects

# Explicit fieldnames control column order
columns = ["timestamp", "hostname", "cpu_percent", "memory_mb", "disk_io_ops"]

with open("server_metrics.csv", "w", newline="", encoding="utf-8") as cf:
    writer = csv.DictWriter(cf, fieldnames=columns, extrasaction="ignore")
    writer.writeheader()
    writer.writerows(metrics)

# Only the five specified columns appear, in exactly that order

Setting extrasaction="ignore" silently drops any keys in the dicts that are not in your fieldnames list. The default is "raise", which throws a ValueError if any dict has an unexpected key. Pick whichever matches your tolerance for surprises.

Note:csv.DictWriter vs csv.writer: DictWriter maps dict keys to column positions automatically. csv.writer writes raw lists as rows β€” you handle the column ordering yourself. DictWriter is almost always the right choice for JSON-to-CSV because JSON records are already dictionaries.

Python's csv module ships with three named dialects: excel (comma delimiter, CRLF line endings β€” the default), excel-tab (tab delimiter, CRLF endings), and unix (LF line endings, quotes all non-numeric fields). Pass the dialect name as the dialect argument to csv.DictWriter. You can also define a custom dialect with csv.register_dialect() when your target system has unusual quoting or delimiter rules. For most JSON-to-CSV workflows the excel dialect is correct, but switch to unix when writing files that will be processed by POSIX tools like awk or sort.

Handling Non-Standard Types: datetime, UUID, and Decimal

JSON from APIs often contains dates as ISO strings, UUIDs as hyphenated strings, and monetary values as floats. When you parse these into Python objects for processing before writing CSV, you need to convert them back to strings. The csv module calls str() on every value, so most types just work. But datetime objects produce messy default string representations, and Decimal values need explicit formatting to avoid scientific notation.

Python 3.8+ β€” pre-process datetime and Decimal before CSV write
import json
import csv
from datetime import datetime, timezone
from decimal import Decimal
from uuid import UUID

# Simulating parsed API response with Python types
transactions = [
    {
        "txn_id": UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890"),
        "created_at": datetime(2026, 3, 15, 9, 30, 0, tzinfo=timezone.utc),
        "amount": Decimal("1249.99"),
        "currency": "USD",
        "merchant": "CloudHost Inc.",
    },
    {
        "txn_id": UUID("b2c3d4e5-f6a7-8901-bcde-f12345678901"),
        "created_at": datetime(2026, 3, 15, 14, 12, 0, tzinfo=timezone.utc),
        "amount": Decimal("87.50"),
        "currency": "EUR",
        "merchant": "DataSync GmbH",
    },
]

def prepare_row(record: dict) -> dict:
    """Convert non-string types to CSV-friendly strings."""
    return {
        "txn_id": str(record["txn_id"]),
        "created_at": record["created_at"].isoformat(),
        "amount": f"{record['amount']:.2f}",
        "currency": record["currency"],
        "merchant": record["merchant"],
    }

with open("transactions.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["txn_id", "created_at", "amount", "currency", "merchant"])
    writer.writeheader()
    for txn in transactions:
        writer.writerow(prepare_row(txn))

# transactions.csv:
# txn_id,created_at,amount,currency,merchant
# a1b2c3d4-e5f6-7890-abcd-ef1234567890,2026-03-15T09:30:00+00:00,1249.99,USD,CloudHost Inc.
# b2c3d4e5-f6a7-8901-bcde-f12345678901,2026-03-15T14:12:00+00:00,87.50,EUR,DataSync GmbH

The prepare_row() function is the right approach here. Rather than trying to teach csv.DictWriter about custom types, you normalize each record to strings before writing. I prefer calling .isoformat() explicitly on datetime objects rather than relying on str() β€” the output format is more predictable, and downstream parsers handle ISO 8601 reliably.

Warning:If you let Decimal values pass through without formatting, very small or very large numbers may render in scientific notation (e.g., 1.5E+7). Always format Decimal with an explicit f-string like f"{value:.2f}" when writing financial data to CSV.

An alternative pattern for pipelines with many custom types is to extend json.JSONEncoder. Subclass it, override the default() method to return a JSON-serializable value for each custom type, then pass the subclass as the cls argument to json.dumps(). Re-encoding through the custom encoder before writing to CSV normalizes all types in one step without a per-row prepare_row() call. The prepare_row() pattern shown above is simpler for one-off scripts; the JSONEncoder subclass approach scales better when the same domain model with custom types is shared across many pipeline stages or microservices.

csv.DictWriter Parameters Reference

The full constructor signature is csv.DictWriter(f, fieldnames, restval="", extrasaction="raise", dialect="excel", **fmtparams). Most of these have sensible defaults. The ones you will actually change are fieldnames, delimiter, and extrasaction.

Parameter
Type
Default
Description
f
file object
(required)
Any object with a write() method β€” typically from open()
fieldnames
sequence
(required)
List of keys that defines column order in the CSV output
restval
str
""
Value written when a dict is missing a key from fieldnames
extrasaction
str
"raise"
"raise" throws ValueError for extra keys; "ignore" silently drops them
dialect
str / Dialect
"excel"
Predefined formatting rules β€” "excel", "excel-tab", or "unix"
delimiter
str
","
Single character separating fields β€” use "\t" for TSV output
quotechar
str
"
Character used to quote fields containing the delimiter
quoting
int
csv.QUOTE_MINIMAL
Controls when quoting is applied β€” MINIMAL, ALL, NONNUMERIC, NONE
lineterminator
str
"\r\n"
String appended after each row β€” override to "\n" for Unix-style output

pandas β€” Convert JSON to CSV with DataFrames

If you are already working in a pandas-heavy codebase, or your JSON has nested objects that you need to flatten, the pandas approach is significantly less code than the stdlib version. The tradeoff: pandas is a ~30 MB dependency. For a throwaway script, that is fine. For a Docker image you ship to production, the stdlib approach keeps things lighter.

Python 3.8+ β€” pandas read_json then to_csv
import pandas as pd

# Read JSON array directly into a DataFrame
df = pd.read_json("warehouse_inventory.json")

# Write to CSV β€” index=False prevents the auto-generated row numbers
df.to_csv("warehouse_inventory.csv", index=False)

# That's it. Two lines. pandas infers column types automatically.

The index=False flag is one of those things you look up every single time. Without it, pandas writes a 0, 1, 2, ... column as the first column of your CSV. Nobody wants that.

Flattening Nested JSON with json_normalize

Real API responses are rarely flat. Orders contain shipping addresses, users contain nested preferences, telemetry events contain nested metadata. pd.json_normalize() walks nested dictionaries and flattens them into columns with dot-separated names.

Python 3.8+ β€” flatten nested JSON using json_normalize
import json
import pandas as pd

api_response = """
[
  {
    "order_id": "ord_91a3",
    "placed_at": "2026-03-15T09:30:00Z",
    "customer": {
      "name": "Sarah Chen",
      "email": "s.chen@example.com",
      "tier": "premium"
    },
    "shipping": {
      "method": "express",
      "address": {
        "city": "Portland",
        "state": "OR",
        "zip": "97201"
      }
    },
    "total": 299.95
  },
  {
    "order_id": "ord_b7f2",
    "placed_at": "2026-03-15T14:12:00Z",
    "customer": {
      "name": "James Park",
      "email": "j.park@example.com",
      "tier": "standard"
    },
    "shipping": {
      "method": "standard",
      "address": {
        "city": "Austin",
        "state": "TX",
        "zip": "73301"
      }
    },
    "total": 87.50
  }
]
"""

orders = json.loads(api_response)

# json_normalize flattens nested dicts β€” sep controls the delimiter
df = pd.json_normalize(orders, sep="_")
df.to_csv("flat_orders.csv", index=False)

# Resulting columns:
# order_id, placed_at, customer_name, customer_email, customer_tier,
# shipping_method, shipping_address_city, shipping_address_state,
# shipping_address_zip, total

The sep="_" parameter controls how nested key names are joined. The default is ".", which produces columns like customer.name. I prefer underscores because dots in column names cause trouble with SQL imports and some spreadsheet formulas.

For API responses that wrap the records array under a nested key, use the record_path parameter. If the response looks like {"data": {"orders": [...]}}, pass record_path=["data", "orders"] to navigate to the right list. The optional meta parameter lets you pull parent-level fields alongside the nested records β€” useful when the response includes top-level pagination info (page number, total count) that you want as a column in every row. Together, record_path and meta handle most real-world nested API response shapes without custom preprocessing.

DataFrame.to_csv() Parameters Reference

DataFrame.to_csv() has over 20 parameters. These are the ones that matter for JSON-to-CSV workflows.

Parameter
Type
Default
Description
path_or_buf
str / Path / None
None
File path or buffer β€” None returns CSV as a string
sep
str
","
Field delimiter β€” use "\t" for TSV
index
bool
True
Write row index as the first column β€” almost always set to False
columns
list
None
Subset and reorder columns in the output
header
bool / list
True
Write column names β€” set False when appending to existing file
encoding
str
"utf-8"
Output encoding β€” use "utf-8-sig" for Excel compatibility on Windows
na_rep
str
""
String representation for missing values (NaN, None)
quoting
int
csv.QUOTE_MINIMAL
Controls when fields get quoted
Python 3.8+ β€” to_csv with common parameter overrides
import pandas as pd

df = pd.read_json("telemetry_events.json")

# TSV output with explicit encoding and missing value handling
df.to_csv(
    "telemetry_events.tsv",
    sep="\t",
    index=False,
    encoding="utf-8",
    na_rep="NULL",
    columns=["event_id", "timestamp", "source", "severity", "message"],
)

# Write to stdout for piping in shell scripts
print(df.to_csv(index=False))

# Return as string (no file written)
csv_string = df.to_csv(index=False)
print(len(csv_string), "characters")

Convert JSON to CSV from a File and API Response

The two most common real-world scenarios: reading JSON from a file on disk and converting it, or fetching JSON from an HTTP API and saving the result as CSV. In development you can get away with no error handling. In production, that choice becomes a 2 a.m. alert. Files might not exist, APIs might return 4xx or 5xx status codes instead of JSON, the response body might be an error object rather than an array, or the JSON could be truncated due to a network timeout. The patterns below handle all of these cases explicitly, log errors to stderr, and return a row count so callers can detect zero-row outputs and alert accordingly.

File on Disk β€” Read, Convert, Save

Python 3.8+ β€” convert JSON file to CSV with error handling
import json
import csv
import sys

def json_file_to_csv(input_path: str, output_path: str) -> int:
    """Convert a JSON file containing an array of objects to CSV.
    Returns the number of rows written.
    """
    try:
        with open(input_path, encoding="utf-8") as jf:
            data = json.load(jf)
    except FileNotFoundError:
        print(f"Error: {input_path} not found", file=sys.stderr)
        return 0
    except json.JSONDecodeError as exc:
        print(f"Error: invalid JSON in {input_path}: {exc.msg} at line {exc.lineno}", file=sys.stderr)
        return 0

    if not isinstance(data, list) or not data:
        print(f"Error: expected a non-empty JSON array in {input_path}", file=sys.stderr)
        return 0

    # Collect all unique keys across all records β€” handles inconsistent schemas
    all_keys: list[str] = []
    seen: set[str] = set()
    for record in data:
        for key in record:
            if key not in seen:
                all_keys.append(key)
                seen.add(key)

    with open(output_path, "w", newline="", encoding="utf-8") as cf:
        writer = csv.DictWriter(cf, fieldnames=all_keys, restval="", extrasaction="ignore")
        writer.writeheader()
        writer.writerows(data)

    return len(data)

rows = json_file_to_csv("deploy_logs.json", "deploy_logs.csv")
print(f"Wrote {rows} rows to deploy_logs.csv")

HTTP API Response β€” Fetch and Convert

Python 3.8+ β€” fetch JSON from API and save as CSV
import json
import csv
import urllib.request
import urllib.error

def api_response_to_csv(url: str, output_path: str) -> int:
    """Fetch JSON from a REST API endpoint and write it as CSV."""
    try:
        req = urllib.request.Request(url, headers={"Accept": "application/json"})
        with urllib.request.urlopen(req, timeout=30) as resp:
            if resp.status != 200:
                print(f"Error: API returned status {resp.status}")
                return 0
            body = resp.read().decode("utf-8")
    except urllib.error.URLError as exc:
        print(f"Error: could not reach {url}: {exc.reason}")
        return 0

    try:
        records = json.loads(body)
    except json.JSONDecodeError as exc:
        print(f"Error: API returned invalid JSON: {exc.msg}")
        return 0

    if not isinstance(records, list) or not records:
        print("Error: expected a non-empty JSON array from the API")
        return 0

    with open(output_path, "w", newline="", encoding="utf-8") as cf:
        writer = csv.DictWriter(cf, fieldnames=records[0].keys())
        writer.writeheader()
        writer.writerows(records)

    return len(records)

rows = api_response_to_csv(
    "https://api.internal.example.com/v2/deployments?status=completed",
    "completed_deployments.csv",
)
print(f"Exported {rows} deployments to CSV")
Note:The example above uses urllib from the standard library to keep the script dependency-free. If you have requests installed, replace the urllib section with resp = requests.get(url, timeout=30); records = resp.json() β€” the rest of the CSV writing code stays identical.

Command-Line JSON to CSV Conversion

Sometimes you just need a one-liner in the terminal. Python's -c flag lets you run a quick conversion without creating a script file. For more complex transformations, pipe through jq first to reshape the data, then convert.

bash β€” one-liner json to csv conversion
# Python one-liner: reads JSON from stdin, writes CSV to stdout
cat orders.json | python3 -c "
import json, csv, sys
data = json.load(sys.stdin)
w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
w.writeheader()
w.writerows(data)
"

# Save output to a file
cat orders.json | python3 -c "
import json, csv, sys
data = json.load(sys.stdin)
w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
w.writeheader()
w.writerows(data)
" > orders.csv
bash β€” self-contained CLI script with argparse
# Save as json2csv.py and run: python3 json2csv.py input.json -o output.csv
python3 -c "
import json, csv, argparse, sys

parser = argparse.ArgumentParser(description='Convert JSON array to CSV')
parser.add_argument('input', help='Path to JSON file')
parser.add_argument('-o', '--output', default=None, help='Output CSV path (default: stdout)')
parser.add_argument('-d', '--delimiter', default=',', help='CSV delimiter')
args = parser.parse_args()

with open(args.input) as f:
    data = json.load(f)

out = open(args.output, 'w', newline='') if args.output else sys.stdout
writer = csv.DictWriter(out, fieldnames=data[0].keys(), delimiter=args.delimiter)
writer.writeheader()
writer.writerows(data)
if args.output:
    out.close()
    print(f'Wrote {len(data)} rows to {args.output}', file=sys.stderr)
" "$@"
bash β€” using jq + csvkit for complex transformations
# Install csvkit: pip install csvkit

# jq flattens and selects fields, in2csv handles the CSV formatting
cat api_response.json | jq '[.[] | {id: .order_id, customer: .customer.name, total}]' | in2csv -f json > orders.csv

# Miller (mlr) is another option for JSON-to-CSV
mlr --json2csv cat orders.json > orders.csv

Miller (mlr) is a standalone binary that treats JSON, CSV, and TSV as first-class formats with no Python runtime required. The --json2csv flag converts JSON input to CSV in a single pass, and you can chain Miller verbs to filter, sort, or rename columns in the same command before writing output. Install via Homebrew on macOS (brew install miller) or your Linux package manager. It is particularly useful in CI pipelines where you want fast JSON-to-CSV conversion without spinning up a Python environment.

High-Performance Alternative β€” pandas with pyarrow

For datasets in the tens-of-millions-of-rows range, pandas with the pyarrow backend reads and writes significantly faster than the default. The C-backed Arrow engine processes columnar data more efficiently than Python's row-by-row csv module. The API stays the same β€” you just set the engine parameter.

bash β€” install pyarrow
pip install pyarrow
Python 3.8+ β€” pandas with pyarrow for faster CSV writing
import pandas as pd

# Read JSON with pyarrow engine (faster parsing for large files)
df = pd.read_json("sensor_readings.json", engine="pyarrow")

# to_csv doesn't have an engine parameter, but the DataFrame operations
# between read and write benefit from pyarrow's columnar layout
df.to_csv("sensor_readings.csv", index=False)

# For truly large exports, consider writing to Parquet instead of CSV
# β€” binary format, 5-10x smaller, preserves types
df.to_parquet("sensor_readings.parquet", engine="pyarrow")

If you are processing more than a few hundred MB of JSON and the final consumer accepts Parquet, skip CSV entirely. Parquet is smaller, preserves column types, and both Redshift and BigQuery load it natively. CSV is a lossy format β€” every value becomes a string.

Terminal Output with Syntax Highlighting

The rich library renders tables with borders, alignment, and color in the terminal β€” useful for previewing a conversion during development without opening the output file.

bash β€” install rich
pip install rich
Python 3.8+ β€” preview CSV output in terminal with rich
import json
from rich.console import Console
from rich.table import Table

json_string = """
[
  {"hostname": "web-prod-1", "cpu_percent": 72.3, "memory_mb": 3840, "uptime_hours": 720},
  {"hostname": "web-prod-2", "cpu_percent": 45.1, "memory_mb": 2560, "uptime_hours": 168},
  {"hostname": "db-replica-1", "cpu_percent": 91.7, "memory_mb": 7680, "uptime_hours": 2160}
]
"""

records = json.loads(json_string)
console = Console()

table = Table(title="Server Metrics Preview", show_lines=True)
for key in records[0]:
    table.add_column(key, style="cyan" if key == "hostname" else "white")

for row in records:
    table.add_row(*[str(v) for v in row.values()])

console.print(table)
# Renders a color-highlighted table with borders in the terminal
Warning:Rich is for terminal display only. Do not use it to generate CSV files β€” it adds ANSI escape codes that will corrupt the output. Write to files with csv.DictWriter or DataFrame.to_csv(), and use rich only for previewing.

Working with Large JSON Files

json.load() reads the entire file into memory. For a 200 MB JSON file, that means ~200 MB of raw text plus the Python object overhead β€” easily 500 MB+ of heap usage. For files over 100 MB, stream the input with ijson and write CSV rows as you go.

bash β€” install ijson
pip install ijson

Streaming JSON Array to CSV with ijson

Python 3.8+ β€” stream large JSON array to CSV with constant memory
import ijson
import csv

def stream_json_to_csv(json_path: str, csv_path: str) -> int:
    """Convert a large JSON array to CSV without loading it all into memory."""
    with open(json_path, "rb") as jf, open(csv_path, "w", newline="", encoding="utf-8") as cf:
        # ijson.items yields each element of the top-level array one at a time
        records = ijson.items(jf, "item")

        first_record = next(records)
        fieldnames = list(first_record.keys())

        writer = csv.DictWriter(cf, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerow(first_record)

        count = 1
        for record in records:
            writer.writerow(record)
            count += 1

    return count

rows = stream_json_to_csv("clickstream_2026_03.json", "clickstream_2026_03.csv")
print(f"Streamed {rows} records to CSV")

NDJSON / JSON Lines β€” One Object Per Line

NDJSON (Newline-Delimited JSON), also called JSON Lines or .jsonl, stores one valid JSON object per line with no wrapping array. This format is common in log pipelines, event streams (Kafka, Kinesis), and bulk exports from services like Elasticsearch and BigQuery. Because each line is a self-contained JSON object, you can process an NDJSON file with a plain Python for loop over the file handle β€” no need for the ijson library. Memory stays constant regardless of file size, making this the simplest streaming approach when your source data is already in JSON Lines format.

Python 3.8+ β€” convert NDJSON to CSV line by line
import json
import csv

def ndjson_to_csv(ndjson_path: str, csv_path: str) -> int:
    """Convert a newline-delimited JSON file to CSV, one line at a time."""
    with open(ndjson_path, encoding="utf-8") as nf:
        first_line = nf.readline()
        first_record = json.loads(first_line)
        fieldnames = list(first_record.keys())

        with open(csv_path, "w", newline="", encoding="utf-8") as cf:
            writer = csv.DictWriter(cf, fieldnames=fieldnames)
            writer.writeheader()
            writer.writerow(first_record)

            count = 1
            for line in nf:
                line = line.strip()
                if not line:
                    continue
                try:
                    record = json.loads(line)
                    writer.writerow(record)
                    count += 1
                except json.JSONDecodeError:
                    continue  # skip malformed lines

    return count

rows = ndjson_to_csv("access_log.ndjson", "access_log.csv")
print(f"Converted {rows} log entries to CSV")
Note:Switch to streaming when the JSON file exceeds 100 MB. A 1 GB JSON array loaded with json.load() can consume 3–5 GB of RAM due to Python object overhead. With ijson, memory stays flat regardless of file size. If you just need a quick conversion of a small file, paste it into the JSON to CSV converter instead.

Common Mistakes

❌ Missing newline='' in open() β€” blank rows on Windows

Problem: The csv module writes line endings. Without newline='', Python's text mode adds another on Windows, producing double-spaced output.

Fix: Always pass newline='' when opening a file for CSV writing. It is harmless on macOS/Linux.

Before Β· Python
After Β· Python
with open("output.csv", "w") as f:
    writer = csv.DictWriter(f, fieldnames=columns)
    writer.writeheader()
    writer.writerows(data)
# Blank rows between every data row on Windows
with open("output.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=columns)
    writer.writeheader()
    writer.writerows(data)
# Clean output on all platforms
❌ Forgetting index=False in pandas to_csv()

Problem: Without index=False, pandas prepends an auto-incrementing row number column (0, 1, 2, ...) that pollutes the CSV with data that was never in the original JSON.

Fix: Pass index=False to to_csv(). If you actually need an index column, name it explicitly with df.index.name = 'row_num'.

Before Β· Python
After Β· Python
df = pd.read_json("events.json")
df.to_csv("events.csv")
# CSV gets an extra unnamed column: ,event_id,timestamp,...
# The leading comma breaks many CSV parsers
df = pd.read_json("events.json")
df.to_csv("events.csv", index=False)
# Clean CSV: event_id,timestamp,...
❌ Using records[0].keys() when records have inconsistent keys

Problem: If JSON objects have different keys (some records have optional fields), using the first record's keys as fieldnames silently drops columns that only appear in later records.

Fix: Collect all unique keys across all records before creating the DictWriter.

Before Β· Python
After Β· Python
records = json.load(f)
writer = csv.DictWriter(out, fieldnames=records[0].keys())
# Misses "discount" field that only appears in records[2]
records = json.load(f)
all_keys = list(dict.fromkeys(k for r in records for k in r))
writer = csv.DictWriter(out, fieldnames=all_keys, restval="")
# Every key from every record is included as a column
❌ Writing nested dicts directly to CSV without flattening

Problem: csv.DictWriter calls str() on nested dicts, producing columns with values like "{'city': 'Portland'}"β€” raw Python repr, not actual data.

Fix: Flatten nested objects first using pd.json_normalize() or a custom flattening function.

Before Β· Python
After Β· Python
records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}]
writer = csv.DictWriter(f, fieldnames=["id", "meta"])
writer.writerows(records)
# meta column contains: {'source': 'web', 'region': 'us-west'}
import pandas as pd
records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}]
df = pd.json_normalize(records, sep="_")
df.to_csv("events.csv", index=False)
# Columns: id, meta_source, meta_region

csv.DictWriter vs pandas β€” Quick Comparison

Method
Nested JSON
Custom Types
Streaming
Dependencies
Requires Install
csv.DictWriter
βœ— (manual flatten)
βœ—
βœ“ (row by row)
None
No (stdlib)
csv.writer
βœ—
βœ—
βœ“ (row by row)
None
No (stdlib)
pd.DataFrame.to_csv()
βœ— (flat only)
βœ“ (via dtypes)
βœ—
pandas + numpy
pip install
pd.json_normalize() + to_csv()
βœ“
βœ“ (via dtypes)
βœ—
pandas + numpy
pip install
csv.writer + json_flatten
βœ“
βœ—
βœ“
flatten_json
pip install
jq + csvkit (CLI)
βœ“ (via jq)
N/A
βœ“
jq, csvkit
System install

Use csv.DictWriter when you need zero dependencies, your JSON is flat, and the script runs in a restricted environment (CI containers, Lambda functions, embedded Python). Use pd.json_normalize() + to_csv() when the JSON is nested, you need to transform or filter data before export, or you are already in a pandas workflow. For files that do not fit in memory, combine ijson with csv.DictWriter for constant-memory streaming.

For quick, no-code conversions, the JSON to CSV converter on ToolDeck handles it without any Python setup.

Frequently Asked Questions

How do I convert JSON to CSV in Python without pandas?

Use the built-in json and csv modules. Call json.load() to parse the JSON file into a list of dicts, extract fieldnames from the first dict's keys, create a csv.DictWriter, call writeheader(), then writerows(). This approach has zero external dependencies and works in any Python 3.x environment. It also runs faster than pandas for small files since there is no DataFrame allocation overhead. If your JSON objects have inconsistent keys across records, collect all unique keys first with dict.fromkeys(k for r in records for k in r) before passing them as fieldnames to avoid missing columns.

Python
import json
import csv

with open("orders.json") as f:
    records = json.load(f)

with open("orders.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=records[0].keys())
    writer.writeheader()
    writer.writerows(records)

How do I handle nested JSON when converting to CSV?

Flat JSON arrays map directly to CSV rows, but nested objects need flattening first. With pandas, pd.json_normalize() handles this automatically β€” it joins nested keys with a dot separator (e.g., "address.city"). Without pandas, write a recursive function that walks the dict and concatenates keys with a delimiter. For deeply nested structures with multiple levels, json_normalize handles them all in one pass. The sep parameter controls the joining character between key segments β€” underscore is usually safer than the default dot for SQL imports and spreadsheet formula compatibility.

Python
import pandas as pd

nested_data = [
    {"id": "ord_91a3", "customer": {"name": "Sarah Chen", "email": "s.chen@example.com"}},
]
df = pd.json_normalize(nested_data, sep="_")
# Columns: id, customer_name, customer_email
df.to_csv("flat_orders.csv", index=False)

Why does my CSV have blank rows between data rows on Windows?

The csv module writes \r\n line endings by default. On Windows, opening the file in text mode adds another \r, producing \r\r\n β€” which displays as a blank row. The fix is to always pass newline="" to open(). This tells Python not to translate line endings, letting the csv module handle them. This pattern is required regardless of operating system β€” it is harmless on macOS and Linux, and critical on Windows. The Python documentation explicitly calls this out in the csv module section as the correct way to open files for CSV writing.

Python
# Wrong β€” blank rows on Windows
with open("output.csv", "w") as f:
    writer = csv.writer(f)

# Correct β€” newline="" prevents double \r
with open("output.csv", "w", newline="") as f:
    writer = csv.writer(f)

How do I append JSON records to an existing CSV file?

Open the file in append mode ("a") and create a DictWriter with the same fieldnames. Skip writeheader() since the header row already exists. With pandas, use to_csv(mode="a", header=False). Make sure the column order matches the existing file, or the data will land in the wrong columns. If you are unsure about the column order in the existing file, open it first with csv.DictReader and read fieldnames from its fieldnames attribute before creating the writer for appending.

Python
import csv

new_records = [
    {"order_id": "ord_f4c1", "total": 89.50, "status": "shipped"},
]

with open("orders.csv", "a", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["order_id", "total", "status"])
    writer.writerows(new_records)

What is the fastest way to convert a large JSON file to CSV in Python?

For files under 500 MB, pd.read_json() followed by to_csv() is the fastest single-call approach β€” pandas uses optimized C code internally. For files above 500 MB, use ijson to stream JSON records and write them to CSV with csv.DictWriter row by row. This keeps memory usage constant regardless of file size. For NDJSON files (one JSON object per line), you do not need ijson at all β€” a plain Python for loop over the file handle processes each line independently and achieves constant memory without any third-party library.

Python
# Fast for files that fit in memory
import pandas as pd
df = pd.read_json("large_dataset.json")
df.to_csv("large_dataset.csv", index=False)

# Streaming for files that don't fit in memory
import ijson, csv
with open("huge.json", "rb") as jf, open("huge.csv", "w", newline="") as cf:
    records = ijson.items(jf, "item")
    first = next(records)
    writer = csv.DictWriter(cf, fieldnames=first.keys())
    writer.writeheader()
    writer.writerow(first)
    for record in records:
        writer.writerow(record)

Can I write CSV output to stdout instead of a file in Python?

Yes. Pass sys.stdout as the file object to csv.writer() or csv.DictWriter(). This is useful for piping output in shell scripts or quick debugging. With pandas, call to_csv(sys.stdout, index=False) or to_csv(None) to get a string you can print. No temporary file needed. When writing to stdout on Windows, call sys.stdout.reconfigure(newline="") first to avoid the double carriage-return issue, since stdout opens in text mode by default.

Python
import csv
import sys
import json

data = json.loads('[{"host":"web-1","cpu":72.3},{"host":"web-2","cpu":45.1}]')
writer = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
writer.writeheader()
writer.writerows(data)
# host,cpu
# web-1,72.3
# web-2,45.1
MS
Maria SantosBackend Developer

Maria is a backend developer specialising in Python and API integration. She has broad experience with data pipelines, serialisation formats, and building reliable server-side services. She is an active member of the Python community and enjoys writing practical, example-driven guides that help developers solve real problems without unnecessary theory.

PS
Priya SharmaTechnical Reviewer

Priya is a data scientist and machine learning engineer who has worked across the full Python data stack β€” from raw data ingestion and cleaning to model deployment and monitoring. She is passionate about reproducible research, Jupyter-based workflows, and the practical engineering side of ML. She writes about NumPy, Pandas, data serialisation, and the Python patterns that make data pipelines reliable at scale.