Python में JSON से CSV — DictWriter + pandas उदाहरण

Maria Santos·Backend Developer·समीक्षकPriya Sharma·प्रकाशित 2026-03-20

मुफ़्त JSON to CSV को सीधे अपने ब्राउज़र में उपयोग करें — इंस्टॉलेशन की ज़रूरत नहीं।

लगभग हर data pipeline में एक ही चरण आता है: एक API JSON लौटाती है, लेकिन अगला consumer — एक spreadsheet, एक import script, एक Redshift COPY command — CSV चाहता है। Python में JSON को CSV में बदलना तब तक सरल लगता है जब तक nested objects, inconsistent keys, या datetime values जिन्हें special handling की ज़रूरत हो, न मिलें। Python आपको दो ठोस रास्ते देता है: zero-dependency scripts के लिए बिल्ट-इन json + csv मॉड्यूल, और nested flattening और बड़े datasets के लिए pandas — या बिना किसी code के quick one-off conversions के लिए ऑनलाइन JSON to CSV converter। यह guide दोनों तरीकों को शुरू से अंत तक cover करती है, runnable Python 3.8+ examples के साथ।

✓csv.DictWriter zero dependencies के साथ dicts की list को CSV में बदलता है — parse करने के लिए json.load() उपयोग करें, फिर writeheader() + writerows()।
✓Windows पर data rows के बीच blank rows से बचने के लिए CSV files हमेशा newline="" के साथ खोलें।
✓pd.json_normalize() to_csv() कॉल करने से पहले nested JSON को flat DataFrame में flatten करता है — multi-level nesting को स्वचालित रूप से handle करता है।
✓DataFrame.to_csv() में index=False पास करें — इसके बिना pandas एक अनचाहा row-number column लिख देता है।
✓500 MB से बड़ी फ़ाइलों के लिए, constant memory usage के लिए ijson streaming JSON parsing को csv.DictWriter के साथ combine करें।

JSON से CSV रूपांतरण क्या है?

JSON से CSV रूपांतरण JSON objects की एक array को tabular format में बदलता है जहाँ प्रत्येक object एक row बन जाता है और प्रत्येक key एक column header। JSON hierarchical है — objects मनमाने तरीके से deeply nest हो सकते हैं। CSV flat है — हर value एक row-column grid में होती है। रूपांतरण तब cleanly काम करता है जब हर object समान top-level keys साझा करता हो। Nested objects, arrays, और inconsistent keys वो जगह हैं जहाँ चीज़ें दिलचस्प हो जाती हैं। raw data वही रहता है — बस structure बदलता है।

Before · json

After · json

[{"order_id":"ord_91a3","total":149.99,"status":"shipped"},
 {"order_id":"ord_b7f2","total":34.50,"status":"pending"}]

order_id,total,status
ord_91a3,149.99,shipped
ord_b7f2,34.50,pending

csv.DictWriter — Pandas के बिना JSON को CSV में बदलें

csv मॉड्यूल हर Python installation के साथ आता है। कोई pip install नहीं, कोई virtual environment की परेशानी नहीं। csv.DictWriter dictionaries की list लेता है और प्रत्येक को CSV row के रूप में लिखता है, dict keys को column headers से map करता है। fieldnames पैरामीटर column का क्रम और कौन सी keys शामिल होंगी, दोनों को नियंत्रित करता है।

Python 3.8+ — minimal json to csv example

import json
import csv

# Sample JSON data — ऑर्डर objects की एक array
json_string = """
[
  {"order_id": "ord_91a3", "product": "Wireless Keyboard", "quantity": 2, "unit_price": 74.99},
  {"order_id": "ord_b7f2", "product": "USB-C Hub", "quantity": 1, "unit_price": 34.50},
  {"order_id": "ord_c4e8", "product": "Monitor Stand", "quantity": 3, "unit_price": 29.95}
]
"""

records = json.loads(json_string)

with open("orders.csv", "w", newline="", encoding="utf-8") as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=records[0].keys())
    writer.writeheader()
    writer.writerows(records)

# orders.csv:
# order_id,product,quantity,unit_price
# ord_91a3,Wireless Keyboard,2,74.99
# ord_b7f2,USB-C Hub,1,34.50
# ord_c4e8,Monitor Stand,3,29.95

open() पर वह newline="" argument Windows पर optional नहीं है। इसके बिना, आपको double carriage returns मिलते हैं — जो Excel में हर data row के बीच blank rows के रूप में दिखते हैं। macOS और Linux पर यह बेकार है, इसलिए बस इसे हमेशा शामिल करें।

ऊपर का code string के लिए json.loads() का उपयोग करता है। file handle से पढ़ते समय json.load() (बिना trailing s के) का उपयोग करें। यह लोगों को लगातार confuse करता है — एक string पढ़ता है, दूसरा file object पढ़ता है।

Python 3.8+ — read JSON file, write CSV file

import json
import csv

with open("server_metrics.json", encoding="utf-8") as jf:
    metrics = json.load(jf)  # file objects के लिए json.load()

# Explicit fieldnames column क्रम नियंत्रित करते हैं
columns = ["timestamp", "hostname", "cpu_percent", "memory_mb", "disk_io_ops"]

with open("server_metrics.csv", "w", newline="", encoding="utf-8") as cf:
    writer = csv.DictWriter(cf, fieldnames=columns, extrasaction="ignore")
    writer.writeheader()
    writer.writerows(metrics)

# केवल पाँच निर्दिष्ट columns उसी क्रम में दिखते हैं

extrasaction="ignore" सेट करने से dicts में कोई भी ऐसी keys जो आपकी fieldnames list में नहीं हैं, चुपचाप हट जाती हैं। डिफ़ॉल्ट "raise" है, जो किसी dict में unexpected key मिलने पर ValueError फेंकता है। जो आपके use case के लिए सही हो, वो चुनें।

नोट:csv.DictWriter बनाम csv.writer: DictWriter dict keys को column positions पर स्वचालित रूप से map करता है। csv.writer raw lists को rows के रूप में लिखता है — column ordering आप खुद handle करते हैं। JSON-to-CSV के लिए DictWriter लगभग हमेशा सही choice है क्योंकि JSON records पहले से ही dictionaries होते हैं।

Python के csv मॉड्यूल में तीन named dialects हैं: excel (comma delimiter, CRLF line endings — डिफ़ॉल्ट), excel-tab (tab delimiter, CRLF endings), और unix (LF line endings, सभी non-numeric fields को quote करता है)। dialect name को dialect argument के रूप में csv.DictWriter को पास करें। जब आपके target system में unusual quoting या delimiter rules हों, तो csv.register_dialect() से custom dialect भी define कर सकते हैं। अधिकांश JSON-to-CSV workflows के लिए excel dialect सही है, लेकिन POSIX tools जैसे awk या sort द्वारा process की जाने वाली फ़ाइलें लिखते समय unix पर switch करें।

Non-Standard Types को संभालना: datetime, UUID, और Decimal

APIs से JSON में अक्सर dates ISO strings के रूप में, UUIDs hyphenated strings के रूप में, और monetary values floats के रूप में होते हैं। जब आप CSV लिखने से पहले processing के लिए इन्हें Python objects में parse करते हैं, तो आपको उन्हें strings में वापस बदलना होगा। csv मॉड्यूल हर value पर str() call करता है, इसलिए अधिकांश types बस काम करते हैं। लेकिन datetime objects messy default string representations देते हैं, और Decimal values को scientific notation से बचने के लिए explicit formatting चाहिए।

Python 3.8+ — pre-process datetime and Decimal before CSV write

import json
import csv
from datetime import datetime, timezone
from decimal import Decimal
from uuid import UUID

# Python types के साथ parsed API response simulate कर रहे हैं
transactions = [
    {
        "txn_id": UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890"),
        "created_at": datetime(2026, 3, 15, 9, 30, 0, tzinfo=timezone.utc),
        "amount": Decimal("1249.99"),
        "currency": "USD",
        "merchant": "CloudHost Inc.",
    },
    {
        "txn_id": UUID("b2c3d4e5-f6a7-8901-bcde-f12345678901"),
        "created_at": datetime(2026, 3, 15, 14, 12, 0, tzinfo=timezone.utc),
        "amount": Decimal("87.50"),
        "currency": "EUR",
        "merchant": "DataSync GmbH",
    },
]

def prepare_row(record: dict) -> dict:
    """Non-string types को CSV-friendly strings में बदलें।"""
    return {
        "txn_id": str(record["txn_id"]),
        "created_at": record["created_at"].isoformat(),
        "amount": f"{record['amount']:.2f}",
        "currency": record["currency"],
        "merchant": record["merchant"],
    }

with open("transactions.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["txn_id", "created_at", "amount", "currency", "merchant"])
    writer.writeheader()
    for txn in transactions:
        writer.writerow(prepare_row(txn))

# transactions.csv:
# txn_id,created_at,amount,currency,merchant
# a1b2c3d4-e5f6-7890-abcd-ef1234567890,2026-03-15T09:30:00+00:00,1249.99,USD,CloudHost Inc.
# b2c3d4e5-f6a7-8901-bcde-f12345678901,2026-03-15T14:12:00+00:00,87.50,EUR,DataSync GmbH

prepare_row() function यहाँ सही तरीका है। csv.DictWriter को custom types के बारे में सिखाने की कोशिश करने के बजाय, आप लिखने से पहले प्रत्येक record को strings में normalize करते हैं। datetime objects पर str() पर निर्भर रहने के बजाय .isoformat() को explicitly call करना बेहतर है — आउटपुट format अधिक predictable है, और downstream parsers ISO 8601 को reliably handle करते हैं।

चेतावनी:यदि आप Decimal values को बिना formatting के pass करते हैं, तो बहुत छोटे या बहुत बड़े numbers scientific notation में render हो सकते हैं (जैसे 1.5E+7)। Financial data को CSV में लिखते समय Decimal को हमेशा explicit f-string जैसे f"{value:.2f}" से format करें।

कई custom types वाली pipelines के लिए एक वैकल्पिक pattern json.JSONEncoder को extend करना है। इसे subclass करें, default() method को override करें ताकि प्रत्येक custom type के लिए JSON-serializable value लौटाए, फिर subclass को json.dumps() के cls argument के रूप में पास करें। CSV में लिखने से पहले custom encoder के माध्यम से re-encoding से per-row prepare_row() call के बिना एक step में सभी types normalize हो जाते हैं। ऊपर दिखाया prepare_row() pattern one-off scripts के लिए सरल है; JSONEncoder subclass approach तब बेहतर scale करता है जब custom types वाला वही domain model कई pipeline stages या microservices में साझा हो।

csv.DictWriter पैरामीटर संदर्भ

पूर्ण constructor signature है csv.DictWriter(f, fieldnames, restval="", extrasaction="raise", dialect="excel", **fmtparams)। इनमें से अधिकांश के sensible defaults हैं। जिन्हें आप वास्तव में बदलेंगे वे हैं fieldnames, delimiter, और extrasaction।

पैरामीटर

प्रकार

डिफ़ॉल्ट

विवरण

file object

(आवश्यक)

कोई भी ऑब्जेक्ट जिसमें write() मेथड हो — आमतौर पर open() से

fieldnames

sequence

(आवश्यक)

keys की सूची जो CSV आउटपुट में कॉलम का क्रम निर्धारित करती है

restval

str

वह मान जो तब लिखा जाता है जब dict में fieldnames की कोई key गायब हो

extrasaction

str

"raise"

"raise" अतिरिक्त keys के लिए ValueError फेंकता है; "ignore" उन्हें चुपचाप हटा देता है

dialect

str / Dialect

"excel"

पूर्वनिर्धारित फ़ॉर्मेटिंग नियम — "excel", "excel-tab", या "unix"

delimiter

str

","

fields को अलग करने वाला एकल अक्षर — TSV आउटपुट के लिए "\t" का उपयोग करें

quotechar

str

delimiter वाले fields को quote करने के लिए उपयोग किया जाने वाला अक्षर

quoting

int

csv.QUOTE_MINIMAL

quoting कब लागू होगी — MINIMAL, ALL, NONNUMERIC, NONE

lineterminator

str

"\r\n"

प्रत्येक row के बाद जोड़ा जाने वाला string — Unix-style आउटपुट के लिए "\n" से ओवरराइड करें

pandas — DataFrames के साथ JSON को CSV में बदलें

यदि आप पहले से pandas-heavy codebase में काम कर रहे हैं, या आपके JSON में nested objects हैं जिन्हें flatten करने की ज़रूरत है, तो pandas approach stdlib version से काफी कम code वाला है। Tradeoff: pandas ~30 MB dependency है। एक throwaway script के लिए, यह ठीक है। एक Docker image के लिए जिसे आप production में ship करते हैं, stdlib approach चीज़ें हल्की रखता है।

Python 3.8+ — pandas read_json then to_csv

import pandas as pd

# JSON array को सीधे DataFrame में पढ़ें
df = pd.read_json("warehouse_inventory.json")

# CSV में लिखें — index=False auto-generated row numbers को रोकता है
df.to_csv("warehouse_inventory.csv", index=False)

# बस इतना। दो lines। pandas column types स्वचालित रूप से infer करता है।

index=False flag उन चीज़ों में से एक है जो आप हर बार look up करते हैं। इसके बिना, pandas आपकी CSV के पहले column के रूप में 0, 1, 2, ... column लिख देता है। कोई नहीं चाहता।

json_normalize से नेस्टेड JSON को Flatten करना

Real API responses शायद ही कभी flat होते हैं। Orders में shipping addresses होते हैं, users में nested preferences होती हैं, telemetry events में nested metadata होता है। pd.json_normalize() nested dictionaries को traverse करता है और उन्हें dot-separated names वाले columns में flatten करता है।

Python 3.8+ — flatten nested JSON using json_normalize

import json
import pandas as pd

api_response = """
[
  {
    "order_id": "ord_91a3",
    "placed_at": "2026-03-15T09:30:00Z",
    "customer": {
      "name": "Sarah Chen",
      "email": "s.chen@example.com",
      "tier": "premium"
    },
    "shipping": {
      "method": "express",
      "address": {
        "city": "Portland",
        "state": "OR",
        "zip": "97201"
      }
    },
    "total": 299.95
  },
  {
    "order_id": "ord_b7f2",
    "placed_at": "2026-03-15T14:12:00Z",
    "customer": {
      "name": "James Park",
      "email": "j.park@example.com",
      "tier": "standard"
    },
    "shipping": {
      "method": "standard",
      "address": {
        "city": "Austin",
        "state": "TX",
        "zip": "73301"
      }
    },
    "total": 87.50
  }
]
"""

orders = json.loads(api_response)

# json_normalize nested dicts को flatten करता है — sep delimiter नियंत्रित करता है
df = pd.json_normalize(orders, sep="_")
df.to_csv("flat_orders.csv", index=False)

# परिणामी columns:
# order_id, placed_at, customer_name, customer_email, customer_tier,
# shipping_method, shipping_address_city, shipping_address_state,
# shipping_address_zip, total

sep="_" पैरामीटर नियंत्रित करता है कि nested key names कैसे join होती हैं। डिफ़ॉल्ट "." है, जो customer.name जैसे columns बनाता है। Underscores बेहतर है क्योंकि column names में dots SQL imports और कुछ spreadsheet formulas में समस्या पैदा करते हैं।

API responses के लिए जो records array को nested key के नीचे wrap करते हैं, record_path पैरामीटर का उपयोग करें। यदि response {"data": {"orders": [...]}} जैसा दिखता है, तो सही list तक navigate करने के लिए record_path=["data", "orders"] पास करें। Optional meta पैरामीटर आपको nested records के साथ parent-level fields खींचने देता है — उपयोगी है जब response में top-level pagination info (page number, total count) हो जो आप हर row में column के रूप में चाहते हैं। साथ में, record_path और meta custom preprocessing के बिना अधिकांश real-world nested API response shapes को handle करते हैं।

DataFrame.to_csv() पैरामीटर संदर्भ

DataFrame.to_csv() में 20 से अधिक पैरामीटर हैं। ये वो हैं जो JSON-to-CSV workflows के लिए महत्वपूर्ण हैं।

पैरामीटर

प्रकार

डिफ़ॉल्ट

विवरण

path_or_buf

str / Path / None

None

फ़ाइल पथ या buffer — None CSV को string के रूप में लौटाता है

sep

str

","

field delimiter — TSV के लिए "\t" का उपयोग करें

index

bool

True

row index को पहले कॉलम के रूप में लिखें — लगभग हमेशा False सेट करें

columns

list

None

आउटपुट में कॉलम का सबसेट और क्रम निर्धारित करें

header

bool / list

True

कॉलम नाम लिखें — मौजूदा फ़ाइल में append करते समय False सेट करें

encoding

str

"utf-8"

आउटपुट encoding — Windows पर Excel compatibility के लिए "utf-8-sig" उपयोग करें

na_rep

str

गायब मानों (NaN, None) के लिए string representation

quoting

int

csv.QUOTE_MINIMAL

fields को quote करने का नियंत्रण

Python 3.8+ — to_csv with common parameter overrides

import pandas as pd

df = pd.read_json("telemetry_events.json")

# TSV आउटपुट explicit encoding और missing value handling के साथ
df.to_csv(
    "telemetry_events.tsv",
    sep="\t",
    index=False,
    encoding="utf-8",
    na_rep="NULL",
    columns=["event_id", "timestamp", "source", "severity", "message"],
)

# shell scripts में piping के लिए stdout पर लिखें
print(df.to_csv(index=False))

# string के रूप में लौटाएं (कोई फ़ाइल नहीं लिखी जाती)
csv_string = df.to_csv(index=False)
print(len(csv_string), "characters")

फ़ाइल और API Response से JSON को CSV में बदलें

दो सबसे common real-world scenarios: disk पर फ़ाइल से JSON पढ़ना और बदलना, या HTTP API से JSON fetch करना और result को CSV के रूप में save करना। Development में आप बिना error handling के काम चला सकते हैं। Production में, यही गलती रात 2 बजे का alert बन जाती है। फ़ाइलें नहीं मिल सकतीं, APIs JSON के बजाय 4xx या 5xx status codes लौटा सकती हैं, response body error object हो सकता है बजाय array के, या JSON network timeout के कारण truncated हो सकता है। नीचे के patterns इन सभी cases को explicitly handle करते हैं, errors को stderr पर log करते हैं, और row count लौटाते हैं ताकि callers zero-row outputs detect कर सकें और उसी हिसाब से alert कर सकें।

Disk पर फ़ाइल — Read, Convert, Save

Python 3.8+ — convert JSON file to CSV with error handling

import json
import csv
import sys

def json_file_to_csv(input_path: str, output_path: str) -> int:
    """Objects की array वाली JSON फ़ाइल को CSV में बदलें।
    लिखी गई rows की संख्या लौटाता है।
    """
    try:
        with open(input_path, encoding="utf-8") as jf:
            data = json.load(jf)
    except FileNotFoundError:
        print(f"Error: {input_path} not found", file=sys.stderr)
        return 0
    except json.JSONDecodeError as exc:
        print(f"Error: invalid JSON in {input_path}: {exc.msg} at line {exc.lineno}", file=sys.stderr)
        return 0

    if not isinstance(data, list) or not data:
        print(f"Error: expected a non-empty JSON array in {input_path}", file=sys.stderr)
        return 0

    # सभी records में सभी unique keys एकत्र करें — inconsistent schemas को handle करता है
    all_keys: list[str] = []
    seen: set[str] = set()
    for record in data:
        for key in record:
            if key not in seen:
                all_keys.append(key)
                seen.add(key)

    with open(output_path, "w", newline="", encoding="utf-8") as cf:
        writer = csv.DictWriter(cf, fieldnames=all_keys, restval="", extrasaction="ignore")
        writer.writeheader()
        writer.writerows(data)

    return len(data)

rows = json_file_to_csv("deploy_logs.json", "deploy_logs.csv")
print(f"Wrote {rows} rows to deploy_logs.csv")

HTTP API Response — Fetch और Convert

Python 3.8+ — fetch JSON from API and save as CSV

import json
import csv
import urllib.request
import urllib.error

def api_response_to_csv(url: str, output_path: str) -> int:
    """REST API endpoint से JSON fetch करें और CSV के रूप में लिखें।"""
    try:
        req = urllib.request.Request(url, headers={"Accept": "application/json"})
        with urllib.request.urlopen(req, timeout=30) as resp:
            if resp.status != 200:
                print(f"Error: API returned status {resp.status}")
                return 0
            body = resp.read().decode("utf-8")
    except urllib.error.URLError as exc:
        print(f"Error: could not reach {url}: {exc.reason}")
        return 0

    try:
        records = json.loads(body)
    except json.JSONDecodeError as exc:
        print(f"Error: API returned invalid JSON: {exc.msg}")
        return 0

    if not isinstance(records, list) or not records:
        print("Error: expected a non-empty JSON array from the API")
        return 0

    with open(output_path, "w", newline="", encoding="utf-8") as cf:
        writer = csv.DictWriter(cf, fieldnames=records[0].keys())
        writer.writeheader()
        writer.writerows(records)

    return len(records)

rows = api_response_to_csv(
    "https://api.internal.example.com/v2/deployments?status=completed",
    "completed_deployments.csv",
)
print(f"Exported {rows} deployments to CSV")

नोट:ऊपर का उदाहरण script को dependency-free रखने के लिए standard library से urllib का उपयोग करता है। यदि आपके पास requests installed है, तो urllib section को resp = requests.get(url, timeout=30); records = resp.json() से replace करें — बाकी CSV writing code वैसा ही रहता है।

Command-Line JSON से CSV रूपांतरण

कभी-कभी आपको बस terminal में एक one-liner चाहिए। Python का -c flag बिना script file बनाए quick conversion चलाने देता है। अधिक complex transformations के लिए, पहले jq से data reshape करें, फिर convert करें।

bash — one-liner json to csv conversion

# Python one-liner: stdin से JSON पढ़ता है, stdout पर CSV लिखता है
cat orders.json | python3 -c "
import json, csv, sys
data = json.load(sys.stdin)
w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
w.writeheader()
w.writerows(data)
"

# आउटपुट को फ़ाइल में सेव करें
cat orders.json | python3 -c "
import json, csv, sys
data = json.load(sys.stdin)
w = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
w.writeheader()
w.writerows(data)
" > orders.csv

bash — self-contained CLI script with argparse

# json2csv.py के रूप में save करें और चलाएं: python3 json2csv.py input.json -o output.csv
python3 -c "
import json, csv, argparse, sys

parser = argparse.ArgumentParser(description='Convert JSON array to CSV')
parser.add_argument('input', help='Path to JSON file')
parser.add_argument('-o', '--output', default=None, help='Output CSV path (default: stdout)')
parser.add_argument('-d', '--delimiter', default=',', help='CSV delimiter')
args = parser.parse_args()

with open(args.input) as f:
    data = json.load(f)

out = open(args.output, 'w', newline='') if args.output else sys.stdout
writer = csv.DictWriter(out, fieldnames=data[0].keys(), delimiter=args.delimiter)
writer.writeheader()
writer.writerows(data)
if args.output:
    out.close()
    print(f'Wrote {len(data)} rows to {args.output}', file=sys.stderr)
" "$@"

bash — using jq + csvkit for complex transformations

# csvkit install करें: pip install csvkit

# jq fields को flatten और select करता है, in2csv CSV formatting handle करता है
cat api_response.json | jq '[.[] | {id: .order_id, customer: .customer.name, total}]' | in2csv -f json > orders.csv

# JSON-to-CSV के लिए Miller (mlr) एक और option है
mlr --json2csv cat orders.json > orders.csv

Miller (mlr) एक standalone binary है जो JSON, CSV, और TSV को बिना किसी Python runtime के first-class formats के रूप में treat करता है। --json2csv flag JSON input को single pass में CSV में बदलता है, और आप आउटपुट लिखने से पहले उसी command में columns filter, sort, या rename करने के लिए Miller verbs chain कर सकते हैं। macOS पर Homebrew से install करें (brew install miller) या अपने Linux package manager से। यह CI pipelines में विशेष रूप से उपयोगी है जहाँ आप Python environment spin up किए बिना fast JSON-to-CSV conversion चाहते हैं।

High-Performance Alternative — pandas with pyarrow

Tens-of-millions-of-rows range के datasets के लिए, pyarrow backend के साथ pandas default की तुलना में काफी तेज़ read और write करता है। C-backed Arrow engine columnar data को Python के row-by-row csv module से अधिक efficiently process करता है। API वही रहती है — बस engine parameter set करें।

bash — install pyarrow

pip install pyarrow

Python 3.8+ — pandas with pyarrow for faster CSV writing

import pandas as pd

# pyarrow engine से JSON पढ़ें (बड़ी फ़ाइलों के लिए तेज़ parsing)
df = pd.read_json("sensor_readings.json", engine="pyarrow")

# to_csv में engine parameter नहीं है, लेकिन read और write के बीच
# DataFrame operations pyarrow के columnar layout से लाभान्वित होते हैं
df.to_csv("sensor_readings.csv", index=False)

# वास्तव में बड़े exports के लिए, CSV के बजाय Parquet में लिखने पर विचार करें
# — binary format, 5-10x छोटा, types preserve करता है
df.to_parquet("sensor_readings.parquet", engine="pyarrow")

यदि आप कुछ सौ MB से अधिक JSON process कर रहे हैं और final consumer Parquet accept करता है, तो CSV को पूरी तरह skip करें। Parquet छोटा है, column types preserve करता है, और Redshift और BigQuery दोनों इसे natively load करते हैं। CSV एक lossy format है — हर value string बन जाती है।

Syntax Highlighting के साथ Terminal Output

rich library terminal में borders, alignment, और color के साथ tables render करती है — output file खोले बिना development के दौरान conversion preview करने के लिए उपयोगी।

bash — install rich

pip install rich

Python 3.8+ — preview CSV output in terminal with rich

import json
from rich.console import Console
from rich.table import Table

json_string = """
[
  {"hostname": "web-prod-1", "cpu_percent": 72.3, "memory_mb": 3840, "uptime_hours": 720},
  {"hostname": "web-prod-2", "cpu_percent": 45.1, "memory_mb": 2560, "uptime_hours": 168},
  {"hostname": "db-replica-1", "cpu_percent": 91.7, "memory_mb": 7680, "uptime_hours": 2160}
]
"""

records = json.loads(json_string)
console = Console()

table = Table(title="Server Metrics Preview", show_lines=True)
for key in records[0]:
    table.add_column(key, style="cyan" if key == "hostname" else "white")

for row in records:
    table.add_row(*[str(v) for v in row.values()])

console.print(table)
# Terminal में borders के साथ color-highlighted table render करता है

चेतावनी:Rich केवल terminal display के लिए है। इसे CSV files generate करने के लिए उपयोग न करें — यह ANSI escape codes जोड़ता है जो आउटपुट को corrupt कर देगा। फ़ाइलों में csv.DictWriter या DataFrame.to_csv() से लिखें, और केवल previewing के लिए rich का उपयोग करें।

बड़ी JSON फ़ाइलों के साथ काम करना

json.load() पूरी फ़ाइल को memory में पढ़ता है। 200 MB JSON फ़ाइल के लिए, इसका मतलब ~200 MB raw text plus Python object overhead — आसानी से 500 MB+ heap usage। 100 MB से बड़ी फ़ाइलों के लिए, ijson से input stream करें और CSV rows चलते-चलते लिखें।

bash — install ijson

pip install ijson

ijson से Streaming JSON Array को CSV में बदलना

Python 3.8+ — stream large JSON array to CSV with constant memory

import ijson
import csv

def stream_json_to_csv(json_path: str, csv_path: str) -> int:
    """बड़े JSON array को सब memory में load किए बिना CSV में बदलें।"""
    with open(json_path, "rb") as jf, open(csv_path, "w", newline="", encoding="utf-8") as cf:
        # ijson.items top-level array का प्रत्येक element एक-एक करके yield करता है
        records = ijson.items(jf, "item")

        first_record = next(records)
        fieldnames = list(first_record.keys())

        writer = csv.DictWriter(cf, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerow(first_record)

        count = 1
        for record in records:
            writer.writerow(record)
            count += 1

    return count

rows = stream_json_to_csv("clickstream_2026_03.json", "clickstream_2026_03.csv")
print(f"Streamed {rows} records to CSV")

NDJSON / JSON Lines — प्रति पंक्ति एक Object

NDJSON (Newline-Delimited JSON), जिसे JSON Lines या .jsonl भी कहते हैं, प्रत्येक पंक्ति पर बिना wrapping array के एक valid JSON object store करता है। यह format log pipelines, event streams (Kafka, Kinesis), और Elasticsearch और BigQuery जैसी services के bulk exports में common है। क्योंकि प्रत्येक पंक्ति एक self-contained JSON object है, आप file handle पर plain Python for loop से NDJSON file process कर सकते हैं — ijson library की ज़रूरत नहीं। File size की परवाह किए बिना memory constant रहती है, जिससे यह सबसे सरल streaming approach है जब source data पहले से JSON Lines format में हो।

Python 3.8+ — convert NDJSON to CSV line by line

import json
import csv

def ndjson_to_csv(ndjson_path: str, csv_path: str) -> int:
    """Newline-delimited JSON फ़ाइल को CSV में एक line-at-a-time बदलें।"""
    with open(ndjson_path, encoding="utf-8") as nf:
        first_line = nf.readline()
        first_record = json.loads(first_line)
        fieldnames = list(first_record.keys())

        with open(csv_path, "w", newline="", encoding="utf-8") as cf:
            writer = csv.DictWriter(cf, fieldnames=fieldnames)
            writer.writeheader()
            writer.writerow(first_record)

            count = 1
            for line in nf:
                line = line.strip()
                if not line:
                    continue
                try:
                    record = json.loads(line)
                    writer.writerow(record)
                    count += 1
                except json.JSONDecodeError:
                    continue  # malformed lines को skip करें

    return count

rows = ndjson_to_csv("access_log.ndjson", "access_log.csv")
print(f"Converted {rows} log entries to CSV")

नोट:JSON फ़ाइल 100 MB से अधिक होने पर streaming पर switch करें। 1 GB JSON array को json.load() से load करने पर Python object overhead के कारण 3–5 GB RAM consume हो सकती है। ijson के साथ, memory file size की परवाह किए बिना flat रहती है। यदि आपको बस एक छोटी फ़ाइल का quick conversion चाहिए, तो उसे JSON to CSV converter में paste करें।

सामान्य गलतियाँ

❌ open() में newline='' गायब — Windows पर blank rows

समस्या: csv मॉड्यूल \r\n line endings लिखता है। newline='' के बिना, Python का text mode Windows पर एक और \r जोड़ देता है, जिससे double-spaced output होता है।

समाधान: CSV लिखने के लिए फ़ाइल खोलते समय हमेशा newline='' पास करें। यह macOS/Linux पर बेकार है।

Before · Python

After · Python

with open("output.csv", "w") as f:
    writer = csv.DictWriter(f, fieldnames=columns)
    writer.writeheader()
    writer.writerows(data)
# Windows पर हर data row के बीच blank rows

with open("output.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=columns)
    writer.writeheader()
    writer.writerows(data)
# सभी platforms पर clean output

❌ pandas to_csv() में index=False भूलना

समस्या: index=False के बिना, pandas एक auto-incrementing row number column (0, 1, 2, ...) prepend करता है जो original JSON में कभी नहीं था।

समाधान: to_csv() में index=False पास करें। यदि आपको वास्तव में index column चाहिए, तो df.index.name = 'row_num' से explicitly नाम दें।

Before · Python

After · Python

df = pd.read_json("events.json")
df.to_csv("events.csv")
# CSV में एक extra unnamed column आता है: ,event_id,timestamp,...
# Leading comma कई CSV parsers को तोड़ देता है

df = pd.read_json("events.json")
df.to_csv("events.csv", index=False)
# Clean CSV: event_id,timestamp,...

❌ inconsistent keys वाले records के लिए records[0].keys() का उपयोग

समस्या: यदि JSON objects में अलग-अलग keys हैं (कुछ records में optional fields हैं), तो पहले record की keys को fieldnames के रूप में उपयोग करने से बाद के records में केवल आने वाले columns silently drop हो जाते हैं।

समाधान: DictWriter बनाने से पहले सभी records में सभी unique keys एकत्र करें।

Before · Python

After · Python

records = json.load(f)
writer = csv.DictWriter(out, fieldnames=records[0].keys())
# "discount" field miss हो जाता है जो केवल records[2] में है

records = json.load(f)
all_keys = list(dict.fromkeys(k for r in records for k in r))
writer = csv.DictWriter(out, fieldnames=all_keys, restval="")
# हर record से हर key column के रूप में शामिल है

❌ flatten किए बिना CSV में सीधे nested dicts लिखना

समस्या: csv.DictWriter nested dicts पर str() call करता है, जिससे "{'city': 'Portland'}"जैसे values वाले columns बनते हैं — actual data नहीं, raw Python repr।

समाधान: pd.json_normalize() या custom flattening function से nested objects को पहले flatten करें।

Before · Python

After · Python

records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}]
writer = csv.DictWriter(f, fieldnames=["id", "meta"])
writer.writerows(records)
# meta column में है: {'source': 'web', 'region': 'us-west'}

import pandas as pd
records = [{"id": "evt_1", "meta": {"source": "web", "region": "us-west"}}]
df = pd.json_normalize(records, sep="_")
df.to_csv("events.csv", index=False)
# Columns: id, meta_source, meta_region

csv.DictWriter बनाम pandas — Quick Comparison

विधि

नेस्टेड JSON

कस्टम प्रकार

स्ट्रीमिंग

निर्भरताएँ

इंस्टॉल आवश्यक

csv.DictWriter

✗ (मैन्युअल flatten)

✗

✓ (row by row)

कोई नहीं

नहीं (stdlib)

csv.writer

✗

✓ (row by row)

कोई नहीं

नहीं (stdlib)

pd.DataFrame.to_csv()

✗ (केवल flat)

✓ (dtypes के माध्यम से)

✗

pandas + numpy

pip install

pd.json_normalize() + to_csv()

✓

✓ (dtypes के माध्यम से)

✗

pandas + numpy

pip install

csv.writer + json_flatten

✓

✗

✓

flatten_json

pip install

jq + csvkit (CLI)

✓ (jq के माध्यम से)

लागू नहीं

✓

jq, csvkit

System install

csv.DictWriter का उपयोग करें जब आपको zero dependencies चाहिए, आपका JSON flat है, और script एक restricted environment (CI containers, Lambda functions, embedded Python) में चलती है।pd.json_normalize() + to_csv() का उपयोग करें जब JSON nested हो, आपको export से पहले data transform या filter करने की ज़रूरत हो, या आप पहले से pandas workflow में हों। Memory में fit न होने वाली फ़ाइलों के लिए, constant-memory streaming के लिए ijson को csv.DictWriter के साथ combine करें।

Quick, no-code conversions के लिए, ToolDeck पर JSON to CSV converter बिना किसी Python setup के इसे handle करता है।

अक्सर पूछे जाने वाले प्रश्न

pandas के बिना Python में JSON को CSV में कैसे बदलें?

बिल्ट-इन json और csv मॉड्यूल का उपयोग करें। JSON फ़ाइल को dicts की list में parse करने के लिए json.load() कॉल करें, पहले dict की keys से fieldnames निकालें, csv.DictWriter बनाएं, writeheader() फिर writerows() कॉल करें। इस तरीके में कोई बाहरी निर्भरता नहीं है और यह किसी भी Python 3.x वातावरण में काम करता है। यह छोटी फ़ाइलों के लिए pandas से तेज़ भी चलता है क्योंकि कोई DataFrame allocation overhead नहीं होता। यदि आपके JSON ऑब्जेक्ट में records के बीच अलग-अलग keys हैं, तो fieldnames में पास करने से पहले dict.fromkeys(k for r in records for k in r) से सभी unique keys एकत्र करें।

Python

import json
import csv

with open("orders.json") as f:
    records = json.load(f)

with open("orders.csv", "w", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=records[0].keys())
    writer.writeheader()
    writer.writerows(records)

CSV में बदलते समय नेस्टेड JSON को कैसे संभालें?

Flat JSON arrays सीधे CSV rows में map हो जाते हैं, लेकिन nested objects को पहले flatten करना होगा। pandas के साथ, pd.json_normalize() यह स्वचालित रूप से करता है — यह nested keys को dot separator से जोड़ता है (जैसे "address.city")। pandas के बिना, एक recursive function लिखें जो dict को traverse करे और keys को delimiter से जोड़े। गहरे nested structures के लिए, json_normalize एक ही pass में सभी levels को handle करता है। sep पैरामीटर key segments के बीच जोड़ने वाले character को नियंत्रित करता है — SQL imports और spreadsheet formula compatibility के लिए underscore आमतौर पर default dot से बेहतर होता है।

Python

import pandas as pd

nested_data = [
    {"id": "ord_91a3", "customer": {"name": "प्रिया शर्मा", "email": "priya.sharma@example.com"}},
]
df = pd.json_normalize(nested_data, sep="_")
# Columns: id, customer_name, customer_email
df.to_csv("flat_orders.csv", index=False)

Windows पर मेरी CSV में data rows के बीच blank rows क्यों हैं?

csv मॉड्यूल डिफ़ॉल्ट रूप से \r\n line endings लिखता है। Windows पर, text mode में फ़ाइल खोलने पर एक और \r जुड़ जाता है, जिससे \r\r\n बनता है — जो blank row के रूप में दिखता है। इसका समाधान है हमेशा open() को newline="" पास करना। यह Python को line endings translate न करने का निर्देश देता है, जिससे csv मॉड्यूल उन्हें handle करता है। यह pattern ऑपरेटिंग सिस्टम की परवाह किए बिना आवश्यक है — macOS और Linux पर यह बेकार है, और Windows पर अनिवार्य है। Python documentation में csv मॉड्यूल section में इसे CSV लिखने के लिए फ़ाइलें खोलने के सही तरीके के रूप में स्पष्ट रूप से उल्लेख किया गया है।

Python

# गलत — Windows पर blank rows
with open("output.csv", "w") as f:
    writer = csv.writer(f)

# सही — newline="" double \r को रोकता है
with open("output.csv", "w", newline="") as f:
    writer = csv.writer(f)

मौजूदा CSV फ़ाइल में JSON records कैसे append करें?

फ़ाइल को append mode ("a") में खोलें और उसी fieldnames के साथ DictWriter बनाएं। writeheader() छोड़ें क्योंकि header row पहले से मौजूद है। pandas के साथ, to_csv(mode="a", header=False) का उपयोग करें। सुनिश्चित करें कि column का क्रम मौजूदा फ़ाइल से मेल खाता हो, नहीं तो data गलत columns में जाएगा। यदि आप मौजूदा फ़ाइल में column क्रम के बारे में अनिश्चित हैं, तो append के लिए writer बनाने से पहले csv.DictReader से उसके fieldnames attribute को पढ़ें।

Python

import csv

new_records = [
    {"order_id": "ord_f4c1", "total": 89.50, "status": "shipped"},
]

with open("orders.csv", "a", newline="", encoding="utf-8") as f:
    writer = csv.DictWriter(f, fieldnames=["order_id", "total", "status"])
    writer.writerows(new_records)

Python में बड़ी JSON फ़ाइल को CSV में बदलने का सबसे तेज़ तरीका क्या है?

500 MB से कम फ़ाइलों के लिए, pd.read_json() के बाद to_csv() सबसे तेज़ single-call तरीका है — pandas आंतरिक रूप से optimized C code का उपयोग करता है। 500 MB से अधिक फ़ाइलों के लिए, ijson का उपयोग करके JSON records stream करें और उन्हें csv.DictWriter के साथ row by row CSV में लिखें। इससे फ़ाइल के आकार की परवाह किए बिना memory का उपयोग स्थिर रहता है। NDJSON फ़ाइलों (प्रति पंक्ति एक JSON ऑब्जेक्ट) के लिए, आपको ijson की बिल्कुल भी ज़रूरत नहीं — file handle पर plain Python for loop प्रत्येक पंक्ति को स्वतंत्र रूप से process करता है और किसी third-party library के बिना constant memory प्राप्त करता है।

Python

# मेमोरी में fit होने वाली फ़ाइलों के लिए तेज़
import pandas as pd
df = pd.read_json("large_dataset.json")
df.to_csv("large_dataset.csv", index=False)

# मेमोरी में fit न होने वाली फ़ाइलों के लिए streaming
import ijson, csv
with open("huge.json", "rb") as jf, open("huge.csv", "w", newline="") as cf:
    records = ijson.items(jf, "item")
    first = next(records)
    writer = csv.DictWriter(cf, fieldnames=first.keys())
    writer.writeheader()
    writer.writerow(first)
    for record in records:
        writer.writerow(record)

क्या Python में फ़ाइल के बजाय stdout पर CSV आउटपुट लिख सकते हैं?

हाँ। csv.writer() या csv.DictWriter() को file object के रूप में sys.stdout पास करें। यह shell scripts में output piping या quick debugging के लिए उपयोगी है। pandas के साथ, to_csv(sys.stdout, index=False) कॉल करें या to_csv(None) से string प्राप्त करें जिसे print कर सकते हैं। कोई temporary file की ज़रूरत नहीं। Windows पर stdout पर लिखते समय, double carriage-return समस्या से बचने के लिए पहले sys.stdout.reconfigure(newline="") कॉल करें, क्योंकि stdout डिफ़ॉल्ट रूप से text mode में खुलता है।

Python

import csv
import sys
import json

data = json.loads('[{"host":"web-1","cpu":72.3},{"host":"web-2","cpu":45.1}]')
writer = csv.DictWriter(sys.stdout, fieldnames=data[0].keys())
writer.writeheader()
writer.writerows(data)
# host,cpu
# web-1,72.3
# web-2,45.1

Python में JSON से CSV — DictWriter + pandas उदाहरण

JSON से CSV रूपांतरण क्या है?

csv.DictWriter — Pandas के बिना JSON को CSV में बदलें

Non-Standard Types को संभालना: datetime, UUID, और Decimal

csv.DictWriter पैरामीटर संदर्भ

pandas — DataFrames के साथ JSON को CSV में बदलें

json_normalize से नेस्टेड JSON को Flatten करना

DataFrame.to_csv() पैरामीटर संदर्भ

फ़ाइल और API Response से JSON को CSV में बदलें

Disk पर फ़ाइल — Read, Convert, Save

HTTP API Response — Fetch और Convert

Command-Line JSON से CSV रूपांतरण

High-Performance Alternative — pandas with pyarrow

Syntax Highlighting के साथ Terminal Output

बड़ी JSON फ़ाइलों के साथ काम करना

ijson से Streaming JSON Array को CSV में बदलना

NDJSON / JSON Lines — प्रति पंक्ति एक Object

सामान्य गलतियाँ

csv.DictWriter बनाम pandas — Quick Comparison

अक्सर पूछे जाने वाले प्रश्न

pandas के बिना Python में JSON को CSV में कैसे बदलें?

CSV में बदलते समय नेस्टेड JSON को कैसे संभालें?

Windows पर मेरी CSV में data rows के बीच blank rows क्यों हैं?

मौजूदा CSV फ़ाइल में JSON records कैसे append करें?

Python में बड़ी JSON फ़ाइल को CSV में बदलने का सबसे तेज़ तरीका क्या है?

क्या Python में फ़ाइल के बजाय stdout पर CSV आउटपुट लिख सकते हैं?

संबंधित टूल्स