แปลง CSV เป็น JSON ด้วย Python

Maria Santos·Backend Developer·ตรวจสอบโดยPriya Sharma·เผยแพร่เมื่อ 2026-03-26

ใช้ CSV to JSON ฟรีโดยตรงในเบราว์เซอร์ของคุณ — ไม่ต้องติดตั้ง

ไฟล์ CSV พบได้ทุกที่ — รายงานที่ส่งออก, dump จากฐานข้อมูล, ไฟล์ log — และไม่ช้าก็เร็วคุณจะต้อง แปลง CSV เป็น JSON ด้วย Pythonไลบรารีมาตรฐานจัดการสิ่งนี้ด้วยสองโมดูล: csv.DictReader แปลงแต่ละแถวเป็น Python dict และ json.dumps() serialize dict เหล่านั้นเป็นสตริง JSON สำหรับการแปลงครั้งเดียวอย่างรวดเร็วโดยไม่ต้องเขียนโค้ด ตัวแปลง CSV เป็น JSON ทำได้ทันทีในเบราว์เซอร์ คู่มือนี้ครอบคลุมเส้นทางแบบโปรแกรมทั้งหมด: json.dump() เทียบกับ json.dumps(), การเขียน JSON ลงไฟล์, การ serialize dataclass, การแปลงชนิดข้อมูลจาก CSV, การจัดการ datetime และ Decimal และทางเลือกประสิทธิภาพสูงอย่าง orjsonตัวอย่างทั้งหมดมุ่งเป้าที่ Python 3.10+

✓csv.DictReader สร้าง list ของ dict — serialize list ทั้งหมดด้วย json.dump(rows, f, indent=2) เพื่อเขียนไฟล์ JSON
✓json.dump() เขียนลงในอ็อบเจกต์ไฟล์โดยตรง json.dumps() คืนค่าเป็นสตริง เลือกให้ถูกต้องเพื่อหลีกเลี่ยงการคัดลอกที่ไม่จำเป็น
✓ค่าใน CSV เป็นสตริงเสมอ แปลงคอลัมน์ตัวเลขอย่างชัดเจน (int(), float()) ก่อน serialize เป็น JSON
✓ส่ง ensure_ascii=False ไปยัง json.dumps() เพื่อรักษาอักขระ Unicode — ชื่อที่มีสำเนียง, ข้อความ CJK — ในผลลัพธ์
✓สำหรับ datetime, UUID หรือ Decimal จาก CSV ใช้พารามิเตอร์ default= พร้อมฟังก์ชัน fallback แบบกำหนดเอง

Before · json

After · json

order_id,product,quantity,price
ORD-7291,Wireless Keyboard,2,49.99
ORD-7292,USB-C Hub,1,34.50

[
  {
    "order_id": "ORD-7291",
    "product": "Wireless Keyboard",
    "quantity": "2",
    "price": "49.99"
  },
  {
    "order_id": "ORD-7292",
    "product": "USB-C Hub",
    "quantity": "1",
    "price": "34.50"
  }
]

หมายเหตุ:สังเกตว่า quantity และ price ปรากฏเป็นสตริง JSON ("2", "49.99") ในผลลัพธ์ดิบ CSV ไม่มีระบบชนิดข้อมูล — ทุกค่าเป็นสตริง การแก้ไขปัญหานี้ครอบคลุมในส่วนการแปลงชนิดข้อมูลด้านล่าง

json.dumps() — Serialize Python Dict เป็นสตริง JSON

โมดูล json มาพร้อมกับทุกการติดตั้ง Python — ไม่ต้องใช้ pip install json.dumps(obj) รับอ็อบเจกต์ Python (dict, list, สตริง, ตัวเลข, bool หรือ None) และคืนค่าเป็น str ที่มี JSON ที่ถูกต้อง Python dictionary ดูคล้ายกับ JSON object แต่ ทั้งสองแตกต่างกันโดยพื้นฐาน: dict คือโครงสร้างข้อมูล Python ในหน่วยความจำ ส่วนสตริง JSON คือข้อความที่ถูก serialize การเรียก json.dumps() เชื่อมช่องว่างนั้น

ตัวอย่างพื้นฐาน — แถว CSV เดี่ยวเป็น JSON

Python 3.10+

import json

# แถว CSV เดี่ยวที่แทนด้วย Python dict
server_entry = {
    "hostname": "web-prod-03",
    "ip_address": "10.0.12.47",
    "port": 8080,
    "region": "eu-west-1"
}

# แปลง dict เป็นสตริง JSON
json_string = json.dumps(server_entry)
print(json_string)
# {"hostname": "web-prod-03", "ip_address": "10.0.12.47", "port": 8080, "region": "eu-west-1"}
print(type(json_string))
# <class 'str'>

ผลลัพธ์ดังกล่าวเป็น JSON กระชับในบรรทัดเดียว — ดีสำหรับ payload และการจัดเก็บ แต่อ่านยาก เพิ่ม indent=2 เพื่อให้ได้ผลลัพธ์ที่อ่านได้:

Python 3.10+ — pretty-printed output

import json

server_entry = {
    "hostname": "web-prod-03",
    "ip_address": "10.0.12.47",
    "port": 8080,
    "region": "eu-west-1"
}

pretty_json = json.dumps(server_entry, indent=2)
print(pretty_json)
# {
#   "hostname": "web-prod-03",
#   "ip_address": "10.0.12.47",
#   "port": 8080,
#   "region": "eu-west-1"
# }

พารามิเตอร์อีกสองตัวที่ใช้แทบทุกครั้ง: sort_keys=True เรียงคีย์ dictionary ตามตัวอักษร (ดีมากสำหรับการ diff ไฟล์ JSON ระหว่างเวอร์ชัน) และ ensure_ascii=False รักษาอักขระที่ไม่ใช่ ASCII แทนการแปลงเป็นลำดับ \uXXXX

Python 3.10+ — sort_keys and ensure_ascii

import json

warehouse_record = {
    "sku": "WH-9031",
    "location": "คลังสินค้ากรุงเทพ 3",
    "quantity": 240,
    "last_audit": "2026-03-10"
}

output = json.dumps(warehouse_record, indent=2, sort_keys=True, ensure_ascii=False)
print(output)
# {
#   "last_audit": "2026-03-10",
#   "location": "คลังสินค้ากรุงเทพ 3",
#   "quantity": 240,
#   "sku": "WH-9031"
# }

หมายเหตุสั้นๆ เกี่ยวกับพารามิเตอร์ separators: ค่าเริ่มต้นคือ (", ", ": ") ซึ่งเพิ่มช่องว่างหลังจุลภาคและจุดคู่ สำหรับผลลัพธ์ที่กระชับที่สุด (มีประโยชน์เมื่อฝัง JSON ใน URL parameter หรือลดขนาด API response) ส่ง separators=(",", ":")

หมายเหตุ:Python dict และ JSON object ดูเกือบเหมือนกันเมื่อพิมพ์ ความแตกต่าง: json.dumps() แปลง Python True เป็น JSON true, None เป็น null และห่ออักขระในเครื่องหมายคำพูดคู่ (Python อนุญาตเครื่องหมายคำพูดเดี่ยว แต่ JSON ไม่อนุญาต) ใช้ json.dumps() เสมอเพื่อสร้าง JSON ที่ถูกต้อง — อย่าพึ่งพาstr() หรือ repr()

csv.DictReader ถึงไฟล์ JSON — Pipeline ครบวงจร

งานที่พบบ่อยที่สุดในโลกจริงคือการอ่านไฟล์ CSV ทั้งหมดและบันทึกเป็น JSON นี่คือสคริปต์ครบวงจรในไม่ถึง 10 บรรทัด csv.DictReader สร้าง iterator ของอ็อบเจกต์ dict — หนึ่งแถวต่อหนึ่ง dict โดยใช้บรรทัดแรกเป็นคีย์ การห่อด้วย list() รวบรวมแถวทั้งหมดเป็น Python list ซึ่ง serialize เป็น JSON array

Python 3.10+ — full CSV to JSON conversion

import csv
import json

# ขั้นตอนที่ 1: อ่านแถว CSV ลงใน list ของ dict
with open("inventory.csv", "r", encoding="utf-8") as csv_file:
    rows = list(csv.DictReader(csv_file))

# ขั้นตอนที่ 2: เขียน list เป็นไฟล์ JSON
with open("inventory.json", "w", encoding="utf-8") as json_file:
    json.dump(rows, json_file, indent=2, ensure_ascii=False)

print(f"Converted {len(rows)} rows to inventory.json")

การเรียก open() สองครั้ง: ครั้งหนึ่งสำหรับอ่าน CSV อีกครั้งสำหรับเขียน JSON นั่นคือรูปแบบทั้งหมด สังเกตว่าใช้ json.dump() (ไม่มี s) — มันเขียนลงในตัวจัดการไฟล์โดยตรง การใช้ json.dumps() จะคืนค่าเป็นสตริงที่คุณต้องเขียนแยกต่างหากด้วย f.write() json.dump() ใช้หน่วยความจำน้อยกว่าเพราะ stream ผลลัพธ์แทนการสร้างสตริงทั้งหมดในหน่วยความจำก่อน

เมื่อต้องการ JSON เป็นสตริงแทนที่จะเป็นไฟล์ — สำหรับการฝังใน API payload, การพิมพ์ไปยัง stdout หรือการแทรกลงในคอลัมน์ฐานข้อมูล — ใช้ json.dumps():

Python 3.10+ — CSV rows as JSON string

import csv
import json

with open("sensors.csv", "r", encoding="utf-8") as f:
    rows = list(csv.DictReader(f))

# รับ JSON เป็นสตริงแทนการเขียนลงไฟล์
json_payload = json.dumps(rows, indent=2)
print(json_payload)
# [
#   {
#     "sensor_id": "TMP-4401",
#     "location": "อาคาร 7 - ชั้น 2",
#     "reading": "22.4",
#     "unit": "celsius"
#   },
#   ...
# ]

แถวเดียวเทียบกับชุดข้อมูลทั้งหมด: หากเรียก json.dumps(single_dict) จะได้ JSON object ({...}) เรียก json.dumps(list_of_dicts) จะได้ JSON array ([{...}, {...}]) รูปร่างของ container ด้านนอกขึ้นอยู่กับสิ่งที่คุณส่งเข้าไป ผู้บริโภคปลายทางส่วนใหญ่ คาดหวัง array สำหรับข้อมูลแบบตาราง

การจัดการค่าที่ไม่ใช่สตริง — การแปลงชนิดข้อมูลจาก CSV

นี่คือสิ่งที่ทุกคนติดกับดักในครั้งแรก: csv.DictReader คืนค่าทุกอย่างเป็นสตริง ตัวเลข 42 ใน CSV กลายเป็นสตริง "42" ใน dict หาก serialize ตรงๆ ด้วย json.dumps(), JSON จะมี "quantity": "42" แทนที่จะเป็น "quantity": 42 API ที่ตรวจสอบชนิดข้อมูล จะปฏิเสธสิ่งนี้ คุณต้องแปลงชนิดอย่างชัดเจน

Python 3.10+ — type coercion for CSV rows

import csv
import json

def coerce_types(row: dict) -> dict:
    """แปลงค่าสตริงเป็นชนิดข้อมูล Python ที่เหมาะสม"""
    return {
        "sensor_id": row["sensor_id"],
        "location": row["location"],
        "temperature": float(row["temperature"]),
        "humidity": float(row["humidity"]),
        "battery_pct": int(row["battery_pct"]),
        "active": row["active"].lower() == "true",
    }

with open("sensor_readings.csv", "r", encoding="utf-8") as f:
    rows = [coerce_types(row) for row in csv.DictReader(f)]

print(json.dumps(rows[0], indent=2))
# {
#   "sensor_id": "TMP-4401",
#   "location": "อาคาร 7 - ชั้น 2",
#   "temperature": 22.4,
#   "humidity": 58.3,
#   "battery_pct": 87,
#   "active": true
# }

ตอนนี้ temperature เป็น float, battery_pct เป็นจำนวนเต็ม และ active เป็น boolean ในผลลัพธ์ JSON ฟังก์ชันการแปลงชนิดเฉพาะกับ schema CSV ของคุณ — ไม่มีวิธีทั่วไปในการเดาชนิดจากข้อมูล CSV ดังนั้นจึงเขียนหนึ่งฟังก์ชันต่อหนึ่งรูปแบบ CSV

การ Serialize อ็อบเจกต์กำหนดเองและชนิดที่ไม่ใช่มาตรฐาน

โมดูล json ของ Python ไม่สามารถ serialize datetime, UUID, Decimal, หรือคลาสกำหนดเองได้โดยค่าเริ่มต้น การเรียก json.dumps() กับสิ่งเหล่านี้จะ raise TypeErrorมีสองแนวทางในการจัดการ

แนวทางที่ 1: พารามิเตอร์ default=

ส่งฟังก์ชันไปยัง default= ที่แปลงชนิดที่ไม่รู้จักเป็นสิ่งที่ serialize ได้ ฟังก์ชันนี้จะถูกเรียกเฉพาะ สำหรับอ็อบเจกต์ที่ JSON encoder ไม่รู้วิธีจัดการ

Python 3.10+ — default= for datetime, UUID, Decimal

import json
from datetime import datetime
from decimal import Decimal
from uuid import UUID

def json_serial(obj):
    """Fallback serializer สำหรับชนิดที่ไม่ใช่มาตรฐาน"""
    if isinstance(obj, datetime):
        return obj.isoformat()
    if isinstance(obj, UUID):
        return str(obj)
    if isinstance(obj, Decimal):
        return float(obj)
    raise TypeError(f"Type {type(obj).__name__} is not JSON serializable")

transaction = {
    "txn_id": UUID("a1b2c3d4-e5f6-7890-abcd-ef1234567890"),
    "amount": Decimal("149.99"),
    "currency": "EUR",
    "processed_at": datetime(2026, 3, 15, 14, 30, 0),
    "gateway": "stripe",
}

print(json.dumps(transaction, indent=2, default=json_serial))
# {
#   "txn_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
#   "amount": 149.99,
#   "currency": "EUR",
#   "processed_at": "2026-03-15T14:30:00",
#   "gateway": "stripe"
# }

คำเตือน:raise TypeError เสมอที่ท้ายฟังก์ชัน default= สำหรับชนิดที่ไม่รู้จัก หากคืนค่า None หรือข้ามโดยเงียบ คุณจะได้ null ในผลลัพธ์โดยไม่มีสัญญาณว่าข้อมูลสูญหาย

แนวทางที่ 2: Dataclass กับ asdict()

Python dataclass ให้นิยามชนิดที่เหมาะสมแก่แถว CSV ใช้ dataclasses.asdict() เพื่อแปลงอินสแตนซ์ dataclass เป็น plain dict แล้วส่งไปยัง json.dumps()

Python 3.10+ — dataclass serialization

import json
from dataclasses import dataclass, asdict
from datetime import datetime

@dataclass
class ShipmentRecord:
    tracking_id: str
    origin: str
    destination: str
    weight_kg: float
    shipped_at: datetime

def json_serial(obj):
    if isinstance(obj, datetime):
        return obj.isoformat()
    raise TypeError(f"Not serializable: {type(obj).__name__}")

shipment = ShipmentRecord(
    tracking_id="SHP-9827",
    origin="กรุงเทพฯ",
    destination="เชียงใหม่",
    weight_kg=1240.5,
    shipped_at=datetime(2026, 3, 12, 8, 0, 0),
)

print(json.dumps(asdict(shipment), indent=2, default=json_serial))
# {
#   "tracking_id": "SHP-9827",
#   "origin": "กรุงเทพฯ",
#   "destination": "เชียงใหม่",
#   "weight_kg": 1240.5,
#   "shipped_at": "2026-03-12T08:00:00"
# }

หมายเหตุ:asdict() แปลง dataclass ที่ซ้อนกันเป็น dict แบบเรียกซ้ำ หาก dataclass มี list ของ dataclass อื่น โครงสร้างทั้งหมดจะถูกแปลง — ไม่ต้องเขียนโค้ดเพิ่ม

เอกสารอ้างอิงพารามิเตอร์ json.dumps()

รายการพารามิเตอร์ keyword ทั้งหมดที่รับโดย json.dumps() และ json.dump()ทั้งสองฟังก์ชันรับพารามิเตอร์เดียวกัน — json.dump() รับอาร์กิวเมนต์แรกเพิ่มเติมสำหรับอ็อบเจกต์ไฟล์

พารามิเตอร์

ชนิดข้อมูล

ค่าเริ่มต้น

คำอธิบาย

obj

Any

(จำเป็น)

อ็อบเจกต์ Python ที่ต้องการ serialize — dict, list, str, int, float, bool, None

indent

int | str | None

None

จำนวนช่องว่าง (หรือสตริง) ต่อหนึ่งระดับการย่อหน้า None = ผลลัพธ์แบบกระชับในบรรทัดเดียว

sort_keys

bool

False

เรียงคีย์ของ dictionary ตามตัวอักษรในผลลัพธ์

ensure_ascii

bool

True

แปลงอักขระที่ไม่ใช่ ASCII ให้เป็น \\uXXXX ตั้งค่า False เพื่อส่งออก UTF-8 โดยตรง

default

Callable | None

None

ฟังก์ชันที่เรียกสำหรับอ็อบเจกต์ที่ไม่สามารถ serialize ได้โดยค่าเริ่มต้น — คืนค่าที่ serialize ได้หรือ raise TypeError

separators

tuple[str, str] | None

None

กำหนด (item_separator, key_separator) เอง ใช้ (",", ":") สำหรับผลลัพธ์กระชับไม่มีช่องว่าง

skipkeys

bool

False

ข้ามคีย์ของ dict ที่ไม่ใช่ str, int, float, bool หรือ None แทนการ raise TypeError

allow_nan

bool

True

อนุญาต float("nan"), float("inf"), float("-inf") ตั้งค่า False เพื่อ raise ValueError สำหรับค่าเหล่านี้

cls

Type[JSONEncoder] | None

None

คลาสย่อย JSONEncoder แบบกำหนดเองที่ใช้แทนค่าเริ่มต้น

csv.DictReader — การอ่าน CSV เป็น Python Dict

csv.DictReader คืออีกครึ่งหนึ่งของ pipeline CSV-to-JSON มันห่ออ็อบเจกต์ไฟล์และ yield dict หนึ่งต่อแถว โดยใช้บรรทัดแรกเป็นชื่อฟิลด์ เมื่อเทียบกับ csv.reader (ที่ yield list ธรรมดา) DictReader ให้การเข้าถึงคอลัมน์ด้วยชื่อ — ไม่ต้องใช้ index มหัศจรรย์อย่าง row[3]

Python 3.10+ — DictReader with custom delimiter

import csv
import json

# ไฟล์คั่นด้วย Tab จาก database export
with open("user_sessions.tsv", "r", encoding="utf-8") as f:
    reader = csv.DictReader(f, delimiter="\t")
    sessions = list(reader)

print(json.dumps(sessions[:2], indent=2))
# [
#   {
#     "session_id": "sess_8f2a91bc",
#     "user_id": "usr_4421",
#     "started_at": "2026-03-15T09:12:00Z",
#     "duration_sec": "342",
#     "pages_viewed": "7"
#   },
#   {
#     "session_id": "sess_3c7d44ef",
#     "user_id": "usr_1187",
#     "started_at": "2026-03-15T09:14:22Z",
#     "duration_sec": "128",
#     "pages_viewed": "3"
#   }
# ]

คำเตือน:csv.DictReader อ่านไฟล์แบบ lazy — มัน yield แถวทีละแถว การเรียก list(reader) โหลดแถวทั้งหมดลงในหน่วยความจำ สำหรับไฟล์ที่มีหลายล้านแถว ให้ประมวลผลแถวแบบ streaming แทนการรวบรวมทั้งหมด

แปลง CSV จากไฟล์และ API Response

สองสถานการณ์ในการใช้งานจริง: การอ่านไฟล์ CSV จากดิสก์และแปลง และการดึง ข้อมูล CSV จาก API endpoint (บริการรายงานหลายแห่งคืนค่าเป็น CSV) ทั้งสองต้องการการจัดการ error ที่เหมาะสม

อ่านไฟล์ CSV → แปลง → เขียน JSON

Python 3.10+ — file conversion with error handling

import csv
import json
import sys

def csv_to_json_file(csv_path: str, json_path: str) -> int:
    """แปลงไฟล์ CSV เป็น JSON คืนค่าจำนวนแถวที่เขียน"""
    try:
        with open(csv_path, "r", encoding="utf-8") as f:
            rows = list(csv.DictReader(f))
    except FileNotFoundError:
        print(f"Error: {csv_path} not found", file=sys.stderr)
        sys.exit(1)
    except csv.Error as e:
        print(f"CSV parse error in {csv_path}: {e}", file=sys.stderr)
        sys.exit(1)

    with open(json_path, "w", encoding="utf-8") as f:
        json.dump(rows, f, indent=2, ensure_ascii=False)

    return len(rows)

count = csv_to_json_file("fleet_vehicles.csv", "fleet_vehicles.json")
print(f"Wrote {count} records to fleet_vehicles.json")

ดึง CSV จาก API → แปลง → JSON

Python 3.10+ — API CSV response to JSON

import csv
import io
import json
import urllib.request

def fetch_csv_as_json(url: str) -> str:
    """ดึง CSV จาก URL และคืนค่าเป็นสตริง JSON"""
    try:
        with urllib.request.urlopen(url, timeout=10) as resp:
            raw = resp.read().decode("utf-8")
    except urllib.error.URLError as e:
        raise RuntimeError(f"Failed to fetch {url}: {e}")

    reader = csv.DictReader(io.StringIO(raw))
    rows = list(reader)

    if not rows:
        raise ValueError("CSV response was empty or had no data rows")

    return json.dumps(rows, indent=2, ensure_ascii=False)

# ตัวอย่าง: endpoint ที่คืนค่า CSV
try:
    result = fetch_csv_as_json("https://reports.internal/api/v2/daily-metrics.csv")
    print(result)
except (RuntimeError, ValueError) as e:
    print(f"Error: {e}")

ทั้งสองตัวอย่างใช้ encoding="utf-8" อย่างชัดเจนทุกครั้งที่เปิดไฟล์ สิ่งนี้สำคัญสำหรับไฟล์ CSV ที่มีอักขระที่ไม่ใช่ ASCII — ชื่อที่มีสำเนียง ที่อยู่ที่มีอักขระพิเศษ ข้อความ CJK หากไม่ระบุ encoding อย่างชัดเจน Python จะใช้ค่าเริ่มต้นของระบบ ซึ่งบน Windows มักเป็น cp1252 และจะทำให้อักขระหลายไบต์เสียหายโดยเงียบๆ

การตรวจสอบผลลัพธ์ JSON ด้วย json.loads()

หลังจากแปลง CSV เป็นสตริง JSON คุณสามารถตรวจสอบผลลัพธ์โดยแปลงกลับด้วย json.loads()การ round-trip นี้จับปัญหา encoding, ลำดับ escape ที่เสียหาย หรือการต่อสตริงโดยไม่ตั้งใจ ที่จะสร้าง JSON ที่ไม่ถูกต้อง ห่อการเรียกด้วยบล็อก try/except

Python 3.10+ — round-trip validation

import json

json_string = json.dumps({"order_id": "ORD-7291", "total": 129.99})

# ตรวจสอบว่าเป็น JSON ที่ถูกต้องโดยแปลงกลับ
try:
    parsed = json.loads(json_string)
    print(f"Valid JSON with {len(parsed)} keys")
except json.JSONDecodeError as e:
    print(f"Invalid JSON: {e}")
# Valid JSON with 2 keys

การแปลง CSV เป็น JSON จาก Command Line

การแปลงอย่างรวดเร็วจาก terminal — ไม่ต้องใช้ไฟล์สคริปต์ Flag -c ของ Python รันโค้ด inline และคุณสามารถ pipe ผลลัพธ์ผ่าน python3 -m json.tool เพื่อ pretty-print

bash — one-liner CSV to JSON

python3 -c "
import csv, json, sys
rows = list(csv.DictReader(sys.stdin))
json.dump(rows, sys.stdout, indent=2)
" < inventory.csv > inventory.json

bash — pipe CSV file and format with json.tool

python3 -c "import csv,json,sys; print(json.dumps(list(csv.DictReader(sys.stdin))))" < data.csv | python3 -m json.tool

bash — convert and validate with jq

python3 -c "import csv,json,sys; json.dump(list(csv.DictReader(sys.stdin)),sys.stdout)" < report.csv | jq .

หมายเหตุ:python3 -m json.tool คือตัวจัดรูปแบบ JSON ที่มาในตัว มันอ่าน JSON จาก stdin, ตรวจสอบ และพิมพ์ด้วยการย่อหน้า 4 ช่อง มีประโยชน์สำหรับตรวจสอบว่าการแปลง CSV-to-JSON ของคุณสร้างผลลัพธ์ที่ถูกต้อง หากต้องการการย่อหน้า 2 ช่องหรือการกรอง ใช้ jq แทน

ทางเลือกประสิทธิภาพสูง — orjson

โมดูล json ที่มาในตัวทำงานได้ดีสำหรับไฟล์ CSV ส่วนใหญ่ แต่หากคุณกำลังประมวลผลชุดข้อมูลที่มีหลายหมื่นแถวใน loop หรือ API ของคุณต้อง serialize ข้อมูลจาก CSV ทุก request orjson เร็วกว่า 5–10 เท่า มันเขียนด้วย Rust, คืนค่าเป็น bytes แทน str, และ serialize datetime, UUID และ numpy array ได้โดยไม่ต้องใช้ฟังก์ชัน default= กำหนดเอง

bash — install orjson

pip install orjson

Python 3.10+ — CSV to JSON with orjson

import csv
import orjson

with open("telemetry_events.csv", "r", encoding="utf-8") as f:
    rows = list(csv.DictReader(f))

# orjson.dumps() คืนค่าเป็น bytes ไม่ใช่ str
json_bytes = orjson.dumps(rows, option=orjson.OPT_INDENT_2)

with open("telemetry_events.json", "wb") as f:  # หมายเหตุ: "wb" สำหรับ bytes
    f.write(json_bytes)

print(f"Wrote {len(rows)} events ({len(json_bytes)} bytes)")

API แตกต่างเล็กน้อย: orjson.dumps() คืนค่าเป็น bytes และใช้ flag option= แทน keyword argument เปิดไฟล์ในโหมดเขียน binary ("wb") เมื่อเขียนผลลัพธ์ orjson หากต้องการสตริง เรียก .decode("utf-8") กับผลลัพธ์

ผลลัพธ์ Terminal พร้อม Syntax Highlighting — rich

การ debug การแปลง CSV-to-JSON ใน terminal ง่ายขึ้นด้วยผลลัพธ์สี ไลบรารี rich แสดง JSON พร้อม syntax highlighting — คีย์ สตริง ตัวเลข และ boolean แต่ละอย่างมีสีของตัวเอง

bash — install rich

pip install rich

Python 3.10+ — rich JSON output

import csv
import json
from rich.console import Console
from rich.syntax import Syntax

console = Console()

with open("deployment_log.csv", "r", encoding="utf-8") as f:
    rows = list(csv.DictReader(f))

json_output = json.dumps(rows[:3], indent=2, ensure_ascii=False)
syntax = Syntax(json_output, "json", theme="monokai", line_numbers=True)
console.print(syntax)

คำเตือน:rich เพิ่ม ANSI escape code ลงในผลลัพธ์ อย่าเขียนผลลัพธ์ที่ฟอร์แมตด้วย rich ลงไฟล์หรือ API response — มันจะมี control character ที่มองไม่เห็น ใช้ rich สำหรับการแสดงผลใน terminal เท่านั้น

การทำงานกับไฟล์ CSV ขนาดใหญ่

การโหลดไฟล์ CSV ขนาด 500 MB ด้วย list(csv.DictReader(f)) จัดสรรชุดข้อมูลทั้งหมดในหน่วยความจำ แล้ว json.dump() สร้างสตริง JSON เต็มรูปแบบทับลงไปอีก สำหรับไฟล์ที่ใหญ่กว่า 50–100 MB ให้เปลี่ยนไปใช้ แนวทาง streaming หรือเขียน NDJSON (newline-delimited JSON) — หนึ่ง JSON object ต่อบรรทัด

NDJSON — หนึ่ง JSON Object ต่อบรรทัด

Python 3.10+ — streaming CSV to NDJSON

import csv
import json

def csv_to_ndjson(csv_path: str, ndjson_path: str) -> int:
    """แปลง CSV เป็น NDJSON โดยประมวลผลทีละแถว"""
    count = 0
    with open(csv_path, "r", encoding="utf-8") as infile, \
         open(ndjson_path, "w", encoding="utf-8") as outfile:
        for row in csv.DictReader(infile):
            outfile.write(json.dumps(row, ensure_ascii=False) + "\n")
            count += 1
    return count

rows_written = csv_to_ndjson("access_log.csv", "access_log.ndjson")
print(f"Wrote {rows_written} lines to access_log.ndjson")
# แต่ละบรรทัดเป็น JSON object แบบ standalone:
# {"timestamp":"2026-03-15T09:12:00Z","method":"GET","path":"/api/v2/orders","status":"200"}
# {"timestamp":"2026-03-15T09:12:01Z","method":"POST","path":"/api/v2/payments","status":"201"}

Streaming ด้วย ijson สำหรับ JSON Input ขนาดใหญ่

Python 3.10+ — ijson for reading large JSON

import ijson  # pip install ijson

def count_high_value_orders(json_path: str, threshold: float) -> int:
    """นับคำสั่งซื้อที่เกินเกณฑ์โดยไม่โหลดไฟล์ทั้งหมด"""
    count = 0
    with open(json_path, "rb") as f:
        for item in ijson.items(f, "item"):
            if float(item.get("total", 0)) > threshold:
                count += 1
    return count

# ประมวลผลไฟล์ JSON ขนาด 2 GB ด้วยการใช้หน่วยความจำคงที่
high_value = count_high_value_orders("all_orders.json", 500.0)
print(f"Found {high_value} orders above $500")

หมายเหตุ:เปลี่ยนไปใช้ NDJSON หรือ streaming เมื่อ CSV มีขนาดเกิน 50–100 MB ijson สำหรับการอ่านไฟล์ JSON ขนาดใหญ่กลับ — สำหรับฝั่งการเขียน รูปแบบ NDJSON ด้านบนรักษาการใช้หน่วยความจำให้คงที่โดยไม่คำนึงถึงขนาดไฟล์

ข้อผิดพลาดที่พบบ่อย

❌ ใช้ json.dumps() แล้วเขียนลงไฟล์แยกต่างหาก

ปัญหา: json.dumps() คืนค่าเป็นสตริง การเขียนด้วย f.write() ทำงานได้แต่สร้างสตริงกลางที่ไม่จำเป็นในหน่วยความจำ — สิ้นเปลืองสำหรับชุดข้อมูลขนาดใหญ่

วิธีแก้: ใช้ json.dump(data, f) เพื่อเขียนลงในอ็อบเจกต์ไฟล์โดยตรง มัน stream ผลลัพธ์โดยไม่สร้างสตริงเต็มรูปแบบก่อน

Before · Python

After · Python

json_string = json.dumps(rows, indent=2)
with open("output.json", "w") as f:
    f.write(json_string)  # unnecessary intermediate string

with open("output.json", "w", encoding="utf-8") as f:
    json.dump(rows, f, indent=2, ensure_ascii=False)  # direct write

❌ ลืมแปลงค่าสตริง CSV เป็นตัวเลข

ปัญหา: csv.DictReader คืนค่าทั้งหมดเป็นสตริง ผลลัพธ์ JSON มี "quantity": "5" แทนที่จะเป็น "quantity": 5 ซึ่งทำให้ API consumer ที่ตรวจสอบชนิดข้อมูลปฏิเสธ

วิธีแก้: แปลงคอลัมน์ตัวเลขอย่างชัดเจนด้วย int() หรือ float() ก่อน serialize

Before · Python

After · Python

rows = list(csv.DictReader(f))
json.dumps(rows)
# [{"port": "8080", "workers": "4"}]  ← strings, not numbers

rows = list(csv.DictReader(f))
for row in rows:
    row["port"] = int(row["port"])
    row["workers"] = int(row["workers"])
json.dumps(rows)
# [{"port": 8080, "workers": 4}]  ← proper integers

❌ ไม่ระบุ encoding='utf-8' เมื่อเปิดไฟล์

ปัญหา: บน Windows encoding เริ่มต้นคือ cp1252 อักขระที่ไม่ใช่ ASCII (ชื่อที่มีสำเนียง ข้อความ CJK) จะเสียหายโดยเงียบๆ หรือ raise UnicodeDecodeError

วิธีแก้: ส่ง encoding='utf-8' ไปยัง open() เสมอสำหรับทั้งการอ่าน CSV และการเขียน JSON

Before · Python

After · Python

with open("locations.csv", "r") as f:  # uses system default encoding
    rows = list(csv.DictReader(f))

with open("locations.csv", "r", encoding="utf-8") as f:
    rows = list(csv.DictReader(f))

❌ ใช้ str() หรือ repr() แทน json.dumps()

ปัญหา: str(my_dict) สร้าง Python syntax (เครื่องหมายคำพูดเดี่ยว True None) ซึ่งไม่ใช่ JSON ที่ถูกต้อง API และ JSON parser ปฏิเสธ

วิธีแก้: ใช้ json.dumps() เสมอเพื่อสร้าง JSON ที่ถูกต้อง มันแปลง True เป็น true, None เป็น null และใช้เครื่องหมายคำพูดคู่

Before · Python

After · Python

output = str({"active": True, "note": None})
# "{'active': True, 'note': None}"  ← NOT valid JSON

output = json.dumps({"active": True, "note": None})
# '{"active": true, "note": null}'  ← valid JSON

json.dumps() เทียบกับทางเลือกอื่น — ตารางเปรียบเทียบ

วิธีการ

ผลลัพธ์

JSON ที่ถูกต้อง

ชนิดข้อมูลกำหนดเอง

ความเร็ว

ต้องติดตั้งเพิ่ม

json.dumps()

str

✓

ผ่านพารามิเตอร์ default=

Baseline

ไม่ (stdlib)

json.dump()

เขียนลงไฟล์

✓

ผ่านพารามิเตอร์ default=

Baseline

ไม่ (stdlib)

csv.DictReader + json

str หรือไฟล์

✓

ผ่านพารามิเตอร์ default=

Baseline

ไม่ (stdlib)

pandas to_json()

str หรือไฟล์

✓

✓ datetime แบบ native

~2x เร็วกว่าสำหรับข้อมูลขนาดใหญ่

pip install pandas

orjson.dumps()

bytes

✓

✓ datetime/UUID แบบ native

5–10x เร็วกว่า

pip install orjson

dataclasses.asdict() + json

str

✓

ผ่านพารามิเตอร์ default=

Baseline

ไม่ (stdlib)

polars write_json()

str หรือไฟล์

✓

✓ datetime แบบ native

~3x เร็วกว่าสำหรับข้อมูลขนาดใหญ่

pip install polars

สำหรับการแปลง CSV-to-JSON ส่วนใหญ่ การรวม csv + json จากไลบรารีมาตรฐานคือตัวเลือกที่ถูกต้อง: ไม่มี dependency, มาพร้อมกับ Python, ทำงานได้ทุกที่ เลือกใช้ orjson เมื่อการ profiling แสดงว่าการ serialize เป็นคอขวด — ความแตกต่างด้านความเร็วเป็นเรื่องจริงในขนาดใหญ่ ใช้ pandas เมื่อคุณต้องการการล้างข้อมูล การกรอง หรือการรวมก่อนแปลงเป็น JSON หากต้องการการแปลงอย่างรวดเร็วโดยไม่เขียนโค้ด ตัวแปลง CSV เป็น JSON ออนไลน์ จัดการได้ทันที

คำถามที่พบบ่อย

json.dump() และ json.dumps() ใน Python แตกต่างกันอย่างไร?

json.dump(obj, file) เขียนผลลัพธ์ JSON ลงในอ็อบเจกต์ไฟล์โดยตรง (สิ่งใดก็ตามที่มีเมธอด .write()) ส่วน json.dumps(obj) คืนค่าเป็นสตริงในรูปแบบ JSON ใช้ json.dump() เมื่อต้องการเขียนลงไฟล์ ใช้ json.dumps() เมื่อต้องการ JSON เป็นสตริง Python สำหรับการบันทึก log การฝังใน payload หรือการส่งผ่าน socket ทั้งคู่รับพารามิเตอร์ keyword เดียวกัน (indent, sort_keys, ensure_ascii, default)

วิธีแปลง Python dictionary เป็นสตริง JSON?

เรียก json.dumps(your_dict) ค่าที่คืนกลับมาเป็น str ที่มี JSON ที่ถูกต้อง เพิ่ม indent=2 เพื่อให้อ่านได้ง่ายขึ้น หาก dict มีค่าที่ไม่ใช่ ASCII ให้ส่ง ensure_ascii=False เพื่อรักษาอักขระเช่น ตัวอักษรที่มีสำเนียงหรือข้อความ CJK

Python 3.10+

import json

server_config = {"host": "api.internal", "port": 8443, "debug": False}
json_string = json.dumps(server_config, indent=2)
print(json_string)
# {
#   "host": "api.internal",
#   "port": 8443,
#   "debug": false
# }

วิธีบันทึก Python list ของ dict เป็นไฟล์ JSON?

เปิดไฟล์ในโหมดเขียนด้วย encoding UTF-8 แล้วเรียก json.dump(your_list, f, indent=2, ensure_ascii=False) ใช้ json.dump() (ไม่ใช่ json.dumps()) เสมอสำหรับการส่งออกไฟล์ — มันเขียนลงในตัวจัดการไฟล์โดยตรงโดยไม่ต้องสร้างสตริงกลางในหน่วยความจำ

Python 3.10+

import json

records = [
    {"order_id": "ORD-4821", "total": 129.99, "currency": "USD"},
    {"order_id": "ORD-4822", "total": 89.50, "currency": "EUR"},
]

with open("orders.json", "w", encoding="utf-8") as f:
    json.dump(records, f, indent=2, ensure_ascii=False)

ทำไม json.dumps() แปลง True เป็น true และ None เป็น null?

บูลีน Python (True, False) และ None ไม่ใช่โทเค็น JSON ที่ถูกต้อง ข้อกำหนด JSON ใช้ตัวพิมพ์เล็ก true, false และ null json.dumps() จัดการการแมปนี้โดยอัตโนมัติ — True กลายเป็น true, False กลายเป็น false, None กลายเป็น null คุณไม่จำเป็นต้องแปลงสิ่งเหล่านี้ด้วยตนเอง ในทิศทางตรงข้าม json.loads() จะแมปกลับเป็นชนิดข้อมูล Python

วิธีจัดการอ็อบเจกต์ datetime เมื่อแปลงข้อมูล CSV เป็น JSON?

ส่งฟังก์ชัน default= ไปยัง json.dumps() เพื่อแปลงอ็อบเจกต์ datetime เป็นสตริง ISO 8601 ฟังก์ชัน default จะถูกเรียกสำหรับอ็อบเจกต์ใดก็ตามที่ json ไม่สามารถ serialize ได้ตามค่าเริ่มต้น คืนค่า obj.isoformat() สำหรับอินสแตนซ์ datetime และ raise TypeError สำหรับสิ่งอื่น

Python 3.10+

import json
from datetime import datetime

def json_default(obj):
    if isinstance(obj, datetime):
        return obj.isoformat()
    raise TypeError(f"Not serializable: {type(obj)}")

event = {"action": "login", "timestamp": datetime(2026, 3, 15, 9, 30, 0)}
print(json.dumps(event, default=json_default))
# {"action": "login", "timestamp": "2026-03-15T09:30:00"}

สามารถแปลง CSV เป็น JSON ได้โดยไม่ใช้ pandas หรือไม่?

ได้ ไลบรารีมาตรฐานของ Python มีทุกสิ่งที่คุณต้องการ ใช้ csv.DictReader เพื่ออ่านแต่ละแถวเป็น dictionary รวบรวมแถวทั้งหมดลงใน list และ serialize ด้วย json.dump() หรือ json.dumps() ไม่จำเป็นต้องใช้ไลบรารีของบุคคลที่สาม pandas ควรเพิ่มเฉพาะเมื่อคุณต้องการการล้างข้อมูล การอนุมานชนิด หรือใช้ pandas ที่อื่นในโปรเจกต์อยู่แล้ว

Python 3.10+

import csv
import json

with open("inventory.csv", "r", encoding="utf-8") as csv_file:
    rows = list(csv.DictReader(csv_file))

with open("inventory.json", "w", encoding="utf-8") as json_file:
    json.dump(rows, json_file, indent=2, ensure_ascii=False)

สำหรับทางเลือกแบบคลิกเดียวโดยไม่ต้องเขียน Python ใดๆ ลองใช้ ตัวแปลง CSV เป็น JSON — วางข้อมูล CSV และรับผลลัพธ์ JSON ที่จัดรูปแบบแล้วทันที

เครื่องมือที่เกี่ยวข้อง

แปลง CSV เป็น JSON

แปลงข้อมูล CSV เป็น JSON ทันทีในเบราว์เซอร์ — วาง CSV และรับผลลัพธ์ JSON ที่จัดรูปแบบแล้วโดยไม่ต้องเขียนโค้ดใดๆ

จัดรูปแบบและตกแต่ง JSON

จัดรูปแบบและ pretty-print ผลลัพธ์ JSON — มีประโยชน์สำหรับตรวจสอบไฟล์ JSON ที่สร้างโดยสคริปต์ Python ของคุณ

ตัวจัดรูปแบบ CSV

ทำให้ข้อมูล CSV เป็นมาตรฐานและจัดรูปแบบใหม่ด้วย delimiter กำหนดเองก่อนนำเข้า pipeline การแปลง Python

แปลง CSV เป็น SQL

สร้างคำสั่ง SQL INSERT จากข้อมูล CSV — รูปแบบผลลัพธ์ทางเลือกเมื่อปลายทางไม่รองรับ JSON

มีให้ในภาษาอื่นด้วย:JavaScript

Maria SantosBackend Developer

Maria is a backend developer specialising in Python and API integration. She has broad experience with data pipelines, serialisation formats, and building reliable server-side services. She is an active member of the Python community and enjoys writing practical, example-driven guides that help developers solve real problems without unnecessary theory.

Priya Sharmaผู้ตรวจสอบทางเทคนิค

Priya is a data scientist and machine learning engineer who has worked across the full Python data stack — from raw data ingestion and cleaning to model deployment and monitoring. She is passionate about reproducible research, Jupyter-based workflows, and the practical engineering side of ML. She writes about NumPy, Pandas, data serialisation, and the Python patterns that make data pipelines reliable at scale.