JSON转Python Dataclass生成器
从JSON生成Python dataclass定义
JSON输入
Python输出
什么是JSON转Python Dataclass转换?
JSON转Python dataclass转换是指将原始JSON对象转化为一组带有精确类型注解的Python dataclass定义。Python的dataclasses模块由PEP 557(Python 3.7)引入,能够根据带注解的类字段自动生成__init__、__repr__和__eq__方法。在处理JSON API、配置文件或消息队列时,dataclass可为数据提供类型化结构,编辑器和mypy等类型检查器能在开发阶段进行验证。
Python的json.loads()返回普通的dict和list,虽然可以使用,但缺乏类型信息:键名拼写错误会返回None而非抛出异常,编辑器也无法自动补全字段名。Dataclass通过将每个JSON键映射为具名的带类型字段来解决这一问题。嵌套JSON对象会生成独立的dataclass定义,数组变为List[T]注解,null值变为Optional[T]并默认值为None。
手动编写这些定义是机械性工作:读取JSON、根据值推断每个字段的类型、将camelCase或snake_case键转换为Python命名规范,并处理可空字段和混合类型数组等边界情况。转换器能在毫秒内完成所有这些工作。粘贴JSON,即可获得正确的dataclass代码。
为什么使用JSON转Python转换器?
手动将JSON结构转换为Python类定义,需要从示例数据中猜测类型、将必填字段排在可选字段之前,并在API变更时更新所有内容。转换器消除了这些繁琐操作。
JSON转Python使用场景
JSON与Python类型映射
每种JSON值类型对应一个特定的Python类型注解。下表展示转换器如何翻译每种JSON类型,包括typing模块语法(Python 3.7+)和Python 3.10起可用的内置语法。
| JSON类型 | 示例 | Python(typing) | Python 3.10+ |
|---|---|---|---|
| string | "hello" | str | str |
| number (integer) | 42 | int | int |
| number (float) | 3.14 | float | float |
| boolean | true | bool | bool |
| null | null | Optional[str] | str | None |
| object | {"k": "v"} | @dataclass class | nested model |
| array of strings | ["a", "b"] | List[str] | list[str] |
| array of objects | [{"id": 1}] | List[Item] | list[Item] |
| mixed array | [1, "a"] | List[Any] | list[Any] |
Dataclass装饰器参考
@dataclass装饰器接受多个参数,用于改变生成类的行为。本参考涵盖处理JSON派生数据时最常用的选项。
| 装饰器 / 字段 | 行为 | 适用场景 |
|---|---|---|
| @dataclass | Generates __init__, __repr__, __eq__ from field annotations | Standard dataclasses |
| @dataclass(frozen=True) | Makes instances immutable (hashable, no attribute reassignment) | Config objects, dict keys |
| @dataclass(slots=True) | Uses __slots__ for lower memory and faster attribute access | Python 3.10+, large datasets |
| @dataclass(kw_only=True) | All fields require keyword arguments in __init__ | Python 3.10+, many fields |
| field(default_factory=list) | Sets a mutable default without sharing state between instances | List/dict/set defaults |
dataclass、Pydantic与TypedDict对比
Python有三种常用方式来定义JSON的带类型结构,各适用于不同场景。Dataclass是零依赖的标准库选项,Pydantic提供运行时验证,TypedDict在不创建新类的情况下为普通dict添加类型注解。
代码示例
以下示例展示如何在Python中使用生成的dataclass,如何通过JavaScript以编程方式生成,以及如何使用Pydantic和命令行工具等替代方案。
from dataclasses import dataclass
from typing import List, Optional
import json
@dataclass
class Address:
street: str
city: str
zip: str
@dataclass
class User:
id: int
name: str
email: str
active: bool
score: float
address: Address
tags: List[str]
metadata: Optional[str] = None
raw = '{"id":1,"name":"Alice","email":"alice@example.com","active":true,"score":98.5,"address":{"street":"123 Main St","city":"Springfield","zip":"12345"},"tags":["admin","user"],"metadata":null}'
data = json.loads(raw)
# Reconstruct nested objects manually
addr = Address(**data["address"])
user = User(**{**data, "address": addr})
print(user.name) # -> Alice
print(user.address) # -> Address(street='123 Main St', city='Springfield', zip='12345')// Minimal JSON-to-Python-dataclass generator in JS
function jsonToPython(obj, name = "Root") {
const classes = [];
function infer(val, fieldName) {
if (val === null) return "Optional[str]";
if (typeof val === "string") return "str";
if (typeof val === "number") return Number.isInteger(val) ? "int" : "float";
if (typeof val === "boolean") return "bool";
if (Array.isArray(val)) {
const first = val.find(v => v !== null);
return first ? `List[${infer(first, fieldName + "Item")}]` : "List[Any]";
}
if (typeof val === "object") {
const clsName = fieldName.charAt(0).toUpperCase() + fieldName.slice(1);
build(val, clsName);
return clsName;
}
return "Any";
}
function build(obj, cls) {
const fields = Object.entries(obj).map(([k, v]) => ` ${k}: ${infer(v, k)}`);
classes.push(`@dataclass\nclass ${cls}:\n${fields.join("\n")}`);
}
build(obj, name);
return classes.join("\n\n");
}
const data = { id: 1, name: "Alice", scores: [98, 85] };
console.log(jsonToPython(data, "User"));
// @dataclass
// class User:
// id: int
// name: str
// scores: List[int]from pydantic import BaseModel
from typing import List, Optional
class Address(BaseModel):
street: str
city: str
zip: str
class User(BaseModel):
id: int
name: str
email: str
active: bool
score: float
address: Address
tags: List[str]
metadata: Optional[str] = None
# Pydantic parses and validates JSON in one step
raw = '{"id":1,"name":"Alice","email":"alice@example.com","active":true,"score":98.5,"address":{"street":"123 Main St","city":"Springfield","zip":"12345"},"tags":["admin","user"],"metadata":null}'
user = User.model_validate_json(raw)
print(user.name) # -> Alice
print(user.model_dump_json()) # -> re-serializes to JSON# Install the generator
pip install datamodel-code-generator
# Generate dataclasses from a JSON file
datamodel-codegen --input data.json --output models.py --output-model-type dataclasses.dataclass
# Generate Pydantic models instead
datamodel-codegen --input data.json --output models.py
# From a JSON string via stdin
echo '{"id": 1, "name": "Alice", "tags": ["admin"]}' | \
datamodel-codegen --output-model-type dataclasses.dataclass
# Output:
# @dataclass
# class Model:
# id: int
# name: str
# tags: List[str]