YAML

1 tools

Dataserialiseringsformat

Data serialization is the process of converting structured data into a format that can be stored or transmitted and later reconstructed. Different formats make different tradeoffs between human readability, machine parseability, expressiveness, and file size.

JSON, YAML, TOML, and XML are the four dominant text-based serialization formats in software development. Each has strengths that make it the best choice for specific contexts — understanding those tradeoffs lets you pick the right format for each job.

JSON and YAML Side by Side

JSON and YAML represent the same data model. YAML is a strict superset of JSON — any valid JSON document is also valid YAML. Below is the same configuration expressed in both formats:

JSON
{
  "server": {
    "host": "localhost",
    "port": 8080,
    "debug": true
  },
  "database": {
    "url": "postgres://localhost/mydb",
    "pool": 10
  }
}
YAML
server:
  host: localhost
  port: 8080
  debug: true
database:
  url: postgres://localhost/mydb
  pool: 10

YAML uses indentation instead of braces and brackets, and omits quotes from most string values. This makes it more compact and readable for human-edited files, but introduces indentation sensitivity.

Format Comparison

FormatReadabilityCommentsArraysBest for
JSON★★★☆☆No commentsNativeAPIs, data exchange
YAML★★★★★Yes (#)NativeConfig files, IaC
TOML★★★★☆Yes (#)NativeApp config (Rust, Python)
XML★★☆☆☆Yes (<!-- -->)Repeated elementsDocuments, SOAP, SVG

YAML Gotchas

YAML is powerful but has well-known edge cases that catch developers by surprise. These are the most common:

The Norway Problem

The bare value 'NO' is interpreted as a boolean false in YAML 1.1. Country codes like NO (Norway), OFF, FALSE, N are all parsed as false. In YAML 1.2 this is fixed, but many parsers still use 1.1 rules. Always quote ambiguous strings.

Indentation sensitivity

YAML uses indentation to define structure. A single extra space or a tab character can completely change the meaning of a document. Tabs are forbidden as indentation in YAML — spaces only.

Implicit type coercion

Bare values that look like numbers, booleans, or null are automatically coerced. '1.0' becomes a float, '2024-01-01' becomes a date object in some parsers. Quote values you want to keep as strings.

Tabs are forbidden

The YAML specification explicitly forbids tab characters for indentation. Editors that auto-convert spaces to tabs will silently break your YAML file. Configure your editor to use spaces in YAML files.

Multiline string modes

YAML has two multiline string indicators: | (literal block, preserves newlines) and > (folded block, converts newlines to spaces). Mixing them up silently produces wrong output.

Anchor/alias loops

YAML anchors (&) and aliases (*) allow node reuse, which is powerful but can create circular references that cause infinite loops or memory exhaustion in naive parsers. Validate anchored YAML from untrusted sources.

Features Unique to YAML

Comments

YAML supports comments with the # character. This is one of the biggest practical advantages over JSON for configuration files — you can document your settings inline. Comments can appear on their own line or after a value.

Anchors and Aliases

YAML lets you define a node once with &anchor-name and reuse it anywhere with *anchor-name. This eliminates repetition in complex configs. Kubernetes and Docker Compose use anchors extensively for shared service definitions.

Multiline Strings

YAML supports block scalars with two modes: the literal block (|) preserves newlines exactly as written, useful for scripts and templates. The folded block (>) wraps long strings into a single line, useful for long prose descriptions.

When You Need to Convert

CI/CD pipeline configuration

GitHub Actions, GitLab CI, and CircleCI use YAML natively. When generating pipeline configs programmatically, it is often easier to build a JSON object and convert to YAML for final output.

Kubernetes manifests

Kubernetes resources are defined in YAML, but Helm chart templating and some tools output JSON. Converting between formats lets you inspect API responses and use standard JSON tooling.

API response processing

REST APIs return JSON. When feeding that data into YAML-based config systems (Ansible, Salt, Kubernetes), a converter bridges the gap without manual reformatting.

Configuration migration

Migrating from one tool to another often means converting its config format. Converting between JSON and YAML lets you move between ecosystems (Node.js → Python, Docker → Kubernetes) quickly.

Schema validation workflow

JSON Schema tooling is more mature than YAML Schema. A common pattern is to author config in YAML for readability, convert to JSON for validation against a schema, then convert back for deployment.

Data interchange with APIs

Internal tooling may consume YAML configs while external APIs require JSON. A converter in the build pipeline keeps both representations in sync without maintaining duplicate files.

Vanliga frågor

Is YAML a superset of JSON?

Yes. YAML 1.2 is a strict superset of JSON — any valid JSON document is also valid YAML. However, YAML 1.1 (still used by many parsers) has a few edge cases where valid JSON is not valid YAML 1.1.

Why would I choose YAML over JSON for config files?

YAML supports comments, which JSON does not. YAML is also more compact for deeply nested structures. These two properties make it the preferred choice for human-edited configuration files (Docker, Kubernetes, GitHub Actions).

What is TOML and when should I use it?

TOML (Tom's Obvious Minimal Language) is designed for config files. It has a clear, INI-like syntax with explicit types and no ambiguity. It is the standard for Rust (Cargo.toml) and Python (pyproject.toml) projects.

Does converting JSON to YAML ever lose information?

Rarely, but potentially. JSON preserves key order in most parsers (though the spec says it is unordered). YAML supports anchors and tags that have no JSON equivalent. For round-trip fidelity, stick to the common subset.

What causes 'undefined' to appear in JSON output?

JSON.stringify converts undefined values to null in arrays and omits them from objects. If you see unexpected nulls or missing keys in JSON output, the source JavaScript object likely contained undefined properties.

Can I use JSON in a YAML file?

Yes. Because YAML is a superset of JSON, you can embed JSON syntax directly inside a YAML document. This is sometimes done for complex nested structures where YAML's indentation would be confusing.