Skip to content

YAMLRocks vs PyYAML

PyYAML is the de-facto standard YAML library for Python. It is mature and widely used, but it is YAML 1.1 only, discards comments, has no round-trip mode, and its safest, fastest path still trails a modern Rust implementation. YAMLRocks is a drop-in-friendly alternative that is faster on every operation, safe by default, and far more capable.

FeaturePyYAMLYAMLRocks
YAML 1.2No (1.1 only)Yes (1.2 default, 1.1 mode)
Speed (parse)C loader~5-10x faster than the C loader
Speed (dump)C dumper~15-19x faster than the C dumper
Comment preservationNoYes
Round-trip (byte-for-byte)NoYes
Anchors/aliases preservedNoYes
Merge keys (<<)YesYes
Native !includeNoYes (with write-back)
Source line/columnNoYes (annotated mode)
JSON Schema validationNoYes (line-numbered errors)
Arbitrary object constructionYes (yaml.load)None by design
Output typestrbytes (no extra encode)
datetime/date/time dumppartialYes (with offset control)
Free-threaded (nogil) safeNoYes

PyYAML’s headline footgun is yaml.load with the default loader, which constructs arbitrary Python objects from tags like !!python/object/apply. That is remote code execution waiting to happen, which is why PyYAML added safe_load and now warns when you call load without an explicit loader.

YAMLRocks has no unsafe path to forget. Tags never construct Python objects. An unrecognized tag keeps its underlying scalar, or you opt in to handle it yourself with a tag_handler or OPT_PASSTHROUGH_TAG:

import yamlrocks
# A tag never executes anything. The value is just the scalar underneath.
yamlrocks.loads(b"value: !something 42") # {'value': '42'}
# Opt in to interpret a tag, on your terms.
yamlrocks.loads(
b"value: !double 5",
tag_handler=lambda tag, value: int(value) * 2 if tag == "!double" else value,
) # {'value': 10}

There is no yamlrocks.load that behaves like yaml.load. The safe behavior is the only behavior. See security and custom tags.

PyYAML follows YAML 1.1, where yes, no, on, off, and the country code NO parse as booleans. This famously corrupts configuration files. YAMLRocks defaults to YAML 1.2, where those are plain strings:

import yamlrocks
yamlrocks.loads(b"country: NO") # {'country': 'NO'}
yamlrocks.loads(b"enabled: yes") # {'enabled': 'yes'}

If you need the old behavior for a specific document, opt in with OPT_YAML_1_1, and use yamlrocks.upgrade() to migrate legacy 1.1 files to canonical 1.2:

import yamlrocks
yamlrocks.loads(b"enabled: yes", option=yamlrocks.OPT_YAML_1_1) # {'enabled': True}
yamlrocks.upgrade(b"enabled: yes\nmode: on\n")
# b'%YAML 1.2\n---\nenabled: true\nmode: true\n'

See YAML 1.1 vs 1.2 for the full list of differences.

PyYAML throws comments away on load and cannot reproduce a file. There is no supported way to load config.yaml, change one value, and write it back without losing the comments and reflowing everything.

YAMLRocks’s round-trip mode preserves comments, anchors, and layout, and re-emits only what changed:

import yamlrocks
source = b"# service config\nname: app # the app name\nport: 8080\n"
doc = yamlrocks.loads(source, option=yamlrocks.OPT_ROUND_TRIP)
doc["port"] = 9090
doc.to_yaml()
# b'# service config\nname: app # the app name\nport: 9090\n'

The comment survives, and the unchanged lines are reproduced verbatim. See round-trip editing and the config editor recipe.

yaml.safe_dump returns a str, so writing to a file or socket means a second UTF-8 encode. yamlrocks.dumps returns bytes directly, ready to write:

import yamlrocks
yamlrocks.dumps({"key": "value", "list": [1, 2]})
# b'key: value\nlist:\n - 1\n - 2\n'

PyYAML has no !include. The common workaround is a custom constructor that you wire up yourself and that cannot write changes back. YAMLRocks resolves !include and the !include_dir_* family natively, and round-trip mode can save an edited value back into the exact file it came from. See includes and the Home Assistant recipe.

Beyond the headline differences above, YAMLRocks quietly fixes a long list of recurring PyYAML frustrations:

Common PyYAML frustrationYAMLRocks
Install fails to build (Cython errors, no wheel for your platform or Python)Pure Rust via maturin: prebuilt wheels, no C toolchain, and it builds and runs on free-threaded CPython and 3.14
An integer beyond 64 bits raises OverflowError on dump, or loads as a stringArbitrary-precision integers load, dump, and round-trip exactly
Large or small floats dump as long digit strings (1e308 becomes 309 digits)Scientific notation matching Python’s repr (1.0e+308, 6.022e+23)
Multi-line strings dump as one ugly "a\nb\n" lineA readable literal | block by default, with the right chomping
OrderedDict, Decimal, Enum, UUID, or pathlib.Path raise RepresenterErrorSerialized natively, no custom representer needed
An unknown or custom tag raises ConstructorErrorKept as its underlying value, or surfaced as a YAMLRocksTag; never an error
Characters above U+FFFF (emoji, rare scripts) are mishandledFull Unicode, including in escapes and as mapping keys
yaml.dump output differs across platforms or appends a stray blank lineDeterministic output, exactly one trailing newline

Each of these is exercised by the test suite, and the emitter’s output is checked to be yamllint-clean by default.

Indicative figures from python bench/bench.py (release build), showing how many times faster YAMLRocks is than PyYAML. The C loader (libyaml) is the harder target; the pure-Python loader is what most environments fall back to.

Parsing (loads)

Payloadvs PyYAML (C)vs PyYAML (pure)
small (10 lines)~7x faster~56x faster
medium (k8s manifest)~8x faster~70x faster
large (500 items)~10x faster~83x faster
deep (30 levels)~5x faster~56x faster

Serializing (dumps)

Payloadvs PyYAML (C)vs PyYAML (pure)
small~17x faster~86x faster
medium~16x faster~84x faster
large~16x faster~82x faster
deep~17x faster~71x faster

Split configuration with !include (Home Assistant style): YAMLRocks’s native resolver versus a PyYAML !include constructor.

FilesYAMLRocks is
50~20x faster
200~20x faster
500~20x faster

Use the compatibility shim for a near drop-in switch:

import yamlrocks.compat as yaml
yaml.safe_load(b"a: 1") # {'a': 1}
yaml.safe_dump({"a": 1}) # 'a: 1\n' (a str, matching PyYAML)

safe_load, safe_load_all, safe_dump, and safe_dump_all map straight across, with sort_keys=True defaulting as it does in PyYAML. The shim’s load and dump map onto the safe variants, because YAMLRocks never executes code from tags. For new code, prefer the native loads/dumps API and its bytes output.

PyYAML is a reasonable choice when you only need basic 1.1 loading, want zero native dependencies, or rely on its custom Loader/Dumper extension points and add_constructor/add_representer hooks. YAMLRocks does not expose those constructor hooks; it provides tag_handler, OPT_PASSTHROUGH_TAG, and the default callback instead. For everything else (speed, round-trip, includes, validation, YAML 1.2, and safety) YAMLRocks is the upgrade.