Loading YAML

Loading is the act of turning YAML text into native Python objects. YAMLRocks gives you three entry points: loads for a string or bytes you already hold, load for a file on disk, and loads_all / load_all for streams that contain more than one document. All of them share the same options and the same type rules, so once you know one you know them all.

`loads`: parse a string or bytes

loads parses the first document in its input and returns native Python objects. The input may be str, bytes, bytearray, or any object that supports the buffer protocol (such as memoryview):

import yamlrocks

yamlrocks.loads(b"key: value")          # {'key': 'value'}
yamlrocks.loads("count: 42")            # {'count': 42}
yamlrocks.loads(bytearray(b"x: 1"))     # {'x': 1}
yamlrocks.loads(memoryview(b"x: 1"))    # {'x': 1}

An empty document (or input that is only comments) returns None:

import yamlrocks

print(yamlrocks.loads(b""))             # None
print(yamlrocks.loads(b"# just a comment"))  # None

Type resolution

By default YAMLRocks follows the YAML 1.2 core schema. Scalars resolve to Python types as follows:

YAML	Python	Examples
`null`, `~`, (empty)	`None`	`key:`
`true` / `false`	`bool`	`enabled: true`
integers	`int`	`42`, `0xFF`, `0o17`, `-5`
floats	`float`	`3.14`, `1e3`, `.inf`, `.nan`
everything else	`str`	`hello`, `2026-01-02`, `yes`

import yamlrocks

source = """
n: null
b: true
i: 42
x: 0xFF
f: 3.14
s: hello
"""

yamlrocks.loads(source)
# {'n': None, 'b': True, 'i': 42, 'x': 255, 'f': 3.14, 's': 'hello'}

The most common surprise for people coming from PyYAML is that yes, no, on, and off are plain strings in YAML 1.2, not booleans:

import yamlrocks

yamlrocks.loads(b"a: yes")    # {'a': 'yes'}

`load`: parse a file

load is the file-oriented counterpart to loads. Pass it a path (a str or any os.PathLike) or an open file object:

import yamlrocks

with open("config.yaml", "w") as f:
    f.write("name: app\nport: 8080\n")

yamlrocks.load("config.yaml")           # {'name': 'app', 'port': 8080}

with open("config.yaml") as f:
    yamlrocks.load(f)                   # {'name': 'app', 'port': 8080}

load shines with split configurations: when you set OPT_INCLUDES and do not pass an include_dir, includes resolve relative to the file’s own directory, which is almost always what you want. See includes.

Multiple documents

A single YAML stream can hold several documents separated by ---. Use loads_all (or load_all for a file) to get them all as a list:

import yamlrocks

source = """
---
a: 1
---
b: 2
"""

yamlrocks.loads_all(source)
# [{'a': 1}, {'b': 2}]

loads_all and load_all accept option, tag_handler, and tags, the same as their single-document twins. They do not take schema= or include_dir: schema validation and !include resolution are single-document operations, so apply them per document instead. Iterate the result and call loads with a schema on each, or split the stream and resolve includes one document at a time.

Block scalars

Literal (|) and folded (>) block scalars are fully supported, including the chomping indicators (- strip, + keep):

import yamlrocks

literal = """
text: |
  line 1
  line 2
"""

yamlrocks.loads(literal)["text"]
# 'line 1\nline 2\n'

folded = """
text: >
  one
  long
  paragraph
"""

yamlrocks.loads(folded)["text"]
# 'one long paragraph\n'

A literal block keeps newlines verbatim; a folded block joins lines with spaces.

Anchors, aliases, and merge keys

Anchors (&name) mark a node, aliases (*name) reuse it, and the merge key (<<) folds one mapping into another. YAMLRocks resolves all three while parsing:

import yamlrocks

alias = """
base: &b
  x: 1
use: *b
"""

yamlrocks.loads(alias)
# {'base': {'x': 1}, 'use': {'x': 1}}

merge = """
base: &b {x: 1}
use:
  <<: *b
  y: 2
"""

yamlrocks.loads(merge)
# {'base': {'x': 1}, 'use': {'x': 1, 'y': 2}}

Explicit keys win over merged ones, and earlier merges win over later ones, matching PyYAML and ruamel.yaml.

Duplicate keys

By default a repeated mapping key keeps the last value, as PyYAML does:

import yamlrocks

source = """
a: 1
a: 2
"""

yamlrocks.loads(source)                 # {'a': 2}

Pass OPT_DUPLICATE_KEYS_ERROR to reject duplicates instead. The error reports the line and column of the offending key:

import yamlrocks

source = """
a: 1
b: 2
a: 3
"""

yamlrocks.loads(source, option=yamlrocks.OPT_DUPLICATE_KEYS_ERROR)
# yamlrocks.YAMLRocksDuplicateKeyError: duplicate mapping key: a at line 3, column 1

The merge key << is exempt, since repeating it is how multiple mappings are merged.

Complex keys

YAML lets a mapping key be any node, including a sequence or another mapping (a “complex key”). Example 2.11 of the spec, “Mapping between Sequences,” is built on exactly this. A Python dict, however, needs hashable keys, and a list or dict is unhashable. Rather than reject valid YAML, YAMLRocks renders a complex key as its hashable counterpart: a sequence becomes a tuple, and a mapping becomes a tuple of its (key, value) pairs (in order). A tuple is used (rather than a frozenset) so the key survives a dumps/loads round-trip unchanged: a frozenset re-serializes as a sequence and would reload as a different type.

import yamlrocks

# A sequence key becomes a tuple.
data = yamlrocks.loads(b"[a, b]: paired\n")
assert data == {("a", "b"): "paired"}

# A mapping key becomes a tuple of its (key, value) pairs.
source = """
? {x: 1}
: nested
"""

data = yamlrocks.loads(source)
assert data == {(("x", 1),): "nested"}

The key may also be a compact block collection written after the ?, the form used in spec example 8.19:

source = """
? earth: blue
: moon: white
"""

data = yamlrocks.loads(source)
assert data == {(("earth", "blue"),): {"moon": "white"}}

The conversion is recursive, so nested collections inside a key are made hashable too. It applies on every load path that builds Python values, plain loads, annotated mode, and custom-tag resolution, so they all produce the same key.

Rejecting complex keys: `OPT_REJECT_COMPLEX_KEYS`

Accept-and-convert is the right default, but some consumers have a strictly scalar-keyed data model (a config loader, say) where a complex key is always a mistake, and would rather catch it early with a precise location than convert it and fail vaguely later. OPT_REJECT_COMPLEX_KEYS switches to that behavior: a collection used as a mapping key raises YAMLRocksComplexKeyError instead of converting.

import yamlrocks

try:
    yamlrocks.loads(b"{a: 1}: b\n", option=yamlrocks.OPT_REJECT_COMPLEX_KEYS)
except yamlrocks.YAMLRocksComplexKeyError as err:
    print(err.line, err.column)
# 1 1

YAMLRocksComplexKeyError is a YAMLRocksDecodeError (so except YAMLRocksError and except ValueError still catch it) and carries .file/.line/.column pointing at the offending key, including when the key is inside an !included file. The flag rejects any complex key (both sequence and mapping keys), applies on the plain, annotated, and tag-resolving paths, and leaves scalar keys untouched. OPT_ROUND_TRIP is unaffected, since a YAMLRocksDocument models source bytes rather than Python containers.

The most common way to hit this by accident is an unquoted template that occupies a whole value:

state: { { states('sensor.x') } } # YAML sees a mapping used as a key

Because the value starts with {, YAML reads it as a flow mapping in key position, not as text. Quoting it (state: "{{ states('sensor.x') }}") makes it a plain string. OPT_REJECT_COMPLEX_KEYS turns this typo into an immediate, located error rather than a value that fails later. (An embedded template like name: app_{{ env }} starts with a normal character, so it is already a plain scalar and is unaffected.)

Custom tags

By default an unrecognized tag like !mytag is dropped and its underlying value kept. To intercept tags, pass a tag_handler callback, or use OPT_PASSTHROUGH_TAG to receive YAMLRocksTag objects. See custom tags:

import yamlrocks

yamlrocks.loads(
    b"value: !double 5",
    tag_handler=lambda tag, value: int(value) * 2 if tag == "!double" else value,
)
# {'value': 10}

Async loading: off the event loop

Each loader has an async counterpart: async_loads, async_load, and async_load_all. They take the same arguments as their synchronous twins and return the same values, but run the work in a worker thread so an asyncio application never blocks its loop while parsing:

import asyncio
import yamlrocks

source = """
name: app
port: 8080
"""

async def main():
    data = await yamlrocks.async_loads(source)
    return data

asyncio.run(main())
# {'name': 'app', 'port': 8080}

async_load and async_load_all move the file read off the loop as well, so a slow disk does not stall it either:

import asyncio
import yamlrocks

with open("config.yaml", "w") as f:
    f.write("name: app\nport: 8080\n")

async def main():
    return await yamlrocks.async_load("config.yaml")

asyncio.run(main())
# {'name': 'app', 'port': 8080}

What makes this more than a convenience wrapper is that the native scan and parse release the GIL on byte input. The worker thread does the heavy parsing while the event loop keeps running, so other coroutines genuinely make progress during a large parse rather than waiting behind it. You can asyncio.gather several loads and let them overlap:

import asyncio
import yamlrocks

async def main():
    docs = [b"a: %d" % i for i in range(3)]
    return await asyncio.gather(*(yamlrocks.async_loads(d) for d in docs))

asyncio.run(main())
# [{'a': 0}, {'a': 1}, {'a': 2}]

For serializing there is deliberately no async loader counterpart on the dump side beyond file I/O; see async dumping for why and the recommended workaround.

When parsing fails

A genuinely malformed document raises YAMLRocksDecodeError, a subclass of ValueError. The message carries the source location:

import yamlrocks

yamlrocks.loads(b"a: 'unterminated")
# yamlrocks.YAMLRocksParseError: unterminated single-quoted scalar at line 1, column 4

See exceptions for the full error model.