Schema validation
YAMLRocks can validate a document against a JSON Schema.
Pass the schema as a Python dict to loads (or load) through the schema=
keyword. If the document conforms, you get the parsed value back exactly as
without a schema. If it does not, YAMLRocks raises YAMLRocksDecodeError with a
precise source location and a JSON path to the offending node.
Validation runs against the rich syntax tree (the same structure that powers
round-trip mode), so every node still knows its source line and column. That is
how a schema failure can point at an exact line, column rather than just
“somewhere in your data”.
import yamlrocks
schema = { "type": "object", "required": ["name", "port"], "properties": { "name": {"type": "string", "minLength": 1}, "port": {"type": "integer", "minimum": 1, "maximum": 65535}, "tags": {"type": "array", "items": {"type": "string"}}, }, "additionalProperties": False,}
source = """name: appport: 8080"""
yamlrocks.loads(source, schema=schema)# {'name': 'app', 'port': 8080}When a value is out of range, the error names both the JSON path ($.port) and
the line and column in the original YAML:
import yamlrocks
schema = { "type": "object", "required": ["name", "port"], "properties": { "name": {"type": "string", "minLength": 1}, "port": {"type": "integer", "minimum": 1, "maximum": 65535}, }, "additionalProperties": False,}
source = """name: appport: 70000"""
yamlrocks.loads(source, schema=schema)# yamlrocks.YAMLRocksDecodeError: schema validation failed: value 70000 is greater than# maximum 65535 at $.port (line 2, column 7)Nested objects
Section titled “Nested objects”Schemas nest the same way your data does. A properties entry can itself be an
object schema with its own required and properties:
import yamlrocks
schema = { "type": "object", "properties": { "server": { "type": "object", "required": ["host"], "properties": { "host": {"type": "string"}, "port": {"type": "integer", "minimum": 1, "maximum": 65535}, }, }, },}
source = """server: host: db port: 5432"""
yamlrocks.loads(source, schema=schema)# {'server': {'host': 'db', 'port': 5432}}A violation deep in the tree reports the full path to it:
import yamlrocks
schema = { "type": "object", "properties": { "server": { "type": "object", "properties": { "port": {"type": "integer", "minimum": 1}, }, }, },}
source = """server: port: 0"""
yamlrocks.loads(source, schema=schema)# yamlrocks.YAMLRocksDecodeError: schema validation failed: value 0 is less than# minimum 1 at $.server.port (line 2, column 9)Arrays
Section titled “Arrays”Use items to validate every element of a sequence against one schema, and
minItems / maxItems to bound its length:
import yamlrocks
schema = { "type": "array", "items": {"type": "integer", "minimum": 0}, "minItems": 1, "maxItems": 3,}
source = """- 1- 2"""
yamlrocks.loads(source, schema=schema)# [1, 2]When an element fails, the path uses array index notation ($[1]):
import yamlrocks
schema = {"type": "array", "items": {"type": "integer", "minimum": 0}}
source = """- 1- -5"""
yamlrocks.loads(source, schema=schema)# yamlrocks.YAMLRocksDecodeError: schema validation failed: value -5 is less than# minimum 0 at $[1] (line 2, column 3)Enums and constants
Section titled “Enums and constants”enum restricts a value to a fixed set; const pins it to exactly one value:
import yamlrocks
schema = { "type": "object", "properties": { "level": {"enum": ["debug", "info", "warning", "error"]}, "version": {"const": 1}, },}
source = """level: infoversion: 1"""
yamlrocks.loads(source, schema=schema)# {'level': 'info', 'version': 1}A value outside the enum is rejected at its exact location:
import yamlrocks
schema = { "type": "object", "properties": {"level": {"enum": ["debug", "info", "warning", "error"]}},}
yamlrocks.loads(b"level: verbose\n", schema=schema)# yamlrocks.YAMLRocksDecodeError: schema validation failed: value is not one of the# allowed enum values at $.level (line 1, column 8)Combinators
Section titled “Combinators”allOf, anyOf, oneOf, and not compose smaller schemas. A common pattern is
“this field is either a string or an integer”:
import yamlrocks
schema = { "type": "object", "properties": { "id": {"anyOf": [{"type": "string"}, {"type": "integer"}]}, },}
yamlrocks.loads(b"id: 7\n", schema=schema) # {'id': 7}yamlrocks.loads(b"id: abc123\n", schema=schema) # {'id': 'abc123'}If the value matches none of the branches, validation fails:
import yamlrocks
schema = {"anyOf": [{"type": "string"}, {"type": "integer"}]}
yamlrocks.loads(b"3.14", schema=schema)# yamlrocks.YAMLRocksDecodeError: schema validation failed: value does not match any of# the anyOf schemas at $ (line 1, column 1)Supported keywords
Section titled “Supported keywords”YAMLRocks implements a practical, draft-7-ish subset of JSON Schema, enough to express the constraints configuration files actually need, without pulling in a full validator. The supported keywords are:
| Group | Keywords |
|---|---|
| Types | type (null, boolean, integer, number, string, array, object) |
| Values | enum, const |
| Objects | properties, required, additionalProperties (boolean or schema) |
| Arrays | items, minItems, maxItems |
| Numbers | minimum, maximum, exclusiveMinimum, exclusiveMaximum |
| Strings | minLength, maxLength |
| Combinators | allOf, anyOf, oneOf, not |
Known limits of the built-in validator
Section titled “Known limits of the built-in validator”The validator is tuned for the scalar-and-shape constraints configuration files actually use. Three boundaries are worth knowing, and all three are reasons to reach for a dedicated JSON Schema library when you need them:
enumandconstcompare scalars. They are reliable for strings, numbers, booleans, and null. Using them to pin an array or object value is not supported and may reject an otherwise-matching value, so do not rely on structuralconst/enum.- Object rules apply to scalar keys.
properties,required, andadditionalPropertiesmatch string keys. A YAML collection key ([a, b]: ...) is not a JSON object key and is not subject to these rules, so it neither satisfiesrequirednor tripsadditionalProperties: false. - The first error is reported. Validation stops at the first violation and raises it with its path, line, and column. It does not accumulate every problem in one pass, so fixing one error may reveal the next on the following run.
In-file schema references
Section titled “In-file schema references”Editors such as VS Code (through the
yaml-language-server
extension) let a document declare its own schema with a comment, conventionally
on the first line:
# yaml-language-server: $schema=https://example.com/config.schema.jsonname: appport: 8080YAMLRocks recognizes this directive, but treats detecting it and acting on it as two separate steps, on purpose.
Detecting the reference
Section titled “Detecting the reference”schema_ref reads the leading comment block and returns the declared reference,
or None if the document does not declare one. It only inspects comments at the
top of the file; it never parses the body and never performs any I/O, so it is
always cheap and safe to call:
import yamlrocks
doc = b"# yaml-language-server: $schema=https://example.com/config.schema.json\nport: 8080\n"
yamlrocks.schema_ref(doc)# 'https://example.com/config.schema.json'
yamlrocks.schema_ref(b"port: 8080\n")# NoneValidating against the declared schema
Section titled “Validating against the declared schema”To validate against the in-file reference, pass schema="auto" together with a
schema_resolver, a callable that receives the reference string and returns a
schema dict (or None to decline). YAMLRocks detects the directive, calls
your resolver, and validates against whatever it returns. If there is no
directive, or the resolver returns None, validation is skipped and the parsed
value is returned as usual.
import yamlrocks
# A real resolver might read from a local cache, a bundled file, or an# allow-listed fetch. Here we just map known references to schemas.SCHEMAS = { "https://example.com/config.schema.json": { "type": "object", "required": ["name", "port"], "properties": { "name": {"type": "string"}, "port": {"type": "integer", "minimum": 1, "maximum": 65535}, }, },}
def resolve(ref): return SCHEMAS.get(ref)
doc = b"# yaml-language-server: $schema=https://example.com/config.schema.json\nname: app\nport: 8080\n"
yamlrocks.loads(doc, schema="auto", schema_resolver=resolve)# {'name': 'app', 'port': 8080}A document that declares a schema and violates it fails exactly like the
explicit schema= path, with a line-accurate error:
import yamlrocks
SCHEMAS = { "https://example.com/config.schema.json": { "type": "object", "properties": {"port": {"type": "integer"}}, },}
doc = b"# yaml-language-server: $schema=https://example.com/config.schema.json\nport: not-a-number\n"
yamlrocks.loads(doc, schema="auto", schema_resolver=SCHEMAS.get)# yamlrocks.YAMLRocksDecodeError: schema validation failed: expected type integer,# found string at $.port (line 2, column 7)This keeps the network decision where it belongs: in your hands. A resolver can consult a local cache, load a schema bundled with your application, or perform a fetch restricted to hosts you trust.
See also
Section titled “See also”- Loading YAML: the parsing entry points
schema=plugs into. - Exceptions: the full
YAMLRocksDecodeErrormodel. - Annotated mode: keep source locations on every node.
- API reference and options.