Skip to content

Real-world verification

YAMLRocks is tested against YAML that people actually write, review, and keep in version control. The real-world corpus pulls public configuration repositories in as pinned git submodules and runs them through the same parser and round-trip emitter users get from the Python package.

The promise is deliberately narrow and testable: every standalone YAML file in the corpus must parse and re-emit byte-for-byte in OPT_ROUND_TRIP mode. For Home Assistant, selected repositories are also loaded through configuration.yaml with native !include resolution enabled, then checked that every unmodified source file writes back byte-for-byte.

The corpus currently spans 95 public repositories across 25 ecosystems and roughly 22,700 YAML files. Each submodule is pinned to a specific commit by this repository, so failures are reproducible until the corpus is deliberately refreshed. The Files column counts *.yaml and *.yml files after excluding Helm chart templates.

EcosystemReposFilesSources
Home Assistant151756Bahnburner/Home-Assistant-Config, DubhAd/Home-AssistantConfig, arsaboo/homeassistant-config, bachya/smart-home, basnijholt/home-assistant-config, benct/home-assistant-config, bieniu/home-assistant-config, dshokouhi/Home-AssistantConfig, frenck/home-assistant-config, hmmbob/HomeAssistantConfig, jcallaghan/home-assistant-config, nagyrobi/home-assistant-configuration-examples, renemarc/home-assistant-config, shortbloke/home_assistant_config, thomasloven/hass-config
Ansible7674dev-sec/ansible-collection-hardening, geerlingguy/ansible-for-devops, geerlingguy/ansible-role-docker, geerlingguy/ansible-role-mysql, geerlingguy/ansible-role-nginx, geerlingguy/mac-dev-playbook, prometheus-community/ansible
ESPHome73264AlexMekkering/esphome-config, athom-tech/esp32-configs, esphome/esphome, esphome/firmware, jesserockz/esphome-configs, landonr/lilygo-tdisplays3-esphome, nrandell/esphome
Kubernetes61175GoogleCloudPlatform/microservices-demo, dockersamples/example-voting-app, kelseyhightower/kubernetes-the-hard-way, kubernetes-sigs/kubespray, kubernetes-sigs/kustomize, kubernetes/examples
Docker Compose5493Haxxnet/Compose-Examples, compose-spec/compose-spec, docker/awesome-compose, docker/compose, vegasbrianc/prometheus
GitOps5569argoproj/argocd-example-apps, fluxcd/flux2, fluxcd/flux2-kustomize-helm-example, rancher/fleet-examples, stefanprodan/podinfo
CircleCI491CircleCI-Public/circleci-demo-go, CircleCI-Public/circleci-demo-javascript-express, CircleCI-Public/circleci-demo-python-django, circleci/circleci-docs
Helm42002grafana/helm-charts, helm/helm, jenkinsci/helm-charts, prometheus-community/helm-charts
OpenAPI4112OAI/OpenAPI-Specification, readmeio/oas-examples, stripe/openapi, swagger-api/swagger-petstore
Prometheus4713prometheus-operator/kube-prometheus, prometheus-operator/prometheus-operator, prometheus/alertmanager, prometheus/prometheus
Argo33398argoproj/argo-cd, argoproj/argo-rollouts, argoproj/argo-workflows
CloudFormation3397aws-cloudformation/aws-cloudformation-templates, awslabs/aws-cloudformation-templates, widdix/aws-cf-templates
dbt3134dbt-labs/dbt-core, dbt-labs/dbt-utils, dbt-labs/jaffle-shop-classic
GitHub Actions3277actions/setup-node, actions/starter-workflows, actions/toolkit
OpenTelemetry33526open-telemetry/opentelemetry-collector, open-telemetry/opentelemetry-collector-contrib, open-telemetry/opentelemetry-demo
Tekton31787tektoncd/catalog, tektoncd/pipeline, tektoncd/triggers
Azure Pipelines250MicrosoftDocs/pipelines-java, microsoft/azure-pipelines-yaml
Cloud Foundry2354cloudfoundry/bosh-deployment, cloudfoundry/cf-deployment
Concourse2234concourse/concourse, concourse/concourse-docker
Crossplane2371crossplane/crossplane, upbound/platform-ref-aws
MkDocs231mkdocstrings/mkdocstrings, squidfunk/mkdocs-material
Serverless2816aws-samples/serverless-patterns, serverless/examples
Woodpecker2130woodpecker-ci/plugin-git, woodpecker-ci/woodpecker
cloud-init1262canonical/cloud-init
goss195goss-org/goss

See the test-suite notes in tests/realworld/README.md for the exact layout and update workflow.

Every *.yaml and *.yml file discovered in the checked-out corpus is parsed in round-trip mode and immediately emitted again. The emitted bytes must exactly match the original file.

Custom tags such as !include, !secret, !vault, and other application tags are preserved in this mode rather than resolved, so the test checks whether the file is valid YAML and whether YAMLRocks can keep its comments, anchors, scalar styles, layout, and tags intact.

Home Assistant configurations get an additional test because split configuration is one of YAMLRocks’s core use cases. Repositories with a configuration.yaml are loaded with OPT_INCLUDES | OPT_ROUND_TRIP, which resolves their !include tree. The test then verifies two write-back properties:

  • the root document re-emits with its include directives restored;
  • every resolved source file re-emits byte-for-byte when it was not modified.

Some public Home Assistant repositories reference files that were intentionally not committed, such as secrets or generated credentials. Those include graphs are marked as strict expected failures, so they cannot hide a parser regression.

The corpus proves that YAMLRocks handles the YAML shapes in these public repositories. It does not claim to replace each ecosystem’s own validation, rendering, or runtime semantics.

EcosystemVerifiedNot claimed
Home AssistantStandalone files round-trip; selected include graphs load and write back byte-for-byte.Validation of Home Assistant integrations or unavailable private secrets.
ESPHomeDevice configuration YAML parses and preserves application tags.Execution of ESPHome substitutions, code generation, or !lambda semantics.
AnsiblePlaybooks, roles, inventories, and custom tags such as !vault are preserved.Replacement of Ansible’s loader, inventory plugins, or task execution semantics.
KubernetesPublic manifests and examples parse and round-trip byte-for-byte.Kubernetes API schema validation or kubectl behavior.
Docker ComposeCompose examples parse and preserve layout.Compose model validation or container runtime behavior.
CloudFormationTemplate YAML parses and round-trips.AWS resource validation or CloudFormation intrinsic execution.
GitOpsArgo CD, Flux, Fleet, and related GitOps examples parse and round-trip.Controller reconciliation or rendered cluster state.
HelmNon-template YAML in charts and examples parses and round-trips.Raw Go template files under chart templates/, which are not standalone YAML until Helm renders them.
OpenAPISpecification and example YAML parses and round-trips.OpenAPI semantic validation.
dbtProject and package YAML parses and round-trips.dbt model compilation or project validation.
CircleCIPipeline configuration YAML parses and round-trips.CircleCI configuration validation or job execution.
GitHub ActionsWorkflow examples parse and round-trip.GitHub Actions workflow validation or runner behavior.
ServerlessFramework examples parse and round-trip.Provider-specific deployment validation.
TektonTask and pipeline YAML parses and round-trips.Kubernetes admission or Tekton controller behavior.
PrometheusPrometheus, Alertmanager, and operator configuration YAML parses and round-trips.Prometheus rule validation, query validation, or operator behavior.
ArgoArgo CD, Argo Workflows, and Argo Rollouts YAML parses and round-trips.Argo controller behavior or workflow execution.
OpenTelemetryCollector, contrib, and demo configuration YAML parses and round-trips.Collector component validation or telemetry processing behavior.
Azure PipelinesPipeline example YAML parses and round-trips.Azure DevOps pipeline validation or job execution.
Cloud FoundryBOSH and Cloud Foundry deployment YAML parses and round-trips.BOSH deployment semantics or Cloud Foundry runtime behavior.
ConcourseConcourse pipeline and deployment YAML parses and round-trips.Concourse pipeline validation or worker behavior.
CrossplaneCrossplane package and platform YAML parses and round-trips.Crossplane schema validation, composition rendering, or controller behavior.
MkDocsMkDocs project configuration YAML parses and round-trips.MkDocs plugin loading or site build behavior.
WoodpeckerWoodpecker pipeline and plugin YAML parses and round-trips.Woodpecker CI validation or job execution.
cloud-initcloud-init YAML examples parse and round-trip.cloud-init module validation or boot-time behavior.
gossgoss YAML tests parse and round-trip.goss assertion execution.

Fetch the corpus once, then run the real-world category:

Terminal window
git submodule update --init
uv run pytest tests/realworld -m realworld

Run a single ecosystem by filtering the test ids:

Terminal window
uv run pytest tests/realworld -k ansible
uv run pytest tests/realworld -k kubernetes

The category auto-skips when the submodules are not checked out, so everyday contributors can still run the normal test suite without downloading the full corpus.

A small number of third-party files use a .yaml or .yml extension but are not valid standalone YAML, usually because they are templates that another tool must render first. These files are recorded as strict expected failures in the test harness. If YAMLRocks ever starts accepting one unexpectedly, the suite fails so the behavior change is visible.

Helm chart templates are excluded for the same reason: files under a chart’s templates/ directory are Go text/template source and are not YAML documents until Helm renders them.