Schema versions: эволюция manifest от v6 до v12, обратная совместимость
dbt-core релизит major version каждые ~6 месяцев, и каждая major version имеет potential для schema migration. manifest.json schema эволюционировал от v1 (dbt 0.x) до v12 (1.10/1.11), и v13 запланирован для 1.12. Для senior critical: знать, какие fields appeared/disappeared в какой версии, что бы поддерживать tools backwards-compatible или планировать миграцию.
В этом уроке — full timeline schema versions, key changes между versions, migration patterns, и как использовать schemas.getdbt.com для validation.
Model versions: v1/v2, latest_version, deprecation_date (dbt II)
schemas.getdbt.com
dbt Labs публикует formal JSON Schema для каждой artifact версии:
https://schemas.getdbt.com/dbt/manifest/v12.json
https://schemas.getdbt.com/dbt/run-results/v6.json
https://schemas.getdbt.com/dbt/catalog/v1.json
https://schemas.getdbt.com/dbt/sources/v3.json
https://schemas.getdbt.com/dbt/semantic_manifest/v1.json
В metadata.dbt_schema_version каждого артефакта указан URL:
{
"metadata": {
"dbt_schema_version": "https://schemas.getdbt.com/dbt/manifest/v12.json",
"dbt_version": "1.11.0"
}
}
Use cases:
- Validation: tool читает schema URL, fetches JSON Schema, валидирует manifest.
- Version detection: parse URL -> extract
v12-> branched logic. - Documentation: schema file сам по себе документация — fields, types, required/optional, descriptions.
- Code generation: pydantic/typescript types из schema.
import requests
import json
from jsonschema import validate
manifest = json.load(open('target/manifest.json'))
schema_url = manifest['metadata']['dbt_schema_version']
schema = requests.get(schema_url).json()
validate(instance=manifest, schema=schema)
print("Manifest is valid for", schema_url)
Кэшируйте schema files локально. `schemas.getdbt.com` rate-limits, а сами файлы static. Для CI/CD — pre-fetch и commit в repo.
Timeline schema versions
manifest schema | dbt-core version | release date | major changes
----------------|------------------|---------------|----------------
v1 | 0.17.x | 2020-07 | initial
v2 | 0.18.x | 2020-09 | refinements
v3 | 0.19.x | 2020-12 | resource_type enum
v4 | 1.0.x | 2021-12 | 1.0 GA, generic tests
v5 | 1.1.x | 2022-05 | small additions
v6 | 1.2.x | 2022-08 | groups, materialized_view
v7 | 1.3.x | 2022-10 | Python models, semantic metrics v1
v8 | 1.4.x | 2023-01 | snapshots cleanup
v9 | 1.5.x | 2023-04 | versioned models, model contracts
v10 | 1.6.x | 2023-07 | semantic_models, saved_queries (MetricFlow)
v11 | 1.7.x | 2023-11 | unit_tests in nodes, constraints expansion
v12 | 1.8.x - 1.11.x | 2024-05+ | unit_tests top-level, microbatch
v13 | 1.12.x | 2026-04 | semantic layer v2 YAML, planned
В каждой версии backward compatibility partial — некоторые fields добавляются safely, другие меняют семантику.
v6 -> v7 (dbt 1.2 -> 1.3)
Major addition: Python models (language: 'python') и semantic metrics v1.
Added в v7
{
"language": "python",
"raw_code": "def model(dbt, session): ...",
"config": {
"packages": ["pandas", "numpy"]
}
}
Новое поле language: 'python' различает Python vs SQL models. Tools должны check:
if node['language'] == 'python':
# Python model logic
elif node['language'] == 'sql':
# SQL model logic
В v6 у nodes не было language поля — все были SQL.
Semantic metrics v1
{
"metrics": {
"metric.jaffle_shop.revenue": {
"name": "revenue",
"model": "ref('fct_orders')",
"type": "sum",
"sql": "amount",
"timestamp": "order_date",
"time_grains": ["day", "week", "month"]
}
}
}
Первая версия dbt semantic layer (predecessor MetricFlow). v7 ввела это поле, v10+ depreciated в пользу semantic_models + saved_queries.
v8 -> v9 (dbt 1.4 -> 1.5)
Major addition: Versioned models + model contracts.
Versioned models
В v9 refs стали dicts (вместо strings):
// v8
"refs": [["fct_orders"], ["dim_customers"]]
// v9
"refs": [
{"name": "fct_orders", "package": null, "version": null},
{"name": "dim_customers", "package": null, "version": 2}
]
unique_id тоже изменился для versioned:
model.jaffle_shop.fct_orders -> unversioned
model.jaffle_shop.fct_orders.v1 -> version 1
model.jaffle_shop.fct_orders.v2 -> version 2
Model contracts
{
"config": {
"contract": {
"enforced": true
}
},
"columns": {
"order_id": {
"data_type": "BIGINT",
"constraints": [
{"type": "not_null"},
{"type": "primary_key"}
]
}
}
}
contract.enforced: true означает dbt валидирует column types и constraints в warehouse при materialization.
Access modifiers
{
"config": {
"access": "protected",
"group": "finance"
}
}
access: public / private / protected. Используется для cross-project ref governance.
v8 -> v9 — самый breaking change. Tools, написанные для v8 и не handling dict-refs, ломаются на v9. До 1.7 dbt allowed refs как arrays of arrays (very legacy). Tools должны handle three forms.
v9 -> v10 (dbt 1.5 -> 1.6)
Major addition: MetricFlow integration.
semantic_models и saved_queries
{
"semantic_models": {
"semantic_model.jaffle_shop.orders": {
"name": "orders",
"node_relation": {
"alias": "fct_orders",
"schema_name": "marts",
"database": "jaffle_shop"
},
"entities": [...],
"dimensions": [...],
"measures": [...]
}
},
"saved_queries": {
"saved_query.jaffle_shop.weekly_revenue": {
"name": "weekly_revenue",
"query_params": {
"metrics": ["revenue"],
"group_by": ["TimeDimension('order__order_date', 'WEEK')"]
}
}
}
}
metrics legacy field остался для backward compat (deprecated с 1.6, completely removed в 1.12).
deprecation_date
{
"config": {
"deprecation_date": "2026-12-31"
}
}
Для models — warning, что они скоро устареют (часто related к versioned models).
v10 -> v11 (dbt 1.6 -> 1.7)
Major addition: Unit tests (в nodes с resource_type=‘unit_test’).
Unit tests в nodes
{
"nodes": {
"unit_test.jaffle_shop.test_revenue_logic": {
"resource_type": "unit_test",
"name": "test_revenue_logic",
"model": "fct_revenue",
"given": [
{"input": "ref('stg_orders')", "rows": [...]}
],
"expect": {"rows": [...]}
}
}
}
В v11 они шли в основной nodes dictionary с resource_type='unit_test'. Это значило: tools filtering by resource_type должны были handle новый type.
Constraints expansion
В v11 column constraints расширились — больше типов (foreign_key, check с expression).
{
"columns": {
"customer_id": {
"constraints": [
{"type": "foreign_key", "expression": "REFERENCES dim_customers(id)"}
]
}
}
}
Group governance
Groups стали более detailed:
{
"groups": {
"group.jaffle_shop.finance": {
"owner": {
"name": "Finance Team",
"email": "[email protected]",
"slack": "@finance"
}
}
}
}
v11 -> v12 (dbt 1.7 -> 1.8+)
Major addition: Unit tests вынесены в top-level, microbatch materialization.
Unit tests top-level
{
"nodes": {
// unit_tests removed from here
},
"unit_tests": {
"unit_test.jaffle_shop.test_revenue_logic": {
"resource_type": "unit_test",
...
}
}
}
Tool которая filtering nodes by resource_type=‘unit_test’ в v12 получит пустой результат. Migration:
def get_unit_tests(manifest):
schema_version = manifest['metadata']['dbt_schema_version']
if 'v11' in schema_version:
return [
n for n in manifest['nodes'].values()
if n.get('resource_type') == 'unit_test'
]
elif 'v12' in schema_version:
return list(manifest.get('unit_tests', {}).values())
else:
raise ValueError(f"Unsupported schema: {schema_version}")
Microbatch materialization
{
"config": {
"materialized": "microbatch",
"event_time": "order_date",
"batch_size": "day",
"lookback": 3,
"begin": "2024-01-01"
}
}
Новый materialization type для time-partitioned incremental processing.
Saved queries — расширение
{
"saved_queries": {
"saved_query.jaffle_shop.weekly_revenue": {
"exports": [
{
"name": "weekly_revenue_export",
"config": {
"export_as": "table",
"schema": "exports"
}
}
]
}
}
}
exports — declare что и куда экспортировать.
legacy metrics removed
metrics field остаётся в manifest для backward compat, но в production не используется. v13 (1.12) полностью удалит.
v12 -> v13 (dbt 1.11 -> 1.12, planned)
Planned addition: Semantic Layer v2 YAML, dbt Cloud-native integrations.
Semantic Layer v2
Новый YAML spec (1.12):
# models/_semantic/_metrics.yml (v2)
metrics:
- name: revenue
label: "Revenue"
type: simple
measure: amount
semantic_model: orders
description: "Total revenue from orders"
filter: "{{ Dimension('order__status') }} = 'completed'"
fill_nulls_with: 0
В manifest это будет reflect новые поля в semantic_models. Legacy v1 YAML still supported, but deprecated.
Other planned changes
- Improved error tracking (each node может иметь parsing_error поле)
- Cross-project versioning enhancements
- dbt MCP server metadata в manifest
v13 запланирован для dbt 1.12 (Q2 2026). Точные изменения могут меняться. Production tools должны subscribe to dbt-core changelog и тестировать compat early.
Backward compatibility approach
dbt-core официальная политика:
- Forward-compat: new dbt versions read old manifests (for state:modified comparisons across versions).
- Backward-compat: old tools на old schema versions могут failing on new manifest.
- Schema version bump only on breaking change: добавление optional field не бампит version. Removing/renaming field — бампит.
В реальности:
- Tools должны explicitly check
dbt_schema_version. - dbt CLI пишет manifest в current schema. Если dbt 1.6 reads manifest v12 from 1.11 — может work если read только compatible fields, но full deserialization упадёт.
state:modifiedсравнивает manifests с possibly different schemas — dbt internally handles migration.
Migration patterns для tools
Pattern 1: branched logic per major version
def parse_manifest(path):
manifest = json.load(open(path))
schema_url = manifest['metadata']['dbt_schema_version']
# Extract version: v10, v11, v12, ...
import re
match = re.search(r'/v(\d+)\.json$', schema_url)
if not match:
raise ValueError(f"Can't parse version from {schema_url}")
version = int(match.group(1))
if version <= 9:
raise ValueError(f"Schema v{version} too old; minimum supported is v10")
elif version == 10:
return _parse_v10(manifest)
elif version == 11:
return _parse_v11(manifest)
elif version == 12:
return _parse_v12(manifest)
elif version >= 13:
# Forward compat — attempt v12 logic
import warnings
warnings.warn(f"Schema v{version} newer than tool supports; using v12 logic")
return _parse_v12(manifest)
Pattern 2: feature detection
Вместо schema version, check fields:
def has_unit_tests_top_level(manifest):
return 'unit_tests' in manifest
def get_unit_tests(manifest):
if has_unit_tests_top_level(manifest):
return list(manifest['unit_tests'].values())
else:
# Old schema — look в nodes
return [
n for n in manifest['nodes'].values()
if n.get('resource_type') == 'unit_test'
]
Feature detection robust против minor changes. Schema version check explicit но fragile при minor releases.
Pattern 3: pydantic models с migration
from pydantic import BaseModel, validator
from typing import Optional, List, Dict, Any
class Ref(BaseModel):
name: str
package: Optional[str] = None
version: Optional[str] = None
@classmethod
def parse_from_manifest(cls, raw):
# Handle three forms
if isinstance(raw, str):
return cls(name=raw)
elif isinstance(raw, list):
return cls(name=raw[0])
elif isinstance(raw, dict):
return cls(**raw)
else:
raise ValueError(f"Unknown ref format: {raw}")
# Использование
for raw_ref in node['refs']:
ref = Ref.parse_from_manifest(raw_ref)
print(ref.name, ref.version)
Pydantic provides defaults и type safety, migration logic в parse_from_manifest.
Pattern 4: dbt-core API (when stable)
from dbt.contracts.graph.manifest import Manifest
manifest_dict = json.load(open('target/manifest.json'))
manifest = Manifest.from_dict(manifest_dict)
# dbt-core handles schema migration internally
for node in manifest.nodes.values():
print(node.unique_id, node.config.materialized)
Hidden behind API — но API changes между dbt versions.
dbt-core Python API не считается stable для external tools (только для plugins). API может measure между major versions. Если используете — pin dbt-core version и test caarefully при upgrade.
Validation workflow
Production-grade validation:
import json
import requests
from jsonschema import Draft7Validator, RefResolver
from functools import lru_cache
@lru_cache(maxsize=10)
def fetch_schema(url):
"""Cache schema fetches."""
return requests.get(url).json()
def validate_manifest(manifest_path):
manifest = json.load(open(manifest_path))
schema_url = manifest['metadata']['dbt_schema_version']
schema = fetch_schema(schema_url)
validator = Draft7Validator(schema)
errors = list(validator.iter_errors(manifest))
if not errors:
return {"valid": True, "schema": schema_url}
return {
"valid": False,
"schema": schema_url,
"errors": [
{
"path": list(e.absolute_path),
"message": e.message
}
for e in errors[:10] # first 10
]
}
result = validate_manifest('target/manifest.json')
if not result['valid']:
for err in result['errors']:
print(f" {' -> '.join(map(str, err['path']))}: {err['message']}")
В CI:
- name: Validate manifest
run: |
dbt parse
python scripts/validate_manifest.py target/manifest.json
Tool compatibility matrix
Топ-tools и поддержка schema versions:
Tool | min dbt | max dbt | notes
----------------------|---------|---------|---------------------------
dbt-osmosis | 1.5 | 1.10 | handles v9-v12
dbt-coverage | 1.4 | 1.10 | feature detection
Elementary | 1.5 | 1.12 | regular updates
re_data | 1.3 | 1.8 | sometimes lags
dbt-checkpoint | 1.5 | 1.10 | pre-commit hooks
dbt-meshify | 1.5 | 1.11 | mesh-specific tooling
Datafold | 1.5 | 1.12 | enterprise
Recce | 1.5 | 1.11 | PR review
Используйте feature detection и graceful degradation, чтобы tool не ломался при minor dbt updates.
Real-world migration story
В 2024 году dbt 1.8 release принес v12 schema с unit_tests top-level. Tools которые filtered nodes by resource_type='unit_test' стали пропускать unit tests.
# v11-era code (broken on v12)
unit_tests_count = sum(
1 for n in manifest['nodes'].values()
if n.get('resource_type') == 'unit_test'
)
# v12: возвращает 0, потому что unit_tests переехали
Fix:
# Handle both v11 и v12
def get_unit_tests(manifest):
if 'unit_tests' in manifest: # v12+
return manifest['unit_tests']
# v11 and earlier
return {
uid: node
for uid, node in manifest.get('nodes', {}).items()
if node.get('resource_type') == 'unit_test'
}
В уроке 04-parsing-manifest-python.mdx мы рассмотрим больше patterns для robust parsing.
Когда schema version важна
- CI/CD: pin dbt version, не surprise upgrades.
- Tooling integrations: feature detection + validation.
- State comparison: state:modified across major versions может break.
- Custom adapters: adapter API tied к dbt-core version.
- Multi-project mesh: все projects желательно на same major dbt version.
Ключевые выводы
- dbt-core schema versioning — formal через
schemas.getdbt.com/dbt/manifest/v{N}.json. - Timeline: v6 (1.2, 2022-08) -> v12 (1.8+, 2024+). v13 запланирован 1.12.
- Breaking changes — v8->v9 (refs стали dicts, versioned models, contracts), v11->v12 (unit_tests top-level).
- Additive changes обычно — Python models, semantic_models, microbatch, constraints expansion.
- Forward-compat: dbt-core новых versions reads старые manifests; backward — нет.
- Tools должны handle multiple versions: branched logic per major, feature detection, или pydantic с migration.
- Validation: jsonschema package + cached schema fetches от schemas.getdbt.com.
- dbt-core Python API не stable: meet external tool risks.
- state:modified across versions — internal mechanism dbt-core, не для external tools.
- Production tools должны pin dbt versions, test against multiple, log schema version в outputs.