Relation API: Relation.create() и quoting policy
Relation — это представление таблицы / view в warehouse на уровне Python. Когда dbt-core говорит «модель A зависит от модели B», в коде это Relation-объект для A и Relation-объект для B. Quoting, case-sensitivity, schema isolation — всё это политики, привязанные к Relation.
В этом уроке — полная анатомия Relation API: какие поля имеет, как создавать через Relation.create(), как настраивать quoting policy под warehouse.
Что такое Relation
Relation — это 3-part identifier: (database, schema, identifier). Каждая таблица или view имеет такое имя. Например:
analytics.marts.fct_ordersmy_warehouse.public.usersprod_db.snapshots.customer_snapshot
В коде dbt:
from dbt.adapters.base import BaseRelation
rel = BaseRelation.create(
database='analytics',
schema='marts',
identifier='fct_orders',
type='table',
)
print(rel) # analytics.marts.fct_orders
print(rel.database) # analytics
print(rel.schema) # marts
print(rel.identifier) # fct_orders
print(rel.type) # table
При rendering в SQL — Relation становится правильно-quoted identifier:
print(f'SELECT * FROM {rel}')
# SELECT * FROM "analytics"."marts"."fct_orders"
# (или analytics.marts.fct_orders — depends на quoting policy)
Структура BaseRelation
# dbt-adapters/dbt/adapters/base/relation.py (упрощённо)
@dataclass(frozen=True)
class BaseRelation:
path: Path # Path(database, schema, identifier)
type: Optional[RelationType] = None # 'table', 'view', 'cte', 'external'
quote_character: str = '"'
include_policy: Policy = field(default_factory=Policy)
quote_policy: Policy = field(default_factory=Policy)
dbt_created: bool = False
metadata: Optional[Dict] = None
Полей:
path— namedtuple(database, schema, identifier).type— RelationType:table,view,cte,external,materialized_view.quote_character— символ для quoting. Default"(ANSI). MySQL использует backtick`.include_policy— какие parts включать в rendered name (database/schema/identifier).quote_policy— какие parts quoting.dbt_created— boolean. True если dbt создал этот relation (в текущем run или раньше).
include_policy: что включать в name
Policy — namedtuple of booleans для каждого component:
@dataclass
class Policy:
database: bool = True
schema: bool = True
identifier: bool = True
include_policy определяет, какие parts появляются в rendered Relation:
rel = BaseRelation.create(
database='analytics',
schema='marts',
identifier='fct_orders',
include_policy=Policy(database=True, schema=True, identifier=True),
)
print(rel)
# analytics.marts.fct_orders
rel = BaseRelation.create(
database='analytics',
schema='marts',
identifier='fct_orders',
include_policy=Policy(database=False, schema=True, identifier=True),
)
print(rel)
# marts.fct_orders ← database не включён
rel = BaseRelation.create(
database='analytics',
schema='marts',
identifier='fct_orders',
include_policy=Policy(database=False, schema=False, identifier=True),
)
print(rel)
# fct_orders ← только identifier
Use case: некоторые warehouses не используют database tier. Например, MySQL только schema (DB) + table. BigQuery наоборот — DB (project) + dataset + table, все три обязательны.
Defaults differ between adapters:
# dbt-bigquery (project.dataset.table)
class BigQueryRelation(BaseRelation):
include_policy: Policy = field(
default_factory=lambda: Policy(database=True, schema=True, identifier=True)
)
# dbt-mysql (no database tier, just schema.table)
class MySQLRelation(BaseRelation):
include_policy: Policy = field(
default_factory=lambda: Policy(database=False, schema=True, identifier=True)
)
quote_policy: что quote
quote_policy определяет, какие parts оборачивать в quotes при rendering:
rel = BaseRelation.create(
database='analytics',
schema='marts',
identifier='fct_orders',
quote_policy=Policy(database=True, schema=True, identifier=True),
)
print(rel)
# "analytics"."marts"."fct_orders"
rel = BaseRelation.create(
database='analytics',
schema='marts',
identifier='fct_orders',
quote_policy=Policy(database=False, schema=False, identifier=False),
)
print(rel)
# analytics.marts.fct_orders ← no quotes
rel = BaseRelation.create(
database='analytics',
schema='marts',
identifier='fct_orders',
quote_policy=Policy(database=False, schema=False, identifier=True),
)
print(rel)
# analytics.marts."fct_orders" ← только identifier quoted
Quoting policy и case-sensitivity
Это критическая тема. Каждый warehouse имеет свои правила quoting и case-sensitivity:
Adapter-specific defaults:
# dbt-postgres
class PostgresRelation(BaseRelation):
quote_policy: Policy = field(default_factory=lambda: Policy(database=True, schema=True, identifier=True))
# Quote everything для case preservation
# dbt-snowflake
class SnowflakeRelation(BaseRelation):
quote_policy: Policy = field(default_factory=lambda: Policy(database=False, schema=False, identifier=False))
# No quoting by default — let Snowflake auto-uppercase
# dbt-bigquery
class BigQueryRelation(BaseRelation):
quote_character: str = '`' # BigQuery uses backtick
quote_policy: Policy = field(default_factory=lambda: Policy(database=True, schema=True, identifier=True))
Snowflake quoting trap:
Самая распространённая ошибка — quoting на Snowflake.
-- Создаём с quoted identifier
CREATE TABLE "MyTable" (...);
-- Теперь queries требуют quotes:
SELECT * FROM MyTable; -- [X] ERROR (looks for MYTABLE, which doesn't exist)
SELECT * FROM "MyTable"; -- [OK] OK
SELECT * FROM mytable; -- [X] ERROR
-- В adapter код:
{{ ref('my_table') }} -> "MyTable" (если quoted) или MYTABLE (если unquoted)
Если materialization создаёт quoted, но downstream queries unquoted — silent failure. Lookup для wrong-cased identifier.
Production rule для Snowflake:
В dbt_project.yml:
quoting:
database: false
schema: false
identifier: false
Все unquoted -> Snowflake auto-uppercases. Consistent.
For dbt-snowflake это default:
class SnowflakeRelation(BaseRelation):
quote_policy: Policy = field(default_factory=lambda: Policy(database=False, schema=False, identifier=False))
Relation.create() — main constructor
rel = BaseRelation.create(
database='analytics',
schema='marts',
identifier='fct_orders',
type='table',
)
Arguments:
database,schema,identifier— обязательны для full pathtype—'table','view','cte','external', etc.quote_policy— optional, defaults к adapter’sinclude_policy— optional, defaults к adapter’smetadata— optional dict с warehouse-specific metadata
Returns: immutable BaseRelation instance.
Used everywhere в adapter code:
# В list_relations_without_caching macro
{% for row in result %}
{%- set relation = api.Relation.create(
database=schema_relation.database,
schema=schema_relation.schema,
identifier=row.name,
type=row.type,
) -%}
{% do relations.append(relation) %}
{% endfor %}
api.Relation.create() в Jinja is equivalent of BaseRelation.create() в Python.
incorporate(): обновление полей
Relation immutable — нельзя rel.identifier = 'new_name'. Use .incorporate():
rel = BaseRelation.create(database='analytics', schema='marts', identifier='fct_orders', type='view')
# Incorporate (returns new instance)
new_rel = rel.incorporate(path={'identifier': 'fct_orders_v2'})
print(new_rel) # analytics.marts.fct_orders_v2
# Change type
new_rel = rel.incorporate(type='table')
print(new_rel.type) # table
# Combined
new_rel = rel.incorporate(
path={'identifier': 'fct_orders_backup'},
type='table',
)
В materialization:
{%- set target_relation = this.incorporate(type='view') -%}
{# this is current node's Relation, type defaulted to identifier's actual type #}
{# .incorporate(type='view') creates new Relation with type='view' #}
Comparison: equals и identity
Relations comparable:
rel_a = BaseRelation.create(database='db', schema='sch', identifier='t1', type='table')
rel_b = BaseRelation.create(database='db', schema='sch', identifier='t1', type='view')
# Equal by path (ignoring type)
print(rel_a == rel_b) # False (different type)
print(rel_a.path == rel_b.path) # True (same path)
# Hash для use в sets/dicts
print(hash(rel_a)) # 12345
print(hash(rel_b)) # different (different type)
Used в caching, deduplication.
Idempotent quoting
Idempotent quoting = одинаковый результат при повторных применениях.
rel = BaseRelation.create(
database='analytics',
schema='marts',
identifier='fct_orders',
quote_policy=Policy(database=True, schema=True, identifier=True),
)
print(rel)
# "analytics"."marts"."fct_orders"
# Render again
print(rel)
# "analytics"."marts"."fct_orders" ← same, not "\"analytics\""...
# Through f-string
sql = f'SELECT * FROM {rel}'
print(sql)
# SELECT * FROM "analytics"."marts"."fct_orders"
Relation’s __str__ is idempotent — won’t double-quote.
Why important:
В macros вы used Relation many times. If quoting wasn’t idempotent — each rendering double-quotes, breaking SQL.
-- Without idempotent — broken:
CREATE TABLE {{ relation }} AS ...
-- becomes: CREATE TABLE """db""."sch"."t1""" AS ... (broken!)
-- With idempotent — works:
CREATE TABLE {{ relation }} AS ...
-- becomes: CREATE TABLE "db"."sch"."t1" AS ...
Implementation в BaseRelation:
def __str__(self):
parts = []
if self.include_policy.database:
parts.append(self.quoted_if_required(self.database, 'database'))
if self.include_policy.schema:
parts.append(self.quoted_if_required(self.schema, 'schema'))
if self.include_policy.identifier:
parts.append(self.quoted_if_required(self.identifier, 'identifier'))
return '.'.join(parts)
def quoted_if_required(self, name, part):
if getattr(self.quote_policy, part):
return f'{self.quote_character}{name}{self.quote_character}'
return name
Adapter-specific Relation subclass
В каждом adapter — <Adapter>Relation. Override defaults:
# dbt-snowflake/dbt/adapters/snowflake/relation.py
@dataclass(frozen=True, eq=False, repr=False)
class SnowflakeRelation(BaseRelation):
quote_character: str = '"'
# Snowflake uses no quoting by default (auto-uppercase)
quote_policy: Policy = field(default_factory=lambda: Policy(
database=False, schema=False, identifier=False
))
# All parts included
include_policy: Policy = field(default_factory=lambda: Policy(
database=True, schema=True, identifier=True
))
# Snowflake-specific properties
transient: Optional[bool] = None
transient — Snowflake-specific (transient tables don’t have time travel, cheaper).
dbt-bigquery:
@dataclass(frozen=True, eq=False, repr=False)
class BigQueryRelation(BaseRelation):
quote_character: str = '`' # backtick, not double-quote!
quote_policy: Policy = field(default_factory=lambda: Policy(
database=True, schema=True, identifier=True
))
include_policy: Policy = field(default_factory=lambda: Policy(
database=True, schema=True, identifier=True
))
Custom adapter — inherit и override:
# dbt-myadapter/dbt/adapters/myadapter/relation.py
from dbt.adapters.base import BaseRelation, Policy
from dataclasses import dataclass, field
@dataclass(frozen=True, eq=False, repr=False)
class MyAdapterRelation(BaseRelation):
quote_character: str = '"'
# Adjust based on your warehouse
quote_policy: Policy = field(default_factory=lambda: Policy(
database=True, schema=True, identifier=True
))
Best practices
-
Use api.Relation.create() в macros, not direct dict manipulation.
-
Document quoting policy в README — explain why your warehouse needs specific config.
-
Configure project-level quoting в
dbt_project.ymlдля consistency:quoting: database: false schema: false identifier: false -
Test quoting cross-warehouse — same Jinja must produce valid SQL на all targets.
-
Avoid case sensitivity reliance — use lowercase or uppercase consistently.
Попробуй сам
- В Python REPL:
from dbt.adapters.base import BaseRelation, Policy
rel = BaseRelation.create(
database='analytics',
schema='marts',
identifier='fct_orders',
type='table',
)
print(rel)
print(rel.path)
print(rel.type)
- Try different quote policies:
rel_quoted = rel.incorporate() # uses defaults
# Override quoting
from dataclasses import replace
rel_no_quote = replace(rel, quote_policy=Policy(database=False, schema=False, identifier=False))
print(rel_quoted) # "analytics"."marts"."fct_orders"
print(rel_no_quote) # analytics.marts.fct_orders
- Compare relations:
rel_a = BaseRelation.create(database='db', schema='sch', identifier='t1', type='table')
rel_b = BaseRelation.create(database='db', schema='sch', identifier='t1', type='view')
print(rel_a == rel_b) # False
print(rel_a.path == rel_b.path) # True
- В dbt project — test reference:
-- models/test_quoting.sql
SELECT * FROM {{ ref('upstream_model') }}
Run dbt compile. Check target/compiled/test_quoting.sql — see actual rendered Relation.
Ключевые выводы
-
Relation = (database, schema, identifier, type) — 3-part identifier + type.
-
include_policy — какие parts включать в rendered name.
Policy(database=False, schema=True, identifier=True)— для warehouses без DB tier. -
quote_policy — какие parts quoting. Snowflake default
Policy(False, False, False)для auto-uppercasing. -
quote_character —
"(most),`(BigQuery),[](SQL Server). -
Case-sensitivity варьируется: Postgres (lowercase unquoted), Snowflake (uppercase unquoted, locked-case quoted), DuckDB (case-insensitive), BigQuery (case-sensitive).
-
Snowflake quoting trap — once quoted, locked permanently. Use unquoted in adapter for consistency.
-
Idempotent quoting —
__str__doesn’t double-quote. Safe to render Relation many times. -
api.Relation.create() в Jinja,
BaseRelation.create()в Python — canonical constructor.