Conditional materialization: table в prod, view в dev
В предыдущих уроках мы прошли target.name, env_var, generate_schema_name, grants. Этот урок — про materialization через target conditional: как использовать view в dev (быстрый rebuild), table в prod (query performance), incremental только в prod (избежать complexity в dev).
Хорошее multi-env materialization config экономит часы developer time (faster iteration в dev) и optimizes production query performance. Это last but critical piece of multi-env puzzle.
dbt-iii: анатомия materialization — как работает table vs view vs incrementalБазовый pattern: table prod, view dev
# dbt_project.yml
models:
my_project:
+materialized: "{{ 'table' if target.name == 'prod' else 'view' }}"
Что happens:
- Dev / CI (
target.name != 'prod'): all models materialized as views - Prod: all materialized as tables
Or per-layer:
models:
my_project:
staging:
+materialized: view # views везде (light layer)
intermediate:
+materialized: "{{ 'ephemeral' if target.name != 'prod' else 'view' }}"
# CTE inlining в dev, real views в prod
marts:
+materialized: "{{ 'table' if target.name == 'prod' else 'view' }}"
Why this pattern
Math example:
Model takes 10 minutes для full rebuild.
Dev (view):
dbt run --select my_model= compile + create view = ~5 seconds- Tradeoff: each ref query reruns the SQL — slow for analytical queries
- BUT — dev usually uses sample data -> small impact
Prod (table):
dbt run --select my_model= full rebuild = 10 minutes- Read queries fast (just SELECT from table)
- Daily scheduled run amortizes cost
Prod (incremental):
- Daily run = INSERT new rows = 1 minute (10x faster than full)
- Initial backfill = 10 minutes (one-time)
Conditional incremental только в prod
Incremental adds complexity:
is_incremental()macro logic- Need to handle
--full-refreshproperly - Test setup more involved
- In dev: backfilling each run = ужас
Solution — incremental только в prod, table в dev:
- name: large_fact_table
config:
materialized: "{{ 'incremental' if target.name == 'prod' else 'table' }}"
unique_key: order_id
incremental_strategy: 'merge'
SELECT
order_id,
customer_id,
order_total
FROM {{ ref('stg_orders') }}
{% if is_incremental() and target.name == 'prod' %}
WHERE order_date >= (SELECT MAX(order_date) FROM {{ this }}) - INTERVAL '3 day'
{% endif %}
In dev: target.name=‘dev’, materialized=‘table’, is_incremental() evaluates to false (table is new on every run). Full SQL runs. Fast for small dev data.
In prod: materialized=‘incremental’, is_incremental() true after first run. Only new orders processed. Fast.
Conditional unique_key / partition / cluster
Sometimes config differs:
- name: orders_partitioned
config:
materialized: "{{ 'incremental' if target.name == 'prod' else 'table' }}"
incremental_strategy: 'merge'
unique_key: order_id
# Snowflake / BigQuery: partition by date в prod (faster queries)
cluster_by: "{{ ['order_date', 'customer_id'] if target.name == 'prod' else none }}"
In dev — no clustering (overhead не оправдан на маленьких данных). In prod — cluster для query performance.
Ephemeral materialization
materialized: ephemeral — model not physically materialized. Just CTE inlined в downstream models.
- name: int_user_demographics
config:
materialized: "{{ 'ephemeral' if target.name != 'prod' else 'view' }}"
Dev: ephemeral — no DDL, no physical object. SQL copy-pasted in downstream as CTE. Faster iteration.
Prod: view — real DDL. Downstream uses standard FROM int_user_demographics. Standard production behavior.
Use case: intermediate models in dev pipelines где не нужны permanent objects. Saves cleanup, faster runs.
Limits:
- Ephemeral models cannot have data tests (no table to query)
- Cannot have grants
- Cannot ref from non-dbt code
—target overrides
Sometimes need different behavior. Use --target flag:
# Default: dev config (views)
dbt run
# Specific target: prod config (tables)
dbt run --target prod
# Override default vars
dbt run --target prod --vars '{full_refresh: true}'
In dbt_project.yml conditional logic auto-applies based on target.
For one-off overrides — set env_var и use в conditional:
+materialized: >
{{
'table' if env_var('FORCE_TABLE', 'false') | as_bool
else 'table' if target.name == 'prod'
else 'view'
}}
# Force tables in dev (rare)
FORCE_TABLE=true dbt run --target dev
Per-model conditional materialization
Some models need different patterns:
1. Large fact table — always incremental в prod:
- name: large_fact_table
config:
materialized: "{{ 'incremental' if target.name == 'prod' else 'table' }}"
unique_key: 'id'
incremental_strategy: 'merge'
2. Snapshot — always table (snapshots have their own materialization, but tables similar):
- name: customer_snapshot
config:
materialized: table # always — snapshots need persistence
3. Aggregates — always view (small, cheap to recompute):
- name: daily_revenue_summary
config:
materialized: view # small summary, view is fine even prod
4. ML feature tables — always table:
- name: customer_features
config:
materialized: table # ML pipelines need stable storage
Mix and match based on model purpose, not blanket policy.
Полный example
# dbt_project.yml
name: my_project
version: '1.0.0'
models:
my_project:
# Default: table prod, view dev
+materialized: "{{ 'table' if target.name == 'prod' else 'view' }}"
staging:
+schema: staging
+materialized: view # always view (light layer)
intermediate:
+schema: intermediate
+materialized: "{{ 'ephemeral' if target.name == 'dev' else 'view' }}"
# ephemeral в dev для speed, view в prod для observability
marts:
+schema: marts
+materialized: "{{ 'table' if target.name == 'prod' else 'view' }}"
# Specific overrides
fct_orders:
+materialized: "{{ 'incremental' if target.name == 'prod' else 'table' }}"
+unique_key: 'order_id'
+incremental_strategy: 'merge'
daily_revenue:
+materialized: view # always view (small aggregate)
Use:
dbt run --target dev # views, ephemeral intermediate
dbt run --target ci # views (CI = dev pattern)
dbt run --target prod # tables, incremental for fct_orders, view for daily_revenue
Performance impact
Real numbers (example):
Dev cycle (developer iterating on model):
- View pattern: 5-30 sec per dbt run (just compile + DDL)
- Table pattern: 5-20 min per dbt run (full materialize)
- Savings: 80-95% per iteration
For developer making 20-50 changes per day, view pattern saves hours.
Production runs:
- Tables: stable, fast reads, predictable cost
- Incremental: 10x faster runs for huge tables (TB scale)
- Views в prod: slower reads (DDL re-evaluated each query), but no storage cost
BI tool query performance (consuming models):
- Table: sub-second to 10s typically
- View: 10s to minutes (depends on complexity)
- For dashboards refreshed often, tables crucial
Anti-patterns
1. Same materialization everywhere
+materialized: table
In dev — slow iteration. In prod — sometimes view enough (small aggregates).
Use target-conditional.
2. Incremental in dev
- name: orders
config:
materialized: incremental
unique_key: order_id
Without target conditional. In dev: pain — incremental logic complexity for no benefit (small data).
Use:
materialized: "{{ 'incremental' if target.name == 'prod' else 'table' }}"
3. View materialization для широкой mart
- name: revenue_dashboard_mart
config:
materialized: view # always
BI tool query — every refresh recomputes view (joins, aggregates). Slow dashboards.
Use table или incremental в prod.
4. Tables for trivial transformations
- name: stg_users
config:
materialized: table # SELECT * FROM raw with renames
No benefit. View suffices. Save storage.
5. Hardcoded test config
- name: my_model
config:
materialized: view
full_refresh: false
full_refresh: false — never accept —full-refresh. If model becomes incremental в future, refresh impossible. Be careful.
6. Conditional based on user input
materialized: "{{ 'table' if var('use_tables') else 'view' }}"
Confusing — config depends on runtime vars. Hard to reason. Use target.name (stable).
7. Different materialization in CI vs prod
materialized: "{{ 'incremental' if target.name == 'prod' else 'table' }}"
CI uses table — tests pass. Prod uses incremental — может expose incremental-specific bugs not seen in CI.
Better: CI mimics prod materialization:
materialized: "{{ 'incremental' if target.name in ('prod', 'ci') else 'table' }}"
Or have separate CI target that mirrors prod config.
CI gate для materialization
Скрипт ниже — иллюстрация, как target-aware materialization-config можно проверить через manifest. Полная механика CI gates, prod manifest, state:modified+ — в модуле 13-ci-cd-github.
# scripts/check_materialization.py
import json
with open('target/manifest.json') as f:
manifest = json.load(f)
errors = []
for node in manifest['nodes'].values():
if node['resource_type'] != 'model':
continue
fqn = node['fqn']
materialized = node['config'].get('materialized', '')
# Marts should not be views in prod (assumes building prod manifest)
is_mart = any(p == 'marts' for p in fqn)
if is_mart and materialized == 'view':
# check if conditional
# In manifest, it's resolved value — so 'view' means prod produces views
errors.append(f'Mart {node[\"name\"]} is view in this target — performance concern')
Hint: build manifest for --target prod specifically.
Попробуй сам
-
Set conditional materialization:
models: my_project: +materialized: "`{{ 'table' if target.name == 'prod' else 'view' }}`" -
Run with each target:
dbt run --target dev # views dbt run --target prod # tables -
Verify materialization:
SHOW TABLES IN dbt_alice_dev_marts; -- views SHOW TABLES IN marts; -- tables -
Try incremental conditional:
- name: my_fact config: materialized: "`{{ 'incremental' if target.name == 'prod' else 'table' }}`" unique_key: 'id'Test:
dbt run --target dev --select my_fact # table dbt run --target prod --select my_fact # incremental dbt run --target prod --select my_fact # incremental (only new rows) -
Time the runs — observe speedup в dev with views.
Ключевые выводы
- Conditional materialization через
target.name: table в prod (performance), view в dev (fast iteration), incremental только prod (TB scale optimization). - dbt_project.yml configs:
materialized: "{{ 'table' if target.name == 'prod' else 'view' }}". Per-folder, per-model. - Ephemeral в dev для intermediate models — CTE inlining, нет DDL, fastest iteration. View или table в prod.
- Incremental only в prod — simplifies dev (no is_incremental logic), preserves prod performance.
- —target flag для override defaults. Combined with env_vars для one-off needs.
- Per-model variation — some always view (small aggregates), always table (ML features), always incremental (huge facts). Mix based on purpose.
- CI should mirror prod — same materialization to expose prod-specific bugs (incremental edge cases) before deploy.
- Anti-patterns: same materialization everywhere, incremental in dev (pain), views для wide marts (slow dashboards), tables для trivial staging (waste).