Column API: маппинг warehouse types в dbt types

Column — это представление одного столбца в Python. Когда dbt-core делает introspection таблицы (get_columns_in_relation), он получает список Column объектов. Кастомизация Column нужна для warehouse-specific type handling.

В этом уроке — анатомия Column class, type mapping, как кастомизировать под warehouse.

Структура Column

# dbt-adapters/dbt/adapters/base/column.py
@dataclass
class Column:
    column: str           # column name
    dtype: str            # data type ('VARCHAR(255)', 'INTEGER', etc.)
    char_size: Optional[int] = None    # для VARCHAR(255)
    numeric_precision: Optional[int] = None   # для DECIMAL(15, 2) — 15
    numeric_scale: Optional[int] = None       # для DECIMAL(15, 2) — 2

Базовая instance:

col = Column(
    column='customer_id',
    dtype='INTEGER',
)

col_varchar = Column(
    column='name',
    dtype='VARCHAR',
    char_size=255,
)

col_decimal = Column(
    column='price',
    dtype='DECIMAL',
    numeric_precision=15,
    numeric_scale=2,
)

type_label_for vs translate_type

Column class имеет два main methods для type rendering:

data_type property — full type string:

col = Column(column='price', dtype='DECIMAL', numeric_precision=15, numeric_scale=2)
print(col.data_type)
# DECIMAL(15, 2)

col = Column(column='name', dtype='VARCHAR', char_size=255)
print(col.data_type)
# VARCHAR(255)

Includes precision/scale/char_size.

translate_type(dtype) classmethod — convert dtype string между warehouses:

class Column:
    @classmethod
    def translate_type(cls, dtype: str) -> str:
        return TRANSLATE_TYPES.get(dtype, dtype)

Default — identity (no translation). Override в adapter-specific Column.

Adapter-specific Column

# dbt-postgres/dbt/adapters/postgres/column.py
class PostgresColumn(Column):
    @classmethod
    def translate_type(cls, dtype: str) -> str:
        # Convert ANSI -> Postgres-specific
        TYPE_MAP = {
            'TEXT': 'text',
            'INTEGER': 'integer',
            'BIGINT': 'bigint',
            'BOOLEAN': 'boolean',
            'DOUBLE': 'double precision',
            'TIMESTAMP': 'timestamp without time zone',
            'TIMESTAMPTZ': 'timestamp with time zone',
        }
        return TYPE_MAP.get(dtype.upper(), dtype)

# dbt-snowflake/dbt/adapters/snowflake/column.py
class SnowflakeColumn(Column):
    @classmethod
    def translate_type(cls, dtype: str) -> str:
        TYPE_MAP = {
            'TEXT': 'VARCHAR(16777216)',   # Snowflake max VARCHAR
            'STRING': 'VARCHAR(16777216)',
            'INTEGER': 'NUMBER(38, 0)',     # Snowflake INTEGER is alias to NUMBER
            'BIGINT': 'NUMBER(38, 0)',
            'DOUBLE': 'FLOAT',
            'TIMESTAMP': 'TIMESTAMP_NTZ',
            'TIMESTAMPTZ': 'TIMESTAMP_LTZ',
        }
        return TYPE_MAP.get(dtype.upper(), dtype)

# dbt-bigquery/dbt/adapters/bigquery/column.py
class BigQueryColumn(Column):
    @classmethod
    def translate_type(cls, dtype: str) -> str:
        TYPE_MAP = {
            'TEXT': 'STRING',
            'INTEGER': 'INT64',
            'BIGINT': 'INT64',
            'DOUBLE': 'FLOAT64',
            'BOOLEAN': 'BOOL',
            'TIMESTAMP': 'TIMESTAMP',
        }
        return TYPE_MAP.get(dtype.upper(), dtype)

Each warehouse использует свою type system.

Type rendering в materializations

Когда dbt builds CREATE TABLE statement:

{# Default materialization table #}
CREATE TABLE {{ relation }} (
  {%- for col in columns -%}
    {{ col.column }} {{ col.dtype }}
    {%- if not loop.last %}, {% endif %}
  {%- endfor -%}
)

col.dtype returns warehouse-specific type via translate_type. Если ваш Column class имеет proper translations — generates valid SQL.

Без override:

-- На Snowflake:
CREATE TABLE x (id INTEGER, name TEXT);
-- [X] INTEGER не Snowflake native (it's NUMBER), TEXT не stand

С override:

-- Через SnowflakeColumn.translate_type:
CREATE TABLE x (id NUMBER(38, 0), name VARCHAR(16777216));
-- [OK] Snowflake-native types

Adapter type conversions (impl.py)

Дополнительно к Column class, Adapter имеет convert_*_type classmethods. Used для CSV -> SQL когда dbt seed runs:

class MyAdapter(SQLAdapter):
    @classmethod
    def convert_text_type(cls, agate_table, col_idx):
        return 'TEXT'
    
    @classmethod
    def convert_number_type(cls, agate_table, col_idx):
        # Check если column имеет decimals
        decimals = agate_table.aggregate(agate.MaxPrecision(col_idx))
        if decimals:
            return 'DOUBLE'
        return 'BIGINT'
    
    @classmethod
    def convert_boolean_type(cls, agate_table, col_idx):
        return 'BOOLEAN'
    
    @classmethod
    def convert_datetime_type(cls, agate_table, col_idx):
        return 'TIMESTAMP'
    
    @classmethod
    def convert_date_type(cls, agate_table, col_idx):
        return 'DATE'
    
    @classmethod
    def convert_time_type(cls, agate_table, col_idx):
        return 'TIME'

agate — Python data analysis library used by dbt for CSV processing. agate_table.aggregate(...) computes column stats.

Example flow:

1. dbt seed reads CSV: countries.csv
2. agate.Table created from CSV
3. For each column, dbt picks Python type (str, int, float, bool, datetime)
4. Maps к SQL type через convert_*_type
5. CREATE TABLE с SQL types
6. INSERT FROM csv values

Adapter-specific examples:

dbt-postgres:

@classmethod
def convert_text_type(cls, agate_table, col_idx):
    return 'text'

@classmethod
def convert_number_type(cls, agate_table, col_idx):
    decimals = agate_table.aggregate(agate.MaxPrecision(col_idx))
    return 'numeric' if decimals else 'integer'

dbt-snowflake:

@classmethod
def convert_text_type(cls, agate_table, col_idx):
    return 'TEXT'

dbt-bigquery:

@classmethod
def convert_text_type(cls, agate_table, col_idx):
    return 'STRING'

@classmethod
def convert_number_type(cls, agate_table, col_idx):
    decimals = agate_table.aggregate(agate.MaxPrecision(col_idx))
    return 'FLOAT64' if decimals else 'INT64'

Column-level information в introspection

Когда get_columns_in_relation returns columns, каждая Column instance имеет full info:

columns = adapter.get_columns_in_relation(relation)

for col in columns:
    print(f'{col.column}: {col.dtype}')
    print(f'  char_size: {col.char_size}')
    print(f'  numeric_precision: {col.numeric_precision}')
    print(f'  numeric_scale: {col.numeric_scale}')

Used в:

Contracts — verify column types match data_type в schema.yml
Catalog generation — dbt docs generate includes types
Schema sync — incremental on_schema_change handles new columns

Quote column names

Some warehouses требуют quoting для special-character column names:

class Column:
    def quoted(self) -> str:
        return f'\"{self.column}\"'

# Or override per adapter:
class SnowflakeColumn(Column):
    def quoted(self) -> str:
        # Snowflake quoting trap — only quote if mixed case or special chars
        if self.column.lower() != self.column or '"' in self.column:
            return f'\"{self.column}\"'
        return self.column

В materialization:

{{ col.quoted }}    {# yields quoted column name #}

Capabilities and special types

Some warehouses имеют non-standard types:

Postgres:

JSON, JSONB, UUID, INET, CIDR, arrays (INTEGER[])
geometry, geography (PostGIS)

Snowflake:

VARIANT — JSON-like
OBJECT, ARRAY — semi-structured
GEOGRAPHY, GEOMETRY

BigQuery:

STRUCT<...>, ARRAY<...>
GEOGRAPHY

DuckDB:

LIST<...>, STRUCT<...>
MAP<key, value>
INTERVAL

If your adapter needs these — extend Column class:

class MyAdapterColumn(Column):
    @property
    def is_json(self) -> bool:
        return self.dtype in ('JSON', 'JSONB', 'VARIANT')
    
    @property
    def is_array(self) -> bool:
        return self.dtype.endswith('[]') or self.dtype.startswith('ARRAY<')
    
    @classmethod
    def translate_type(cls, dtype):
        # Standard ANSI -> MyAdapter
        # Plus special types
        return TYPE_MAP.get(dtype.upper(), dtype)

Custom logic visible в macros через {% if col.is_json %}.

Production-grade Column class

Full example для гипотетического warehouse OceanBase:

# dbt-oceanbase/dbt/adapters/oceanbase/column.py
from dataclasses import dataclass
from typing import Optional
from dbt.adapters.base import Column


@dataclass
class OceanBaseColumn(Column):
    @classmethod
    def translate_type(cls, dtype: str) -> str:
        TYPE_MAP = {
            'TEXT': 'TEXT',
            'STRING': 'TEXT',
            'INTEGER': 'BIGINT',
            'BIGINT': 'BIGINT',
            'DOUBLE': 'DOUBLE',
            'BOOLEAN': 'BOOLEAN',
            'TIMESTAMP': 'TIMESTAMP',
            'TIMESTAMPTZ': 'TIMESTAMP',  # OceanBase doesn't distinguish
            'DATE': 'DATE',
            'TIME': 'TIME',
            'JSON': 'JSON',
        }
        return TYPE_MAP.get(dtype.upper(), dtype)
    
    @property
    def is_string(self) -> bool:
        return self.dtype.upper() in ('TEXT', 'VARCHAR', 'CHAR', 'STRING')
    
    @property
    def is_numeric(self) -> bool:
        return self.dtype.upper() in (
            'INT', 'BIGINT', 'SMALLINT', 'INTEGER',
            'DOUBLE', 'FLOAT', 'NUMERIC', 'DECIMAL'
        )
    
    @property
    def is_integer(self) -> bool:
        return self.dtype.upper() in ('INT', 'BIGINT', 'SMALLINT', 'INTEGER')
    
    @property
    def can_expand_to(self, other_column):
        # Can my type expand to accommodate other_column's type?
        # Used для incremental schema sync
        if self.dtype == other_column.dtype:
            return True
        # OceanBase-specific compatibility rules
        ...

is_string, is_numeric, etc. — convenience properties used в macros.

Попробуй сам

В Python REPL:

from dbt.adapters.base import Column

col = Column(column='price', dtype='DECIMAL', numeric_precision=15, numeric_scale=2)
print(col.data_type)
# DECIMAL(15, 2)

col2 = Column(column='name', dtype='VARCHAR', char_size=255)
print(col2.data_type)
# VARCHAR(255)

Try translate_type:

print(Column.translate_type('TEXT'))
# TEXT  (default — identity)

# Subclass for specific warehouse
class SnowflakeColumn(Column):
    @classmethod
    def translate_type(cls, dtype):
        TYPE_MAP = {'TEXT': 'VARCHAR(16777216)', 'INTEGER': 'NUMBER(38, 0)'}
        return TYPE_MAP.get(dtype.upper(), dtype)

print(SnowflakeColumn.translate_type('TEXT'))
# VARCHAR(16777216)

print(SnowflakeColumn.translate_type('INTEGER'))
# NUMBER(38, 0)

В dbt project test seed:

mkdir -p seeds/
cat > seeds/test_seed.csv << EOF
id,name,price,is_active
1,Alice,99.99,true
2,Bob,49.99,false
EOF

dbt seed --select test_seed
# Check created table types — should be appropriate per adapter

Inspect через dbt docs generate + catalog.json:

dbt docs generate
# Then view target/catalog.json — see column types per relation

@dataclass: память и удобство Наследование и MRO: PostgresColumn extends Column

Ключевые выводы

Column = (column, dtype, char_size, numeric_precision, numeric_scale) — represents single column.
data_type property — returns rendered type with precision/scale (DECIMAL(15, 2)).
translate_type classmethod — convert ANSI/dbt types в warehouse-specific. Override per adapter.
convert_*_type methods на Adapter — для dbt seed CSV -> SQL type mapping.
Special types (JSON, ARRAY, STRUCT, geography) — extend Column class с convenience properties.
Used в:
- Materializations (CREATE TABLE column types)
- Contracts (verify data_type matches)
- Catalog generation (dbt docs)
- Schema sync (incremental on_schema_change)
Production-grade: implement all standard types + warehouse-specific properties.

Проверка знанийKnowledge check

Senior пишет adapter for Apache Iceberg-style warehouse. Iceberg имеет rich type system (STRUCT, MAP, LIST, DECIMAL with arbitrary precision). Как настроить Column class?

ОтветAnswer

Iceberg's rich type system требует extended Column class.\n\n**Iceberg types**:\n\n- Primitive: BOOLEAN, INT, LONG, FLOAT, DOUBLE, DECIMAL(P, S), DATE, TIME, TIMESTAMP, TIMESTAMP_NS, STRING, UUID, FIXED(L), BINARY\n- Nested: STRUCT, LIST, MAP\n\n**Custom Column class**:\n\n```python\nfrom dataclasses import dataclass, field\nfrom typing import Optional, List, Dict\nfrom dbt.adapters.base import Column\n\n\n@dataclass\nclass IcebergColumn(Column):\n # Standard fields из base Column\n column: str\n dtype: str\n char_size: Optional[int] = None\n numeric_precision: Optional[int] = None\n numeric_scale: Optional[int] = None\n \n # Iceberg-specific\n nested_fields: Optional[List['IcebergColumn']] = None # для STRUCT\n element_type: Optional['IcebergColumn'] = None # для LIST<element_type>\n key_type: Optional['IcebergColumn'] = None # для MAP<key, value>\n value_type: Optional['IcebergColumn'] = None\n is_required: bool = False # nullability\n \n @classmethod\n def translate_type(cls, dtype: str) -> str:\n # ANSI / dbt -> Iceberg\n TYPE_MAP = {\n 'TEXT': 'STRING',\n 'VARCHAR': 'STRING',\n 'CHAR': 'STRING',\n 'INTEGER': 'INT',\n 'BIGINT': 'LONG',\n 'SMALLINT': 'INT',\n 'DOUBLE': 'DOUBLE',\n 'FLOAT': 'FLOAT',\n 'BOOLEAN': 'BOOLEAN',\n 'DATE': 'DATE',\n 'TIMESTAMP': 'TIMESTAMP',\n 'TIMESTAMPTZ': 'TIMESTAMP',\n }\n \n # Handle complex types\n if dtype.startswith('LIST<'):\n return dtype # already in Iceberg syntax\n if dtype.startswith('STRUCT<'):\n return dtype\n if dtype.startswith('MAP<'):\n return dtype\n \n return TYPE_MAP.get(dtype.upper(), dtype)\n \n @property\n def data_type(self) -> str:\n # Render with full Iceberg syntax\n if self.dtype == 'STRUCT' and self.nested_fields:\n fields = ', '.join(f'{f.column}: {f.data_type}' for f in self.nested_fields)\n return f'STRUCT<{fields}>'\n \n if self.dtype == 'LIST' and self.element_type:\n return f'LIST<{self.element_type.data_type}>'\n \n if self.dtype == 'MAP' and self.key_type and self.value_type:\n return f'MAP<{self.key_type.data_type}, {self.value_type.data_type}>'\n \n # Standard precision/scale\n if self.dtype == 'DECIMAL':\n if self.numeric_precision and self.numeric_scale:\n return f'DECIMAL({self.numeric_precision}, {self.numeric_scale})'\n return 'DECIMAL'\n \n # Standard with char_size\n if self.char_size:\n return f'{self.dtype}({self.char_size})'\n \n return self.dtype\n \n @property\n def is_complex(self) -> bool:\n """True if STRUCT, LIST, MAP"""\n return self.dtype in ('STRUCT', 'LIST', 'MAP')\n \n @property\n def is_string(self) -> bool:\n return self.dtype == 'STRING'\n \n @property\n def is_numeric(self) -> bool:\n return self.dtype in ('INT', 'LONG', 'FLOAT', 'DOUBLE', 'DECIMAL')\n \n @property\n def is_temporal(self) -> bool:\n return self.dtype in ('DATE', 'TIME', 'TIMESTAMP', 'TIMESTAMP_NS')\n \n def to_iceberg_schema_dict(self) -> dict:\n """Convert to Iceberg schema dict для metadata"""\n result = {\n 'name': self.column,\n 'type': self.data_type,\n 'required': self.is_required,\n }\n \n if self.is_complex:\n if self.dtype == 'STRUCT':\n result['fields'] = [f.to_iceberg_schema_dict() for f in self.nested_fields]\n elif self.dtype == 'LIST':\n result['element'] = self.element_type.to_iceberg_schema_dict()\n elif self.dtype == 'MAP':\n result['key'] = self.key_type.to_iceberg_schema_dict()\n result['value'] = self.value_type.to_iceberg_schema_dict()\n \n return result\n```\n\n**Usage**:\n\n```python\n# Simple\ncol_id = IcebergColumn(column='id', dtype='LONG', is_required=True)\nprint(col_id.data_type)\n# LONG\n\n# Decimal\ncol_price = IcebergColumn(column='price', dtype='DECIMAL', numeric_precision=15, numeric_scale=2)\nprint(col_price.data_type)\n# DECIMAL(15, 2)\n\n# Struct\ncol_address = IcebergColumn(\n column='address',\n dtype='STRUCT',\n nested_fields=[\n IcebergColumn(column='street', dtype='STRING'),\n IcebergColumn(column='city', dtype='STRING'),\n IcebergColumn(column='zip', dtype='STRING', char_size=10),\n ],\n)\nprint(col_address.data_type)\n# STRUCT<street: STRING, city: STRING, zip: STRING(10)>\n\n# List\ncol_tags = IcebergColumn(\n column='tags',\n dtype='LIST',\n element_type=IcebergColumn(column='', dtype='STRING'),\n)\nprint(col_tags.data_type)\n# LIST<STRING>\n\n# Map\ncol_attrs = IcebergColumn(\n column='attributes',\n dtype='MAP',\n key_type=IcebergColumn(column='', dtype='STRING'),\n value_type=IcebergColumn(column='', dtype='STRING'),\n)\nprint(col_attrs.data_type)\n# MAP<STRING, STRING>\n```\n\n**Adapter integration**:\n\n```python\n# impl.py\nclass IcebergAdapter(SQLAdapter):\n Column = IcebergColumn # ← use custom Column\n \n @classmethod\n def convert_text_type(cls, agate_table, col_idx):\n return 'STRING'\n \n @classmethod\n def convert_number_type(cls, agate_table, col_idx):\n decimals = agate_table.aggregate(agate.MaxPrecision(col_idx))\n if decimals:\n return 'DECIMAL(15, 2)'\n return 'LONG'\n \n @classmethod\n def convert_boolean_type(cls, agate_table, col_idx):\n return 'BOOLEAN'\n \n @classmethod\n def convert_datetime_type(cls, agate_table, col_idx):\n return 'TIMESTAMP'\n```\n\n**Macros для introspection**:\n\n```jinja\n{% macro iceberg__get_columns_in_relation(relation) %}\n {% call statement('get_columns_in_relation', fetch_result=True) %}\n SELECT * FROM iceberg_catalog.metadata.columns(...)\n {% endcall %}\n {# Parse Iceberg metadata response, construct IcebergColumn instances #}\n ...\n{% endmacro %}\n```\n\n**Why это complexity worth it**:\n\n1. **Data modeling**: lakehouse projects often have nested data (events, JSON-like).\n2. **Schema evolution**: Iceberg supports adding fields к STRUCT — adapter must understand.\n3. **Type compatibility**: `can_expand_to` для incremental updates с schema changes.\n4. **Documentation**: rich types помогают для understanding model schema.\n5. **Validation**: contract checks нуждаются в proper type representation.\n\nЭто **enterprise-grade adapter** territory. Не для PoC, но essential для production data lakehouse.\n\nReference: Iceberg spec, dbt-snowflake's similar handling for VARIANT/OBJECT/ARRAY.

Проверка знанийKnowledge check

Чем отличаются `Column.translate_type()` (classmethod) и `Adapter.convert_*_type()` (на Adapter class)?

ОтветAnswer

Это **два разных type-conversion mechanism** в dbt-adapters. Используются в разных context.\n\n**Column.translate_type(dtype)** — type mapping**:\n\n**Purpose**: convert ANSI/dbt type name -> warehouse-specific.\n\n**Когда используется**:\n\n1. **Materialization SQL** — building CREATE TABLE / ALTER TABLE statements\n2. **Schema sync** — comparing column types для incremental on_schema_change\n3. **Contract validation** — verifying schema.yml types match warehouse\n\n**Signature**:\n\n```python\nclass Column:\n @classmethod\n def translate_type(cls, dtype: str) -> str:\n # input: ANSI/dbt type ('TEXT', 'INTEGER')\n # output: warehouse type ('VARCHAR(16777216)', 'NUMBER(38, 0)')\n return TYPE_MAP.get(dtype.upper(), dtype)\n```\n\n**Use**:\n\n```jinja\n-- В materialization\n{% for col in columns %}\n {{ col.column }} {{ adapter.translate_type(col.dtype) }}\n -- col.dtype = 'TEXT' -> translates to 'VARCHAR(16777216)' on Snowflake\n{% endfor %}\n```\n\n**Adapter.convert_*_type(agate_table, col_idx)** — CSV -> SQL**:\n\n**Purpose**: convert Python type (from CSV via agate) -> warehouse SQL type.\n\n**Когда используется**:\n\n1. **dbt seed** — loading CSV as table\n2. **Custom seed materializations**\n3. **Some test setups**\n\n**Signature**:\n\n```python\nclass MyAdapter(SQLAdapter):\n @classmethod\n def convert_text_type(cls, agate_table, col_idx):\n # input: agate table + column index\n # output: SQL type string\n return 'TEXT'\n \n @classmethod\n def convert_number_type(cls, agate_table, col_idx):\n decimals = agate_table.aggregate(agate.MaxPrecision(col_idx))\n return 'DOUBLE' if decimals else 'BIGINT'\n \n @classmethod\n def convert_boolean_type(cls, agate_table, col_idx):\n return 'BOOLEAN'\n \n @classmethod\n def convert_datetime_type(cls, agate_table, col_idx):\n return 'TIMESTAMP'\n \n @classmethod\n def convert_date_type(cls, agate_table, col_idx):\n return 'DATE'\n \n @classmethod\n def convert_time_type(cls, agate_table, col_idx):\n return 'TIME'\n```\n\n**Use**:\n\n```python\n# dbt seed flow internally:\nimport agate\nimport csv\n\n# Step 1: Read CSV\nwith open('seeds/countries.csv') as f:\n reader = csv.DictReader(f)\n table = agate.Table.from_csv(reader)\n\n# Step 2: Infer Python types\nfor i, col_type in enumerate(table.column_types):\n if isinstance(col_type, agate.Text):\n sql_type = MyAdapter.convert_text_type(table, i)\n elif isinstance(col_type, agate.Number):\n sql_type = MyAdapter.convert_number_type(table, i)\n elif isinstance(col_type, agate.Boolean):\n sql_type = MyAdapter.convert_boolean_type(table, i)\n elif isinstance(col_type, agate.Date):\n sql_type = MyAdapter.convert_date_type(table, i)\n elif isinstance(col_type, agate.DateTime):\n sql_type = MyAdapter.convert_datetime_type(table, i)\n elif isinstance(col_type, agate.TimeDelta):\n sql_type = MyAdapter.convert_time_type(table, i)\n\n# Step 3: CREATE TABLE с inferred types\n# Step 4: INSERT FROM VALUES\n```\n\n**Key difference**:\n\n| Aspect | Column.translate_type | Adapter.convert_*_type |\n|--------|------------------------|------------------------|\n| **Input** | String (type name) | agate Table + column index |\n| **Output** | Warehouse SQL type | Warehouse SQL type |\n| **Use case** | Map between type systems | Infer SQL type from Python data |\n| **Used by** | Materializations, contracts | dbt seed primarily |\n| **Considers data** | No | Yes (precision, max value, etc.) |\n\n**Example workflow**:\n\n**Scenario 1 — dbt run на existing model**:\n\n```yaml\n# schema.yml\nmodels:\n - name: customers\n columns:\n - name: id\n data_type: INTEGER # contract\n```\n\nWhen dbt builds CREATE TABLE:\n\n```jinja\nCREATE TABLE customers (\n id {{ adapter.translate_type('INTEGER') }},\n -- ↑ uses Column.translate_type via adapter dispatch\n -- result: 'NUMBER(38, 0)' on Snowflake, 'integer' on Postgres\n)\n```\n\nUses **translate_type**.\n\n**Scenario 2 — dbt seed на CSV**:\n\n```csv\n# seeds/products.csv\nid,name,price,launched_at,is_active\n1,Widget,99.99,2024-01-15,true\n```\n\ndbt seed inspects values:\n\n```python\n# Internal\ntable.column_types = [\n agate.Number, # id\n agate.Text, # name\n agate.Number, # price (с decimals)\n agate.Date, # launched_at\n agate.Boolean, # is_active\n]\n\n# Apply convert_*\ntypes = [\n MyAdapter.convert_number_type(table, 0), # 'BIGINT'\n MyAdapter.convert_text_type(table, 1), # 'TEXT'\n MyAdapter.convert_number_type(table, 2), # 'DOUBLE' (has decimals)\n MyAdapter.convert_date_type(table, 3), # 'DATE'\n MyAdapter.convert_boolean_type(table, 4), # 'BOOLEAN'\n]\n\n# Generate:\nCREATE TABLE products (\n id BIGINT,\n name TEXT,\n price DOUBLE,\n launched_at DATE,\n is_active BOOLEAN\n);\n```\n\nUses **convert_*_type**.\n\n**Customization**:\n\n**For new adapter, override both**:\n\n```python\n# Column class\nclass MyColumn(Column):\n @classmethod\n def translate_type(cls, dtype):\n TYPE_MAP = {'TEXT': 'STRING', 'INTEGER': 'INT64', ...}\n return TYPE_MAP.get(dtype.upper(), dtype)\n\n# Adapter class\nclass MyAdapter(SQLAdapter):\n Column = MyColumn\n \n @classmethod\n def convert_text_type(cls, agate_table, col_idx):\n return 'STRING'\n \n @classmethod\n def convert_number_type(cls, agate_table, col_idx):\n ...\n```\n\nBoth needed для full functionality.\n\n**Production tip**:\n\nKeep TYPE_MAP consistent между translate_type и convert_*_type:\n\n```python\n# Wrong: inconsistent\nclass MyColumn:\n translate_type('TEXT') -> 'VARCHAR'\n\nclass MyAdapter:\n convert_text_type -> 'TEXT'\n\n# Then CREATE TABLE varies based on path:\n# - From contract: uses VARCHAR\n# - From seed: uses TEXT\n# - Confusing!\n\n# Right: consistent\nclass MyColumn:\n translate_type('TEXT') -> 'STRING'\n\nclass MyAdapter:\n convert_text_type -> 'STRING'\n\n# Always consistent.\n```\n\n**Test conformance**:\n\n```python\ndef test_consistent_type_mappings():\n # Both should produce same SQL для 'text'\n sql_from_translate = MyColumn.translate_type('TEXT')\n sql_from_convert = MyAdapter.convert_text_type(table, 0)\n assert sql_from_translate == sql_from_convert\n```\n\nThis is **subtle but important**. Без consistency — confusing user experience.

Column API: маппинг warehouse types в dbt types

Структура Column

type_label_for vs translate_type

Adapter-specific Column

Type rendering в materializations

Adapter type conversions (impl.py)

Column-level information в introspection

Quote column names

Capabilities and special types

Production-grade Column class

Попробуй сам

Ключевые выводы

Закончили урок?