Metric types: simple, ratio, cumulative
В предыдущем уроке мы видели simple metric (direct alias на measure) и derived metric (formula). В этом уроке детально разберём остальные типы: simple (revisited), ratio (для conversion rate, CTR), cumulative (для running totals, MTD/YTD, retention windows).
Каждый тип решает свой класс задач. Понимая их, вы можете описать любую бизнес-метрику.
Fact Tables: мера как основа для MetricFlow simple metricSimple metric — direct measure
Самый простой тип. Прямое отображение measure -> metric.
# В semantic_model:
measures:
- name: order_count
agg: count
expr: order_id
# В metrics:
metrics:
- name: total_orders
description: "Total number of orders"
type: simple
type_params:
measure: order_count
Когда total_orders запрашивается:
client.query(metrics=['total_orders'], group_by=['metric_time__month'])
MetricFlow генерирует:
SELECT
DATE_TRUNC('month', order_date) AS metric_time__month,
COUNT(order_id) AS total_orders
FROM analytics.fct_orders
GROUP BY 1
Простая агрегация. Применяется когда метрика = одно измерение без сложных вычислений.
Use cases:
total_revenue = SUM(amount)customer_count = COUNT(DISTINCT customer_id)avg_order_value = AVG(amount)
Ratio metric — numerator / denominator
Когда метрика — это отношение. Conversion rate, click-through rate, refund rate.
# В semantic_model orders:
measures:
- name: order_count
agg: count
expr: order_id
- name: refunded_order_count
agg: count
expr: order_id
agg_params:
filter: "refund_amount > 0" # boolean filter
# В metrics:
metrics:
- name: refund_rate
description: "Percent of orders that were refunded"
type: ratio
type_params:
numerator: refunded_order_count
denominator: order_count
Когда refund_rate запрашивается by month:
SELECT
DATE_TRUNC('month', order_date) AS month,
COUNT(CASE WHEN refund_amount > 0 THEN order_id END) * 1.0 / COUNT(order_id) AS refund_rate
FROM analytics.fct_orders
GROUP BY 1
* 1.0 важно — без него на integer denominator получим integer division (0 вместо 0.05).
Ratio с фильтрами
metrics:
- name: paid_customer_conversion
description: "% of trial users who became paid customers"
type: ratio
type_params:
numerator:
name: paid_customer_count
filter: "subscription_status = 'paid'"
denominator:
name: customer_count
filter: "signed_up_at IS NOT NULL"
Можно фильтровать numerator и denominator независимо. Это для сложных conversion funnels.
Ratio с different semantic_models
metrics:
- name: revenue_per_customer
description: "Average revenue per customer (LTV proxy)"
type: ratio
type_params:
numerator: order_amount # measure on `orders` semantic_model
denominator: customer_count # measure on `customers` semantic_model
MetricFlow знает что измерения на разных таблицах. Делает JOIN через entities -> агрегирует.
SELECT
SUM(o.amount) AS numerator,
COUNT(DISTINCT c.customer_id) AS denominator,
SUM(o.amount) * 1.0 / COUNT(DISTINCT c.customer_id) AS revenue_per_customer
FROM analytics.fct_orders o
RIGHT JOIN analytics.dim_customers c
ON o.customer_id = c.customer_id
Cumulative metric — running aggregate
Когда метрика — это накопительная сумма за окно. MTD revenue, YTD revenue, rolling 7-day active users, 30-day cohort retention.
# В metrics:
metrics:
- name: revenue_mtd
description: "Month-to-date revenue (cumulative within month)"
type: cumulative
type_params:
measure: order_amount
window: 1 month # окно
grain_to_date: month # restart accumulation each month
При запросе:
client.query(
metrics=['revenue_mtd'],
group_by=['metric_time__day']
)
MetricFlow генерирует:
SELECT
d.day,
SUM(o.amount) OVER (
PARTITION BY DATE_TRUNC('month', d.day)
ORDER BY d.day
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS revenue_mtd
FROM (
SELECT DISTINCT DATE_TRUNC('day', order_date) AS day
FROM analytics.fct_orders
) d
LEFT JOIN analytics.fct_orders o
ON DATE_TRUNC('day', o.order_date) <= d.day
AND DATE_TRUNC('month', o.order_date) = DATE_TRUNC('month', d.day)
ORDER BY 1
Window function с restart по началу месяца.
Cumulative без grain_to_date — всё время
metrics:
- name: total_lifetime_revenue
description: "Lifetime revenue (cumulative from start)"
type: cumulative
type_params:
measure: order_amount
# no window, no grain_to_date — cumulate over all history
SQL:
SELECT
day,
SUM(amount) OVER (ORDER BY day ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS total_lifetime_revenue
FROM ...
Cumulative с window — rolling
metrics:
- name: revenue_rolling_30_day
description: "Rolling 30-day revenue"
type: cumulative
type_params:
measure: order_amount
window: 30 day # окно 30 дней
SQL:
SELECT
day,
SUM(amount) OVER (
ORDER BY day
RANGE BETWEEN INTERVAL '30 day' PRECEDING AND CURRENT ROW
) AS revenue_rolling_30_day
FROM ...
Каждый день — sum последних 30 дней. Полезно для tendency analysis (тренды без шума weekend dips).
Derived metric — формула из других metrics
Уже встречали. Композиция из других metrics через expr.
metrics:
- name: profit_margin
description: "Profit / Revenue percentage"
type: derived
type_params:
expr: "(revenue - costs) / revenue * 100"
metrics:
- name: revenue
- name: costs
Compiled SQL aggregates revenue и costs, потом считает (revenue - costs) / revenue * 100.
Derived с filter
metrics:
- name: revenue_paid_tier
type: derived
type_params:
expr: "revenue"
metrics:
- name: revenue
filter: "{'{{ Dimension(\'customer__tier\') }}'} = 'paid'"
{'{{ Dimension(...) }}'} — Jinja-ссылка на dimension в filter context. Применит фильтр внутри метрики.
Conversion metric — funnel’ы
Специальный тип для conversion funnels:
metrics:
- name: trial_to_paid_conversion
description: "% of trial users who converted to paid within 30 days"
type: conversion
type_params:
base_measure:
name: trial_signups
conversion_measure:
name: paid_signups
entity: customer_id
window: 30 day
conversion_type_params:
calculation: conversion_rate
Compiled SQL — два аггрегата с window-based JOIN:
WITH base AS (
SELECT customer_id, signup_at FROM ... WHERE type = 'trial'
),
conversion AS (
SELECT customer_id, conversion_at FROM ... WHERE type = 'paid'
),
joined AS (
SELECT
b.customer_id,
b.signup_at,
c.conversion_at,
CASE WHEN c.conversion_at IS NOT NULL AND
c.conversion_at <= b.signup_at + INTERVAL '30 days'
THEN 1 ELSE 0 END AS converted
FROM base b
LEFT JOIN conversion c USING (customer_id)
)
SELECT
COUNT(CASE WHEN converted = 1 THEN customer_id END) * 1.0 / COUNT(customer_id) AS conversion_rate
FROM joined
Conversion metric — мощный для analytics. Available в MetricFlow 1.10+.
Полный пример: e-commerce SaaS
# models/semantic_models/mt_metrics.yml
metrics:
# === Simple ===
- name: total_revenue
description: "Sum of order amounts"
type: simple
type_params:
measure: order_amount
- name: total_orders
description: "Count of orders"
type: simple
type_params:
measure: order_count
- name: active_customer_count
description: "Distinct customers in period"
type: simple
type_params:
measure: customer_count_distinct
# === Ratio ===
- name: avg_order_value
description: "Revenue / orders"
type: ratio
type_params:
numerator: order_amount
denominator: order_count
- name: refund_rate
description: "Refunded orders / total orders"
type: ratio
type_params:
numerator: refunded_order_count
denominator: order_count
- name: revenue_per_customer
description: "Revenue / unique customers"
type: ratio
type_params:
numerator: order_amount
denominator: customer_count_distinct
# === Cumulative ===
- name: revenue_mtd
description: "Month-to-date revenue"
type: cumulative
type_params:
measure: order_amount
grain_to_date: month
- name: revenue_ytd
description: "Year-to-date revenue"
type: cumulative
type_params:
measure: order_amount
grain_to_date: year
- name: revenue_rolling_30d
description: "Rolling 30-day revenue"
type: cumulative
type_params:
measure: order_amount
window: 30 day
# === Derived ===
- name: net_revenue
description: "Revenue minus refunds"
type: derived
type_params:
expr: "total_revenue - refund_amount"
metrics:
- name: total_revenue
- name: refund_amount
- name: gross_margin_pct
description: "(Revenue - Cost) / Revenue * 100"
type: derived
type_params:
expr: "(total_revenue - total_costs) / total_revenue * 100"
metrics:
- name: total_revenue
- name: total_costs
Этот набор покрывает ~80% типичных e-commerce SaaS метрик. На основе одного semantic_model orders + связанный customers.
Сравнение типов метрик
| Type | Что делает | SQL pattern | Use case |
|---|---|---|---|
simple | Direct measure | SUM(x) FROM t GROUP BY | total revenue, total orders |
ratio | num / denom | SUM(num) / SUM(denom) | conversion rate, refund rate, AOV |
cumulative | Running sum в window | SUM() OVER (...) | MTD/YTD revenue, rolling 30d |
derived | Formula из metrics | (a - b) / a * 100 | margin, NRR, growth rate |
conversion | Funnel rate с window | window-based JOIN | trial->paid, signup->purchase |
В реальной project’е используются все 5 типов в comparable пропорциях. Понимая каждый, можно описать почти любую бизнес-метрику.
Performance considerations
| Type | Performance characteristic |
|---|---|
simple | Простой SUM/COUNT. Быстро. |
ratio | Два aggregate’а. Чуть медленнее. |
cumulative | Window function. Может быть тяжёлым на большом периоде. |
derived | Зависит от underlying metrics. Если они быстрые — derived быстрый. |
conversion | Самый тяжёлый. Window JOIN на entity + time -> большой query. |
Оптимизация:
- cumulative: если запрашивается MTD каждый день для dashboard — материализуй через saved query.
- conversion: всегда материализуй для production usage. Чистый ad-hoc — слишком медленно.
- Pre-aggregations — топовая оптимизация. dbt Cloud saved queries или Cube pre-aggregations.
Попробуй сам
Расширьте семантические модели из прошлого урока. Добавьте все 4 типа метрик:
# models/semantic_models/mt_metrics.yml
metrics:
- name: total_orders
type: simple
type_params:
measure: order_count
- name: aov # average order value
type: ratio
type_params:
numerator: order_amount
denominator: order_count
- name: revenue_mtd
type: cumulative
type_params:
measure: order_amount
grain_to_date: month
- name: net_orders
type: derived
type_params:
expr: "total_orders - refunded_orders"
metrics:
- name: total_orders
- name: refunded_orders
Запустите:
dbt parse
dbt sl validate
dbt sl query --metrics aov --group_by metric_time__month
dbt sl query --metrics revenue_mtd --group_by metric_time__day
Получите SQL и результаты. Посмотрите как MetricFlow генерирует разный SQL для каждого типа.
Бонус: создайте conversion metric customer_to_repeat_purchase — % новых покупателей которые сделали второй заказ в 90 дней. Это типичная customer retention метрика.
Ключевые выводы
- 5 типов метрик в MetricFlow: simple, ratio, cumulative, derived, conversion.
- simple — direct alias на measure. SUM/COUNT/AVG в GROUP BY.
- ratio — numerator/denominator. Для conversion rate, refund rate, AOV.
* 1.0важно для float division. - cumulative — running sum в window.
grain_to_daterestartит accumulation (MTD, YTD).windowдля rolling (30d). - derived — формула из других metrics через
expr. Composition без duplicating logic. - conversion (1.10+) — funnel rate с time window. Самый сложный, тяжёлый на performance.
- Performance: simple -> ratio -> cumulative -> conversion в порядке возрастания cost. Для production cumulative/conversion — материализуйте через saved queries.
- На реальном проекте используются все типы, mix зависит от бизнеса (E-commerce — много ratio, SaaS — много cumulative и conversion).