Learning Platform
Глоссарий Troubleshooting
Урок 19.10 · 18 мин
Продвинутый
ResourcesCommunityAlternativesCareer

What’s next — resources, alternatives, дальнейший путь

Поздравляю — вы прошли путь от scheduler internals до production deployment, capstone проекта и migration playbook. Этот финальный урок — что делать дальше: где углублять knowledge, какие community resources actively maintained, и какие альтернативы Airflow стоит изучить для broader architectural view.

Этот курс covered Airflow 2.10/2.11 LTS deep — но industry exquisite Airflow alternatives. Знать их — sign senior engineer.

Оркестрация Spark через Airflow Analytics engineering и роль dbt Что такое Apache Kafka и его место в архитектуре Что такое Change Data Capture
Orchestration ecosystem map — где Airflow и его альтернативы
Airflow 2.x/3.xGeneral-purpose batch ETL orchestrator. Tasks → DAG pattern. Python-based, PostgreSQL state store. Largest community, most mature (since 2014). Best for: data engineering pipelines, scheduled ETL, cross-tool orchestration. Industry standard в 2026. Cloud SaaS: Astronomer.
когда выбрать другое?
Dagster — asset-firstSoftware-defined assets вместо tasks. Каждая asset = Python function что produces data с type signatures, schema validation, lineage. Strengths: type-safe data flow, materializations clear в UI, strong testing story, IO Managers. Weaknesses: smaller ecosystem, less mature operations side. When: analytics engineering (dbt + Python), ML pipelines, strong typing culture.
Prefect 3 — dynamic PythonicWorkflows как Python functions с @flow decorator. Workflow structure can decide at runtime (early exits, conditional branches). Strengths: Pythonic, no parse-time quirks, dynamic shape, Prefect Cloud SaaS hybrid execution. Weaknesses: less enterprise (RBAC, multi-tenancy), smaller provider ecosystem. When: dynamic workflows где structure depends on runtime data.
Temporal — durable executionНе data orchestrator — durable execution framework. Workflows survive crashes, retries, weeks-long durations. Multi-language (Python, Go, Java, TS). Strengths: truly durable, long-running workflows (days, weeks, months), strong typing. Used by Uber, Stripe, Snap. Weaknesses: no operators для Spark/Postgres etc, different mental model. When: business processes (loan approval, multi-day sagas), need durability guarantees.
Argo Workflows — K8s-nativeWorkflow как Kubernetes CRD. YAML или Hera Python SDK. Strengths: K8s-native (no separate scheduler infra), highly parallel (millions short tasks), tight K8s integration (volumes, secrets). Weaknesses: YAML-heavy, fewer batch ETL features, smaller data engineering community. When: K8s-first organizations, ML pipelines (Kubeflow uses Argo), highly parallel scientific computing.
Kestra — YAML declarativeDeclarative YAML workflows + plugins ecosystem. Targets developer experience. Strengths: YAML-first (non-Python users), strong UI/DAG visualization, growing fast 2024-2026. Weaknesses: newer (less battle-tested), smaller ecosystem. When: mixed-language teams, prefer declarative YAML, willing bet on growing tool.
senior engineer mindset
Right tool for right jobThe best Airflow engineer не тот, кто знает все operators — это тот, кто знает когда не использовать Airflow. Senior architecture decisions: Airflow для general batch ETL, Dagster для asset-first analytics/ML, Temporal для long-running business processes, Argo для K8s-native parallel jobs. Знать все 5 = architectural breadth → senior data engineer.

Что вы освоили в этом курсе

Recap 18 модулей:

МодулиЧто освоено
00-03Foundations — DAGs, operators, sensors
04 — Scheduler internalsCritical section, HA via row-locks
05 — ExecutorsCelery, Kubernetes, Multiple Executors (AIP-61)
06 — XComCustom backends, S3 storage
07 — Dynamic Task Mapping.expand() patterns
08 — DatasetsData-aware scheduling
09 — Triggerer / Deferrableasyncio, long-waiting sensors
10 — SecretsVault, Connection caching
11 — Pools, concurrencyResource control
12 — Plugins, ListenersExtension points
13 — REST API, CLIAutomation
14 — ObservabilityOpenTelemetry, OpenLineage, Marquez
15 — Production deploymentHA, Helm, PgBouncer, security
16 — TestingDagBag, unit, integration, CI
17 — Design patternsIdempotency, factory, error handling
18 — Capstone + MigrationFull pipeline + 3.x migration

Это comprehensive coverage. С этим knowledge — вы senior-level Airflow engineer.


Resources для углубления

Apache Airflow official

Astronomer ecosystem

  • Astronomer Academy: https://academy.astronomer.io — free courses, including certification (Astronomer Certification for Apache Airflow — industry-respected)
  • Astronomer blog: https://www.astronomer.io/blog — quality articles на advanced topics
  • Astronomer Webinars: live monthly events, recordings available
  • The Airflow Newsletter: weekly digest by Astronomer team

Conferences

  • Airflow Summit: annual conference (ASF event), talks на YouTube
    • 2024 — virtual + Boston
    • 2025 — Bay Area
    • 2026 — likely hybrid
  • DataConf / Data Council — broader data engineering, Airflow tracks
  • Big Data LDN / Sweden / NL — European conferences

Books

  • Data Pipelines with Apache Airflow by Bas Harenslak & Julian de Ruiter (Manning, 2021) — comprehensive 2.x focus. Updated edition expected 2026 для 3.x
  • Fundamentals of Data Engineering by Joe Reis & Matt Housley (O’Reilly, 2022) — broader context, Airflow chapter excellent
  • The Data Engineering Cookbook by Andreas Kretz — free GitBook, practical

Blogs / Newsletters

  • Astronomer blog (mentioned above)
  • Maxime Beauchemin’s blog (Airflow creator, ex-Airbnb) — strategic perspective
  • Tobias Macey’s Data Engineering Podcast — interviews with Airflow committers
  • The Pragmatic Engineer (Gergely Orosz) — newsletter с occasional Airflow content

Alternatives — architectural breadth

Senior data engineer должен know альтернативы. Each tool solves slightly different problem. Знать когда что использовать — sign maturity.

Dagster — asset-first orchestration

Concept: вместо tasks-first (Airflow), thinks в terms of software-defined assets. Каждый asset — Python function что produces specific data, with type signatures, schema validation, lineage.

# Dagster asset
from dagster import asset, AssetExecutionContext

@asset
def raw_orders(context: AssetExecutionContext) -> pd.DataFrame:
    return pd.read_sql("SELECT * FROM source.orders", conn)

@asset(deps=[raw_orders])
def cleaned_orders(raw_orders: pd.DataFrame) -> pd.DataFrame:
    return raw_orders.dropna()

Strengths:

  • Type-safe data flow (Python typing enforced)
  • Better для analytics engineering / ML pipelines
  • Materializations clear in UI (asset → asset)
  • Strong testing story
  • IO Managers abstract storage

Weaknesses:

  • Smaller ecosystem чем Airflow (fewer operators)
  • Less mature на operations side
  • Smaller community

When to choose: data team focused на analytics engineering (dbt + Python), strong typing culture, asset-first thinking.

Prefect 3 — dynamic Pythonic workflow

Concept: workflows как Python функции с @flow decorator. Dynamic — workflow structure can decide at runtime.

from prefect import flow, task

@task
def fetch_data(date): ...

@flow
def my_pipeline(date):
    data = fetch_data(date)
    if data.is_empty():
        return  # Early exit
    process(data)

Strengths:

  • Pythonic — no parse-time quirks
  • Dynamic — workflow can change shape
  • Prefect Cloud SaaS — managed control plane
  • Hybrid execution — agents в your cloud

Weaknesses:

  • Less mature на enterprise features (RBAC, multi-tenancy)
  • Smaller провider ecosystem

When to choose: dynamic workflows where structure depends on runtime data, Python-heavy team, want SaaS control plane.

Temporal — durable execution

Concept: not data orchestrator — это durable execution framework. Workflows survive crashes, retries, weeks-long durations. Used для business processes, не batch ETL.

@workflow.defn
class OrderProcessingWorkflow:
    @workflow.run
    async def run(self, order_id: str):
        await workflow.execute_activity(reserve_inventory, order_id)
        await workflow.execute_activity(charge_payment, order_id)
        await workflow.sleep(timedelta(days=7))  # 7-day waiting period
        await workflow.execute_activity(ship_order, order_id)

Strengths:

  • Truly durable — survives node crashes, restarts
  • Long-running workflows (days, weeks, months)
  • Strong typing
  • Used by Uber, Stripe, Snap

Weaknesses:

  • Не предназначен для data orchestration (no operators for Spark, Postgres etc)
  • Different mental model
  • Operations overhead

When to choose: business processes (loan approval workflow, multi-day sagas), need durability guarantees, can build own integrations.

Argo Workflows — Kubernetes-native

Concept: Workflow как Kubernetes CRD (Custom Resource Definition). Workflows = YAML описания pods, Argo controller runs them.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
spec:
  entrypoint: my-workflow
  templates:
    - name: my-workflow
      steps:
        - - name: extract
            template: run-pod
            arguments: {parameters: [{name: cmd, value: extract.py}]}
        - - name: transform
            template: run-pod
            arguments: {parameters: [{name: cmd, value: transform.py}]}

Strengths:

  • Kubernetes-native — no separate scheduler infra
  • Highly parallel — millions of short tasks
  • YAML or SDK (Hera Python)
  • Tight K8s integration (volumes, secrets, etc)

Weaknesses:

  • YAML-heavy (Hera SDK helps)
  • Less “batch ETL” features compared к Airflow
  • Smaller community in data engineering space

When to choose: K8s-first organizations, ML pipelines (Kubeflow uses Argo internally), highly parallel scientific computing.

Kestra — modern YAML/Kotlin

Concept: declarative YAML workflows + plugins ecosystem. Targets developer experience.

id: orders-etl
namespace: production
tasks:
  - id: extract
    type: io.kestra.plugin.jdbc.postgresql.Query
    sql: SELECT * FROM orders
  - id: transform
    type: io.kestra.plugin.scripts.python.Script
    script: |
      df = ...

Strengths:

  • YAML-first (easier для non-Python users)
  • Strong UI / DAG visualization
  • Growing fast в 2024-2026

Weaknesses:

  • Newer (less battle-tested)
  • Smaller ecosystem

When to choose: organizations with mixed-language teams, prefer declarative YAML, willing to bet on growing tool.


Comparison матрица

DimensionAirflow 2.x/3.xDagsterPrefect 3TemporalArgo
Primary useBatch ETLAnalytics engineering / MLDynamic workflowsBusiness processesK8s parallel jobs
LanguagePythonPythonPythonMulti (Python, Go, Java, TS)YAML / Python SDK
PatternTasks → DAGAssets → graphFunctions → flowsWorkflows + activitiesSteps → Pod templates
SchedulingCron, datasetsCron, sensorsCron, dynamicNone (you trigger)Cron via CronWorkflow
State storePostgreSQLPostgreSQL / cloudPostgreSQL / cloudCassandra / MySQLetcd (via K8s)
Best forGeneral-purpose batchdbt+Python+MLPythonic dynamicMulti-day business processesK8s-native
MaturityVery highHighMedium-highVery highHigh
Community sizeLargestLargeLargeMediumMedium (K8s overlap)
Cloud SaaSAstronomerDagster CloudPrefect CloudTemporal CloudArgo Cloud (less common)

Career paths

Specialization paths для Airflow expertise:

Path 1 — Senior Data Engineer

  • Lead Airflow implementation в company
  • Set patterns / standards (capstone-level architecture)
  • Mentor junior engineers
  • Make tooling decisions (Airflow vs alternatives)

Path 2 — Data Platform Engineer

  • Design data platform architecture
  • Multi-tool integration (Airflow + Spark + dbt + ClickHouse)
  • Infrastructure focus — Helm, K8s, observability
  • Cross-team enablement

Path 3 — Open Source Contributor

  • Contribute к Apache Airflow (commits, PRs, AIPs)
  • Provider package maintenance
  • Become committer / PMC member
  • Conference speaking

Path 4 — Founder / Consultant

  • Specialized Airflow consulting
  • Migrate companies from custom solutions
  • Build complementary products (lineage tools, monitoring)
  • Vendor (Astronomer-like) work

Final advice

После этого курса:

  1. Build something real — capstone-level project в production at your job или as side project. Theory без practice fades.

  2. Contribute — fix one bug, добавь one provider, write one blog post. Каждое contribution — learning + visibility.

  3. Read source code — Airflow open source, scheduler 2-3k lines, readable. Hours читания source = month of docs reading.

  4. Stay current — Airflow evolves каждые ~6 months. Read release notes, attend Summit, follow Slack.

  5. Know the alternatives — even if you stay with Airflow, knowing Dagster/Prefect/Temporal gives perspective. Architecture decisions улучшаются с broader view.

  6. Production matters — anyone can write DAG. Few can run Airflow at scale reliably. Master HA, monitoring, security — those who do are valuable.

  7. Mentor — teach junior engineer Airflow basics. Teaching crystallizes knowledge.


Conclusion

Airflow в 2026 — индустриальный стандарт workflow orchestration, и его relevance only grows с increased complexity dataиплатформ. Знания этого курса — solid foundation для long-term career в data engineering.

Course covered Airflow 2.x deeply because 2.10/2.11 LTS — где most production deployments в 2026. 3.x migration playbook (module 18) prepares вас для smooth transition когда time будет right.

Whether вы build pipelines, lead teams, or contribute upstream — you now have tools для professional excellence in this space.

Удачи в production deployments.


Final thought

“The best Airflow engineer не тот, кто знает все operators — это тот, кто знает когда не использовать Airflow.”

Right tool для right job. Этот course дал deep knowledge Airflow + perspective когда чем-то другим он лучше. Это и есть senior engineer mindset.


Проверка знанийKnowledge check
Junior engineer asks: 'Almost все курсы и books focus on Airflow 2.x. Сейчас 2026 — should I focus на 2.x knowledge or jump straight to 3.x?' Что бы вы посоветовали и почему?
ОтветAnswer
Pragmatic answer: **focus на 2.x first, then transition к 3.x knowledge — DON'T skip 2.x entirely**. Reasoning: (1) **Production reality** — в 2026 90%+ production deployments на 2.10/2.11. If you join company in next 2-3 years, you'll work с 2.x. Job interviews tested 2.x knowledge. Skipping 2.x = irrelevant first year of career. (2) **3.x является эволюцией 2.x** — concepts identical (DAGs, tasks, scheduler, executors). Most of what вы learn в 2.x transfers directly. TaskFlow API decorators same. Datasets/Assets — same concept. Idempotency, error handling, testing — identical principles. Скип 2.x чтобы learn 3.x = learn 80% same material и miss 20% transition context. (3) **2.x LTS supported до 2027** — patches и security fixes продолжаются. Скип 2.x = miss understanding кода actively maintained 2+ years. (4) **Migration knowledge sought after** — companies в 2026-2028 будут мигrate с 2.x к 3.x. Engineers who understand BOTH versions valuable для migration projects. Pure 3.x knowledge без 2.x context — less valuable. (5) **Mental model formation** — 2.x has architectural quirks (Worker direct DB access, FAB bundled) — understanding эти 'why we needed 3.x' is foundation для appreciating 3.x design decisions. Skip 2.x = miss 'why' behind design. **Concrete learning plan**: (1) Months 1-3: 2.x foundations (this course modules 1-14). Build small DAGs, learn TaskFlow, Datasets, deferrable; (2) Months 4-6: 2.x production topics (модули 15-17). HA, testing, design patterns; (3) Months 7-9: 3.x transition (модуль 18, official 3.x docs). Understand AIP-72, AIP-66, Assets renaming; (4) Months 10-12: Real projects — contribute к open source, build capstone-level pipeline в work, attend Airflow Summit. **Specific tools**: 2.11 LTS local Docker setup для experimentation (matches production); read AIP documents для historical perspective; watch Airflow Summit 2024-2026 talks (free YouTube); join Apache Airflow Slack — most active community resource. **What NOT to do**: don't try learn только 3.x от docs alone — concepts опираются на 2.x context. Don't memorize all operators (3000+) — learn patterns. Don't skip testing/observability/production modules — these distinguish senior от junior engineers. **Career advice**: Airflow market expects 5+ years experience for senior roles — 2.x mastery valuable through 2027+ even after most production migrates к 3.x. Solid 2.x foundation + 3.x migration knowledge = competitive position for next decade.

Проверьте понимание

Результат: 0 из 0
Прикладной
Вопрос 1 из 4. Какие resources наиболее useful для continuous Airflow learning?

Закончили урок?

Отметьте его как пройденный, чтобы отслеживать свой прогресс

Войдите чтобы оценить урок

Прогресс модуля
0 из 10