Migration infrastructure — DAG Processor, FastAPI, FAB removal, REST API v2
DAG code migrated (previous lessons), но это только половина работы. 3.x фундаментально меняет infrastructure: standalone DAG Processor становится mandatory, webserver — это FastAPI API server, FAB убран в pluggable auth providers, REST API v1 deprecated в пользу v2. Этот урок — practical guide для infrastructure migration в Helm-based deployments.
Helm commands — install, upgrade, rollback
Change 1: Standalone DAG Processor — mandatory
В 2.x — DAG Processor opcional standalone (default — внутри scheduler). В 3.x — mandatory standalone.
Почему
В 2.x scheduler делает DAG parsing inline:
Scheduler main loop:
Phase 0: DAG parsing (slow, IO-bound)
Phase 1: Create DagRuns
Phase 2: Schedule TIs
Phase 3: Enqueue (critical section)
Heavy DAGs (TensorFlow imports, complex factories) slowed Phase 0 → entire scheduler tick slow.
В 3.x DAG Processor — separate process с own concurrency, dedicated к parsing. Scheduler — pure scheduling logic, faster.
Helm 2.x configuration
# values 2.x (uses optional dagProcessor)
dagProcessor:
enabled: true # was optional
replicas: 1
# values 3.x (mandatory)
dagProcessor:
# No 'enabled' option — always on
replicas: 1
# Multiple replicas для multiple DAG bundles
Migration: enable dagProcessor.enabled: true в 2.x already — preview к 3.x mandatory. Capstone deployment (модуль 18.04) already does this.
Change 2: Webserver Flask → FastAPI API Server
В 2.x — Flask + Flask-AppBuilder. UI server-rendered Jinja templates.
В 3.x — FastAPI API Server + separate React UI. Architecture:
Client (browser)
↓ HTTPS
React UI (static files)
↓ /api/v2/* AJAX
FastAPI API Server
↓ async
PostgreSQL
What this means
- Webserver renamed to API Server в Helm values
- Custom webserver_config.py (Flask) — broken
- Custom Flask views — broken (need React + FastAPI rewrites)
- Static assets served separately
- WebSocket support better (FastAPI native)
Helm config
# values 2.x
webserver:
replicas: 3
webserverConfig: |
# Flask config
AUTH_TYPE = AUTH_OAUTH
OAUTH_PROVIDERS = [...]
# values 3.x
apiServer: # renamed from webserver
replicas: 3
# No more webserverConfig — auth via provider config
config: {}
Migration:
- Verify no custom Flask views в
webserver_config.py - Custom plugins with Flask blueprints → rewrite (или live without)
- UI customizations (CSS overrides) — apply through React UI config
Change 3: FAB removed → pluggable auth providers (AIP-79)
Biggest infrastructure change. FAB (Flask-AppBuilder) was bundled in 2.x, providing built-in auth (DB-based users, OIDC, LDAP).
В 3.x — FAB removed. Auth через AuthManager providers:
Available auth providers (2026)
| Provider | Description | Maintainer |
|---|---|---|
apache-airflow-providers-fab | Legacy FAB compat | Apache (provider package) |
apache-airflow-providers-amazon (AwsIamAuthManager) | AWS IAM | Apache |
| Custom AuthManager classes | OIDC, SAML, JWT, custom | Community |
Migration option 1 — use FAB provider for backward compat
Simplest path: install apache-airflow-providers-fab — works almost as drop-in.
pip install apache-airflow-providers-fab
# airflow.cfg 3.x
[api]
auth_backends = "airflow.providers.fab.auth_manager.FabAuthManager"
Configuration almost identical к 2.x webserver_config.py. Migrating to FAB provider — 30 minutes for most setups.
Migration option 2 — switch to native AuthManager
For modern OIDC setups, native auth provider may be cleaner:
# plugins/auth/okta_auth_manager.py
from airflow.auth.managers.base_auth_manager import BaseAuthManager
class OktaAuthManager(BaseAuthManager):
# Implement abstract methods
def get_user(self): ...
def is_logged_in(self): ...
def get_url_login(self): ...
# etc.
[api]
auth_backends = "plugins.auth.okta_auth_manager.OktaAuthManager"
Major rewrite. Postpone unless really need it.
Helm config example
# values 3.x
apiServer:
config:
api:
auth_backends: "airflow.providers.fab.auth_manager.FabAuthManager"
# FAB-specific config:
fabConfig: |
AUTH_TYPE = AUTH_OAUTH
OAUTH_PROVIDERS = [{...}]
Change 4: REST API v1 → v2
В 2.x — REST API v1 (Flask-RESTful) at /api/v1/*.
В 3.x — REST API v2 (FastAPI) at /api/v2/*. Differences:
Response format
// 2.x v1 example
{
"dag_id": "my_dag",
"is_paused": false,
"last_parsed_time": "2026-05-12T10:00:00+00:00"
}
// 3.x v2 example
{
"dag_id": "my_dag",
"is_paused": false,
"last_parsed_time": "2026-05-12T10:00:00+00:00",
"_links": { // HATEOAS links новые
"self": "/api/v2/dags/my_dag"
}
}
Most fields identical. Some renames:
execution_date→logical_date- camelCase introduced в places
Auth headers
# 2.x v1 — Basic auth common
Authorization: Basic <base64>
# 3.x v2 — Bearer token preferred
Authorization: Bearer <jwt>
Migration consumers
API consumers (CI/CD scripts, monitoring tools, custom integrations):
# 2.x consumer
import requests
r = requests.get("https://airflow.example.com/api/v1/dags", auth=("admin", "pass"))
dags = r.json()["dags"]
# 3.x consumer
import requests
# Bearer token (obtained from /api/v2/security/login)
token = "..."
r = requests.get(
"https://airflow.example.com/api/v2/dags"
headers={"Authorization": f"Bearer {token}"}
)
dags = r.json()["dags"]
For Python clients — apache-airflow-client package auto-generated from OpenAPI spec:
pip install apache-airflow-client
from airflow_client.client import Client
client = Client(host="https://airflow.example.com", token="...")
dags = client.dags.list() # auto-typed responses
Change 5: gitSync → DAG Bundles (AIP-66)
В 2.x — gitSync sidecar в каждом pod (модуль 15.03).
В 3.x — DAG Bundles abstraction. Pluggable DAG sources.
# 2.x values
dags:
gitSync:
enabled: true
repo: [email protected]:org/dags.git
branch: main
subPath: "dags"
# 3.x values
dagBundles:
- name: production-dags
classpath: airflow.dag_bundles.git.GitDagBundle
kwargs:
repo_url: https://github.com/org/dags
branch: main
subdir: dags
refresh_interval: 60
- name: experimental-dags
classpath: airflow.dag_bundles.s3.S3DagBundle
kwargs:
bucket: airflow-experimental-dags
refresh_interval: 300
DAG Bundles superior:
- Multiple sources (git, S3, HTTP) one Airflow instance
- Per-bundle refresh intervals
- Per-bundle versioning
- No sidecar overhead (DAG Processor pulls)
Migration: переписать dags.gitSync config на dagBundles array. Helm chart 2.x handles.
Change 6: Helm chart 1.x → 2.x
В 2026 — official Apache Airflow Helm chart 2.x released aligned с Airflow 3.x. Major changes:
# Major key renames в values.yaml
# 1.x → 2.x
webserver: → apiServer:
dags.gitSync: → dagBundles: [...]
fernetKey: → (unchanged)
data: → (unchanged)
postgresql: → (deprecated, use external)
redis: → (deprecated, use external)
Full Helm migration:
# 1. Backup current values
helm get values airflow -n airflow > values-2x-backup.yaml
# 2. Translate manually (or use chart migration script)
# (Helm 2.x chart will provide migration script)
# 3. Upgrade
helm upgrade airflow apache-airflow/airflow \
--version 2.0.0 \ # Helm chart 2.0+
--values values-3x.yaml \
--namespace airflow \
--wait
Reading Helm chart 2.x docs essential — many key renames.
Change 7: Per-component DB users (модуль 15.08 — even more important)
В 3.x Task SDK enforces stricter boundaries. Per-component DB users (модуль 15.08) become more important:
-- airflow_api_server — для API server (replaces airflow_webserver)
-- airflow_scheduler — same as 2.x
-- airflow_dag_processor — new, for DAG Processor
-- airflow_task_worker — for workers (TASK SDK через API, limited DB access)
Task SDK enforces “tasks don’t query metadata DB directly” — но still need DB connection для intermediate result writes. Setup users carefully.
Helm migration example — capstone values
Before (2.x):
# values-capstone-2x.yaml
executor: "CeleryExecutor,KubernetesExecutor"
webserver:
replicas: 3
dags:
gitSync:
enabled: true
repo: ...
scheduler:
replicas: 2
triggerer:
replicas: 2
dagProcessor:
enabled: true
After (3.x):
# values-capstone-3x.yaml
executor: "CeleryExecutor,KubernetesExecutor"
apiServer: # renamed
replicas: 3
config:
api:
auth_backends: "airflow.providers.fab.auth_manager.FabAuthManager"
fabConfig: |
AUTH_TYPE = AUTH_OAUTH
OAUTH_PROVIDERS = [...]
dagBundles: # was dags.gitSync
- name: production
classpath: airflow.dag_bundles.git.GitDagBundle
kwargs:
repo_url: https://github.com/org/dags
branch: main
subdir: dags
scheduler:
replicas: 2
triggerer:
replicas: 2
dagProcessor: # no 'enabled' — always on
replicas: 1
Key changes:
webserver→apiServerdags.gitSync→dagBundles- FAB config preserved через provider
dagProcessor.enabledremoved (always on)
Migration ordering
Recommended infrastructure migration sequence:
Week 1: Pre-flight
- upgrade-check warnings = 0
- Staging environment ready
- Backup all DBs
- Document current Helm values
Week 2: Staging upgrade (parallel deployment)
- Deploy 3.x Helm chart в new namespace
- airflow db migrate (на staging Postgres copy)
- Verify all DAGs parse
- Run sample DagRuns
- Test API consumers с new v2 endpoints
Week 3-4: Validation
- Soak test 1-2 weeks
- Compare metrics 2.x vs 3.x
- Verify performance regression < 10%
- User acceptance testing для UI changes
Week 5: Production migration
- Saturday morning (low traffic)
- Blue/green switch
- Monitor for 48 hours
- Decommission 2.x
Production gotchas
Helm chart 2.x release timing. Apache Airflow Helm chart 2.0+ released some time after Airflow 3.0 release. Use Astronomer Astro или wait для official chart до production migration. Don’t migrate just code — wait для chart maturity.
Custom Flask plugins — major rewrite. If you have custom webserver views (sidebar items, custom pages) — completely rewrite в React + FastAPI. Estimate weeks per plugin.
REST API v1 deprecation period. 3.x deprecates v1 но likely supports for 6-12 months parallel с v2. Don’t rush API consumer migration day 1.
FAB provider stable, but limited. apache-airflow-providers-fab provides legacy compat — but not all FAB features перенесены. If you used obscure FAB hooks, check provider docs.
Multiple-Executors AIP-61 syntax may evolve. 3.x may add Edge Executor (AIP-69), updates к executor selection syntax. Stay current с release notes.
Backup before migration imperative. Migration tools на production без backup = career-ending mistake. RDS snapshot + pg_dump before airflow db migrate.
Test rollback procedure. Practice in staging — что delete 3.x namespace + restore 2.x. Document exact steps.