New KB articles published for the week ending 2nd February,2020
February 5, 2020Who Are the Gamaredon Group and What Do They Want?
February 6, 2020Choose the right Airflow scheduler settings
When you need to scale your Cloud Composer environment, you’ll want to choose the right Airflow configs as well as node and machine type settings.
The Airflow default config for scheduler max_threads
is only two, which means even if the Airflow scheduler pod runs in a 32-core node, it can only launch two DAG parsing processes. Therefore, it’s recommended to set max_threads
to at least the number of vCPUs per machine.
If you find tasks are taking a long time in SCHEDULED state, it can mean that tasks are constrained by dag_concurrency
or non_pooled_task_slot_count
. You can consider increasing the value of the two options.
If you find tasks are stuck in QUEUED state, it can mean they may be constrained by parallelism
. It may, however, also be limited by worker processing power, because tasks are only set to RUNNING state after they’re already picked up by a worker. You can consider increasing parallelism
or adding more worker nodes.
Test Airflow worker performance
Cloud Composer launches a worker pod for each node you have in your environment. Each worker pod can launch multiple worker processes to fetch and run a task from the Celery queue. The number of processes a worker pod can launch is limited by Airflow config worker_concurrency
.
To test worker performance, we ran a test based on no-op PythonOperator and found that six or seven concurrent worker processes seem to already fully utilize one vCPU with 3.75GB RAM (the default n1-standard-1 machine type). The addition of worker processes can introduce large context switch overhead and can even result in out-of-memory issues for worker pods, ultimately disrupting task execution.
`worker_concurrency` = 6-8 * cores_per_node or per_3.75GB_ram
Cloud Composer uses six as the default concurrency value for environments. For environments with more cores in a single node, use the above formula to quickly get a worker_concurrency number that works for you. If you do want a higher concurrency, we recommend monitoring worker pod stability closely after the new value takes effect. Worker pod evictions that happen because of out-of-memory errors may indicate the concurrency value is too high. Your real limit may vary depending on your worker process memory consumption.
Another consideration to take into account is long-running operations that are not CPU-intensive, such as polling status from a remote server that consumes memory for running a whole Airflow process. We advise lifting your worker_concurrency number slowly and monitoring closely after adjustment.