以上简单了解了airflow的概念和使用场景。今天我们就通过Docker安装Airflow,在使用过程中详细了解airflow的具体功能。1Airflow容器化部署阿里云主机环境:操作系统:Ubuntu20.04.3LTS内核版本:Linux5.4.0-91-generic安装docker安装Docker请参考官方文档[1],cleansystem,没有需要卸载旧版本,因为是云平台,为了防止配置破坏环境,可以提前快照。#updatereposudoapt-getupdatesudoapt-getinstall\ca-certificates\curl\gnupg\lsb-release#adddockergpgkeycurl-fsSLhttps://download.docker.com/linux/ubuntu/gpg|sudogpg--dearmor-o/usr/share/复制代码keyrings/docker-archive-keyring.gpg#设置dockerstable仓库地址echo\"deb[arch=$(dpkg--print-architecture)signed-by=/usr/share/keyrings/docker-archive-keyring.gpg]https//download.docker.com/linux/ubuntu\$(lsb_release-cs)stable"|sudotee/etc/apt/sources.list.d/docker.list>/dev/null#查看可安装的docker-ce版本root@bigdata1:~#apt-cachemadisondocker-cedocker-ce|5:20.10.12~3-0~ubuntu-focal|https://download.docker.com/linux/ubuntufocal/stableamd64Packagesdocker-ce|5:20.10.11~3-0~ubuntu-focal|https://download.docker.com/linux/ubuntufocal/stableamd64Packagesdocker-ce|5:20.10.10~3-0~ubuntu-focal|https://download.docker.com/linux/ubuntufocal/stableamd64Packagesdocker-ce|5:20.10.9~3-0~ubuntu-focal|https://download.docker.com/linux/ubuntufocal/stableamd64Packages#安装命令格式#sudoapt-getinstallddocker-ce=docker-ce-cli=containerd.io#安装指定版本sudoapt-getinstalldocker-ce=5:20.10.12~3-0~ubuntu-focaldocker-ce-cli=5:20.10.12~3-0~ubuntu-focalcontainerd.io优化Docker配置{"data-root":"/var/lib/docker","exec-opts":["native.cgroupdriver=systemd"],"registry-mirrors":["https://****.mirror.aliyuncs.com"#这里配置一些加速地址,比如阿里云等...],"storage-driver":"overlay2","storage-opts":["overlay2.override_kernel_check=true"],"log-driver":"json-file","log-opts":{"max-size":"100m","max-file":"3"}}Configurebootsystemctldaemon-reloadsystemctlenable--nowdocker.serviceContainerizedinstallationAirflow数据库选择根据官网说明,数据库推荐使用MySQL8+和postgresql9.6+,在官方docker-composescript[2]是PostgreSQL,所以需要调整docker-compose.yml的内容---version:'3'x-airflow-common:&airflow-common#Inordertoaddcustomdependenciesorupgradeproviderpackagesyoucanuseyourextendedimage.#Commenttheimageline,placeyourDockerfileinthedirectorywhereyouplacedthedocker-compose.yaml#anduncommentthe"build"linebelow,Thenrun`docker-composebuild`tobuildtheimages.image:${AIRFLOW_IMAGE_NAME:-apache/airbuild-enflow:2.2-compose:&vi.3}envAIRFLOW__CORE__EXECUTOR:CeleryExecutorAIRFLOW__CORE__SQL_ALCHEMY_CONN:mysql+mysqldb://airflow:aaaa@mysql/airflow#这里换成mysql的连接方式AIRFLOW__CELERY__RESULT_BACKEND:db+mysql://airflow:aaaa@mysql/airflow#这里换成mysql的连接方式AIRFLOW__CELURL:YBROWEROK_CELURLredis://:xxxx@redis:6379/0#为了保证安全,我们对redis开启了鉴权,所以把这里的xxxx换成redis的密码AIRFLOW__CORE__FERNET_KEY:''AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION:'true'AIRFLOW__CORE__LOAD_EXAMPLES:'true'AIRFLOW__API__AUTH_BACKEND:'airflow.api.auth.backend.basic_auth'_PIP_ADDITIONAL_REQUIREMENTS:${_PIP_ADDITIONAL_REQUIREMENTS:-}volumes:-./dags:/opt/airflow/dags-./logs:/opt/airflow/logs-./plugins:/opt/airflow/pluginsuser:"${AIRFLOW_UID:-50000}:0"depends_on:&airflow-common-depends-onredis:condition:service_healthymysql:#这里改成mysql服务名condition:service_healthyservices:mysql:image:mysql:8.0.27#修改为最新版mysql图片环境:MYSQL_ROOT_PASSWORD:bbbb#MySQLroot账户密码MYSQL_USER:airflowMYSQL_PASSWORD:aaaa#airflow用户密码MYSQL_DATABASE:airflowcommand:--default-authentication-plugin=mysql_native_password#指定默认认证插件--collat??ion-server=utf8mb4_general_ci#据官方指定字符集--character-set-server=utf8mb4#根据官方指定字符编码卷:-/apps/airflow/mysqldata8:/var/lib/mysql#PersistentMySQLdata-/apps/airflow/my.cnf:/etc/my.cnf#持久化MySQL配置文件healthcheck:test:mysql--user=$$MYSQL_USER--password=$$MYSQL_PASSWORD-e'SHOWDATABASES;'#healthcheckcommandinterval:5sretries:5restart:alwaysredis:image:redis:6.2公开:-6379命令:redis-server--requirepassxxxx#redis-server开启密码认证healthcheck:test:["CMD","re??dis-cli","-a","xxxx","ping"]#redis使用密码进行healthcheckinterval:5stimeout:30sretries:50restart:alwaysairflow-webserver:<<:*airflow-commoncommand:webserverports:-8080:8080healthcheck:test:["CMD","curl","--fail","http://localhost:8080/health“]间隔:10stimeout:10sretries:5restart:alwaysdepends_on:<<:*airflow-common-depends-onairflow-init:条件:service_completed_successfullyairflow-scheduler:<<:*airflow-commoncommand:schedulerhealthcheck:test:[“CMD-SHELL”,'airflowjobscheck--job-typeSchedulerJob--hostname"$${HOSTNAME}"']interval:10stimeout:10sretries:5restart:alwaysdepends_on:<<:*airflow-common-depends-onairflow-init:condition:service_completed_successfullyairflow-worker:<<:*airflow-commoncommand:celeryworkerhealthcheck:test:-"CMD-SHELL"-'celery--appairflow.executors.celery_executor.appinspectping-d"celery@$${HOSTNAME}"'interval:10stimeout:10sretries:5environment:<<:*airflow-common-env#Requiredtohandlewarmshutdownoftheceleryworkersproperly#Seehttps://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagationDUMB_INIT_SETSID:"0"restart:alwaysdepends_on:<<:*airflow-common-depends-onairflow-init:condition:service_completed_successfullyairflow-触发器:<<:*airflow-commoncommand:triggererhealthcheck:test:["CMD-SHELL",'airflowjobscheck--job-typeTriggererJob--hostname"$${HOSTNAME}"']interval:10stimeout:10sretries:5restart:alwaysdepends_on:<<:*airflow-common-depends-onairflow-init:condition:service_completed_successfullyairflow-init:<<:*airflow-commonentrypoint:/bin/bash#yamllintdisablerule:line-lengthcommand:--c-|functionver(){printf"%04d%04d%04d%04d"$${1//./}}airflow_version=$$(gosuairflowairflowversion)airflow_version_comparable=$$(ver$${airflow_version})min_airflow_version=2.2.0min_airflow_version_comparable=$$(ver$${min_airflow_version})if((airflow_version_comparable.env#注意,这里一定要保证AIRFLOW_UID是普通的UID用户,并确保该用户有权创建这些持久目录。如果不是普通用户,运行容器会报错,找不到airflow模块docker-composeupairflow-init#初始化数据库并创建表当容器状态为不健康时,需要通过dockerinspect$container_name检查错误原因。至此,airflow的安装就完成了。参考[1]在Ubuntu上安装DockerEngine:https://docs.docker.com/engine/install/ubuntu/[2]官方docker-compose.yaml:https://airflow.apache.org/docs/apache-气流/2.2.3/docker-compose.yaml