我正在尝试使用run_as_user功能在气流中为我们的DAG和我们面临一些问题。有什么帮助或建议吗?
DAG Code:from datetime import datetime, timedelta
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
current_time = datetime.now() - timedelta(days=1)
default_args = {
'start_date': datetime.strptime(current_time.strftime('%Y-%m-%d %H:%M:%S'),'%Y-%m-%d %H:%M:%S'),
'run_as_user': 'airflowaduser',
'execution_timeout': timedelta(minutes=5)
}
dag = DAG('test_run-as_user', default_args=default_args,description='Run hive Query DAG', schedule_interval='0 * * * *',)
hive_ex = BashOperator(
task_id='hive-ex',
bash_command='whoami',
dag=dag
)我有气流添加到sudoers,它可以切换到气流用户,没有密码从Linux外壳。
airflow ALL=(ALL) NOPASSWD: ALL
运行DAG时,下面的错误详细信息:
*** Reading local file: /home/airflow/logs/test_run-as_user/hive-ex/2020-06-09T16:00:00+00:00/1.log
[2020-06-09 17:00:04,602] {taskinstance.py:620} INFO - Dependencies all met for <TaskInstance: test_run-as_user.hive-ex 2020-06-09T16:00:00+00:00 [queued]>
[2020-06-09 17:00:04,613] {taskinstance.py:620} INFO - Dependencies all met for <TaskInstance: test_run-as_user.hive-ex 2020-06-09T16:00:00+00:00 [queued]>
[2020-06-09 17:00:04,613] {taskinstance.py:838} INFO -
--------------------------------------------------------------------------------
[2020-06-09 17:00:04,613] {taskinstance.py:839} INFO - Starting attempt 1 of 1
[2020-06-09 17:00:04,613] {taskinstance.py:840} INFO -
--------------------------------------------------------------------------------
[2020-06-09 17:00:04,651] {taskinstance.py:859} INFO - Executing <Task(BashOperator): hive-ex> on 2020-06-09T16:00:00+00:00
[2020-06-09 17:00:04,651] {base_task_runner.py:133} INFO - Running: ['sudo', '-E', '-H', '-u', 'airflowaduser', 'airflow', 'run', 'test_run-as_user', 'hive-ex', '2020-06-09T16:00:00+00:00', '--job_id', '2314', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/test_run-as_user/testscript.py', '--cfg_path', '/tmp/tmpbinlgw54']
[2020-06-09 17:00:04,664] {base_task_runner.py:115} INFO - Job 2314: Subtask hive-ex sudo: airflow: command not found
[2020-06-09 17:00:09,576] {logging_mixin.py:95} INFO - [[34m2020-06-09 17:00:09,575[0m] {[34mlocal_task_job.py:[0m105} INFO[0m - Task exited with return code 1[0m我们的气流在虚拟环境中运行。
发布于 2022-10-14 13:04:33
在虚拟环境中运行气流时,只有用户“气流”被配置为运行airflow命令。如果希望以另一个用户的身份运行,则需要将主目录设置为与气流用户(/home/airflow)相同的目录,并使其属于0组。请参阅https://airflow.apache.org/docs/docker-stack/entrypoint.html#allowing-arbitrary-user-to-run-the-container
此外,run_as_user特性调用sudo,只允许使用安全路径。airflow命令的位置不是安全路径的一部分,但可以添加到sudoers文件中。您可以使用whereis airflow检查气流目录在哪里,在我的容器中它是/home/airflow/.local/bin。
为了解决这个问题,我需要在我的Dockerfile中添加4行:
RUN useradd -u [airflowaduser UID] -g 0 -d /home/airflow kettle && \
# create airflowaduser
usermod -u [airflow UID] -aG sudo airflow && \
# add airflow to sudo group
echo "airflow ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers && \
# allow airflow to run sudo without a password
sed -i 's#/.venv/bin#/home/airflow/.local/bin:/.venv/bin#' /etc/sudoers
# update secure path to include the airflow directoryhttps://stackoverflow.com/questions/62303665
复制相似问题