我在docker上的centos7上运行AirFlow1.8,而我的the服务器无法连接到浏览器。我通过pip2.7安装了airflow。Flower ui显示良好,initdb已连接到postgres和redis后端,使用CeleryExecutor,在ECS上运行,我以根用户身份运行。Webserver正通过airflow webserver部署到默认的8080。
有没有人知道黑角兽工人退出的原因/解决方案是什么?具体地说,它看起来就是这一行
ERROR - [0 / 0] some workers seem to have died and gunicorndid not restart them as expected
整个日志..。
[2018-04-13 20:05:01,161] {db.py:287} INFO - Creating tables
INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
Done.
[2018-04-13 20:05:02,358] {__init__.py:57} INFO - Using executor CeleryExecutor
/usr/local/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
.format(x=modname), ExtDeprecationWarning
____________ _____________
____ |__( )_________ __/__ /________ __
____ /| |_ /__ ___/_ /_ __ /_ __ \_ | /| / /
___ ___ | / _ / _ __/ _ / / /_/ /_ |/ |/ /
_/_/ |_/_/ /_/ /_/ /_/ \____/____/|__/
[2018-04-13 20:05:03,363] [1] {models.py:167} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-13 20:05:04,488] {__init__.py:57} INFO - Using executor CeleryExecutor
[2018-04-13 20:05:04 +0000] [18] [INFO] Starting gunicorn 19.3.0
[2018-04-13 20:05:04 +0000] [18] [INFO] Listening at: http://0.0.0.0:8080 (18)
[2018-04-13 20:05:04 +0000] [18] [INFO] Using worker: sync
[2018-04-13 20:05:04 +0000] [24] [INFO] Booting worker with pid: 24
[2018-04-13 20:05:05 +0000] [25] [INFO] Booting worker with pid: 25
/usr/local/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
.format(x=modname), ExtDeprecationWarning
[2018-04-13 20:05:05 +0000] [26] [INFO] Booting worker with pid: 26
[2018-04-13 20:05:05 +0000] [27] [INFO] Booting worker with pid: 27
/usr/local/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
.format(x=modname), ExtDeprecationWarning
Running the Gunicorn Server with:
Workers: 4 sync
Host: 0.0.0.0:8080
Timeout: 120
Logfiles: - -
=================================================================
/usr/local/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
.format(x=modname), ExtDeprecationWarning
/usr/local/lib/python2.7/site-packages/flask/exthook.py:71: ExtDeprecationWarning: Importing flask.ext.cache is deprecated, use flask_cache instead.
.format(x=modname), ExtDeprecationWarning
[2018-04-13 20:05:06,461] [24] {models.py:167} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-13 20:05:07,873] [1] {cli.py:723} ERROR - [0 / 0] some workers seem to have died and gunicorndid not restart them as expected
[2018-04-13 20:05:08,271] [27] {models.py:167} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-13 20:05:08,271] [25] {models.py:167} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-13 20:05:08,271] [26] {models.py:167} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-13 20:05:09 +0000] [25] [INFO] Parent changed, shutting down:
[2018-04-13 20:05:09 +0000] [25] [INFO] Worker exiting (pid: 25)
[2018-04-13 20:05:09 +0000] [26] [INFO] Parent changed, shutting down:
[2018-04-13 20:05:09 +0000] [26] [INFO] Worker exiting (pid: 26)
[2018-04-13 20:05:09 +0000] [27] [INFO] Parent changed, shutting down:
[2018-04-13 20:05:09 +0000] [27] [INFO] Worker exiting (pid: 27)
我发誓不久前我把它修好了,不知道发生了什么。下面是我安装的pip包的列表
airflow (1.8.0)
alembic (0.8.10)
amqp (2.2.2)
asn1crypto (0.24.0)
awscli (1.15.4)
Babel (2.5.3)
backports-abc (0.5)
billiard (3.5.0.3)
boto3 (1.7.4)
botocore (1.10.4)
celery (4.0.2)
certifi (2018.1.18)
cffi (1.11.5)
chardet (3.0.4)
click (6.7)
colorama (0.3.7)
croniter (0.3.20)
cryptography (2.2.2)
Cython (0.28.2)
dill (0.2.7.1)
docutils (0.14)
enum34 (1.1.6)
Flask (0.11.1)
Flask-Admin (1.4.1)
Flask-Cache (0.13.1)
Flask-Login (0.2.11)
flask-swagger (0.2.13)
Flask-WTF (0.12)
flower (0.9.2)
funcsigs (1.0.0)
future (0.15.2)
futures (3.2.0)
gitdb2 (2.0.3)
GitPython (2.1.9)
gunicorn (19.3.0)
idna (2.6)
ipaddress (1.0.19)
itsdangerous (0.24)
Jinja2 (2.8.1)
jmespath (0.9.3)
kombu (4.1.0)
lockfile (0.12.2)
lxml (3.8.0)
Mako (1.0.7)
Markdown (2.6.11)
MarkupSafe (1.0)
ndg-httpsclient (0.4.4)
numpy (1.14.2)
ordereddict (1.1)
pandas (0.22.0)
pip (9.0.3)
psutil (4.4.2)
psycopg2-binary (2.7.4)
pyasn1 (0.4.2)
pycparser (2.18)
Pygments (2.2.0)
pyOpenSSL (17.5.0)
python-daemon (2.1.2)
python-dateutil (2.7.2)
python-editor (1.0.3)
python-nvd3 (0.14.2)
python-slugify (1.1.4)
pytz (2018.4)
PyYAML (3.12)
redis (2.10.6)
requests (2.18.4)
rsa (3.4.2)
s3transfer (0.1.13)
setproctitle (1.1.10)
setuptools (39.0.1)
singledispatch (3.4.0.3)
six (1.11.0)
smmap2 (2.0.3)
SQLAlchemy (1.2.6)
tabulate (0.7.7)
thrift (0.9.3)
tornado (5.0.2)
Unidecode (1.0.22)
urllib3 (1.22)
vine (1.1.4)
Werkzeug (0.14.1)
wheel (0.31.0)
WTForms (2.1)
zope.deprecation (4.3.0)
更新我是从源代码安装的,现在从the服务器得到这个错误。
[2018-04-14 00:20:48,594] {{cli.py:718}} ERROR - [0 / 0] some workers seem to have died and gunicorndid not restart them as expected
[2018-04-14 00:20:50,396] {{models.py:197}} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-14 00:20:50,396] {{models.py:197}} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-14 00:20:50,396] {{models.py:197}} INFO - Filling up the DagBag from /usr/local/airflow/dags
[2018-04-14 00:24:18,135] {{cli.py:725}} ERROR - No response from gunicorn master within 120 seconds
[2018-04-14 00:24:23,032] {{cli.py:726}} ERROR - Shutting down webserver
我认为这是
https://issues.apache.org/jira/browse/AIRFLOW-1235
它会在黑角工友死亡时关闭网络服务器。我想..。
更新
好的,这会以某种方式自我修复。我不知道怎么做,因为我做了很多事情,但是用greenlet、eventlet、gevent安装gunicorn可能会有帮助,它可能是我的入口点上的一些东西,也许是执行中的并发性airflow webserver紧接着airflow initdb..。当我在之前的puckel安装中遇到这个问题时,我也留下了这个问题,我很想知道这是不是别人面临的一个bug,以及这个问题是什么。
发布于 2018-04-23 17:10:44
因此,当您从源代码安装时,您获得了
https://issues.apache.org/jira/browse/AIRFLOW-1235
,我认为它会在工人死后重新启动主机和工人。我还看到我的员工死于MySQL会话/连接出错。例如,来自SQLAlchemy的异常,关于事务由于并发锁定而失败并需要重试,围绕该异常的气流模型没有任何逻辑,或者InvalidRequestError: This session is in 'prepared' state; no further SQL can be emitted within this transaction.
但通常不是在启动时。
我在启动时遇到的两次错误是,由于亚马逊网络服务中的东西中的安全组而无法连接到数据库,以及当我们的3000+ DAG花了很长时间才添加到DAG包中时,工作人员的超时被触发,他们在安装代码完成之前就关闭了自己。我很想看看这个设置代码是否可以改进或移出工人。
https://stackoverflow.com/questions/49824969
复制相似问题