首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >RabbitMQ (beam.smp)与高CPU/内存负载问题

RabbitMQ (beam.smp)与高CPU/内存负载问题
EN

Stack Overflow用户
提问于 2014-08-06 14:03:49
回答 5查看 106K关注 0票数 58

我有一个debian盒子,用芹菜和狂犬病做了一年左右的任务。最近,我注意到任务没有被处理,所以我登录到系统,并注意到芹菜不能连接到兔子so。我重新启动了rabbitmq服务器,尽管芹菜不再抱怨,但它现在不再执行新任务了。奇怪的是,狂犬病正在疯狂地吞噬cpu和内存资源。重新启动服务器不会解决问题。在花了几个小时在网上寻找解决方案之后,我决定重建服务器。

我用Debian 7.5,rabbitmq 2.8.4,芹菜3.1.13 (Cipater)重建了新服务器。大约一个小时左右,一切又恢复得很好,直到芹菜又开始抱怨它不能连接到兔子!

代码语言:javascript
复制
[2014-08-06 05:17:21,036: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@127.0.0.1:5672//: [Errno 111] Connection refused.
Trying again in 6.00 seconds...

我重新启动了rabbitmq service rabbitmq-server start和相同的问题增益:

rabbitmq又开始膨胀,不断地冲击cpu,慢慢地接管所有ram和交换:

代码语言:javascript
复制
PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
21823 rabbitmq  20   0  908m 488m 3900 S 731.2 49.4   9:44.74 beam.smp

这是rabbitmqctl status上的结果

代码语言:javascript
复制
Status of node 'rabbit@li370-61' ...
[{pid,21823},
 {running_applications,[{rabbit,"RabbitMQ","2.8.4"},
                        {os_mon,"CPO  CXC 138 46","2.2.9"},
                        {sasl,"SASL  CXC 138 11","2.2.1"},
                        {mnesia,"MNESIA  CXC 138 12","4.7"},
                        {stdlib,"ERTS  CXC 138 10","1.18.1"},
                        {kernel,"ERTS  CXC 138 10","2.15.1"}]},
 {os,{unix,linux}},
 {erlang_version,"Erlang R15B01 (erts-5.9.1) [source] [64-bit] [smp:8:8] [async-threads:30] [kernel-poll:true]\n"},
 {memory,[{total,489341272},
          {processes,462841967},
          {processes_used,462685207},
          {system,26499305},
          {atom,504409},
          {atom_used,473810},
          {binary,98752},
          {code,11874771},
          {ets,6695040}]},
 {vm_memory_high_watermark,0.3999999992280962},
 {vm_memory_limit,414559436},
 {disk_free_limit,1000000000},
 {disk_free,48346546176},
 {file_descriptors,[{total_limit,924},
                    {total_used,924},
                    {sockets_limit,829},
                    {sockets_used,3}]},
 {processes,[{limit,1048576},{used,1354}]},
 {run_queue,0},

/var/log/rabbitmq中的一些条目:

代码语言:javascript
复制
=WARNING REPORT==== 8-Aug-2014::00:11:35 ===
Mnesia('rabbit@li370-61'): ** WARNING ** Mnesia is overloaded: {dump_log,
                                                                write_threshold}

=WARNING REPORT==== 8-Aug-2014::00:11:35 ===
Mnesia('rabbit@li370-61'): ** WARNING ** Mnesia is overloaded: {dump_log,
                                                                write_threshold}

=WARNING REPORT==== 8-Aug-2014::00:11:35 ===
Mnesia('rabbit@li370-61'): ** WARNING ** Mnesia is overloaded: {dump_log,
                                                                write_threshold}

=WARNING REPORT==== 8-Aug-2014::00:11:35 ===
Mnesia('rabbit@li370-61'): ** WARNING ** Mnesia is overloaded: {dump_log,
                                                                write_threshold}

=WARNING REPORT==== 8-Aug-2014::00:11:36 ===
Mnesia('rabbit@li370-61'): ** WARNING ** Mnesia is overloaded: {dump_log,
                                                                write_threshold}

=INFO REPORT==== 8-Aug-2014::00:11:36 ===
vm_memory_high_watermark set. Memory used:422283840 allowed:414559436

=WARNING REPORT==== 8-Aug-2014::00:11:36 ===
memory resource limit alarm set on node 'rabbit@li370-61'.

**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************

=INFO REPORT==== 8-Aug-2014::00:11:43 ===
started TCP Listener on [::]:5672

=INFO REPORT==== 8-Aug-2014::00:11:44 ===
vm_memory_high_watermark clear. Memory used:290424384 allowed:414559436

=WARNING REPORT==== 8-Aug-2014::00:11:44 ===
memory resource limit alarm cleared on node 'rabbit@li370-61'

=INFO REPORT==== 8-Aug-2014::00:11:59 ===
vm_memory_high_watermark set. Memory used:414584504 allowed:414559436

=WARNING REPORT==== 8-Aug-2014::00:11:59 ===
memory resource limit alarm set on node 'rabbit@li370-61'.

**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************

=INFO REPORT==== 8-Aug-2014::00:12:00 ===
vm_memory_high_watermark clear. Memory used:411143496 allowed:414559436

=WARNING REPORT==== 8-Aug-2014::00:12:00 ===
memory resource limit alarm cleared on node 'rabbit@li370-61'

=INFO REPORT==== 8-Aug-2014::00:12:01 ===
vm_memory_high_watermark set. Memory used:415563120 allowed:414559436

=WARNING REPORT==== 8-Aug-2014::00:12:01 ===
memory resource limit alarm set on node 'rabbit@li370-61'.

**********************************************************
*** Publishers will be blocked until this alarm clears ***
**********************************************************

=INFO REPORT==== 8-Aug-2014::00:12:07 ===
Server startup complete; 0 plugins started.

=ERROR REPORT==== 8-Aug-2014::00:15:32 ===
** Generic server rabbit_disk_monitor terminating 
** Last message in was update
** When Server state == {state,"/var/lib/rabbitmq/mnesia/rabbit@li370-61",
                               50000000,46946492416,100,10000,
                               #Ref<0.0.1.79456>,false}
** Reason for termination == 
** {unparseable,[]}

=INFO REPORT==== 8-Aug-2014::00:15:37 ===
Disk free limit set to 50MB

=ERROR REPORT==== 8-Aug-2014::00:16:03 ===
** Generic server rabbit_disk_monitor terminating 
** Last message in was update
** When Server state == {state,"/var/lib/rabbitmq/mnesia/rabbit@li370-61",
                               50000000,46946426880,100,10000,
                               #Ref<0.0.1.80930>,false}
** Reason for termination == 
** {unparseable,[]}

=INFO REPORT==== 8-Aug-2014::00:16:05 ===
Disk free limit set to 50MB

更新:从存储库安装最新版本的rabbitmq (3.3.4-1),似乎解决了的问题。最初,我从Debian存储库安装了一个(2.8.4)。到目前为止,rabbitmq服务器运行顺利。如果问题再次出现,我会更新这篇文章。

更新:不幸的是,大约24小时后,这个问题再次出现,在该问题中,rabbitmq关闭并重新启动进程将使它消耗资源,直到它在几分钟内再次关闭为止。

EN

回答 5

Stack Overflow用户

发布于 2014-08-06 17:49:51

最后我找到了解决办法。这些帖子帮助解决了这个问题。RabbitMQ on EC2 Consuming Tons of CPUhttps://serverfault.com/questions/337982/how-do-i-restart-rabbitmq-after-switching-machines

发生的事情是,狂犬病控制着所有的结果,这些结果从未被释放到它超载的程度。我清除了/var/lib/rabbitmq/mnesia/rabbit/中所有陈旧的数据,重新启动了兔子,现在它正常工作了。

我的解决方案是在芹菜配置文件中禁用与CELERY_IGNORE_RESULT = True一起存储结果,以确保不再发生这种情况。

票数 47
EN

Stack Overflow用户

发布于 2015-09-01 14:57:30

您还可以重置队列:

警告这将清除所有数据和配置!谨慎使用.

代码语言:javascript
复制
sudo service rabbitmq-server start
sudo rabbitmqctl stop_app
sudo rabbitmqctl reset
sudo rabbitmqctl start_app

如果您的系统没有响应,您可能需要在重新启动之后立即运行这些命令。

票数 6
EN

Stack Overflow用户

发布于 2014-08-08 10:13:25

由于芹菜,内存资源耗尽了,我也遇到了类似的问题,芹菜后端结果使用的队列出现了问题。

您可以使用rabbitmqctl list_queues命令检查有多少队列,如果这个数量永远增长,请注意。在这种情况下,看看你的芹菜用途。

关于芹菜,如果你没有像异步事件那样得到结果,不要为存储那些未使用的结果配置后端。

票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/25162484

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档