如果您是通过搜索错误信息看到了此文,直接参考以下三点即可:
接下来的内容,是我对整个问题过程的复盘;
收到同事反馈,说后台服务出现异常,定位后发现是应用连接elasticsearch server失败,于是用eshead去连接,还是失败;
[admin@dev ~]$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fbbcd0d7d57c eshead "/bin/sh -c 'cd ~/..." 10 hours ago Up 10 hours 0.0.0.0:10300->9100/tcp es-head
ef23574c0afe docker.elastic.co/elasticsearch/elasticsearch:6.1.1 "/usr/local/bin/do..." 10 hours ago Up 10 hours elasticsearch
[admin@dev ~]$ docker exec -it ef23574c0afe /bin/bash
rpc error: code = 2 desc = containerd: container not found
至此,觉得问题已经解决了,在群里给大家说了下就回家了;
刚刚坐上回家的车,收到同事消息说问题又出现了,es再次连接不上,状况和之前一样,这就尴尬了… 带着郁闷回到家,在梦中问题再次解决,还是那熟悉的systemctl restart docker命令。。。
以上就是问题的出现和第一轮处理的过程;
第二天再次面对此问题;
[admin@dev ~]$ sudo egrep -i -r 'killed process' /var/log
/var/log/messages:Mar 25 17:08:42 dev kernel: Killed process 12385 (controller) total-vm:72068kB, anon-rss:644kB, file-rss:0kB
/var/log/messages:Mar 25 17:08:42 dev kernel: Killed process 11887 (java) total-vm:44440740kB, anon-rss:10786220kB, file-rss:0kB
/var/log/messages:Mar 25 23:13:05 dev kernel: Killed process 43011 (controller) total-vm:72068kB, anon-rss:640kB, file-rss:0kB
/var/log/messages:Mar 25 23:13:12 dev kernel: Killed process 36316 (java) total-vm:44448156kB, anon-rss:11088004kB, file-rss:0kB
/var/log/secure:Mar 26 08:48:49 dev sudo: admin : TTY=pts/9 ; PWD=/home/admin ; USER=root ; COMMAND=/bin/egrep -i -r killed process /var/log
[ 17.644157] Freeing initrd memory: 19872k freed
[ 17.749669] Non-volatile memory driver v1.3
[ 17.749834] crash memory driver: version 1.1
[ 18.833237] Freeing unused kernel memory: 1620k freed
[ 19.279518] [TTM] Zone kernel: Available graphics memory: 65870004 kiB
[ 19.279520] [TTM] Zone dma32: Available graphics memory: 2097152 kiB
[245831.555024] [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0
[245831.555827] Out of memory: Kill process 11887 (java) score 99 or sacrifice child
[245831.560069] [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0
[245831.560791] Out of memory: Kill process 20406 (java) score 99 or sacrifice child
[267649.159340] [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0
[267649.160478] Out of memory: Kill process 36316 (java) score 97 or sacrifice child
[267651.140770] [<ffffffff8116d616>] out_of_memory+0x4b6/0x4f0
[267651.142104] Out of memory: Kill process 36316 (java) score 97 or sacrifice child
[admin@dev ~]$ top
top - 09:24:40 up 3 days, 12:30, 6 users, load average: 2.02, 1.83, 1.79
Tasks: 814 total, 1 running, 812 sleeping, 0 stopped, 1 zombie
%Cpu(s): 1.0 us, 0.7 sy, 0.0 ni, 98.2 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 13174000+total, 13239316 free, 11658065+used, 1920036 buff/cache
KiB Swap: 4194300 total, 2515328 free, 1678972 used. 14449396 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
19816 admin 20 0 41.774g 0.012t 6396 S 1.0 9.5 159:22.23 java
19389 admin 20 0 40.963g 0.010t 6360 S 1.0 8.6 96:36.80 java
18464 admin 20 0 41.804g 8.216g 6460 S 0.7 6.5 157:01.02 java
18751 admin 20 0 41.829g 7.692g 6364 S 1.3 6.1 98:26.24 java
24233 admin 20 0 37.290g 7.543g 6616 S 0.7 6.0 22:41.94 java
19671 admin 20 0 32.509g 7.314g 6392 S 0.7 5.8 58:14.12 java
9466 admin 20 0 32.105g 7.300g 6628 S 1.0 5.8 18:41.46 java