查看程序占用tomcat内存情况

最近,公司线上tomcat经常无缘无辜宕机,总结了一下定位问题的方法,仅供参考: 报错信息:

Maximum number of threads (200) created for connector with address null and port 9443
# There is insufficient memory for the Java Runtime Environment to continue.
# Cannot create GC thread. Out of system resources.

一、查看当前用户线程和文件句柄数是否超出限制

(1)显示当前用户进程限制:ulimit -a 显示结果:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 256612
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 102400
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 1024
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

(2)修改所有 linux 用户的环境变量文件:

vi /etc/profile
ulimit -u 10000
ulimit -n 4096

保存后运行#source /etc/profile 使其生效

二、查看当前端口号进程信息和GC使用情况

(1)显示端口的PID:lsof -i:端口 示例:lsof -i:7074

COMMAND  PID   USER   FD   TYPE   DEVICE SIZE/OFF NODE NAME
java    3195  ligang  34u  IPv4   37416693  0t0    TCP *:7074 (LISTEN)

(2)gc信息统计:jstat -gcutil PID 示例:jstat -gcutil 3195

 S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT   
12.63   0.00  52.03  78.63  99.13   4148   24.274   200   40.246   64.520

(3)输出线程数:ps -mp PID -o THREAD,tid,time | wc -l 示例:ps -mp 3195 -o THREAD,tid,time | wc -l 43

三、查看进程内存使用情况及定位到对应程序

(1)内存使用情况:top -p PID 示例:top 3195

top - 15:29:27 up 25 days, 20:05,  2 users,  load average: 0.01, 0.05, 0.01
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.1%sy,  0.0%ni, 99.8%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   8058868k total,  6821684k used,  1237184k free,   181936k buffers
Swap:  2097144k total,   492300k used,  1604844k free,  1897320k cached

PID  USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                           
3195 ligang    20   0 4862m 196m  10m S  0.0  2.5   7:57.48 java

(2)找到该进程后,如何定位具体线程或代码呢,首先显示线程列表,并按照CPU占用高的线程排序: ps -mp PID -o THREAD,tid,time | sort -rn | head -10 示例:ps -mp PID -o THREAD,tid,time | sort -rn | head -10

USER     %CPU PRI SCNT WCHAN  USER SYSTEM   TID     TIME
ligang    0.6   -    - -         -      -     - 00:07:58
ligang    0.2  19    - futex_    -      -  3270 00:02:49
ligang    0.0  19    - inet_c    -      -  3277 00:00:00
ligang    0.0  19    - inet_c    -      -  3273 00:00:00
ligang    0.0  19    - inet_c    -      -  3271 00:00:00
ligang    0.0  19    - inet_c    -      -  3203 00:00:05
ligang    0.0  19    - futex_    -      -  7644 00:00:00
ligang    0.0  19    - futex_    -      -  3420 00:00:00
ligang    0.0  19    - futex_    -      -  3288 00:00:06

(3)将需要的线程ID转换为16进制格式:printf "%x\n" TID 示例:printf "%x\n" 3270 cc6 (4)最后打印线程的堆栈信息:jstack PID |grep cc6 -A 30 示例: jstack 2633 |grep e18 -A 30

结果就可以看到哪段代码导致的问题...

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

扫码关注云+社区

领取腾讯云代金券