开始了:我有一个运行在RedHat服务器上的Rails应用程序。我使用Passenger Standalone v3.0.19为其提供服务。数据库为MySQL。最近,我发现一些Passenger Application实例每分钟都在消耗越来越多的内存,而这些实例实际上并没有什么用处。(当我请求一个页面时,Passenger会产生一个新的实例,而不是使用这个实例)我试图手动杀死这些实例(我知道这不是处理问题的正确方法),但几个小时后它又发生了。
这是我通过passenger-memory-stats得到的:
------- Apache processes --------
### Processes: 0
### Total private dirty RSS: 0.00 MB
---------- Nginx processes ----------
PID PPID VMSize Private Name
-------------------------------------
48881 1 27.6 MB 0.1 MB nginx: master process /home/myuser/.passenger/standalone/3.0.19-x86_64-ruby1.9.3-linux-gcc4.4.7-1002/nginx-1.2.6/sbin/nginx -c /tmp/passenger-standalone.48333/config -p /tmp/passenger-standalone.48333/
48882 48881 28.5 MB 0.7 MB nginx: worker process
### Processes: 2
### Total private dirty RSS: 0.78 MB
------- Passenger processes -------
PID VMSize Private Name
-----------------------------------
48841 207.3 MB 0.1 MB PassengerWatchdog
48844 1822.7 MB 1.7 MB PassengerHelperAgent
48846 247.2 MB 14.3 MB Passenger spawn server
48850 143.0 MB 0.5 MB PassengerLoggingAgent
63729 10006.3 MB 9097.2 MB Rack: /home/myuser/webapp/current
76631 999.0 MB 92.3 MB Rack: /home/myuser/webapp/current这是乘客状态输出:
----------- General information -----------
max = 6
count = 1
active = 0
inactive = 1
Waiting on global queue: 0
----------- Application groups -----------
/home/myuser/webapp/current:
App root: /home/myuser/webapp/current
* PID: 76631 Sessions: 0 Processed: 15 Uptime: 51m 7s显然,进程63729不再由Passenger管理。
我还使用了strace,方法是运行:
strace -f -p 63729我得到了:
[pid 63729] read(14, "\1\0\0\1\1\36\0\0\2\3def\0\0\0\10COUNT(*)\0\f?\0\25\0\0"..., 16384) = 63
[pid 63729] brk(0x234f2c000) = 0x234f2c000
[pid 63729] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
[pid 63729] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
[pid 63729] poll([{fd=14, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
[pid 63729] write(14, "\221\0\0\0\3SELECT COUNT(*) FROM `test_"..., 149) = 149
[pid 63729] ppoll([{fd=14, events=POLLIN}], 1, NULL, NULL, 8) = 1 ([{fd=14, revents=POLLIN}])
[pid 63729] read(14, "\1\0\0\1\1\36\0\0\2\3def\0\0\0\10COUNT(*)\0\f?\0\25\0\0"..., 16384) = 63
[pid 63729] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
[pid 63729] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
[pid 63729] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
[pid 63729] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
[pid 63729] poll([{fd=14, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
[pid 63729] write(14, "\221\0\0\0\3SELECT COUNT(*) FROM `test_"..., 149) = 149
[pid 63729] ppoll([{fd=14, events=POLLIN}], 1, NULL, NULL, 8) = 1 ([{fd=14, revents=POLLIN}])
[pid 63729] read(14, "\1\0\0\1\1\36\0\0\2\3def\0\0\0\10COUNT(*)\0\f?\0\25\0\0"..., 16384) = 63
[pid 63729] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
[pid 63729] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
[pid 63729] poll([{fd=14, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout)
[pid 63729] write(14, "\221\0\0\0\3SELECT COUNT(*) FROM `test_"..., 149) = 149同样的模式还在继续。我用代码检查了fd 14
ls -l /proc/63729/fd这就是我得到的:
lr-x------ 1 myuser myuser 64 May 9 05:38 0 -> /dev/null
l-wx------ 1 myuser myuser 64 May 9 05:38 1 -> /home/myuser/webapp/shared/log/passenger.log
lrwx------ 1 myuser myuser 64 May 9 05:38 10 -> socket:[430845495]
lrwx------ 1 myuser myuser 64 May 9 05:38 11 -> socket:[430860531]
lrwx------ 1 myuser myuser 64 May 9 05:38 12 -> socket:[430845847]
lrwx------ 1 myuser myuser 64 May 9 05:38 13 -> socket:[430845850]
lrwx------ 1 myuser myuser 64 May 9 05:38 14 -> socket:[430845864]
l-wx------ 1 myuser myuser 64 May 9 05:38 2 -> /home/myuser/webapp/shared/log/passenger.log
l-wx------ 1 myuser myuser 64 May 9 05:38 3 -> /home/myuser/webapp/shared/log/production.log
lr-x------ 1 myuser myuser 64 May 9 05:38 4 -> pipe:[430845856]
lr-x------ 1 myuser myuser 64 May 9 05:38 5 -> pipe:[430845845]
l-wx------ 1 myuser myuser 64 May 9 05:38 6 -> pipe:[430845845]
lr-x------ 1 myuser myuser 64 May 9 05:38 7 -> pipe:[430845846]
l-wx------ 1 myuser myuser 64 May 9 05:38 8 -> pipe:[430845856]
lrwx------ 1 myuser myuser 64 May 9 05:38 9 -> socket:[430845494]这就是我现在难受的地方。我不知道如何从套接字id获取请求,也不知道如何从strace获取SQL命令,但我猜问题可能是滥用SQL或ActiveRecord。有人能在这方面帮我吗?我不知道是否有任何信息丢失,但我会提供它的要求。任何事情都会受到感谢。谢谢!
发布于 2013-05-10 14:40:31
我已经通过检查mysql日志解决了这个问题。slow_query_log没有任何错误。但general_log显示,一系列查询一直在寻找较旧的记录。这是因为我为active record写了一个愚蠢的代码。我想我现在要结束这个问题了。
https://stackoverflow.com/questions/16458975
复制相似问题