注意:这个问题最初被问到这里,但是赏金时间已经过期了,尽管实际上还没有找到一个可以接受的答案。我现在再问这个问题,包括原质询所提供的所有细节。
python脚本每60秒使用赛德模块运行一组类函数:
# sc is a sched.scheduler instance
sc.enter(60, 1, self.doChecks, (sc, False))
该脚本是使用代码这里作为守护进程运行的。
作为doChecks的一部分调用的许多类方法使用子过程模块调用系统函数,以获取系统统计信息:
ps = subprocess.Popen(['ps', 'aux'], stdout=subprocess.PIPE).communicate()[0]
在整个脚本因以下错误崩溃之前,这段时间运行良好:
File "/home/admin/sd-agent/checks.py", line 436, in getProcesses
File "/usr/lib/python2.4/subprocess.py", line 533, in __init__
File "/usr/lib/python2.4/subprocess.py", line 835, in _get_handles
OSError: [Errno 12] Cannot allocate memory
脚本崩溃后,服务器上的空闲-m的输出是:
$ free -m
total used free shared buffers cached
Mem: 894 345 549 0 0 0
-/+ buffers/cache: 345 549
Swap: 0 0 0
服务器正在运行CentOS 5.3。我无法在自己的CentOS框上复制,也无法与任何其他用户报告相同的问题。
我尝试了一些方法来调试这个问题,正如在最初的问题中所建议的那样:
整个检查可以在GitHub在这里上找到,getProcesses函数从第442行中定义。这是doChecks()从第520行开始调用的。
在崩溃之前,脚本使用以下输出与strace一起运行:
recv(4, "Total Accesses: 516662\nTotal kBy"..., 234, 0) = 234
gettimeofday({1250893252, 887805}, NULL) = 0
write(3, "2009-08-21 17:20:52,887 - checks"..., 91) = 91
gettimeofday({1250893252, 888362}, NULL) = 0
write(3, "2009-08-21 17:20:52,888 - checks"..., 74) = 74
gettimeofday({1250893252, 888897}, NULL) = 0
write(3, "2009-08-21 17:20:52,888 - checks"..., 67) = 67
gettimeofday({1250893252, 889184}, NULL) = 0
write(3, "2009-08-21 17:20:52,889 - checks"..., 81) = 81
close(4) = 0
gettimeofday({1250893252, 889591}, NULL) = 0
write(3, "2009-08-21 17:20:52,889 - checks"..., 63) = 63
pipe([4, 5]) = 0
pipe([6, 7]) = 0
fcntl64(7, F_GETFD) = 0
fcntl64(7, F_SETFD, FD_CLOEXEC) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb7f12708) = -1 ENOMEM (Cannot allocate memory)
write(2, "Traceback (most recent call last"..., 35) = 35
open("/usr/bin/sd-agent/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/bin/sd-agent/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/lib/python24.zip/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/plat-linux2/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/lib/python2.4/lib-tk/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/lib-dynload/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/site-packages/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
write(2, " File \"/usr/bin/sd-agent/agent."..., 52) = 52
open("/home/admin/sd-agent/daemon.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/bin/sd-agent/daemon.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/lib/python24.zip/daemon.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/daemon.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/plat-linux2/daemon.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/lib/python2.4/lib-tk/daemon.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/lib-dynload/daemon.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/site-packages/daemon.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
write(2, " File \"/home/admin/sd-agent/dae"..., 60) = 60
open("/usr/bin/sd-agent/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/bin/sd-agent/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/lib/python24.zip/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/plat-linux2/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/lib/python2.4/lib-tk/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/lib-dynload/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/site-packages/agent.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
write(2, " File \"/usr/bin/sd-agent/agent."..., 54) = 54
open("/usr/lib/python2.4/sched.py", O_RDONLY|O_LARGEFILE) = 8
write(2, " File \"/usr/lib/python2.4/sched"..., 55) = 55
fstat64(8, {st_mode=S_IFREG|0644, st_size=4054, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7d28000
read(8, "\"\"\"A generally useful event sche"..., 4096) = 4054
write(2, " ", 4) = 4
write(2, "void = action(*argument)\n", 25) = 25
close(8) = 0
munmap(0xb7d28000, 4096) = 0
open("/usr/bin/sd-agent/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/bin/sd-agent/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/lib/python24.zip/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/plat-linux2/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/lib/python2.4/lib-tk/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/lib-dynload/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/site-packages/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
write(2, " File \"/usr/bin/sd-agent/checks"..., 60) = 60
open("/usr/bin/sd-agent/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/bin/sd-agent/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/lib/python24.zip/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/plat-linux2/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOMEM (Cannot allocate memory)
open("/usr/lib/python2.4/lib-tk/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/lib-dynload/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
open("/usr/lib/python2.4/site-packages/checks.py", O_RDONLY|O_LARGEFILE) = -1 ENOENT (No such file or directory)
write(2, " File \"/usr/bin/sd-agent/checks"..., 64) = 64
open("/usr/lib/python2.4/subprocess.py", O_RDONLY|O_LARGEFILE) = 8
write(2, " File \"/usr/lib/python2.4/subpr"..., 65) = 65
fstat64(8, {st_mode=S_IFREG|0644, st_size=39931, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7d28000
read(8, "# subprocess - Subprocesses with"..., 4096) = 4096
read(8, "lso, the newlines attribute of t"..., 4096) = 4096
read(8, "code < 0:\n print >>sys.st"..., 4096) = 4096
read(8, "alse does not exist on 2.2.0\ntry"..., 4096) = 4096
read(8, " p2cread\n # c2pread <-"..., 4096) = 4096
write(2, " ", 4) = 4
write(2, "errread, errwrite)\n", 19) = 19
close(8) = 0
munmap(0xb7d28000, 4096) = 0
open("/usr/lib/python2.4/subprocess.py", O_RDONLY|O_LARGEFILE) = 8
write(2, " File \"/usr/lib/python2.4/subpr"..., 71) = 71
fstat64(8, {st_mode=S_IFREG|0644, st_size=39931, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7d28000
read(8, "# subprocess - Subprocesses with"..., 4096) = 4096
read(8, "lso, the newlines attribute of t"..., 4096) = 4096
read(8, "code < 0:\n print >>sys.st"..., 4096) = 4096
read(8, "alse does not exist on 2.2.0\ntry"..., 4096) = 4096
read(8, " p2cread\n # c2pread <-"..., 4096) = 4096
read(8, "table(self, handle):\n "..., 4096) = 4096
read(8, "rrno using _sys_errlist (or siml"..., 4096) = 4096
read(8, " p2cwrite = None, None\n "..., 4096) = 4096
write(2, " ", 4) = 4
write(2, "self.pid = os.fork()\n", 21) = 21
close(8) = 0
munmap(0xb7d28000, 4096) = 0
write(2, "OSError", 7) = 7
write(2, ": ", 2) = 2
write(2, "[Errno 12] Cannot allocate memor"..., 33) = 33
write(2, "\n", 1) = 1
unlink("/var/run/sd-agent.pid") = 0
close(3) = 0
munmap(0xb7e0d000, 4096) = 0
rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x589978}, {0xb89a60, [], SA_RESTORER, 0x589978}, 8) = 0
brk(0xa022000) = 0xa022000
exit_group(1) = ?
发布于 2012-11-11 07:30:08
作为一般规则(例如,在香草内核中),fork
/clone
与ENOMEM
具体发生的失败是由于以下原因造成的:ENOMEM
具体发生: (dup_mm
、dup_task_struct
、alloc_pid
、mpol_dup
、d10等),或者是由于d11导致e 112而E 213
强制执行
E 115E 216clone>
首先检查分叉尝试时未能分叉的进程的vmsize,然后比较与过度提交策略相关的空闲内存量(物理和交换)(插入数字)。
在您的特殊情况下,请注意Virtuozzo在
超额执行
中有
超额执行
。此外,我不确定从到容器中,您真正对
交换和过度提交配置
有多大的控制(以影响执行的结果)。
现在,为了更好地前进,我想说你是,还有两个选择--
切换到更大的实例,或
在中进行一些编码工作,更有效地控制脚本的内存占用
注意到,如果事实证明不是你,而是其他人和你在同一台服务器上配置了一个不同的实例,那么编写代码的努力可能是徒劳的。
就内存而言,我们已经知道subprocess.Popen
使用
fork**/**clone
在引擎盖下面
,这意味着每当您调用它时,再次请求的内存与Python已经占用的内存一样多,即在数百个额外的MB中,所有这些都是为了在exec上使用一个微不足道的10 MB的可执行文件,比如free或ps。如果出现不利的超额承诺政策,您很快就会看到ENOMEM。
没有此父页表等复制问题的fork的替代方案是
vfork
和
posix_spawn
。但是,如果您不想用free**/**ps**/**sleep /posix_spawn重写大块subprocess.Popen,可以考虑只使用suprocess.Popen一次,在脚本开始时(当Python的内存占用最小时),生成一个shell脚本,然后运行和与脚本并行的循环中的其他任何东西;轮询脚本的输出或同步读取它,如果需要异步处理其他内容,则可能从单独的线程中进行--用Python处理数据,但将分叉留给次要进程处理。
不过,在您的特殊情况下,您可以完全跳过调用ps和free;无论您选择自己访问还是通过
现有库和/或包
访问,信息在中都可以直接从
procfs
获得。如果ps和free是您正在运行的唯一实用程序,那么您可以完全取消完全。
最后,无论您做什么,就subprocess.Popen而言,如果您的脚本泄漏内存,您最终还是会碰壁的。盯着它,
检查内存泄漏
。
发布于 2014-10-17 01:58:12
看一下free -m
的输出,在我看来,您实际上没有可用的交换内存。我不确定在Linux中,交换是否总是按需自动提供,但我也遇到了同样的问题,这里的答案对我没有真正的帮助。但是,添加一些交换内存,解决了我的情况下的问题,因为这可能会帮助其他面临相同问题的人,所以我在文章中给出了如何添加1GB交换空间的答案(在Ubuntu12.04上,但是对于其他发行版,它也应该同样有效)。
您可以首先检查是否启用了任何交换内存。
$sudo swapon -s
如果它是空的,就意味着您没有启用任何交换。若要添加1GB交换空间,请执行以下操作:
$sudo dd if=/dev/zero of=/swapfile bs=1024 count=1024k
$sudo mkswap /swapfile
$sudo swapon /swapfile
将下面的行添加到fstab
以使交换永久化。
$sudo vim /etc/fstab
/swapfile none swap sw 0 0
来源和更多信息可以找到这里。
发布于 2018-09-13 10:26:15
https://stackoverflow.com/questions/1367373
复制相似问题