前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Error : No space left on device

Error : No space left on device

作者头像
PedroQin
发布2020-03-31 10:25:39
2.1K0
发布2020-03-31 10:25:39
举报
文章被收录于专栏:WriteSimpleDemoWriteSimpleDemo

记一次服务器异常及修复

起因

最近同事发现产线服务器重启服务时出现如下报错。

代码语言:javascript
复制
[root@server ~]# service sshd restart
Redirecting to /bin/systemctl restart sshd.service
Error: No space left on device
[root@server ~]#  systemctl restart  dhcpd.service
Error: No space left on device
[root@server ~]# df -h
Filesystem               Size  Used Avail Use% Mounted on
/dev/mapper/centos-root  472G  195G  278G  42% /
devtmpfs                 126G     0  126G   0% /dev
tmpfs                    126G     0  126G   0% /dev/shm
tmpfs                    126G  4.1G  122G   4% /run
tmpfs                    126G     0  126G   0% /sys/fs/cgroup
/dev/sda1               1014M  166M  849M  17% /boot
tmpfs                     26G     0   26G   0% /run/user/0

debug过程

根据报的Error,字面意思为设备空间不足。一般来说,造成这种报错的原因一般有两种:

  1. 磁盘空间
  2. 是inode空间不足

于是信心十足敲下命令证明自己猜想。。。

代码语言:javascript
复制
[root@server bin]# df -h
Filesystem               Size  Used Avail Use% Mounted on
/dev/mapper/centos-root  472G  195G  278G  42% /
devtmpfs                 126G     0  126G   0% /dev
tmpfs                    126G     0  126G   0% /dev/shm
tmpfs                    126G  4.1G  122G   4% /run
tmpfs                    126G     0  126G   0% /sys/fs/cgroup
/dev/sda1               1014M  166M  849M  17% /boot
tmpfs                     26G     0   26G   0% /run/user/0
[root@server bin]# df -i
Filesystem                 Inodes   IUsed     IFree IUse% Mounted on
/dev/mapper/centos-root 247431168 1169533 246261635    1% /
devtmpfs                 33000182     748  32999434    1% /dev
tmpfs                    33004461       1  33004460    1% /dev/shm
tmpfs                    33004461    1514  33002947    1% /run
tmpfs                    33004461      16  33004445    1% /sys/fs/cgroup
/dev/sda1                  524288     340    523948    1% /boot
tmpfs                    33004461       1  33004460    1% /run/user/0

额?怎么跟预想的不太一样,空间看样子都足够的。查看messagedmesgsel等信息,也无硬盘异常log,不像硬盘问题。

查找根源

By default, Linux only allocates 8192 watches for inotify, which is ridiculously low. And when it runs out, the error is also No space left on device, which may be confusing if you aren't explicitly looking for this issue.

可通过命令man 7 inotify查询inotify相关介绍(文末附录 man page for inotify)

代码语言:javascript
复制
[root@server ~]# sysctl fs.inotify
fs.inotify.max_queued_events = 16384
fs.inotify.max_user_instances = 128
fs.inotify.max_user_watches = 8192
[root@server ~]# cat /proc/sys/fs/inotify/max_user_watches
8192

查询可得当前upper limit on the number of watches that can be created per real user ID的确是默认值8192

查询当前实际值如下,实际值已大于默认设置最大值,故报错。

代码语言:javascript
复制
[root@server ~]# find /proc/*/fd -user "$USER" -lname anon_inode:inotify \
 -printf '%hinfo/%f\n' 2>/dev/null \
 | xargs cat | grep -c '^inotify'
8557

命令原理:

This will first find all open file descriptors created by inotify_init*(2), and will then look into the corresponding /proc/PID/fdinfo/FD file for the info about the watch descriptors added with inotify_add_watch(2) to each of them (look into the proc(5) manpage under /proc/[pid]/fdinfo/ for a description of the inotify-specific entries).

同理可查询每个/proc/PID/fdinfo/FD对应watch descriptors数,并找出执行命令和文件

代码语言:javascript
复制
[root@server ~]# for i in `find /proc/*/fd -user "$USER" -lname\
anon_inode:inotify -printf '%hinfo/%f\n' 2>/dev/null`;\
do echo -e "$i \t `cat $i|grep -c '^inotify'`";done
/proc/17810/fdinfo/11 	 2
/proc/17825/fdinfo/3 	 3
/proc/17825/fdinfo/8 	 4
/proc/17847/fdinfo/6 	 2
/proc/17873/fdinfo/6 	 1
/proc/17879/fdinfo/3 	 1
/proc/17880/fdinfo/3 	 1
/proc/18341/fdinfo/5 	 3
/proc/18882/fdinfo/7 	 1
/proc/19235/fdinfo/9 	 5
/proc/1/fdinfo/10 	 1
/proc/1/fdinfo/14 	 4
/proc/1/fdinfo/15 	 4
/proc/1/fdinfo/17 	 4
/proc/57300/fdinfo/4 	 8630
/proc/7143/fdinfo/3 	 2
/proc/9380/fdinfo/7 	 11
[root@server ~]# cat /proc/57300/cmdline
python xxx.py

由此可看出上述PID 57300即为罪魁祸首,其实际命令也已查出。

解决问题

由于上一步查出的脚本为一关键任务脚本,暂时无法关掉,故增大fs.inotify.max_user_watches以解决此问题。

编辑/etc/sysctl.conf,添加行fs.inotify.max_user_watches = 81920,并执行以下命令

代码语言:javascript
复制
[root@server ~]# sysctl -p
fs.inotify.max_user_watches = 81920

重新查询inotify

代码语言:javascript
复制
[root@server ~]# sysctl fs.inotify
fs.inotify.max_queued_events = 16384
fs.inotify.max_user_instances = 128
fs.inotify.max_user_watches = 81920
[root@server ~]# cat /proc/sys/fs/inotify/max_user_watches
81920

执行systemctl验证结果如下,已解决

代码语言:javascript
复制
[root@server etc]# service sshd restart
Redirecting to /bin/systemctl restart sshd.service

附录

参考链接

https://serverfault.com/questions/708001/error-no-space-left-on-device-when-starting-stopping-services-only

https://unix.stackexchange.com/questions/498393/how-to-get-the-number-of-inotify-watches-in-use

man page for inotify
代码语言:javascript
复制
NAME
       inotify - monitoring file system events

DESCRIPTION
       The  inotify  API  provides  a  mechanism  for monitoring file system events.  Inotify can be used to monitor individual files, or to monitor directories.  When a
       directory is monitored, inotify will return events for the directory itself, and for files inside the directory.
...
   /proc interfaces
       The following interfaces can be used to limit the amount of kernel memory consumed by inotify:

       /proc/sys/fs/inotify/max_queued_events
              The  value  in  this  file is used when an application calls inotify_init(2) to set an upper limit on the number of events that can be queued to the corre‐
              sponding inotify instance.  Events in excess of this limit are dropped, but an IN_Q_OVERFLOW event is always generated.

       /proc/sys/fs/inotify/max_user_instances
              This specifies an upper limit on the number of inotify instances that can be created per real user ID.

       /proc/sys/fs/inotify/max_user_watches
              This specifies an upper limit on the number of watches that can be created per real user ID.
...
本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2020-03-21,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 WriteSimpleDemo 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 起因
  • debug过程
  • 查找根源
  • 解决问题
  • 附录
    • 参考链接
      • man page for inotify
      领券
      问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档