我在服务器上有一些奇怪的IO活动,我不知道它是从哪里来的。
为了提供一些背景信息,我不得不从服务器上更换一台NVMe (三星PM81)。我没有注意到任何性能问题,但SMART报告说,是时候找一个替代者了。我确实注意到了设备上的一些不寻常的IO活动,但我想可能是由于设备的磨损,所以我不太重视它。
现在,随着全新的NVMe (三星980 Pro)和操作系统从零开始安装(Debian 10),IO活动问题依然存在。
以下是/proc/diskstats
在1分钟内的内容:
$ cat /proc/diskstats; sleep 1m; cat /proc/diskstats
259 0 nvme0n1 2323590 271 213032732 285413 43708052 69809516 16770577066 269903507 0 901057472 1159862364 0 0 0 0
259 1 nvme0n1p1 2006 0 7264 3665 2 0 2 0 0 44 3080 0 0 0 0
259 2 nvme0n1p2 74879 0 5283682 9424 2001773 386508 28620456 971285 0 455348 825152 0 0 0 0
259 3 nvme0n1p3 2246597 271 207737634 272318 40382341 69423008 16741956608 266611966 0 12038708 266043996 0 0 0 0
259 0 nvme0n1 2323590 271 213032732 285413 43710868 69817259 16771166530 269907653 0 901114568 1159920624 0 0 0 0
259 1 nvme0n1p1 2006 0 7264 3665 2 0 2 0 0 44 3080 0 0 0 0
259 2 nvme0n1p2 74879 0 5283682 9424 2002019 386548 28623272 971330 0 455376 825180 0 0 0 0
259 3 nvme0n1p3 2246597 271 207737634 272318 40384852 69430711 16742543256 266615967 0 12041324 266047732 0 0 0 0
如您所见,nvme0n1
报告了95 %以上的IO ((901114568-901057472)/60000*100)。但是分区上的IO使用几乎为零。那么,在哪里做的IO呢?在分区表上?此外,阅读时间(0 ms)加上花在写作上的时间(4146 ms)与完成I/O (57096 ms)的时间相加。除了读和写,还有什么可做的呢?
设备上没有更多的分区或未分配的空间:
$ echo p | sudo fdisk /dev/nvme0n1
Welcome to fdisk (util-linux 2.33.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.
Command (m for help): Disk /dev/nvme0n1: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: Samsung SSD 980 PRO 2TB
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 0B698EB9-DD2E-4131-9730-4193DD9D5FB5
Device Start End Sectors Size Type
/dev/nvme0n1p1 2048 1953791 1951744 953M EFI System
/dev/nvme0n1p2 1953792 197265407 195311616 93.1G Linux filesystem
/dev/nvme0n1p3 197265408 3907028991 3709763584 1.7T Linux filesystem
Command (m for help):
SMART还报告了一个错误,但如果我正确理解它,它只是简单地报告了设备上缺少的功能,而不是功能问题:
$ sudo smartctl -a /dev/nvme0n1
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-21-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: Samsung SSD 980 PRO 2TB
Serial Number: S69ENL0T610188X
Firmware Version: 5B2QGXA7
PCI Vendor/Subsystem ID: 0x144d
IEEE OUI Identifier: 0x002538
Total NVM Capacity: 2,000,398,934,016 [2.00 TB]
Unallocated NVM Capacity: 0
Controller ID: 6
Number of Namespaces: 1
Namespace 1 Size/Capacity: 2,000,398,934,016 [2.00 TB]
Namespace 1 Utilization: 1,736,883,855,360 [1.73 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 002538 b621a0ae58
Local Time is: Tue Sep 27 10:47:54 2022 CEST
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0057): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat Timestmp
Maximum Data Transfer Size: 128 Pages
Warning Comp. Temp. Threshold: 82 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 8.49W - - 0 0 0 0 0 0
1 + 4.48W - - 1 1 1 1 0 200
2 + 3.18W - - 2 2 2 2 0 1000
3 - 0.0400W - - 3 3 3 3 2000 1200
4 - 0.0050W - - 4 4 4 4 500 9500
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02, NSID 0x1)
Critical Warning: 0x00
Temperature: 40 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 0%
Data Units Read: 214,200 [109 GB]
Data Units Written: 16,891,230 [8.64 TB]
Host Read Commands: 2,350,427
Host Write Commands: 42,643,472
Controller Busy Time: 238
Power Cycles: 1
Power On Hours: 262
Unsafe Shutdowns: 0
Media and Data Integrity Errors: 0
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 40 Celsius
Temperature Sensor 2: 55 Celsius
Read Error Information Log failed: NVMe Status 0x02
我还检查了iotop
,但没有看到任何相关的内容:
$ sudo iotop -aoPb -n 2 -d 60
unable to set locale, falling back to the default locale
Total DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
Current DISK READ: 0.00 B/s | Current DISK WRITE: 0.00 B/s
PID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
Total DISK READ: 0.00 B/s | Total DISK WRITE: 36.82 K/s
Current DISK READ: 0.00 B/s | Current DISK WRITE: 46.88 K/s
PID PRIO USER DISK READ DISK WRITE SWAPIN IO COMMAND
649 be/3 root 0.00 B 84.00 K 0.00 % 0.07 % [jbd2/nvme0n1p3-]
396 be/3 root 0.00 B 40.00 K 0.00 % 0.05 % [jbd2/nvme0n1p2-]
31590 be/4 root 0.00 B 0.00 B 0.00 % 0.00 % [kworker/u48:0-flush-259:0]
4761 be/4 root 0.00 B 2.02 M 0.00 % 0.00 % minio server /data
733 be/4 root 0.00 B 12.00 K 0.00 % 0.00 % dcgm-exporter
737 be/4 root 0.00 B 8.00 K 0.00 % 0.00 % nscd
我猜这意味着IO是由内核本身执行的?
有人能帮我找出造成这个IO活动的原因以及如何避免它吗?我不希望这台NVMe很快就磨损,需要再更换一次。
发布于 2022-10-10 07:00:17
终于解开了谜团!
Linux内核中似乎有一个bug。它使diskstats
为某些存储设备报告了错误的指标。
我将内核升级到5.10.0 (buster-backports
上可用的内核),现在这些指标是正确的。
这个问题可以使用t2.microandt3在AWS上复制,如果有人感兴趣,可以使用Debian 10的微实例。
https://serverfault.com/questions/1111657
复制相似问题