前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Bluestore下的SSD故障排查

Bluestore下的SSD故障排查

作者头像
用户1260683
发布2019-05-09 14:15:05
5.9K0
发布2019-05-09 14:15:05
举报

线上发现L版本一个OSD down,不确定是否磁盘故障,之前的filestore排查起来比较熟,换成Bluestore以后,有些细节上的操作不一样,因为用到的是SSD,所以有了这篇排查文档。

排查过程

定位故障节点

[root@demo-host ceph]# ceph osd tree|grep down
  20        1.00000                 osd.20              down        0 1.00000
[root@demo-host ceph]# ceph osd find 20
{
    "osd": 20,
    "ip": "192.168.8.124:6800/1298894",
    "osd_fsid": "a99bc25c-4cf4-5429-9171-4084555af14b",
    "crush_location": {
        "host": "demo-host-ssd",
        "media": "site1-rack1-ssd",
        "mediagroup": "site1-ssd",
        "root": "default"
    }
}

登录到上面的192.168.8.124,执行命令"dmesg -T",发现有dm-0设备发生io错误

[Wed Feb 27 16:24:02 2019] hpsa 0000:03:00.0: Acknowledging event: 0x40000032 (HP SSD Smart Path state change)
[Wed Feb 27 16:24:02 2019] hpsa 0000:03:00.0: hpsa_update_device_info: LV failed, device will be skipped.
[Wed Feb 27 16:24:02 2019] hpsa 0000:03:00.0: scsi 0:1:0:0: updated Direct-Access     HP       LOGICAL VOLUME   RAID-1(+0) SSDSmartPathCap+ En+ Exp=1
[Wed Feb 27 16:24:02 2019] hpsa 0000:03:00.0: scsi 0:1:0:2: updated Direct-Access     HP       LOGICAL VOLUME   RAID-0 SSDSmartPathCap+ En+ Exp=1
[Wed Feb 27 16:24:21 2019] buffer_io_error: 1 callbacks suppressed
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 0, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 468834288, async page read
[Wed Feb 27 16:24:21 2019] Buffer I/O error on dev dm-0, logical block 468834288, async page read
[Wed Feb 27 16:24:22 2019] Buffer I/O error on dev dm-0, logical block 468834288, async page read

检查osd日志出现“ERROR: osd init failed: (5) Input/output error ”

[root@demo-host ceph]# tail -100 /var/log/ceph/ceph-osd.20.log
2019-02-27 16:31:34.492858 7fc0f33aed80  1 bdev(0x55d1a3c16000 /var/lib/ceph/osd/ceph-20/block) open size 1920345309184 (0x1bf1d800000, 1.75TiB) block_size 4096 (4KiB) non-rotational
2019-02-27 16:31:34.492906 7fc0f33aed80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:34.492917 7fc0f33aed80  1 bdev(0x55d1a3c16000 /var/lib/ceph/osd/ceph-20/block) close
2019-02-27 16:31:34.751175 7fc0f33aed80  1 bluestore(/var/lib/ceph/osd/ceph-20) _mount path /var/lib/ceph/osd/ceph-20
2019-02-27 16:31:34.751738 7fc0f33aed80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:34.751776 7fc0f33aed80  1 bdev create path /var/lib/ceph/osd/ceph-20/block type kernel
2019-02-27 16:31:34.751779 7fc0f33aed80  1 bdev(0x55d1a3c16200 /var/lib/ceph/osd/ceph-20/block) open path /var/lib/ceph/osd/ceph-20/block
2019-02-27 16:31:34.751978 7fc0f33aed80  1 bdev(0x55d1a3c16200 /var/lib/ceph/osd/ceph-20/block) open size 1920345309184 (0x1bf1d800000, 1.75TiB) block_size 4096 (4KiB) non-rotational
2019-02-27 16:31:34.752485 7fc0f33aed80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:34.752495 7fc0f33aed80  1 bdev(0x55d1a3c16200 /var/lib/ceph/osd/ceph-20/block) close
2019-02-27 16:31:35.009776 7fc0f33aed80 -1 osd.20 0 OSD:init: unable to mount object store
2019-02-27 16:31:35.009796 7fc0f33aed80 -1  ** ERROR: osd init failed: (5) Input/output error
2019-02-27 16:31:55.220715 7ff4da40cd80  0 set uid:gid to 167:167 (ceph:ceph)
2019-02-27 16:31:55.220746 7ff4da40cd80  0 ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable), process ceph-osd, pid 1564222
2019-02-27 16:31:55.221547 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.221977 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.222331 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.222747 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.226811 7ff4da40cd80  0 pidfile_write: ignore empty --pid-file
2019-02-27 16:31:55.235463 7ff4da40cd80  0 load: jerasure load: lrc load: isa
2019-02-27 16:31:55.235531 7ff4da40cd80  1 bdev create path /var/lib/ceph/osd/ceph-20/block type kernel
2019-02-27 16:31:55.235538 7ff4da40cd80  1 bdev(0x5608d71b6000 /var/lib/ceph/osd/ceph-20/block) open path /var/lib/ceph/osd/ceph-20/block
2019-02-27 16:31:55.236101 7ff4da40cd80  1 bdev(0x5608d71b6000 /var/lib/ceph/osd/ceph-20/block) open size 1920345309184 (0x1bf1d800000, 1.75TiB) block_size 4096 (4KiB) non-rotational
2019-02-27 16:31:55.236467 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.236478 7ff4da40cd80  1 bdev(0x5608d71b6000 /var/lib/ceph/osd/ceph-20/block) close
2019-02-27 16:31:55.494201 7ff4da40cd80  1 bluestore(/var/lib/ceph/osd/ceph-20) _mount path /var/lib/ceph/osd/ceph-20
2019-02-27 16:31:55.494686 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.494724 7ff4da40cd80  1 bdev create path /var/lib/ceph/osd/ceph-20/block type kernel
2019-02-27 16:31:55.494727 7ff4da40cd80  1 bdev(0x5608d71b6200 /var/lib/ceph/osd/ceph-20/block) open path /var/lib/ceph/osd/ceph-20/block
2019-02-27 16:31:55.494921 7ff4da40cd80  1 bdev(0x5608d71b6200 /var/lib/ceph/osd/ceph-20/block) open size 1920345309184 (0x1bf1d800000, 1.75TiB) block_size 4096 (4KiB) non-rotational
2019-02-27 16:31:55.495323 7ff4da40cd80 -1 bluestore(/var/lib/ceph/osd/ceph-20/block) _read_bdev_label failed to read from /var/lib/ceph/osd/ceph-20/block: (5) Input/output error
2019-02-27 16:31:55.495335 7ff4da40cd80  1 bdev(0x5608d71b6200 /var/lib/ceph/osd/ceph-20/block) close
2019-02-27 16:31:55.758790 7ff4da40cd80 -1 osd.20 0 OSD:init: unable to mount object store
2019-02-27 16:31:55.758804 7ff4da40cd80 -1  ** ERROR: osd init failed: (5) Input/output error

接下来确定dm-0是不是和osd-20有关联,熟悉的顺藤摸瓜操作如下,注意warning提示有PV丢失

[root@demo-host ceph]# ls -l  /var/lib/ceph/osd/ceph-20/
total 48
-rw-r--r-- 1 ceph ceph 456 Feb 25 19:56 activate.monmap
lrwxrwxrwx 1 ceph ceph  93 Feb 25 19:56 block -> /dev/ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37/osd-block-a99bc25c-4cf4-5429-9171-4084555af14b #注意对应的LV和VG
-rw-r--r-- 1 ceph ceph   2 Feb 25 19:56 bluefs
-rw-r--r-- 1 ceph ceph  37 Feb 25 19:56 ceph_fsid
-rw-r--r-- 1 ceph ceph  37 Feb 25 19:56 fsid
-rw------- 1 ceph ceph  56 Feb 25 19:56 keyring
-rw-r--r-- 1 ceph ceph   8 Feb 25 19:56 kv_backend
-rw-r--r-- 1 ceph ceph  21 Feb 25 19:56 magic
-rw-r--r-- 1 ceph ceph   4 Feb 25 19:56 mkfs_done
-rw-r--r-- 1 ceph ceph  41 Feb 25 19:56 osd_key
-rw-r--r-- 1 ceph ceph   6 Feb 25 19:56 ready
-rw-r--r-- 1 ceph ceph  10 Feb 25 19:56 type
-rw-r--r-- 1 ceph ceph   3 Feb 25 19:56 whoami

[root@demo-host ceph]# vgs
  WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
  VG                                        #PV #LV #SN Attr   VSize  VFree
  ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd   1   1   0 wz--n- <5.46t    0
  ceph-2d626a29-6409-4edd-b3e0-df6dc0259629   1   1   0 wz--n- <5.46t    0
  ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37   1   1   0 wz-pn- <1.75t    0 #注意
  ceph-782b8301-ed74-4809-b39c-755bebd86a81   1   1   0 wz--n- <1.75t    0
  ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311   1   1   0 wz--n- <5.46t    0
  ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6   1   1   0 wz--n- <5.46t    0
  ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd   1   1   0 wz--n- <5.46t    0
  ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2   1   1   0 wz--n- <5.46t    0
  ceph-d3c92af2-9aee-4141-a693-9d21c329bec6   1   1   0 wz--n- <5.46t    0
  ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628   1   1   0 wz--n- <5.46t    0
[root@demo-host ceph]# lvs
  WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
  LV                                             VG                                        Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  osd-block-737138bb-53f8-5f20-b131-d776fec5e62e ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd -wi-ao---- <5.46t
  osd-block-31724c12-5cab-54ba-a0ea-f7bd0c5bdb39 ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 -wi-ao---- <5.46t
  osd-block-a99bc25c-4cf4-5429-9171-4084555af14b ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 -wi-a---p- <1.75t #注意
  osd-block-8505d8f5-4ea3-59d0-870e-59d360f5015c ceph-782b8301-ed74-4809-b39c-755bebd86a81 -wi-ao---- <1.75t
  osd-block-e9a70833-590b-5993-9638-179baaa782a5 ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 -wi-ao---- <5.46t
  osd-block-31541688-fb32-5337-af90-09d185613075 ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 -wi-ao---- <5.46t
  osd-block-df6cd15a-1b5c-5443-a062-50fa64fa9d07 ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd -wi-ao---- <5.46t
  osd-block-b28a126d-0a7b-503d-80c5-7cbaa04d0a9b ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 -wi-ao---- <5.46t
  osd-block-377ff375-d2bf-5ad9-94b4-2127b6dcf9e7 ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 -wi-ao---- <5.46t
  osd-block-4f147edf-9cb7-5263-bec0-3fa34dc0373f ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 -wi-ao---- <5.46t

[root@demo-host ceph]# ls -l  /dev/ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37/osd-block-a99bc25c-4cf4-5429-9171-4084555af14b
lrwxrwxrwx 1 ceph ceph 7 Feb 27 16:32 /dev/ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37/osd-block-a99bc25c-4cf4-5429-9171-4084555af14b -> ../dm-0 #验明正身

接下来查看Raid卡信息,确定磁盘状态

[root@demo-host ceph]# hpssacli ctrl slot=0  show  config detail


  Array: B
      Interface Type: Solid State SATA
      Unused Space: 0  MB (0.0%)
      Used Space: 1.7 TB (100.0%)
      Status: Failed Physical Drive #盘丢了,warning也有提示
      MultiDomain Status: OK
      Array Type: Data       HPE SSD Smart Path: enable

      Warning: One of the drives on this array have failed or has been removed.




      Logical Drive: 2
         Size: 1.7 TB
         Fault Tolerance: 0
         Heads: 255
         Sectors Per Track: 32
         Cylinders: 65535
         Strip Size: 256 KB
         Full Stripe Size: 256 KB
         Status: Failed #挂了
         MultiDomain Status: OK
         Caching:  Disabled
         Unique Identifier: 600508B1001C3FBF225890CDE3612E98
         Logical Drive Label: 0606978APVYKH0BRH9507N6082
         Drive Type: Data
         LD Acceleration Method: HPE SSD Smart Path

      physicaldrive 2I:4:1 #记录插槽序号,后面备用
         Port: 2I
         Box: 4
         Bay: 1
         Status: Failed
         Last Failure Reason: Hot removed
         Drive Type: Data Drive
         Interface Type: Solid State SATA
         Size: 1920.3 GB
         Drive exposed to OS: False
         Native Block Size: 4096
         Firmware Revision: 4IYVHPG1
         Serial Number: BTYS802201ZJ1P9DGN
         Model: ATA     VK001920GWJPH
         SATA NCQ Capable: True
         SATA NCQ Enabled: True
         Maximum Temperature (C): 41
         Usage remaining: 99.80%
         Power On Hours: 4868 #真是一块短命的SSD
         Estimated Life Remaining based on workload to date: 101213 days
         SSD Smart Trip Wearout: False
         PHY Count: 1
         PHY Transfer Rate: Unknown
         Drive Authentication Status: Not Applicable
         Sanitize Erase Supported: False

基本确定是SSD损坏,接下来开始清除系统残留的LVM信息,先umount掉对应目录

[root@demo-host ceph]# mount -l|grep ceph
tmpfs on /var/lib/ceph/osd/ceph-20 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-21 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-22 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-23 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-24 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-25 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-26 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-27 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-28 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-29 type tmpfs (rw,relatime)
[root@demo-host ceph]# umount  /var/lib/ceph/osd/ceph-20
[root@demo-host ceph]# mount -l|grep ceph
tmpfs on /var/lib/ceph/osd/ceph-21 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-22 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-23 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-24 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-25 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-26 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-27 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-28 type tmpfs (rw,relatime)
tmpfs on /var/lib/ceph/osd/ceph-29 type tmpfs (rw,relatime)

检查vg和lv信息,注意pv有warning,提示丢了盘

[root@demo-host ceph]# vgs
  WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
  VG                                        #PV #LV #SN Attr   VSize  VFree
  ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd   1   1   0 wz--n- <5.46t    0
  ceph-2d626a29-6409-4edd-b3e0-df6dc0259629   1   1   0 wz--n- <5.46t    0
  ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37   1   1   0 wz-pn- <1.75t    0
  ceph-782b8301-ed74-4809-b39c-755bebd86a81   1   1   0 wz--n- <1.75t    0
  ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311   1   1   0 wz--n- <5.46t    0
  ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6   1   1   0 wz--n- <5.46t    0
  ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd   1   1   0 wz--n- <5.46t    0
  ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2   1   1   0 wz--n- <5.46t    0
  ceph-d3c92af2-9aee-4141-a693-9d21c329bec6   1   1   0 wz--n- <5.46t    0
  ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628   1   1   0 wz--n- <5.46t    0
[root@demo-host ceph]# lvs
  WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
  LV                                             VG                                        Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  osd-block-737138bb-53f8-5f20-b131-d776fec5e62e ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd -wi-ao---- <5.46t
  osd-block-31724c12-5cab-54ba-a0ea-f7bd0c5bdb39 ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 -wi-ao---- <5.46t
  osd-block-a99bc25c-4cf4-5429-9171-4084555af14b ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 -wi-a---p- <1.75t
  osd-block-8505d8f5-4ea3-59d0-870e-59d360f5015c ceph-782b8301-ed74-4809-b39c-755bebd86a81 -wi-ao---- <1.75t
  osd-block-e9a70833-590b-5993-9638-179baaa782a5 ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 -wi-ao---- <5.46t
  osd-block-31541688-fb32-5337-af90-09d185613075 ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 -wi-ao---- <5.46t
  osd-block-df6cd15a-1b5c-5443-a062-50fa64fa9d07 ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd -wi-ao---- <5.46t
  osd-block-b28a126d-0a7b-503d-80c5-7cbaa04d0a9b ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 -wi-ao---- <5.46t
  osd-block-377ff375-d2bf-5ad9-94b4-2127b6dcf9e7 ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 -wi-ao---- <5.46t
  osd-block-4f147edf-9cb7-5263-bec0-3fa34dc0373f ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 -wi-ao---- <5.46t

遵从LVM三级结构,LV->VG->PV,先删掉lv和vg,发现有残留

[root@demo-host ceph]# vgremove ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37
  WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
  WARNING: 1 physical volumes are currently missing from the system.
Do you really want to remove volume group "ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37" containing 1 logical volumes? [y/n]: y
Do you really want to remove active logical volume ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37/osd-block-a99bc25c-4cf4-5429-9171-4084555af14b? [y/n]: y
  Aborting vg_write: No metadata areas to write to!
[root@demo-host ceph]# lvs
  WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
  LV                                             VG                                        Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  osd-block-737138bb-53f8-5f20-b131-d776fec5e62e ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd -wi-ao---- <5.46t
  osd-block-31724c12-5cab-54ba-a0ea-f7bd0c5bdb39 ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 -wi-ao---- <5.46t
  osd-block-a99bc25c-4cf4-5429-9171-4084555af14b ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 -wi-----p- <1.75t
  osd-block-8505d8f5-4ea3-59d0-870e-59d360f5015c ceph-782b8301-ed74-4809-b39c-755bebd86a81 -wi-ao---- <1.75t
  osd-block-e9a70833-590b-5993-9638-179baaa782a5 ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 -wi-ao---- <5.46t
  osd-block-31541688-fb32-5337-af90-09d185613075 ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 -wi-ao---- <5.46t
  osd-block-df6cd15a-1b5c-5443-a062-50fa64fa9d07 ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd -wi-ao---- <5.46t
  osd-block-b28a126d-0a7b-503d-80c5-7cbaa04d0a9b ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 -wi-ao---- <5.46t
  osd-block-377ff375-d2bf-5ad9-94b4-2127b6dcf9e7 ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 -wi-ao---- <5.46t
  osd-block-4f147edf-9cb7-5263-bec0-3fa34dc0373f ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 -wi-ao---- <5.46t
[root@demo-host ceph]# vgs
  WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
  VG                                        #PV #LV #SN Attr   VSize  VFree
  ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd   1   1   0 wz--n- <5.46t    0
  ceph-2d626a29-6409-4edd-b3e0-df6dc0259629   1   1   0 wz--n- <5.46t    0
  ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37   1   1   0 wz-pn- <1.75t    0
  ceph-782b8301-ed74-4809-b39c-755bebd86a81   1   1   0 wz--n- <1.75t    0
  ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311   1   1   0 wz--n- <5.46t    0
  ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6   1   1   0 wz--n- <5.46t    0
  ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd   1   1   0 wz--n- <5.46t    0
  ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2   1   1   0 wz--n- <5.46t    0
  ceph-d3c92af2-9aee-4141-a693-9d21c329bec6   1   1   0 wz--n- <5.46t    0
  ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628   1   1   0 wz--n- <5.46t    0

检查pv状态,发现有unknown设备,提示有pv设备“BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu” 丢失

[root@demo-host ceph]# pvs
  WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
  WARNING: Device for PV BLh8zq-EFYG-Yoy9-75uU-b7VM-p66e-o0C6Fu not found or rejected by a filter.
  PV         VG                                        Fmt  Attr PSize  PFree
  /dev/sdc   ceph-782b8301-ed74-4809-b39c-755bebd86a81 lvm2 a--  <1.75t    0
  /dev/sdd   ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd lvm2 a--  <5.46t    0
  /dev/sde   ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd lvm2 a--  <5.46t    0
  /dev/sdf   ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 lvm2 a--  <5.46t    0
  /dev/sdg   ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 lvm2 a--  <5.46t    0
  /dev/sdh   ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 lvm2 a--  <5.46t    0
  /dev/sdi   ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 lvm2 a--  <5.46t    0
  /dev/sdj   ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 lvm2 a--  <5.46t    0
  /dev/sdk   ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 lvm2 a--  <5.46t    0
  [unknown]  ceph-72de7913-115e-4df5-868d-7f4cf7ea2b37 lvm2 a-m  <1.75t    0

手工删除pv是不行的,这里需要用到一个pvscan --cache命令去刷新缓存,之后再看pv、vg、lv通通都被清理掉了

[root@demo-host ceph]# pvscan --cache
[root@demo-host ceph]# pvs
  PV         VG                                        Fmt  Attr PSize  PFree
  /dev/sdc   ceph-782b8301-ed74-4809-b39c-755bebd86a81 lvm2 a--  <1.75t    0
  /dev/sdd   ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd lvm2 a--  <5.46t    0
  /dev/sde   ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd lvm2 a--  <5.46t    0
  /dev/sdf   ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 lvm2 a--  <5.46t    0
  /dev/sdg   ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 lvm2 a--  <5.46t    0
  /dev/sdh   ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 lvm2 a--  <5.46t    0
  /dev/sdi   ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 lvm2 a--  <5.46t    0
  /dev/sdj   ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 lvm2 a--  <5.46t    0
  /dev/sdk   ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 lvm2 a--  <5.46t    0
[root@demo-host ceph]# vgs
  VG                                        #PV #LV #SN Attr   VSize  VFree
  ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd   1   1   0 wz--n- <5.46t    0
  ceph-2d626a29-6409-4edd-b3e0-df6dc0259629   1   1   0 wz--n- <5.46t    0
  ceph-782b8301-ed74-4809-b39c-755bebd86a81   1   1   0 wz--n- <1.75t    0
  ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311   1   1   0 wz--n- <5.46t    0
  ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6   1   1   0 wz--n- <5.46t    0
  ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd   1   1   0 wz--n- <5.46t    0
  ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2   1   1   0 wz--n- <5.46t    0
  ceph-d3c92af2-9aee-4141-a693-9d21c329bec6   1   1   0 wz--n- <5.46t    0
  ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628   1   1   0 wz--n- <5.46t    0
[root@demo-host ceph]# lvs
  LV                                             VG                                        Attr       LSize  Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  osd-block-737138bb-53f8-5f20-b131-d776fec5e62e ceph-04507dae-fa43-4a5f-910e-9a7f8fd64fbd -wi-ao---- <5.46t
  osd-block-31724c12-5cab-54ba-a0ea-f7bd0c5bdb39 ceph-2d626a29-6409-4edd-b3e0-df6dc0259629 -wi-ao---- <5.46t
  osd-block-8505d8f5-4ea3-59d0-870e-59d360f5015c ceph-782b8301-ed74-4809-b39c-755bebd86a81 -wi-ao---- <1.75t
  osd-block-e9a70833-590b-5993-9638-179baaa782a5 ceph-94096bc2-4a6a-4ac5-922a-9129e3b96311 -wi-ao---- <5.46t
  osd-block-31541688-fb32-5337-af90-09d185613075 ceph-957c14e6-c45e-4794-a6c4-92e55b267fd6 -wi-ao---- <5.46t
  osd-block-df6cd15a-1b5c-5443-a062-50fa64fa9d07 ceph-9a7d0101-f451-4e86-b1c0-3a4e4fced4bd -wi-ao---- <5.46t
  osd-block-b28a126d-0a7b-503d-80c5-7cbaa04d0a9b ceph-a150e02b-ec98-48ac-92b9-4f754aea19c2 -wi-ao---- <5.46t
  osd-block-377ff375-d2bf-5ad9-94b4-2127b6dcf9e7 ceph-d3c92af2-9aee-4141-a693-9d21c329bec6 -wi-ao---- <5.46t
  osd-block-4f147edf-9cb7-5263-bec0-3fa34dc0373f ceph-dfe4f8f2-880f-414d-af58-5b3c77ed2628 -wi-ao---- <5.46t

最后保守起见还是手工点亮故障灯,通知机房换盘

[root@demo-host ceph]# hpssacli ctrl slot=0 pd 2I:4:1 modify led=on

总结

Bluestore用到了LVM,因此卸载物理磁盘之前,一定要遵循LVM规范去删除残留数据,不然之后换上新盘会造成管理上的混乱。

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2019-02-27,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 Ceph对象存储方案 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 排查过程
  • 总结
相关产品与服务
对象存储
对象存储(Cloud Object Storage,COS)是由腾讯云推出的无目录层次结构、无数据格式限制,可容纳海量数据且支持 HTTP/HTTPS 协议访问的分布式存储服务。腾讯云 COS 的存储桶空间无容量上限,无需分区管理,适用于 CDN 数据分发、数据万象处理或大数据计算与分析的数据湖等多种场景。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档