专栏首页Ceph对象存储方案ceph v12版本直升v14

ceph v12版本直升v14

一个RGW环境的更新,ceph 12.2.12升级到14.2.4流程,跳过中间的13版本。 注意:升级很危险,操作需谨慎。升级没有后悔药,本人不承担任何因升级及相关操作导致的任何数据丢失风险。

yum准备

yum源里面将旧的

https://mirrors.aliyun.com/ceph/rpm-luminous/el7/x86_64/

替换为

https://mirrors.aliyun.com/ceph/rpm-nautilus/el7/x86_64/

之后更新yum源信息,使用install即可完成二进制包的升级。

yum clean all
yum makecache
yum install ceph ceph-radosgw

更新服务

软件版本升级以后还要使用下面的命令依次,重启MON,MGR,OSD,最后是RGW

systemctl stop ceph-mon@*
systemctl stop ceph-mgr@*
systemctl stop ceph-osd@*
systemctl stop ceph-radosgw@*

修复异常

升级后出现“Legacy BlueStore stats reporting”和“ 3 monitors have not enabled msgr2”,两种类型的异常。 出现“Legacy BlueStore stats reporting” 是因为底层数据结构发生变化导致。 出现“3 monitors have not enabled msgr2” 是因为新版本需要默认开启msgr2的通信模块。

OSD问题修复

[root@demohost-229 supdev]# ceph -s
  cluster:
    id:     a293ad23-f310-480b-ab2a-5629f2aeef45
    health: HEALTH_WARN
            Legacy BlueStore stats reporting detected on 6 OSD(s)
            3 monitors have not enabled msgr2

  services:
    mon: 3 daemons, quorum demohost-227,demohost-228,demohost-229 (age 4m)
    mgr: demohost-229(active, since 4m), standbys: demohost-227, demohost-228
    osd: 6 osds: 6 up, 6 in
    rgw: 3 daemons active (demohost-227, demohost-228, demohost-229)

  data:
    pools:   7 pools, 184 pgs
    objects: 279.96k objects, 92 GiB
    usage:   295 GiB used, 3.0 TiB / 3.3 TiB avail
    pgs:     184 active+clean

  io:
    client:   55 KiB/s rd, 0 B/s wr, 55 op/s rd, 37 op/s wr

[root@demohost-229 supdev]# ceph -v
ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable)


[root@demohost-227 supdev]# ceph health detail
HEALTH_WARN Legacy BlueStore stats reporting detected on 6 OSD(s); 3 monitors have not enabled msgr2
BLUESTORE_LEGACY_STATFS Legacy BlueStore stats reporting detected on 6 OSD(s)
     osd.0 legacy statfs reporting detected, suggest to run store repair to get consistent statistic reports
     osd.1 legacy statfs reporting detected, suggest to run store repair to get consistent statistic reports
     osd.2 legacy statfs reporting detected, suggest to run store repair to get consistent statistic reports
     osd.3 legacy statfs reporting detected, suggest to run store repair to get consistent statistic reports
     osd.4 legacy statfs reporting detected, suggest to run store repair to get consistent statistic reports
     osd.5 legacy statfs reporting detected, suggest to run store repair to get consistent statistic reports
MON_MSGR2_NOT_ENABLED 3 monitors have not enabled msgr2
    mon.demohost-227 is not bound to a msgr2 port, only v1:172.17.61.227:6789/0
    mon.demohost-228 is not bound to a msgr2 port, only v1:172.17.61.228:6789/0
    mon.demohost-229 is not bound to a msgr2 port, only v1:172.17.61.229:6789/0

先修复OSD相关的异常,流程为:停OSD服务,执行“ceph-bluestore-tool repair”,之后再启动OSD服务,将所有OSD依次全部这样操作一遍即可。以修复OSD.1 为例

[root@demohost-227 supdev]# systemctl stop ceph-osd@1
[root@demohost-227 supdev]# ls /var/lib/ceph/osd/ceph-1
activate.monmap  block  bluefs  ceph_fsid  fsid  keyring  kv_backend  magic  mkfs_done  osd_key  ready  require_osd_release  type  whoami
[root@demohost-227 supdev]# ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-1
2019-12-02 14:41:06.607 7faf98bfcf80 -1 bluestore(/var/lib/ceph/osd/ceph-1) fsck error: legacy statfs record found, removing
2019-12-02 14:41:06.607 7faf98bfcf80 -1 bluestore(/var/lib/ceph/osd/ceph-1) fsck error: missing Pool StatFS record for pool 8
2019-12-02 14:41:06.607 7faf98bfcf80 -1 bluestore(/var/lib/ceph/osd/ceph-1) fsck error: missing Pool StatFS record for pool a
2019-12-02 14:41:06.607 7faf98bfcf80 -1 bluestore(/var/lib/ceph/osd/ceph-1) fsck error: missing Pool StatFS record for pool c
2019-12-02 14:41:06.607 7faf98bfcf80 -1 bluestore(/var/lib/ceph/osd/ceph-1) fsck error: missing Pool StatFS record for pool d
2019-12-02 14:41:06.607 7faf98bfcf80 -1 bluestore(/var/lib/ceph/osd/ceph-1) fsck error: missing Pool StatFS record for pool ffffffffffffffff
repair success

[root@demohost-227 supdev]# systemctl start ceph-osd@1

[root@demohost-227 supdev]# ceph -s
  cluster:
    id:     a293ad23-f310-480b-ab2a-5629f2aeef45
    health: HEALTH_WARN
            Legacy BlueStore stats reporting detected on 5 OSD(s)
            3 monitors have not enabled msgr2

  services:
    mon: 3 daemons, quorum demohost-227,demohost-228,demohost-229 (age 11m)
    mgr: demohost-229(active, since 11m), standbys: demohost-227, demohost-228
    osd: 6 osds: 6 up, 6 in
    rgw: 3 daemons active (demohost-227, demohost-228, demohost-229)

  data:
    pools:   7 pools, 184 pgs
    objects: 279.96k objects, 92 GiB
    usage:   294 GiB used, 3.0 TiB / 3.3 TiB avail
    pgs:     184 active+clean

  io:
    recovery: 367 B/s, 5 objects/s

MGR2问题修复

之后修复mgr2的问题,随便找台机器执行开启命令即可。

[root@demohost-229 supdev]# ceph -s
  cluster:
    id:     a293ad23-f310-480b-ab2a-5629f2aeef45
    health: HEALTH_WARN
            3 monitors have not enabled msgr2

  services:
    mon: 3 daemons, quorum demohost-227,demohost-228,demohost-229 (age 19m)
    mgr: demohost-229(active, since 19m), standbys: demohost-227, demohost-228
    osd: 6 osds: 6 up, 6 in
    rgw: 3 daemons active (demohost-227, demohost-228, demohost-229)

  data:
    pools:   7 pools, 184 pgs
    objects: 279.96k objects, 92 GiB
    usage:   293 GiB used, 3.0 TiB / 3.3 TiB avail
    pgs:     184 active+clean

  io:
    client:   7.1 KiB/s rd, 7 op/s rd, 0 op/s wr
    recovery: 156 B/s, 2 objects/s
[root@demohost-227 tools]# ceph mon enable-msgr2
[root@demohost-227 tools]# ceph  -s
  cluster:
    id:     a293ad23-f310-480b-ab2a-5629f2aeef45
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum demohost-227,demohost-228,demohost-229 (age 13s)
    mgr: demohost-229(active, since 22m), standbys: demohost-227, demohost-228
    osd: 6 osds: 6 up, 6 in
    rgw: 3 daemons active (demohost-227, demohost-228, demohost-229)

  data:
    pools:   7 pools, 184 pgs
    objects: 279.96k objects, 92 GiB
    usage:   293 GiB used, 3.0 TiB / 3.3 TiB avail
    pgs:     184 active+clean

  io:
    client:   14 KiB/s rd, 0 B/s wr, 13 op/s rd, 10 op/s wr

总结

升级操作不复杂,但是里面会遇上各种奇葩问题,升级尽量控制在小版本的维度,如果是这种跨大版本,老司机都容易翻车,所以一点要谨慎。

本文分享自微信公众号 - Ceph对象存储方案(cephbook)

原文出处及转载信息见文内详细说明,如有侵权,请联系 yunjia_community@tencent.com 删除。

原始发表时间:2019-12-02

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • systemd下手工部署OSD服务-Jewel版本

    用户1260683
  • Bluestore下的OSD自启动修复

    集群误操作,停掉了所有OSD服务,同时关闭了自启动,尝试”systemctl start ceph-osd@10“发现日志出现下面的报错

    用户1260683
  • ​gitlab+jenkins打造ceph的rpm自动编译

    需要添加的插件:Gitlab Hook、Build Authorization Token Root、Gitlab Authentication、Gitlab

    用户1260683
  • 大数据架构师基础:hadoop家族,Cloudera系列产品介绍

    大数据我们都知道hadoop,可是还会各种各样的技术进入我们的视野:Spark,Storm,impala,让我们都反映不过来。为了能够更好的架构大数据项目,这...

    机器学习AI算法工程
  • 一个复杂的数据需求的创新优化(r12笔记第96天))

    今天处理了一个蛮有意思的案例,正如我给开发同学所说的那样,方案有很多,但是我们需要明确需求之后,找到一个最合适的需求。 业务同学反馈,数据库中有...

    jeanron100
  • 前端入门系列之CSS

    CSS (Cascading Style Sheets) 是用来样式化和排版你的网页的 —— 例如更改网页内容的字体、颜色、大小和间距,将内容分割成多列或者加入...

    王大锤
  • 3.3 Spark存储与I/O

    3.3 Spark存储与I/O 前面已经讲过,RDD是按照partition分区划分的,所以RDD可以看作由一些分布在不同节点上的分区组成。由于partiti...

    Albert陈凯
  • 金融风控领域的工业级大数据应用: 如何跨越AI与业务经验结合前的鸿沟?

    大数据文摘
  • 【HBU】数据结构树练习题

    版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。 ...

    韩旭051
  • Linux安装python3.5

    Tencent JCoder

扫码关注云+社区

领取腾讯云代金券