前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >OpenStack HA集群3-Pace

OpenStack HA集群3-Pace

作者头像
py3study
发布2020-01-07 15:25:46
7260
发布2020-01-07 15:25:46
举报
文章被收录于专栏:python3
代码语言:javascript
复制
节点间主机名必须能解析
[root@controller1 ~]# cat /etc/hosts
192.168.17.149  controller1
192.168.17.141  controller2
192.168.17.166  controller3
192.168.17.111  demo.open-stack.cn
各节点间要互信,无密码能登录
[root@controller1 ~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
20:79:d4:a4:9f:8b:75:cf:12:58:f4:47:a4:c1:29:f3 root@controller1
The key's randomart p_w_picpath is:
+--[ RSA 2048]----+
|      .o. ...oo  |
|     o ...o.o+   |
|    o +   .+o .  |
|     o o +  E.   |
|        S o      |
|       o o +     |
|      . . . o    |
|           .     |
|                 |
+-----------------+
[root@controller1 ~]# ssh-copy-id controller2
[root@controller1 ~]# ssh-copy-id controller3
配置YUM源
# vim /etc/yum.repos.d/ha-clustering.repo
[network_ha-clustering_Stable]
name=Stable High Availability/Clustering packages (CentOS-7)
type=rpm-md
baseurl=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/
gpgcheck=0
gpgkey=http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/repodata/repomd.xml.key
enabled=1
这个YUM源可能会冲突,先enabled=0,如果剩下一个crmsh包,再enabled=1打开后安装
Corosync下载地址,目前最新版本2.4.2
http://build.clusterlabs.org/corosync/releases/
http://build.clusterlabs.org/corosync/releases/corosync-2.4.2.tar.gz
[root@controller1 ~]# ansible controller -m copy -a "src=/etc/yum.repos.d/ha-cluster.repo dest=/etc/yum.repos.d/"
安装软件包
# yum install pacemaker pcs resource-agents -y cifs-utils quota psmisc corosync fence-agents-all lvm2 resource-agents
#  yum install crmsh  -y
启动pcsd,并确认启动正常
# systemctl enable pcsd
# systemctl enable corosync
# systemctl start pcsd
# systemctl status pcsd
[root@controller2 ~]# pacemakerd -$
Pacemaker 1.1.15-11.el7_3.2
Written by Andrew Beekhof
[root@controller1 ~]# ansible controller -m command -a "pacemakerd -$"
修改hacluster密码
【all】# echo zoomtech | passwd --stdin hacluster
[root@controller1 ~]# ansible controller -m command -a "echo zoomtech | passwd --stdin hacluster"
# passwd hacluster
编辑corosync.conf
[root@controller3 ~]# vim /etc/corosync/corosync.conf
totem {
        version: 2
        secauth: off
        cluster_name: openstack-cluster
        transport: udpu
}
nodelist {
        node {
                ring0_addr: controller1
                nodeid: 1
        }
        node {
                ring0_addr: controller2
                nodeid: 2
        }
        node {
                ring0_addr: controller3
                nodeid: 3
        }
}
logging {
        to_logfile: yes
        logfile: /var/log/cluster/corosync.log
        to_syslog: yes
}
quorum {
        provider: corosync_votequorum
}
[root@controller1 ~]# scp /etc/corosync/corosync.conf controller2:/etc/corosync/
[root@controller1 ~]# scp /etc/corosync/corosync.conf controller3:/etc/corosync/
[root@controller1 corosync]# ansible controller -m copy -a "src=corosync.conf dest=/etc/corosync"
创建集群
使用pcs设置集群身份认证
[root@controller1 ~]# pcs cluster auth controller1 controller2 controller3 -u hacluster -p zoomtech --force
controller3: Authorized
controller2: Authorized
controller1: Authorized
现在我们创建一个集群并添加一些节点。注意,这个名字不能超过15个字符
[root@controller1 ~]# pcs cluster setup --force --name openstack-cluster controller1 controller2 controller3
Destroying cluster on nodes: controller1, controller2, controller3...
controller3: Stopping Cluster (pacemaker)...
controller2: Stopping Cluster (pacemaker)...
controller1: Stopping Cluster (pacemaker)...
controller2: Successfully destroyed cluster
controller1: Successfully destroyed cluster
controller3: Successfully destroyed cluster
Sending cluster config files to the nodes...
controller1: Succeeded
controller2: Succeeded
controller3: Succeeded
Synchronizing pcsd certificates on nodes controller1, controller2, controller3...
controller3: Success
controller2: Success
controller1: Success
Restarting pcsd on the nodes in order to reload the certificates...
controller3: Success
controller2: Success
controller1: Success
启动集群
[root@controller1 ~]# pcs cluster enable --all
controller1: Cluster Enabled
controller2: Cluster Enabled
controller3: Cluster Enabled
[root@controller1 ~]# pcs cluster start --all
controller2: Starting Cluster...
controller1: Starting Cluster...
controller3: Starting Cluster...
查看集群状态
[root@controller1 corosync]# ansible controller -m command -a "pcs cluster status"
[root@controller1 ~]# pcs cluster status
Cluster Status:
 Stack: corosync
 Current DC: controller3 (version 1.1.15-11.el7_3.2-e174ec8) - partition with quorum
 Last updated: Fri Feb 17 10:39:38 2017        Last change: Fri Feb 17 10:39:29 2017 by hacluster via crmd on controller3
 3 nodes and 0 resources configured
PCSD Status:
  controller2: Online
  controller3: Online
  controller1: Online
[root@controller1 corosync]# ansible controller -m command -a "pcs status"
[root@controller1 ~]# pcs status
Cluster name: openstack-cluster
Stack: corosync
Current DC: controller2 (version 1.1.15-11.el7_3.2-e174ec8) - partition with quorum
Last updated: Thu Mar  2 17:07:34 2017        Last change: Thu Mar  2 01:44:44 2017 by root via cibadmin on controller1
3 nodes and 1 resource configured
Online: [ controller1 controller2 controller3 ]
Full list of resources:
 vip    (ocf::heartbeat:IPaddr2):    Started controller2
Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled
查看集群状态
[root@controller1 corosync]# ansible controller -m command -a "crm_mon -1"
[root@controller1 ~]# crm_mon -1
Stack: corosync
Current DC: controller2 (version 1.1.15-11.el7_3.2-e174ec8) - partition with quorum
Last updated: Wed Mar  1 17:54:04 2017          Last change: Wed Mar  1 17:44:38 2017 by root via cibadmin on controller1
3 nodes and 1 resource configured
Online: [ controller1 controller2 controller3 ]
Active resources:
vip     (ocf::heartbeat:IPaddr2):    Started controller1
查看pacemaker进程状态
[root@controller1 ~]# ps aux | grep pacemaker
root      75900  0.2  0.5 132632  9216 ?        Ss   10:39   0:00 /usr/sbin/pacemaked -f
haclust+  75901  0.3  0.8 135268 15376 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/cib
root      75902  0.1  0.4 135608  7920 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/stonithd
root      75903  0.0  0.2 105092  5020 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/lrmd
haclust+  75904  0.0  0.4 126924  7636 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/attrd
haclust+  75905  0.0  0.2 117040  4560 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/pengine
haclust+  75906  0.1  0.5 145328  8988 ?        Ss   10:39   0:00 /usr/libexec/pacemaker/crmd
root      75997  0.0  0.0 112648   948 pts/0    R+   10:40   0:00 grep --color=auto pacemaker
查看集群状态
[root@controller1 ~]# corosync-cfgtool -s
Printing ring status.
Local node ID 1
RING ID 0
    id    = 192.168.17.132
    status    = ring 0 active with no faults
[root@controller2 corosync]# corosync-cfgtool -s
Printing ring status.
Local node ID 2
RING ID 0
    id    = 192.168.17.146
    status    = ring 0 active with no faults
[root@controller3 ~]# corosync-cfgtool -s
Printing ring status.
Local node ID 3
RING ID 0
    id    = 192.168.17.138
    status    = ring 0 active with no faults
[root@controller1 ~]# corosync-cmapctl | grep members
runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.17.132)
runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.1.status (str) = joined
runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.17.146)
runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.2.status (str) = joined
runtime.totem.pg.mrp.srp.members.3.config_version (u64) = 0
runtime.totem.pg.mrp.srp.members.3.ip (str) = r(0) ip(192.168.17.138)
runtime.totem.pg.mrp.srp.members.3.join_count (u32) = 1
runtime.totem.pg.mrp.srp.members.3.status (str) = joined
查看集群状态
[root@controller1 ~]# pcs status corosync
Membership information
----------------------
    Nodeid      Votes Name
         1          1 controller1 (local)
         3          1 controller3
         2          1 controller2
[root@controller2 corosync]# pcs status corosync
Membership information
----------------------
    Nodeid      Votes Name
         1          1 controller1
         3          1 controller3
         2          1 controller2 (local)
[root@controller3 ~]# pcs status corosync
Membership information
----------------------
    Nodeid      Votes Name
         1          1 controller1
         3          1 controller3 (local)
         2          1 controller2
[root@controller1 ~]# crm_verify -L -V
   error: unpack_resources:    Resource start-up disabled since no STONITH resources have been defined
   error: unpack_resources:    Either configure some or disable STONITH with the stonith-enabled option
   error: unpack_resources:    NOTE: Clusters with shared data need STONITH to ensure data integrity
Errors found during check: config not valid
[root@controller1 ~]#
[root@controller1 ~]# pcs property set stonith-enabled=false
[root@controller1 ~]# pcs property set no-quorum-policy=ignore
[root@controller1 ~]# crm_verify -L -V
[root@controller1 corosync]# ansible controller -m command -a "pcs property set stonith-enabled=false
[root@controller1 corosync]# ansible controller -m command -a "pcs property set no-quorum-policy=ignore"
[root@controller1 corosync]# ansible controller -m command -a "crm_verify -L -V"
配置 VIP
[root@controller1 ~]# crm
crm(live)# configure
crm(live)configure# show
node 1: controller1
node 2: controller2
node 3: controller3
property cib-bootstrap-options: \
    have-watchdog=false \
    dc-version=1.1.15-11.el7_3.2-e174ec8 \
    cluster-infrastructure=corosync \
    cluster-name=openstack-cluster \
    stonith-enabled=false \
    no-quorum-policy=ignore
crm(live)configure# primitive vip ocf:heartbeat:IPaddr2 params ip=192.168.17.111 cidr_netmask=24 nic=ens37 op start interval=0s timeout=20s op stop interval=0s timeout=20s monitor interval=30s meta priority=100
crm(live)configure# show
node 1: controller1
node 2: controller2
node 3: controller3
primitive vip IPaddr2 \
    params ip=192.168.17.111 cidr_netmask=24 nic=ens37 \
    op start interval=0s timeout=20s \
    op stop interval=30s timeout=20s monitor \
    meta priority=100
property cib-bootstrap-options: \
    have-watchdog=false \
    dc-version=1.1.15-11.el7_3.2-e174ec8 \
    cluster-infrastructure=corosync \
    cluster-name=openstack-cluster \
    stonith-enabled=false \
    no-quorum-policy=ignore
crm(live)configure# commit
crm(live)configure# exit
查看VIP已绑定在ens37网卡上
[root@controller1 ~]# ip a
4: ens37: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:0c:29:ff:8b:4b brd ff:ff:ff:ff:ff:ff
    inet 192.168.17.141/24 brd 192.168.17.255 scope global dynamic ens37
       valid_lft 2388741sec preferred_lft 2388741sec
   inet 192.168.17.111/24 brd 192.168.17.255 scope global secondary ens37
       valid_lft forever preferred_lft forever
上面指定的网卡名称3个节点必须是同一个名称,否则飘移会出现问题,切换不过去
[root@controller1 ~]# crm status
Stack: corosync
Current DC: controller1 (version 1.1.15-11.el7_3.2-e174ec8) - partition with quorum
Last updated: Wed Feb 22 11:42:07 2017        Last change: Wed Feb 22 11:22:56 2017 by root via cibadmin on controller1
3 nodes and 1 resource configured
Online: [ controller1 controller2 controller3 ]
Full list of resources:
 vip    (ocf::heartbeat:IPaddr2):    Started controller1
查看corosync引擎是否正常启动
[root@controller1 ~]# grep -e "Corosync Cluster Engine" -e "configuration file" /var/log/cluster/corosync.log
[51405] controller1 corosyncnotice  [MAIN  ] Corosync Cluster Engine ('2.4.0'): started and ready to provide service.
Mar 01 17:35:20 [51425] controller1        cib:     info: retrieveCib:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.xml (digest: /var/lib/pacemaker/cib/cib.xml.sig)
Mar 01 17:35:20 [51425] controller1        cib:  warning: cib_file_read_and_verify:    Could not verify cluster configuration file /var/lib/pacemaker/cib/cib.xml: No such file or directory (2)
Mar 01 17:35:20 [51425] controller1        cib:  warning: cib_file_read_and_verify:    Could not verify cluster configuration file /var/lib/pacemaker/cib/cib.xml: No such file or directory (2)
Mar 01 17:35:20 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.Apziws (digest: /var/lib/pacemaker/cib/cib.0ZxsVW)
Mar 01 17:35:21 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.ObYehI (digest: /var/lib/pacemaker/cib/cib.O8Rntg)
Mar 01 17:35:42 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.eqrhsF (digest: /var/lib/pacemaker/cib/cib.6BCfNj)
Mar 01 17:35:42 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.riot2E (digest: /var/lib/pacemaker/cib/cib.SAqtzj)
Mar 01 17:35:42 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.Q8H9BL (digest: /var/lib/pacemaker/cib/cib.MBljlq)
Mar 01 17:38:29 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.OTIiU4 (digest: /var/lib/pacemaker/cib/cib.JnHr1v)
Mar 01 17:38:36 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.2cK9Yk (digest: /var/lib/pacemaker/cib/cib.JSqEH8)
Mar 01 17:44:38 [51425] controller1        cib:     info: cib_file_write_with_digest:    Reading cluster configuration file /var/lib/pacemaker/cib/cib.aPFtr3 (digest: /var/lib/pacemaker/cib/cib.E3Ve7X)
[root@controller1 ~]#
查看初始化成员节点通知是否正常发出 
[root@controller1 ~]# grep  TOTEM /var/log/cluster/corosync.log 
[51405] controller1 corosyncnotice  [TOTEM ] Initializing transport (UDP/IP Unicast).
[51405] controller1 corosyncnotice  [TOTEM ] Initializing transmit/receive security (NSS) crypto: none hash: none
[51405] controller1 corosyncnotice  [TOTEM ] The network interface [192.168.17.149] is now up.
[51405] controller1 corosyncnotice  [TOTEM ] adding new UDPU member {192.168.17.149}
[51405] controller1 corosyncnotice  [TOTEM ] adding new UDPU member {192.168.17.141}
[51405] controller1 corosyncnotice  [TOTEM ] adding new UDPU member {192.168.17.166}
[51405] controller1 corosyncnotice  [TOTEM ] A new membership (192.168.17.149:4) was formed. Members joined: 1
[51405] controller1 corosyncnotice  [TOTEM ] A new membership (192.168.17.141:12) was formed. Members joined: 2 3
检查启动过程中是否有错误产生
[root@controller1 ~]# grep ERROR: /var/log/cluster/corosync.log
本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2019/09/19 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档