前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >GPCC参数metrics_collector配置错误导致GreenPlum启动报错

GPCC参数metrics_collector配置错误导致GreenPlum启动报错

作者头像
小麦苗DBA宝典
发布2023-04-27 13:24:53
7010
发布2023-04-27 13:24:53
举报

现象

代码语言:javascript
复制
[gpadmin@mdw1 ~]$ gpstart -a
20230116:12:58:42:008927 gpstart:mdw1:gpadmin-[INFO]:-Starting gpstart with args: -a
20230116:12:58:42:008927 gpstart:mdw1:gpadmin-[INFO]:-Gathering information and validating the environment...
20230116:12:58:42:008927 gpstart:mdw1:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 6.19.1 build commit:0e314744a460630073b46cea7b7cf20a81e3da63 Open Source'
20230116:12:58:42:008927 gpstart:mdw1:gpadmin-[INFO]:-Greenplum Catalog Version: '301908232'
20230116:12:58:42:008927 gpstart:mdw1:gpadmin-[INFO]:-Starting Master instance in admin mode
20230116:12:58:42:008927 gpstart:mdw1:gpadmin-[CRITICAL]:-Failed to start Master instance in admin mode
20230116:12:58:42:008927 gpstart:mdw1:gpadmin-[CRITICAL]:-Error occurred: non-zero rc: 1
 Command was: 'env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /data/gpdb/master/gpseg-1/ -l /data/gpdb/master/gpseg-1//pg_log/startup.log -w -t 600 -o " -p 5432 -c gp_role=utility " start'
rc=1, stdout='waiting for server to start.... stopped waiting
', stderr='pg_ctl: could not start server
Examine the log output.
'
[gpadmin@mdw1 ~]$ tailf /data/gpdb/master/gpseg-1//pg_log/startup.log
2023-01-16 12:58:59.464993 CST,,,p8992,th834783360,,,,0,,,seg-1,,,,,"LOG","00000","registering background worker ""sweeper process""",,,,,,,,"RegisterBackgroundWorker","bgworker.c",774,
2023-01-16 12:58:59.465304 CST,,,p8992,th834783360,,,,0,,,seg-1,,,,,"FATAL","58P01","could not access file ""metrics_collector"": No such file or directory",,,,,,,,"internal_load_library","dfmgr.c",202,1    0xbef3fc postgres errstart (elog.c:557)
2    0xbf456d postgres <symbol not found> (dfmgr.c:199)
3    0xbf4f54 postgres load_file (dfmgr.c:156)
4    0xc083a4 postgres process_shared_preload_libraries (miscinit.c:1378)
5    0xa0d6e3 postgres PostmasterMain (postmaster.c:1151)
6    0x6b0871 postgres main (main.c:205)
7    0x7f522e7ed3d5 libc.so.6 __libc_start_main + 0xf5
8    0x6bc58c postgres <symbol not found> + 0x6bc58c

分析

从启动日志“2023-01-16 12:58:59.465304 CST,,,p8992,th834783360,,,,0,,,seg-1,,,,,"FATAL","58P01","could not access file ""metrics_collector"": No such file or directory",,,,,,,,"internal_load_library","dfmgr.c",202,1 0xbef3fc postgres errstart (elog.c:557)”可以看到应该是metrics_collector的问题,这个值是参数文件postgresql.conf中的shared_preload_libraries的值,用于开启gpcc的指标监控。

报错,应该是gpcc安装有错误,然后启动数据库导致的。

若是GPCC安装成功,则会在如下位置有库文件,否则不能随便重启GreenPlum,会导致启动失败:

代码语言:javascript
复制
[root@lhrgp40 /]# find /usr/local -name metrics_collector*
/usr/local/greenplum-db-6.19.3/share/postgresql/extension/metrics_collector--1.0.sql
/usr/local/greenplum-db-6.19.3/share/postgresql/extension/metrics_collector.control
/usr/local/greenplum-db-6.19.3/lib/postgresql/metrics_collector.so
[root@lhrgp40 /]# 
[gpadmin@lhrgp40 ~]$ ll $GPHOME/share/postgresql/extension/gp_wlm*
-rw-r--r-- 1 gpadmin gpadmin 856 Dec  6 12:27 /usr/local/greenplum-db-6.19.3/share/postgresql/extension/gp_wlm--0.1.sql
-rw-r--r-- 1 gpadmin gpadmin 232 Dec  6 12:27 /usr/local/greenplum-db-6.19.3/share/postgresql/extension/gp_wlm.control
[gpadmin@lhrgp40 ~]$ ll $GPHOME/share/postgresql/extension/metrics_collector*
-rw-r--r-- 1 gpadmin gpadmin 846 Dec  6 12:27 /usr/local/greenplum-db-6.19.3/share/postgresql/extension/metrics_collector--1.0.sql
-rw-r--r-- 1 gpadmin gpadmin 233 Dec  6 12:27 /usr/local/greenplum-db-6.19.3/share/postgresql/extension/metrics_collector.control
[gpadmin@lhrgp40 ~]$ ll $GPHOME/lib/postgresql/metrics_collector.so
-rwxr-xr-x 1 gpadmin gpadmin 3357064 Dec  6 12:27 /usr/local/greenplum-db-6.19.3/lib/postgresql/metrics_collector.so
[gpadmin@lhrgp40 ~]$ 
[gpadmin@lhrgp40 ~]$ gppkg -q --all
20230116:14:58:39:020317 gppkg:lhrgp40:gpadmin-[INFO]:-Starting gppkg with args: -q --all
MetricsCollector-6.8.3_gp_6.19.3

解决

1、先修复master实例,将参数文件postgresql.conf中的shared_preload_libraries的值清空

2、再修改segment实例,将参数文件postgresql.conf中的shared_preload_libraries的值清空

3、尽快启动GreenPlum实例,命令gpstart -a

4、再修复mirror实例的参数文件,将参数文件postgresql.conf中的shared_preload_libraries的值清空

5、最后再单独启动mirror实例,启动方式:

代码语言:javascript
复制
nohup  /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/mirror/gpseg5 -p 7002 &

segment的配置可以在master实例上查看:

代码语言:javascript
复制
 select * from gp_segment_configuration order by 2,1 ;

最后重新安装gpcc,请参考:https://www.xmmup.com/greenplumguanfangjiankonggongjugpcc-6deanzhuanghexiezai.html

postgresql.conf参数文件的位置

代码语言:javascript
复制
[gpadmin@lhrgp40 ~]$ ps -ef|grep green
gpadmin    520     1  0 14:28 pts/0    00:00:07 /usr/local/greenplum-cc-6.8.3/bin/gpccws -W masterport5432e
gpadmin    672     1  0 14:28 ?        00:00:02 /usr/local/greenplum-cc-6.8.3/bin/ccagent -udpport 9898 -rpcaddr lhrgp40:8899 masterport5432e
gpadmin   1845     1  0 14:33 ?        00:00:21 /usr/local/greenplum-db-6.19.3/bin/postgres -D /opt/greenplum/data/master/gpseg-1 -p 5432 -E
gpadmin  15037 15036  0 15:28 ?        00:00:00 addr2line -s -e /usr/local/greenplum-db-6.19.3/bin/postgres 0xbefe0c 0xbf2e08 0xa12c84 0x9fd127 0xa08dd0 0x6ac32e 0xa0e592 0x6b09e1 0x7f969816e555 0x6bc6fc
gpadmin  15039 15724  0 15:28 pts/0    00:00:00 grep --color=auto green
[gpadmin@lhrgp40 ~]$ ll /opt/greenplum/data/master/gpseg-1/postgresql.conf
-rw------- 1 gpadmin gpadmin 23762 Jan 16 14:31 /opt/greenplum/data/master/gpseg-1/postgresql.conf
[gpadmin@lhrgp40 ~]$ more postgresql.conf^C
[gpadmin@lhrgp40 ~]$ more /opt/greenplum/data/master/gpseg-1/postgresql.conf | grep shared_preload_libraries
#shared_preload_libraries = ''          # (change requires restart)
shared_preload_libraries='metrics_collector'

同一个主机上可能有多个primary和mirror,那么每个库都需要修改,如下得修改6个库的参数文件:

代码语言:javascript
复制
[root@hdw ~]# ps -ef|grep green
gpadmin   3120     1  0 13:47 ?        00:00:00 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/mirror/gpseg3 -p 7000
gpadmin   3138     1  0 13:47 ?        00:00:00 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/mirror/gpseg4 -p 7001
gpadmin   7256     1  0 13:53 ?        00:00:00 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/mirror/gpseg5 -p 7002
gpadmin  27039     1  0 13:19 ?        00:00:30 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/primary/gpseg7 -p 6001
gpadmin  27041     1  0 13:19 ?        00:00:30 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/primary/gpseg8 -p 6002
gpadmin  27042     1  0 13:19 ?        00:00:30 /usr/local/greenplum-db-6.19.1/bin/postgres -D /data/gpdb/primary/gpseg6 -p 6000
[root@hdw5 ~]# 
本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2023-01-27,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 DB宝 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 现象
  • 分析
  • 解决
  • postgresql.conf参数文件的位置
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档