//
利用MySQL二进制包进行版本升级
//
业务场景介绍
线上有个数据库主从环境的MySQL版本是5.5.19版本的,由于5.5.19环境的MySQL在运维侧的支持不太好,例如:不能动态修改buffer_pool的值,alter table增加列的操作会长时间锁表等等。所以经过商量,需要对它进行升级,这次我采用的是在线升级的办法。我总结了一下在线升级过程中的总体步骤:
0、备份旧的数据库 1、关闭旧的MySQL服务器 2、用新的MySQL服务器二进制文件替换旧的MySQL二进制文件或软件包, 3、在现有的数据目录上重新启动MySQL 4、运行mysql_upgrade 5、重启新的MySQL服务器
具体的步骤如下:
0、先对旧的数据库进行备份,以免在执行完数据库更新之后,导致数据不可用。
1、关闭旧的MySQL服务器,这里使用mysqladmin命令来进行关闭。为了保证安全,可以在关闭之前将参数 innodb_fast_shutdown设置成为0,保证安全,这个参数的取值一般有以下几个: 0表示在innodb关闭的时候,需要purge all, merge insert buffer,flush dirty pages。这是最慢的一种关闭方式,但是restart的时候也是最快的。后面将介绍purge all,merge insert buffer,flush dirty pages这三者的含义。 1表示在innodb关闭的时候,它不需要purge all,merge insert buffer,只需要flush dirty page。 2表示在innodb关闭的时候,它不需要purge all,merge insert buffer,也不进行flush dirty page,只将log buffer里面的日志flush到log files。因此等下进行恢复的时候它是最耗时的。
2、替换软件包,这里我使用的是Percona-server的5.7.16代替旧版本5.5.19:
[root local]# ll
lrwxrwxrwx 1 mysql mysql 39 Apr 23 2014 mysql -> /usr/local/mysql-5.5.19-linux2.6-x86_64
drwxr-xr-x 13 mysql mysql 4096 Jul 7 2016 mysql-5.5.19-linux2.6-x86_64
drwxr-xr-x 12 root mysql 4096 Apr 23 2014 mysql-5.5.19-linux2.6-x86_64_old
[root local]# mv /usr/local/Percona-Server-5.7.16-10-Linux.x86_64.ssl101 /usr/local/mysql
[root local]# sed -i 's/Percona-Server-5.7.16-10-Linux.x86_64.ssl101/mysql/g' /usr/local/mysql/bin/mysqld_safe
[root local]# ll
drwxr-xr-x 10 mysql mysql 4096 Nov 28 2016 mysql
drwxr-xr-x 13 mysql mysql 4096 Jul 7 2016 mysql-5.5.19-linux2.6-x86_64
drwxr-xr-x 12 root mysql 4096 Apr 23 2014 mysql-5.5.19-linux2.6-x86_64_old
3、启动新的MySQL实例,注意在启动的时候,有一些参数需要调整,因为5.5和5.7之间配置文件里面有些参数不一致。启动之后可以在错误日志中看到如下的信息:
2019-10-29T06:03:31.968868Z 0 [ERROR] Native table 'performance_schema'.'variables_by_thread' has the wrong structure
2019-10-29T06:03:31.968885Z 0 [ERROR] Native table 'performance_schema'.'global_variables' has the wrong structure
2019-10-29T06:03:31.968902Z 0 [ERROR] Native table 'performance_schema'.'session_variables' has the wrong structure
2019-10-29T06:03:31.968963Z 0 [ERROR] Incorrect definition of table mysql.db: expected column 'User' at position 2 to have type char(32), found type char(16).
2019-10-29T06:03:31.968978Z 0 [ERROR] mysql.user has no `Event_priv` column at position 28
2019-10-29T06:03:31.969103Z 0 [ERROR] Event Scheduler: An error occurred when initializing system tables. Disabling the Event Scheduler.
2019-10-29T06:03:31.969257Z 0 [Note] /usr/local/mysql/bin/mysqld: ready for connections.
2019-10-29T06:03:32.048562Z 2 [ERROR] InnoDB: Table `mysql`.`innodb_table_stats` not found.
2019-10-29T06:03:32.048580Z 2 [ERROR] InnoDB: Fetch of persistent statistics requested for table `infra`.`heartbeat` but the required system tables mysql.innodb_table_stats and mysql.innodb_index_stats are not present or have unexpected structure. Using transient stats instead.
2019-10-29T06:10:56.315588Z 74 [ERROR] InnoDB: Table `mysql`.`innodb_table_stats` not found.
2019-10-29T06:10:56.315616Z 74 [ERROR] InnoDB: Fetch of persistent statistics requested for table `ops_csm`.`operation_log` but the required system tables mysql.innodb_table_stats and mysql.innodb_index_stats are not present or have unexpected structure. Using transient stats instead.
可以看到很多报错信息,大部分都是在说数据字典有问题,还有下面的这两句提示:
2019-10-29T06:00:18.160362Z mysqld_safe Starting mysqld daemon with databases from /data/mysql_4306/data
2019-10-29T06:00:18.327548Z 0 [Warning] The syntax '--log_warnings/-W' is deprecated and will be removed in a future release. Please use '--log_error_verbosity' instead.
2019-10-29T06:03:31.966705Z 0 [ERROR] Column count of performance_schema.events_waits_current is wrong. Expected 19, found 16. Created with MySQL 50519, now running 50716. Please use mysql_upgrade to fix this error.
2019-10-29T06:03:31.966765Z 0 [ERROR] Column count of performance_schema.events_waits_history is wrong. Expected 19, found 16. Created with MySQL 50519, now running 50716. Please use mysql_upgrade to fix this error.
提示使用mysql_upgrade方法来进行升级。
4、使用mysql_upgrade方法进行升级,升级的过程中,可以看到日志中有如下信息:
[root log]# mysql_upgrade -uroot -p -hlocalhost -P3306
Enter password:
Checking if update is needed.
Checking server version.
Running queries to upgrade MySQL server.
Checking system database.
mysql.columns_priv OK
mysql.db OK
mysql.engine_cost OK
mysql.event OK
.........
mysql.time_zone_transition_type OK
mysql.user OK
Upgrading the sys schema.
Checking databases.
infra.chk_masterha OK
infra.heartbeat OK
ops_csm.monitor_alert
error : Table rebuild required. Please do "ALTER TABLE `monitor_alert` FORCE" or dump/reload to fix it!
ops_csm.monitor_alert_indicator OK
ops_csm.monitor_shift OK
ops_csm.monitor_threshold OK
ops_csm.operation_log
error : Table rebuild required. Please do "ALTER TABLE `operation_log` FORCE" or dump/reload to fix it!
ops_csm.qc_apeal
error : Table rebuild required. Please do "ALTER TABLE `qc_apeal` FORCE" or dump/reload to fix it!
ops_csm.qc_apeal_history
error : Table rebuild required. Please do "ALTER TABLE `qc_apeal_history` FORCE" or dump/reload to fix it!
ops_csm.qc_audit
error : Table rebuild required. Please do "ALTER TABLE `qc_audit` FORCE" or dump/reload to fix it!
ops_csm.qc_audit_detail
error : Table rebuild required. Please do "ALTER TABLE `qc_audit_detail` FORCE" or dump/reload to fix it!
从日志中可以看出,带有OK字样的表是成功升级的,还有一部分提示需要手工的使用alter的语法进行修复,否则无法直接使用。最后输出了两行日志:
Upgrade process completed successfully.
Checking if update is needed.
5、重启服务器。这里为什么要重启?因为upgrade修复表结构之后,需要重启来让MySQL重新加载表,否则会出现下面的情况:
mysql--dba_admin@127.0.0.1:(none) 22:00:28>>show variables like '%%';
ERROR 1682 (HY000): Native table 'performance_schema'.'session_variables'
has the wrong structure
想要查看系统变量的时候,报这个错误,找了找网上的解决办法:
mysql--dba_admin@127.0.0.1:(none) 22:01:37>>set @@global.show_compatibility_56=ON;
Query OK, 0 rows affected (0.00 sec)
mysql--dba_admin@127.0.0.1:(none) 22:02:51>>show variables like '%show_compatibility_56%';
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| show_compatibility_56 | ON |
+-----------------------+-------+
1 row in set (0.00 sec)
mysql--dba_admin@127.0.0.1:(none) 22:02:55>>show variables like '%fast%';
+----------------------------+-------+
| Variable_name | Value |
+----------------------------+-------+
| expand_fast_index_creation | OFF |
| innodb_fast_shutdown | 1 |
+----------------------------+-------+
2 rows in set (0.00 sec)
设置全局参数show_compatibility_56 为on之后,发现已经可以查看当前session的变量了,暂时解决了问题。这个变量我之前没有使用过,不过从字面意思看,是为了兼容MySQL5.6版本而存在的,翻了翻官方文档:
上面说到,当这个值设置为off的时候,会从performance_schema库中的表session_variable表去读取相关的参数,而此时表session_variable有问题,使用mysql_upgrade修复之后还没有重启,所以会报错。又从淘宝数据库内核月报中看到下面的信息:
到这里,已经可以大概理解了,如果设置为on,则不从performance_schema库中的表去读,它会兼容5.6的方法,也就是说,及时这个表session_variable存在问题,也不会影响那个读取系统变量,自然也就不会报错了。
但是,在官方文档中还有一句话:
意思是这个参数在未来的版本中将会去掉,所以不建议使用,细细一想,这个参数还是需要改为OFF,如果设置为ON,相当于没有使用到MySQL5.7的特性,还是在使用兼容5.6的方法来使用5.7,这跟我们的初衷相违背。于是我修改了这个参数。
MySQL 5.7.6之后,在Performance_schema添加了如下的表: performance_schema.global_variables performance_schema.session_variables performance_schema.variables_by_thread performance_schema.global_status performance_schema.session_status performance_schema.status_by_thread performance_schema.status_by_account performance_schema.status_by_host performance_schema.status_by_user
将参数show_compatibility_56设置为OFF关闭之后,我查看了表session_variables,结果是无法查看:
mysql--dba_admin@127.0.0.1:performance_schema 22:11:29>>select * from session_variables limit 1;
ERROR 1682 (HY000): Native table 'performance_schema'.'session_variables' has the wrong structure
于是我重启了一下新的MySQL服务器,重启之后,再来查看变量:
mysql--dba_admin@127.0.0.1:(none) 22:12:32>>show variables like '%show_compatibility_56%';
+-----------------------+-------+
| Variable_name | Value |
+-----------------------+-------+
| show_compatibility_56 | OFF |
+-----------------------+-------+
1 row in set (0.00 sec)
mysql--dba_admin@127.0.0.1:(none) 22:36:32>>show variables like '%buffer_pool_size%';
+-------------------------+------------+
| Variable_name | Value |
+-------------------------+------------+
| innodb_buffer_pool_size | 2147483648 |
+-------------------------+------------+
1 row in set (0.00 sec)
发现已经可以正常查看了。到这里,这个在线升级也算是告一段落了,里面其实还是有很多细的点需要注意,但是鉴于时间原因,这里不做展开,简单提一下:
1、在主从复制的过程中,如果进行升级,mysql_upgrade进行alter table操作会长时间锁表,这个问题如何修复?修复完成之后能否保证主从数据一致?
2、如果在线升级,会对业务方的表结构产生影响么?
3、除了查看变量的操作,还有其他的操作有问题么?数据字典还需不需要额外的手动修复。