专栏首页乐沙弥的世界MySQL字符集乱码

MySQL字符集乱码

版权声明:本文为博主原创文章,欢迎扩散,扩散请务必注明出处。 https://blog.csdn.net/robinson_0612/article/details/91175314

MySQL数据库查询结果乱码,这是大家比较常见的情形。到底是什么原因导致出现查询结果为乱码呢,本文主要通过演示来理解乱码产生的原因,以及如何解决字符集乱码,供大家参考。

一、字符编码对比

	SELECT hex(convert('love' USING latin1)) latin_value,
	       hex(convert('love' USING gb2312)) gb2312_value,
	       hex(convert('love' USING gbk))    gbk_value,
	       hex(convert('love' USING utf8))   utf8_value;
	+-------------+--------------+-----------+------------+
	| latin_value | gb2312_value | gbk_value | utf8_value |
	+-------------+--------------+-----------+------------+
	| 6C6F7665    | 6C6F7665     | 6C6F7665  | 6C6F7665   |
	+-------------+--------------+-----------+------------+
	
	SELECT hex(convert('爱' USING latin1)) latin_value,
	       hex(convert('爱' USING gb2312)) gb2312_value,
	       hex(convert('爱' USING gbk))    gbk_value,
	       hex(convert('爱' USING utf8))   utf8_value;
	+-------------+--------------+-----------+------------+
	| latin_value | gb2312_value | gbk_value | utf8_value |
	+-------------+--------------+-----------+------------+
	| 3F          | B0AE         | B0AE      | E788B1     |
	+-------------+--------------+-----------+------------+  
	
	SELECT convert(0x3F USING latin1)   latin_value,
	       convert(0xB0AE USING gb2312) gb2312_value,
	       convert(0xB0AE USING gbk)    gbk_value,
	       convert(0xE788B1 USING utf8) utf8_value;
   
	+-------------+--------------+-----------+------------+
	| latin_value | gb2312_value | gbk_value | utf8_value |
	+-------------+--------------+-----------+------------+
	| ?           | 爱           | 爱        | 爱         |
	+-------------+--------------+-----------+------------+

二、乱码测试

1、环境准备

	# grep -Ev "^#|^$" /etc/my.cnf   -- 查看当前my.cnf配置
	
	mysql> show variables like 'version';
	+---------------+------------+
	| Variable_name | Value      |
	+---------------+------------+
	| version       | 5.7.23-log |
	+---------------+------------+

	mysql> show variables like '%character%';
	
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | gbk                        |
	| character_set_connection | gbk                        |
	| character_set_database   | latin1                     |
	| character_set_filesystem | binary                     |
	| character_set_results    | gbk                        |
	| character_set_server     | latin1                     |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+

	DROP TABLE IF EXISTS sakila.colum_charset;
	
	CREATE TABLE sakila.colum_charset
	(
	   id int not null auto_increment primary key,
	   c1 varchar(20),
	   c2 char(20) CHAR SET gbk,
	   c3 varchar(20) CHARSET gb2312,
	   c4 char(20) CHARACTER SET utf8,
	   c5 varchar(20) CHARSET utf8mb4
	);
	
	mysql> show create table sakila.colum_charset\G
	*************************** 1. row ***************************
	       Table: colum_charset
	Create Table: CREATE TABLE `colum_charset` (
	  `id` int(11) NOT NULL AUTO_INCREMENT,
	  `c1` varchar(20) DEFAULT NULL,
	  `c2` char(20) CHARACTER SET gbk DEFAULT NULL,
	  `c3` varchar(20) CHARACTER SET gb2312 DEFAULT NULL,
	  `c4` char(20) CHARACTER SET utf8 DEFAULT NULL,
	  `c5` varchar(20) CHARACTER SET utf8mb4 DEFAULT NULL,
	  PRIMARY KEY (`id`)
	) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1

2、基于默认字符插入数据(gbk) – character_set_client gbk – character_set_connection gbk – character_set_results gbk

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'爱','爱','爱','爱','爱');
	        
	ERROR 1366 (HY000): Incorrect string value: '\xB0\xAE' for column 'c1' at row 1

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'love','爱','爱','爱','爱');
	Query OK, 1 row affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+------+------+------+------+------+
	| id | c1   | c2   | c3   | c4   | c5   |
	+----+------+------+------+------+------+
	|  1 | love | 爱   | 爱   | 爱   | 爱   |
	+----+------+------+------+------+------+

3、三个变量全部设置为utf8插入数据 mysql> set names ‘utf8’; Query OK, 0 rows affected (0.00 sec)

	mysql> show variables like '%character%';
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | utf8                       |
	| character_set_connection | utf8                       |
	| character_set_database   | latin1                     |
	| character_set_filesystem | binary                     |
	| character_set_results    | utf8                       |
	| character_set_server     | latin1                     |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+
	
	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','心','心','心','心');        
	Query OK, 1 row affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	+----+-------+------+------+------+------+

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','屌','屌','屌','屌');  -- c3列为gb2312编码
	        
	ERROR 1366 (HY000): Incorrect string value: '\xE5\xB1\x8C' for column 'c3' at row 1

4、单个变量character_set_connection设置为latin1插入数据 mysql> set character_set_connection=latin1;

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','情','情','情','情');
	Query OK, 1 row affected, 4 warnings (0.00 sec)
	
	mysql> show warnings \G
	*************************** 1. row ***************************
	  Level: Warning
	   Code: 1300
	Message: Invalid utf8 character string: '\xE6\x83\x85'
	
	-- 乱码出现
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	|  3 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+        

5、单个变量character_set_connection设置为gb2312插入数据 mysql> set character_set_connection=gb2312;

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','屌','屌','屌','屌');
	
	Query OK, 1 row affected, 4 warnings (0.00 sec)
    
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+

6、单个变量character_set_results设置为latin1 – 测试返回数据

	mysql> set character_set_results=latin1;
	Query OK, 0 rows affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | ?    | ?    | ?    | ?    |
	|  2 | heart | ?    | ?    | ?    | ?    |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)

6、单个变量character_set_results设置为gb2312 – 测试返回数据

	mysql> set character_set_results=gb2312;
	Query OK, 0 rows affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  |    |    |    |    |
	|  2 | heart |    |    |    |    |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)

6、单个变量character_set_results设置为gbk – 测试返回数据 mysql> set character_set_results=gbk; Query OK, 0 rows affected (0.00 sec)

	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  |    |    |    |    |
	|  2 | heart |    |    |    |    |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)

7、单个变量character_set_results设置为utf8 – 测试返回数据

	mysql> set character_set_results=utf8;
	Query OK, 0 rows affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)             

8、本地环境变量影响客户端字符集设定 – 在my.cnf中未配置客户端字符集,如果配置后,则使用配置文件中设定的字符集

	[root@centos7 ~]# export LANG=en_US.UTF-8
	[root@centos7 ~]# mysql -e "show variables like 'character%'"
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | utf8                       |
	| character_set_connection | utf8                       |
	| character_set_database   | utf8                       |
	| character_set_filesystem | binary                     |
	| character_set_results    | utf8                       |
	| character_set_server     | utf8                       |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+
	
	[root@centos7 ~]# export LANG=zh_CN.GBK
	[root@centos7 ~]# mysql -e "show variables like 'character%'"
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | gbk                        |
	| character_set_connection | gbk                        |
	| character_set_database   | utf8                       |
	| character_set_filesystem | binary                     |
	| character_set_results    | gbk                        |
	| character_set_server     | utf8                       |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+
	
	[root@centos7 ~]# export LANG=zh_CN.GB2312
	[root@centos7 ~]# mysql -e "show variables like 'character%'"
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | gb2312                     |
	| character_set_connection | gb2312                     |
	| character_set_database   | utf8                       |
	| character_set_filesystem | binary                     |
	| character_set_results    | gb2312                     |
	| character_set_server     | utf8                       |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+

========================================================== 结论: character_set_client: 客户端发送的数据是什么编码? character_set_connection: 告诉字符集转换器,转换成什么编码? character_set_results: 查询的结果用什么编码? 如果以上三者都为字符集N,可简写为set names ‘N’;

乱码产生的原因如下: a、插入或读取时对应编码环节发生转换导致数据丢失。 b、如果两个字符集之间无法进行无损编码转换,一定会出现乱码。

解决方案: 1、一定要保证character_set_connection字符集大于等于client字符集,否则会丢失数据 比如: latin1 < gb2312 < gbk < utf8, 若设置set character_set_client = gb2312, 那么至少connection的字符集要大于等于gb2312,否则就会丢失数据 2、一定要保证character_set_results大于等于数据存入的字符集,否则会丢失数据 比如:如存储的字符为utf8,而返回character_set_results为gbk,数据被截断

3、所有变量使用统一的字符编码,如utf8或者utf8mb4

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • [转]你应该定期更新 Homebrew

    这篇文章是关于定期更新 Homebrew 的话题。它会告诉你定期更新的好处,常用的命令,以及用 brew pin 尽可能无痛地更新。

    FungLeo
  • 7.2 Sqoop2安装

    版权声明:本文为王小雷原创文章,未经博主允许不得转载 https://blog.csdn.n...

    王小雷
  • zabbix-server 自动安装脚本

    版权声明:本文为木偶人shaon原创文章,转载请注明原文地址,非常感谢。 https://b...

    shaonbean
  • 架构师的必备素质和成长途径

    2016年11月18-20日SDCC 2016中国软件开发者大会,易宝支付CTO陈斌给我们带来了“架构师的成长之路”的演讲。主要谈及了架构师的必备素质和成长途...

    魏晓蕾
  • Pipeline大数据架构

    Pipeline大数据架构,面向大数据仓库和大数据处理平台。是基于lambda的大数据架构的变种,增加了企业级服务,而并非只是大数据组件的对切,是一种更落地的方...

    王小雷
  • Eclipse连接SQL2008与连接SQL2000代码的不同及连接2008过程中遇到的问题

    版权声明:本文为博主原创文章,转载请注明出处。 https://blog.csdn.net/gongxifacai_believe/artic...

    魏晓蕾
  • Sqlite基本命令集合(linux/fedora/ubuntu)

    版权声明:本文为王小雷原创文章,未经博主允许不得转载 https://blog.csdn.net/...

    王小雷
  • 登录Oracle过程中遇到的问题及解决方案

    IOException in Sending Request::Connection refused: connect

    魏晓蕾
  • Windows Server 2003 组策略报错:提示不能确定应用到此机器的组策略安全性设置

    错误提示信息: 不能确定应用到此机器的组策略安全性设置。在尝试从本地安全策略数据库(%windir%\security\database\secedit.sdb...

    魏晓蕾
  • laravel5.1的用户权限管理的实现 原

    修改database/migrations/2016_12_02_070731_create_blogs_table.php文件

    lilugirl

扫码关注云+社区

领取腾讯云代金券