前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >MySQL字符集乱码

MySQL字符集乱码

作者头像
Leshami
发布2019-06-14 20:46:08
6.6K0
发布2019-06-14 20:46:08
举报
文章被收录于专栏:乐沙弥的世界乐沙弥的世界

版权声明:本文为博主原创文章,欢迎扩散,扩散请务必注明出处。 https://blog.csdn.net/robinson_0612/article/details/91175314

MySQL数据库查询结果乱码,这是大家比较常见的情形。到底是什么原因导致出现查询结果为乱码呢,本文主要通过演示来理解乱码产生的原因,以及如何解决字符集乱码,供大家参考。

一、字符编码对比

代码语言:javascript
复制
	SELECT hex(convert('love' USING latin1)) latin_value,
	       hex(convert('love' USING gb2312)) gb2312_value,
	       hex(convert('love' USING gbk))    gbk_value,
	       hex(convert('love' USING utf8))   utf8_value;
	+-------------+--------------+-----------+------------+
	| latin_value | gb2312_value | gbk_value | utf8_value |
	+-------------+--------------+-----------+------------+
	| 6C6F7665    | 6C6F7665     | 6C6F7665  | 6C6F7665   |
	+-------------+--------------+-----------+------------+
	
	SELECT hex(convert('爱' USING latin1)) latin_value,
	       hex(convert('爱' USING gb2312)) gb2312_value,
	       hex(convert('爱' USING gbk))    gbk_value,
	       hex(convert('爱' USING utf8))   utf8_value;
	+-------------+--------------+-----------+------------+
	| latin_value | gb2312_value | gbk_value | utf8_value |
	+-------------+--------------+-----------+------------+
	| 3F          | B0AE         | B0AE      | E788B1     |
	+-------------+--------------+-----------+------------+  
	
	SELECT convert(0x3F USING latin1)   latin_value,
	       convert(0xB0AE USING gb2312) gb2312_value,
	       convert(0xB0AE USING gbk)    gbk_value,
	       convert(0xE788B1 USING utf8) utf8_value;
   
	+-------------+--------------+-----------+------------+
	| latin_value | gb2312_value | gbk_value | utf8_value |
	+-------------+--------------+-----------+------------+
	| ?           | 爱           | 爱        | 爱         |
	+-------------+--------------+-----------+------------+

二、乱码测试

1、环境准备

代码语言:javascript
复制
	# grep -Ev "^#|^$" /etc/my.cnf   -- 查看当前my.cnf配置
	
	mysql> show variables like 'version';
	+---------------+------------+
	| Variable_name | Value      |
	+---------------+------------+
	| version       | 5.7.23-log |
	+---------------+------------+

	mysql> show variables like '%character%';
	
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | gbk                        |
	| character_set_connection | gbk                        |
	| character_set_database   | latin1                     |
	| character_set_filesystem | binary                     |
	| character_set_results    | gbk                        |
	| character_set_server     | latin1                     |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+

	DROP TABLE IF EXISTS sakila.colum_charset;
	
	CREATE TABLE sakila.colum_charset
	(
	   id int not null auto_increment primary key,
	   c1 varchar(20),
	   c2 char(20) CHAR SET gbk,
	   c3 varchar(20) CHARSET gb2312,
	   c4 char(20) CHARACTER SET utf8,
	   c5 varchar(20) CHARSET utf8mb4
	);
	
	mysql> show create table sakila.colum_charset\G
	*************************** 1. row ***************************
	       Table: colum_charset
	Create Table: CREATE TABLE `colum_charset` (
	  `id` int(11) NOT NULL AUTO_INCREMENT,
	  `c1` varchar(20) DEFAULT NULL,
	  `c2` char(20) CHARACTER SET gbk DEFAULT NULL,
	  `c3` varchar(20) CHARACTER SET gb2312 DEFAULT NULL,
	  `c4` char(20) CHARACTER SET utf8 DEFAULT NULL,
	  `c5` varchar(20) CHARACTER SET utf8mb4 DEFAULT NULL,
	  PRIMARY KEY (`id`)
	) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1

2、基于默认字符插入数据(gbk) – character_set_client gbk – character_set_connection gbk – character_set_results gbk

代码语言:javascript
复制
	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'爱','爱','爱','爱','爱');
	        
	ERROR 1366 (HY000): Incorrect string value: '\xB0\xAE' for column 'c1' at row 1

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'love','爱','爱','爱','爱');
	Query OK, 1 row affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+------+------+------+------+------+
	| id | c1   | c2   | c3   | c4   | c5   |
	+----+------+------+------+------+------+
	|  1 | love | 爱   | 爱   | 爱   | 爱   |
	+----+------+------+------+------+------+

3、三个变量全部设置为utf8插入数据 mysql> set names ‘utf8’; Query OK, 0 rows affected (0.00 sec)

代码语言:javascript
复制
	mysql> show variables like '%character%';
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | utf8                       |
	| character_set_connection | utf8                       |
	| character_set_database   | latin1                     |
	| character_set_filesystem | binary                     |
	| character_set_results    | utf8                       |
	| character_set_server     | latin1                     |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+
	
	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','心','心','心','心');        
	Query OK, 1 row affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	+----+-------+------+------+------+------+

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','屌','屌','屌','屌');  -- c3列为gb2312编码
	        
	ERROR 1366 (HY000): Incorrect string value: '\xE5\xB1\x8C' for column 'c3' at row 1

4、单个变量character_set_connection设置为latin1插入数据 mysql> set character_set_connection=latin1;

代码语言:javascript
复制
	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','情','情','情','情');
	Query OK, 1 row affected, 4 warnings (0.00 sec)
	
	mysql> show warnings \G
	*************************** 1. row ***************************
	  Level: Warning
	   Code: 1300
	Message: Invalid utf8 character string: '\xE6\x83\x85'
	
	-- 乱码出现
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	|  3 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+        

5、单个变量character_set_connection设置为gb2312插入数据 mysql> set character_set_connection=gb2312;

代码语言:javascript
复制
	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','屌','屌','屌','屌');
	
	Query OK, 1 row affected, 4 warnings (0.00 sec)
    
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+

6、单个变量character_set_results设置为latin1 – 测试返回数据

代码语言:javascript
复制
	mysql> set character_set_results=latin1;
	Query OK, 0 rows affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | ?    | ?    | ?    | ?    |
	|  2 | heart | ?    | ?    | ?    | ?    |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)

6、单个变量character_set_results设置为gb2312 – 测试返回数据

代码语言:javascript
复制
	mysql> set character_set_results=gb2312;
	Query OK, 0 rows affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  |    |    |    |    |
	|  2 | heart |    |    |    |    |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)

6、单个变量character_set_results设置为gbk – 测试返回数据 mysql> set character_set_results=gbk; Query OK, 0 rows affected (0.00 sec)

代码语言:javascript
复制
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  |    |    |    |    |
	|  2 | heart |    |    |    |    |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)

7、单个变量character_set_results设置为utf8 – 测试返回数据

代码语言:javascript
复制
	mysql> set character_set_results=utf8;
	Query OK, 0 rows affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)             

8、本地环境变量影响客户端字符集设定 – 在my.cnf中未配置客户端字符集,如果配置后,则使用配置文件中设定的字符集

代码语言:javascript
复制
	[root@centos7 ~]# export LANG=en_US.UTF-8
	[root@centos7 ~]# mysql -e "show variables like 'character%'"
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | utf8                       |
	| character_set_connection | utf8                       |
	| character_set_database   | utf8                       |
	| character_set_filesystem | binary                     |
	| character_set_results    | utf8                       |
	| character_set_server     | utf8                       |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+
	
	[root@centos7 ~]# export LANG=zh_CN.GBK
	[root@centos7 ~]# mysql -e "show variables like 'character%'"
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | gbk                        |
	| character_set_connection | gbk                        |
	| character_set_database   | utf8                       |
	| character_set_filesystem | binary                     |
	| character_set_results    | gbk                        |
	| character_set_server     | utf8                       |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+
	
	[root@centos7 ~]# export LANG=zh_CN.GB2312
	[root@centos7 ~]# mysql -e "show variables like 'character%'"
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | gb2312                     |
	| character_set_connection | gb2312                     |
	| character_set_database   | utf8                       |
	| character_set_filesystem | binary                     |
	| character_set_results    | gb2312                     |
	| character_set_server     | utf8                       |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+

========================================================== 结论: character_set_client: 客户端发送的数据是什么编码? character_set_connection: 告诉字符集转换器,转换成什么编码? character_set_results: 查询的结果用什么编码? 如果以上三者都为字符集N,可简写为set names ‘N’;

乱码产生的原因如下: a、插入或读取时对应编码环节发生转换导致数据丢失。 b、如果两个字符集之间无法进行无损编码转换,一定会出现乱码。

解决方案: 1、一定要保证character_set_connection字符集大于等于client字符集,否则会丢失数据 比如: latin1 < gb2312 < gbk < utf8, 若设置set character_set_client = gb2312, 那么至少connection的字符集要大于等于gb2312,否则就会丢失数据 2、一定要保证character_set_results大于等于数据存入的字符集,否则会丢失数据 比如:如存储的字符为utf8,而返回character_set_results为gbk,数据被截断

3、所有变量使用统一的字符编码,如utf8或者utf8mb4

本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2019年06月07日,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
云数据库 SQL Server
腾讯云数据库 SQL Server (TencentDB for SQL Server)是业界最常用的商用数据库之一,对基于 Windows 架构的应用程序具有完美的支持。TencentDB for SQL Server 拥有微软正版授权,可持续为用户提供最新的功能,避免未授权使用软件的风险。具有即开即用、稳定可靠、安全运行、弹性扩缩等特点。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档