专栏首页乐沙弥的世界MySQL字符集乱码

MySQL字符集乱码

版权声明:本文为博主原创文章,欢迎扩散,扩散请务必注明出处。 https://blog.csdn.net/robinson_0612/article/details/91175314

MySQL数据库查询结果乱码,这是大家比较常见的情形。到底是什么原因导致出现查询结果为乱码呢,本文主要通过演示来理解乱码产生的原因,以及如何解决字符集乱码,供大家参考。

一、字符编码对比

	SELECT hex(convert('love' USING latin1)) latin_value,
	       hex(convert('love' USING gb2312)) gb2312_value,
	       hex(convert('love' USING gbk))    gbk_value,
	       hex(convert('love' USING utf8))   utf8_value;
	+-------------+--------------+-----------+------------+
	| latin_value | gb2312_value | gbk_value | utf8_value |
	+-------------+--------------+-----------+------------+
	| 6C6F7665    | 6C6F7665     | 6C6F7665  | 6C6F7665   |
	+-------------+--------------+-----------+------------+
	
	SELECT hex(convert('爱' USING latin1)) latin_value,
	       hex(convert('爱' USING gb2312)) gb2312_value,
	       hex(convert('爱' USING gbk))    gbk_value,
	       hex(convert('爱' USING utf8))   utf8_value;
	+-------------+--------------+-----------+------------+
	| latin_value | gb2312_value | gbk_value | utf8_value |
	+-------------+--------------+-----------+------------+
	| 3F          | B0AE         | B0AE      | E788B1     |
	+-------------+--------------+-----------+------------+  
	
	SELECT convert(0x3F USING latin1)   latin_value,
	       convert(0xB0AE USING gb2312) gb2312_value,
	       convert(0xB0AE USING gbk)    gbk_value,
	       convert(0xE788B1 USING utf8) utf8_value;
   
	+-------------+--------------+-----------+------------+
	| latin_value | gb2312_value | gbk_value | utf8_value |
	+-------------+--------------+-----------+------------+
	| ?           | 爱           | 爱        | 爱         |
	+-------------+--------------+-----------+------------+

二、乱码测试

1、环境准备

	# grep -Ev "^#|^$" /etc/my.cnf   -- 查看当前my.cnf配置
	
	mysql> show variables like 'version';
	+---------------+------------+
	| Variable_name | Value      |
	+---------------+------------+
	| version       | 5.7.23-log |
	+---------------+------------+

	mysql> show variables like '%character%';
	
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | gbk                        |
	| character_set_connection | gbk                        |
	| character_set_database   | latin1                     |
	| character_set_filesystem | binary                     |
	| character_set_results    | gbk                        |
	| character_set_server     | latin1                     |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+

	DROP TABLE IF EXISTS sakila.colum_charset;
	
	CREATE TABLE sakila.colum_charset
	(
	   id int not null auto_increment primary key,
	   c1 varchar(20),
	   c2 char(20) CHAR SET gbk,
	   c3 varchar(20) CHARSET gb2312,
	   c4 char(20) CHARACTER SET utf8,
	   c5 varchar(20) CHARSET utf8mb4
	);
	
	mysql> show create table sakila.colum_charset\G
	*************************** 1. row ***************************
	       Table: colum_charset
	Create Table: CREATE TABLE `colum_charset` (
	  `id` int(11) NOT NULL AUTO_INCREMENT,
	  `c1` varchar(20) DEFAULT NULL,
	  `c2` char(20) CHARACTER SET gbk DEFAULT NULL,
	  `c3` varchar(20) CHARACTER SET gb2312 DEFAULT NULL,
	  `c4` char(20) CHARACTER SET utf8 DEFAULT NULL,
	  `c5` varchar(20) CHARACTER SET utf8mb4 DEFAULT NULL,
	  PRIMARY KEY (`id`)
	) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1

2、基于默认字符插入数据(gbk) – character_set_client gbk – character_set_connection gbk – character_set_results gbk

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'爱','爱','爱','爱','爱');
	        
	ERROR 1366 (HY000): Incorrect string value: '\xB0\xAE' for column 'c1' at row 1

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'love','爱','爱','爱','爱');
	Query OK, 1 row affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+------+------+------+------+------+
	| id | c1   | c2   | c3   | c4   | c5   |
	+----+------+------+------+------+------+
	|  1 | love | 爱   | 爱   | 爱   | 爱   |
	+----+------+------+------+------+------+

3、三个变量全部设置为utf8插入数据 mysql> set names ‘utf8’; Query OK, 0 rows affected (0.00 sec)

	mysql> show variables like '%character%';
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | utf8                       |
	| character_set_connection | utf8                       |
	| character_set_database   | latin1                     |
	| character_set_filesystem | binary                     |
	| character_set_results    | utf8                       |
	| character_set_server     | latin1                     |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+
	
	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','心','心','心','心');        
	Query OK, 1 row affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	+----+-------+------+------+------+------+

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','屌','屌','屌','屌');  -- c3列为gb2312编码
	        
	ERROR 1366 (HY000): Incorrect string value: '\xE5\xB1\x8C' for column 'c3' at row 1

4、单个变量character_set_connection设置为latin1插入数据 mysql> set character_set_connection=latin1;

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','情','情','情','情');
	Query OK, 1 row affected, 4 warnings (0.00 sec)
	
	mysql> show warnings \G
	*************************** 1. row ***************************
	  Level: Warning
	   Code: 1300
	Message: Invalid utf8 character string: '\xE6\x83\x85'
	
	-- 乱码出现
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	|  3 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+        

5、单个变量character_set_connection设置为gb2312插入数据 mysql> set character_set_connection=gb2312;

	INSERT INTO sakila.colum_charset(id,
	                                 c1,
	                                 c2,
	                                 c3,
	                                 c4,
	                                 c5)
	VALUES (NULL,'heart','屌','屌','屌','屌');
	
	Query OK, 1 row affected, 4 warnings (0.00 sec)
    
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+

6、单个变量character_set_results设置为latin1 – 测试返回数据

	mysql> set character_set_results=latin1;
	Query OK, 0 rows affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | ?    | ?    | ?    | ?    |
	|  2 | heart | ?    | ?    | ?    | ?    |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)

6、单个变量character_set_results设置为gb2312 – 测试返回数据

	mysql> set character_set_results=gb2312;
	Query OK, 0 rows affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  |    |    |    |    |
	|  2 | heart |    |    |    |    |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)

6、单个变量character_set_results设置为gbk – 测试返回数据 mysql> set character_set_results=gbk; Query OK, 0 rows affected (0.00 sec)

	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  |    |    |    |    |
	|  2 | heart |    |    |    |    |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)

7、单个变量character_set_results设置为utf8 – 测试返回数据

	mysql> set character_set_results=utf8;
	Query OK, 0 rows affected (0.00 sec)
	
	mysql> select * from sakila.colum_charset;
	+----+-------+------+------+------+------+
	| id | c1    | c2   | c3   | c4   | c5   |
	+----+-------+------+------+------+------+
	|  1 | love  | 爱   | 爱   | 爱   | 爱   |
	|  2 | heart | 心   | 心   | 心   | 心   |
	|  3 | heart | ?    | ?    | ?    | ?    |
	|  4 | heart | ?    | ?    | ?    | ?    |
	+----+-------+------+------+------+------+
	4 rows in set (0.00 sec)             

8、本地环境变量影响客户端字符集设定 – 在my.cnf中未配置客户端字符集,如果配置后,则使用配置文件中设定的字符集

	[root@centos7 ~]# export LANG=en_US.UTF-8
	[root@centos7 ~]# mysql -e "show variables like 'character%'"
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | utf8                       |
	| character_set_connection | utf8                       |
	| character_set_database   | utf8                       |
	| character_set_filesystem | binary                     |
	| character_set_results    | utf8                       |
	| character_set_server     | utf8                       |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+
	
	[root@centos7 ~]# export LANG=zh_CN.GBK
	[root@centos7 ~]# mysql -e "show variables like 'character%'"
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | gbk                        |
	| character_set_connection | gbk                        |
	| character_set_database   | utf8                       |
	| character_set_filesystem | binary                     |
	| character_set_results    | gbk                        |
	| character_set_server     | utf8                       |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+
	
	[root@centos7 ~]# export LANG=zh_CN.GB2312
	[root@centos7 ~]# mysql -e "show variables like 'character%'"
	+--------------------------+----------------------------+
	| Variable_name            | Value                      |
	+--------------------------+----------------------------+
	| character_set_client     | gb2312                     |
	| character_set_connection | gb2312                     |
	| character_set_database   | utf8                       |
	| character_set_filesystem | binary                     |
	| character_set_results    | gb2312                     |
	| character_set_server     | utf8                       |
	| character_set_system     | utf8                       |
	| character_sets_dir       | /usr/share/mysql/charsets/ |
	+--------------------------+----------------------------+

========================================================== 结论: character_set_client: 客户端发送的数据是什么编码? character_set_connection: 告诉字符集转换器,转换成什么编码? character_set_results: 查询的结果用什么编码? 如果以上三者都为字符集N,可简写为set names ‘N’;

乱码产生的原因如下: a、插入或读取时对应编码环节发生转换导致数据丢失。 b、如果两个字符集之间无法进行无损编码转换,一定会出现乱码。

解决方案: 1、一定要保证character_set_connection字符集大于等于client字符集,否则会丢失数据 比如: latin1 < gb2312 < gbk < utf8, 若设置set character_set_client = gb2312, 那么至少connection的字符集要大于等于gb2312,否则就会丢失数据 2、一定要保证character_set_results大于等于数据存入的字符集,否则会丢失数据 比如:如存储的字符为utf8,而返回character_set_results为gbk,数据被截断

3、所有变量使用统一的字符编码,如utf8或者utf8mb4

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • MySQL字符集乱码总结

    查看数据库编码: show create database db_name; 查看表编码: show create table tbl_name; 查看字段...

    阳光岛主
  • 第09期:有关 MySQL 字符集的乱码问题

    相信大家通过前几篇文章,已经了解了 MySQL 字符集使用相关注意事项。那么数据乱码问题在这儿显得就非常简单了,或许说可能不会出现这样的问题。

    爱可生开源社区
  • MySQL字符集中文乱码剖析

    MySQL创建数据表时,分别采用gbk、utf8、latin1的编码方式,当不同编码方式存储相同字符时,是否会显示乱码呢?本文通过案例描述,探讨数据库中的字符串...

    serena
  • MySQL的字符集和乱码问题

    #字符编码:就是人类使用的英文字母、汉字、特殊符号等信息,通过转换规则,将其转换为计算机可以识别的二进制数字的一种编码方式

    老油条IT记
  • MySQL从删库到跑路(二)——MySQL字符集与乱码解析

    字符(Character)是各种文字和符号的总称,包括各国家文字、标点符号、图形符号、数字等。 字符集(Character set)是多个字符的集合,字符集种类...

    良月柒
  • mysql字符集配置&mysql中文乱码

    这两天重置了下自己的电脑系统,一个ubuntu,另外一个当然就是windows。 不过在运行程序的时候发现,出现了很多的“????”,也就是乱码字符。毫无疑...

    roobtyan
  • MySQL 案例:乱码,字符集与错入错出的 MySQL

    “数据库的数据变成乱码了!”---想必不少 DBA 们对类似的“呼救”不算太陌生。一般来说这类问题都是字符集的设置有关,同时在 MySQL 中也存在“错入错出”...

    王文安@DBA
  • 再见乱码:5 分钟读懂 MySQL 字符集设置

    作者: 程序猿小卡_casper 原文:https://segmentfault.com/a/1190000012775484 一、内容概述 在MySQL的使...

    程序员宝库
  • 再见乱码:5 分钟读懂 MySQL 字符集设置

    作者: 程序猿小卡_casper 原文:https://segmentfault.com/a/1190000012775484 一、内容概述 在MySQL的使用...

    企鹅号小编

扫码关注云+社区

领取腾讯云代金券