前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >MySQL字符集学习

MySQL字符集学习

作者头像
heidsoft
发布2023-03-18 17:24:51
2.9K0
发布2023-03-18 17:24:51
举报
  1. 将字符映射成二进制数据的过程叫编码,将二进制数据映射到字符的过程叫做解码
  2. ASCII字符集: 有128个字符。包括空格/标点符号/数字/大小写字母和不可见字符。
  1. ISO 8859-1 字符集合:有256个字符,在ASCII字符集基础上扩展了128个西欧常用字符(包括德法字符)。它可以使用一个字节来进行编码(它的别名称叫Latin1)
  2. GB2312字符集:包括汉子和拉丁字母/希腊字母/日文/俄文等。如果字符集包含在ASCII字符集中,则采用一个字节编码,否则采用两个字没编码。
  3. GBK字符集:对GB2312字符集进行了扩充。编码方式兼容GB2312.
  4. UTF-8字符集:收录了当今世界各个国家地区使用的字符,并且还在扩充。它兼容ASCII字符集。采用变长编码方式,编码一个字符时需要使用1到4字节。
  5. mysql 不区分字符集和编码方案的概念。
  6. mysql utf8mb3: "阉割"过的utf-8字符集,只使用1-3个字节表示字符。
  7. mysql utf8mb4: 正宗的utf-8字符集,使用1-4个字节表示字符。
  8. mysql 中utf8是 utf8mb3的别名。
  9. mysql 中如果要存放表情,则使用utf8mb4.
代码语言:javascript
复制
mysql> show charset
    -> ;
+----------+---------------------------------+---------------------+--------+
| Charset  | Description                     | Default collation   | Maxlen |
+----------+---------------------------------+---------------------+--------+
| armscii8 | ARMSCII-8 Armenian              | armscii8_general_ci |      1 |
| ascii    | US ASCII                        | ascii_general_ci    |      1 |
| big5     | Big5 Traditional Chinese        | big5_chinese_ci     |      2 |
| binary   | Binary pseudo charset           | binary              |      1 |
| cp1250   | Windows Central European        | cp1250_general_ci   |      1 |
| cp1251   | Windows Cyrillic                | cp1251_general_ci   |      1 |
| cp1256   | Windows Arabic                  | cp1256_general_ci   |      1 |
| cp1257   | Windows Baltic                  | cp1257_general_ci   |      1 |
| cp850    | DOS West European               | cp850_general_ci    |      1 |
| cp852    | DOS Central European            | cp852_general_ci    |      1 |
| cp866    | DOS Russian                     | cp866_general_ci    |      1 |
| cp932    | SJIS for Windows Japanese       | cp932_japanese_ci   |      2 |
| dec8     | DEC West European               | dec8_swedish_ci     |      1 |
| eucjpms  | UJIS for Windows Japanese       | eucjpms_japanese_ci |      3 |
| euckr    | EUC-KR Korean                   | euckr_korean_ci     |      2 |
| gb18030  | China National Standard GB18030 | gb18030_chinese_ci  |      4 |
| gb2312   | GB2312 Simplified Chinese       | gb2312_chinese_ci   |      2 |
| gbk      | GBK Simplified Chinese          | gbk_chinese_ci      |      2 |
| geostd8  | GEOSTD8 Georgian                | geostd8_general_ci  |      1 |
| greek    | ISO 8859-7 Greek                | greek_general_ci    |      1 |
| hebrew   | ISO 8859-8 Hebrew               | hebrew_general_ci   |      1 |
| hp8      | HP West European                | hp8_english_ci      |      1 |
| keybcs2  | DOS Kamenicky Czech-Slovak      | keybcs2_general_ci  |      1 |
| koi8r    | KOI8-R Relcom Russian           | koi8r_general_ci    |      1 |
| koi8u    | KOI8-U Ukrainian                | koi8u_general_ci    |      1 |
| latin1   | cp1252 West European            | latin1_swedish_ci   |      1 |
| latin2   | ISO 8859-2 Central European     | latin2_general_ci   |      1 |
| latin5   | ISO 8859-9 Turkish              | latin5_turkish_ci   |      1 |
| latin7   | ISO 8859-13 Baltic              | latin7_general_ci   |      1 |
| macce    | Mac Central European            | macce_general_ci    |      1 |
| macroman | Mac West European               | macroman_general_ci |      1 |
| sjis     | Shift-JIS Japanese              | sjis_japanese_ci    |      2 |
| swe7     | 7bit Swedish                    | swe7_swedish_ci     |      1 |
| tis620   | TIS620 Thai                     | tis620_thai_ci      |      1 |
| ucs2     | UCS-2 Unicode                   | ucs2_general_ci     |      2 |
| ujis     | EUC-JP Japanese                 | ujis_japanese_ci    |      3 |
| utf16    | UTF-16 Unicode                  | utf16_general_ci    |      4 |
| utf16le  | UTF-16LE Unicode                | utf16le_general_ci  |      4 |
| utf32    | UTF-32 Unicode                  | utf32_general_ci    |      4 |
| utf8     | UTF-8 Unicode                   | utf8_general_ci     |      3 |
| utf8mb4  | UTF-8 Unicode                   | utf8mb4_0900_ai_ci  |      4 |
+----------+---------------------------------+---------------------+--------+

字符集比价规则

代码语言:javascript
复制
mysql> SHOW COLLATION;
+----------------------------+----------+-----+---------+----------+---------+---------------+
| Collation                  | Charset  | Id  | Default | Compiled | Sortlen | Pad_attribute |
+----------------------------+----------+-----+---------+----------+---------+---------------+
| armscii8_bin               | armscii8 |  64 |         | Yes      |       1 | PAD SPACE     |
| armscii8_general_ci        | armscii8 |  32 | Yes     | Yes      |       1 | PAD SPACE     |
| ascii_bin                  | ascii    |  65 |         | Yes      |       1 | PAD SPACE     |
| ascii_general_ci           | ascii    |  11 | Yes     | Yes      |       1 | PAD SPACE     |
| big5_bin                   | big5     |  84 |         | Yes      |       1 | PAD SPACE     |
| big5_chinese_ci            | big5     |   1 | Yes     | Yes      |       1 | PAD SPACE     |
| binary                     | binary   |  63 | Yes     | Yes      |       1 | NO PAD        |
| cp1250_bin                 | cp1250   |  66 |         | Yes      |       1 | PAD SPACE     |
| cp1250_croatian_ci         | cp1250   |  44 |         | Yes      |       1 | PAD SPACE     |
| cp1250_czech_cs            | cp1250   |  34 |         | Yes      |       2 | PAD SPACE     |
| cp1250_general_ci          | cp1250   |  26 | Yes     | Yes      |       1 | PAD SPACE     |
| cp1250_polish_ci           | cp1250   |  99 |         | Yes      |       1 | PAD SPACE     |
| cp1251_bin                 | cp1251   |  50 |         | Yes      |       1 | PAD SPACE     |
| cp1251_bulgarian_ci        | cp1251   |  14 |         | Yes      |       1 | PAD SPACE     |
| cp1251_general_ci          | cp1251   |  51 | Yes     | Yes      |       1 | PAD SPACE     |
| cp1251_general_cs          | cp1251   |  52 |         | Yes      |       1 | PAD SPACE     |
| cp1251_ukrainian_ci        | cp1251   |  23 |         | Yes      |       1 | PAD SPACE     |
| cp1256_bin                 | cp1256   |  67 |         | Yes      |       1 | PAD SPACE     |
| cp1256_general_ci          | cp1256   |  57 | Yes     | Yes      |       1 | PAD SPACE     |
| cp1257_bin                 | cp1257   |  58 |         | Yes      |       1 | PAD SPACE     |
| cp1257_general_ci          | cp1257   |  59 | Yes     | Yes      |       1 | PAD SPACE     |
| cp1257_lithuanian_ci       | cp1257   |  29 |         | Yes      |       1 | PAD SPACE     |
| cp850_bin                  | cp850    |  80 |         | Yes      |       1 | PAD SPACE     |
| cp850_general_ci           | cp850    |   4 | Yes     | Yes      |       1 | PAD SPACE     |
| cp852_bin                  | cp852    |  81 |         | Yes      |       1 | PAD SPACE     |
| cp852_general_ci           | cp852    |  40 | Yes     | Yes      |       1 | PAD SPACE     |
| cp866_bin                  | cp866    |  68 |         | Yes      |       1 | PAD SPACE     |
| cp866_general_ci           | cp866    |  36 | Yes     | Yes      |       1 | PAD SPACE     |
| cp932_bin                  | cp932    |  96 |         | Yes      |       1 | PAD SPACE     |
| cp932_japanese_ci          | cp932    |  95 | Yes     | Yes      |       1 | PAD SPACE     |
| dec8_bin                   | dec8     |  69 |         | Yes      |       1 | PAD SPACE     |
| dec8_swedish_ci            | dec8     |   3 | Yes     | Yes      |       1 | PAD SPACE     |
| eucjpms_bin                | eucjpms  |  98 |         | Yes      |       1 | PAD SPACE     |
| eucjpms_japanese_ci        | eucjpms  |  97 | Yes     | Yes      |       1 | PAD SPACE     |
| euckr_bin                  | euckr    |  85 |         | Yes      |       1 | PAD SPACE     |
| euckr_korean_ci            | euckr    |  19 | Yes     | Yes      |       1 | PAD SPACE     |
| gb18030_bin                | gb18030  | 249 |         | Yes      |       1 | PAD SPACE     |
| gb18030_chinese_ci         | gb18030  | 248 | Yes     | Yes      |       2 | PAD SPACE     |
| gb18030_unicode_520_ci     | gb18030  | 250 |         | Yes      |       8 | PAD SPACE     |
| gb2312_bin                 | gb2312   |  86 |         | Yes      |       1 | PAD SPACE     |
| gb2312_chinese_ci          | gb2312   |  24 | Yes     | Yes      |       1 | PAD SPACE     |
| gbk_bin                    | gbk      |  87 |         | Yes      |       1 | PAD SPACE     |
| gbk_chinese_ci             | gbk      |  28 | Yes     | Yes      |       1 | PAD SPACE     |
| geostd8_bin                | geostd8  |  93 |         | Yes      |       1 | PAD SPACE     |
| geostd8_general_ci         | geostd8  |  92 | Yes     | Yes      |       1 | PAD SPACE     |
| greek_bin                  | greek    |  70 |         | Yes      |       1 | PAD SPACE     |
| greek_general_ci           | greek    |  25 | Yes     | Yes      |       1 | PAD SPACE     |
| hebrew_bin                 | hebrew   |  71 |         | Yes      |       1 | PAD SPACE     |
| hebrew_general_ci          | hebrew   |  16 | Yes     | Yes      |       1 | PAD SPACE     |
| hp8_bin                    | hp8      |  72 |         | Yes      |       1 | PAD SPACE     |
| hp8_english_ci             | hp8      |   6 | Yes     | Yes      |       1 | PAD SPACE     |
| keybcs2_bin                | keybcs2  |  73 |         | Yes      |       1 | PAD SPACE     |
| keybcs2_general_ci         | keybcs2  |  37 | Yes     | Yes      |       1 | PAD SPACE     |
| koi8r_bin                  | koi8r    |  74 |         | Yes      |       1 | PAD SPACE     |
| koi8r_general_ci           | koi8r    |   7 | Yes     | Yes      |       1 | PAD SPACE     |
| koi8u_bin                  | koi8u    |  75 |         | Yes      |       1 | PAD SPACE     |
| koi8u_general_ci           | koi8u    |  22 | Yes     | Yes      |       1 | PAD SPACE     |
| latin1_bin                 | latin1   |  47 |         | Yes      |       1 | PAD SPACE     |
| latin1_danish_ci           | latin1   |  15 |         | Yes      |       1 | PAD SPACE     |
| latin1_general_ci          | latin1   |  48 |         | Yes      |       1 | PAD SPACE     |
| latin1_general_cs          | latin1   |  49 |         | Yes      |       1 | PAD SPACE     |
| latin1_german1_ci          | latin1   |   5 |         | Yes      |       1 | PAD SPACE     |
| latin1_german2_ci          | latin1   |  31 |         | Yes      |       2 | PAD SPACE     |
| latin1_spanish_ci          | latin1   |  94 |         | Yes      |       1 | PAD SPACE     |
| latin1_swedish_ci          | latin1   |   8 | Yes     | Yes      |       1 | PAD SPACE     |
| latin2_bin                 | latin2   |  77 |         | Yes      |       1 | PAD SPACE     |
| latin2_croatian_ci         | latin2   |  27 |         | Yes      |       1 | PAD SPACE     |
| latin2_czech_cs            | latin2   |   2 |         | Yes      |       4 | PAD SPACE     |
| latin2_general_ci          | latin2   |   9 | Yes     | Yes      |       1 | PAD SPACE     |
| latin2_hungarian_ci        | latin2   |  21 |         | Yes      |       1 | PAD SPACE     |
| latin5_bin                 | latin5   |  78 |         | Yes      |       1 | PAD SPACE     |
| latin5_turkish_ci          | latin5   |  30 | Yes     | Yes      |       1 | PAD SPACE     |
| latin7_bin                 | latin7   |  79 |         | Yes      |       1 | PAD SPACE     |
| latin7_estonian_cs         | latin7   |  20 |         | Yes      |       1 | PAD SPACE     |
| latin7_general_ci          | latin7   |  41 | Yes     | Yes      |       1 | PAD SPACE     |
| latin7_general_cs          | latin7   |  42 |         | Yes      |       1 | PAD SPACE     |
| macce_bin                  | macce    |  43 |         | Yes      |       1 | PAD SPACE     |
| macce_general_ci           | macce    |  38 | Yes     | Yes      |       1 | PAD SPACE     |
| macroman_bin               | macroman |  53 |         | Yes      |       1 | PAD SPACE     |
| macroman_general_ci        | macroman |  39 | Yes     | Yes      |       1 | PAD SPACE     |
| sjis_bin                   | sjis     |  88 |         | Yes      |       1 | PAD SPACE     |
| sjis_japanese_ci           | sjis     |  13 | Yes     | Yes      |       1 | PAD SPACE     |
| swe7_bin                   | swe7     |  82 |         | Yes      |       1 | PAD SPACE     |
| swe7_swedish_ci            | swe7     |  10 | Yes     | Yes      |       1 | PAD SPACE     |
| tis620_bin                 | tis620   |  89 |         | Yes      |       1 | PAD SPACE     |
| tis620_thai_ci             | tis620   |  18 | Yes     | Yes      |       4 | PAD SPACE     |
| ucs2_bin                   | ucs2     |  90 |         | Yes      |       1 | PAD SPACE     |
| ucs2_croatian_ci           | ucs2     | 149 |         | Yes      |       8 | PAD SPACE     |
| ucs2_czech_ci              | ucs2     | 138 |         | Yes      |       8 | PAD SPACE     |
| ucs2_danish_ci             | ucs2     | 139 |         | Yes      |       8 | PAD SPACE     |
| ucs2_esperanto_ci          | ucs2     | 145 |         | Yes      |       8 | PAD SPACE     |
| ucs2_estonian_ci           | ucs2     | 134 |         | Yes      |       8 | PAD SPACE     |
| ucs2_general_ci            | ucs2     |  35 | Yes     | Yes      |       1 | PAD SPACE     |
| ucs2_general_mysql500_ci   | ucs2     | 159 |         | Yes      |       1 | PAD SPACE     |
| ucs2_german2_ci            | ucs2     | 148 |         | Yes      |       8 | PAD SPACE     |
| ucs2_hungarian_ci          | ucs2     | 146 |         | Yes      |       8 | PAD SPACE     |
| ucs2_icelandic_ci          | ucs2     | 129 |         | Yes      |       8 | PAD SPACE     |
| ucs2_latvian_ci            | ucs2     | 130 |         | Yes      |       8 | PAD SPACE     |
| ucs2_lithuanian_ci         | ucs2     | 140 |         | Yes      |       8 | PAD SPACE     |
| ucs2_persian_ci            | ucs2     | 144 |         | Yes      |       8 | PAD SPACE     |
| ucs2_polish_ci             | ucs2     | 133 |         | Yes      |       8 | PAD SPACE     |
| ucs2_romanian_ci           | ucs2     | 131 |         | Yes      |       8 | PAD SPACE     |
| ucs2_roman_ci              | ucs2     | 143 |         | Yes      |       8 | PAD SPACE     |
| ucs2_sinhala_ci            | ucs2     | 147 |         | Yes      |       8 | PAD SPACE     |
| ucs2_slovak_ci             | ucs2     | 141 |         | Yes      |       8 | PAD SPACE     |
| ucs2_slovenian_ci          | ucs2     | 132 |         | Yes      |       8 | PAD SPACE     |
| ucs2_spanish2_ci           | ucs2     | 142 |         | Yes      |       8 | PAD SPACE     |
| ucs2_spanish_ci            | ucs2     | 135 |         | Yes      |       8 | PAD SPACE     |
| ucs2_swedish_ci            | ucs2     | 136 |         | Yes      |       8 | PAD SPACE     |
| ucs2_turkish_ci            | ucs2     | 137 |         | Yes      |       8 | PAD SPACE     |
| ucs2_unicode_520_ci        | ucs2     | 150 |         | Yes      |       8 | PAD SPACE     |
| ucs2_unicode_ci            | ucs2     | 128 |         | Yes      |       8 | PAD SPACE     |
| ucs2_vietnamese_ci         | ucs2     | 151 |         | Yes      |       8 | PAD SPACE     |
| ujis_bin                   | ujis     |  91 |         | Yes      |       1 | PAD SPACE     |
| ujis_japanese_ci           | ujis     |  12 | Yes     | Yes      |       1 | PAD SPACE     |
| utf16le_bin                | utf16le  |  62 |         | Yes      |       1 | PAD SPACE     |
| utf16le_general_ci         | utf16le  |  56 | Yes     | Yes      |       1 | PAD SPACE     |
| utf16_bin                  | utf16    |  55 |         | Yes      |       1 | PAD SPACE     |
| utf16_croatian_ci          | utf16    | 122 |         | Yes      |       8 | PAD SPACE     |
| utf16_czech_ci             | utf16    | 111 |         | Yes      |       8 | PAD SPACE     |
| utf16_danish_ci            | utf16    | 112 |         | Yes      |       8 | PAD SPACE     |
| utf16_esperanto_ci         | utf16    | 118 |         | Yes      |       8 | PAD SPACE     |
| utf16_estonian_ci          | utf16    | 107 |         | Yes      |       8 | PAD SPACE     |
| utf16_general_ci           | utf16    |  54 | Yes     | Yes      |       1 | PAD SPACE     |
| utf16_german2_ci           | utf16    | 121 |         | Yes      |       8 | PAD SPACE     |
| utf16_hungarian_ci         | utf16    | 119 |         | Yes      |       8 | PAD SPACE     |
| utf16_icelandic_ci         | utf16    | 102 |         | Yes      |       8 | PAD SPACE     |
| utf16_latvian_ci           | utf16    | 103 |         | Yes      |       8 | PAD SPACE     |
| utf16_lithuanian_ci        | utf16    | 113 |         | Yes      |       8 | PAD SPACE     |
| utf16_persian_ci           | utf16    | 117 |         | Yes      |       8 | PAD SPACE     |
| utf16_polish_ci            | utf16    | 106 |         | Yes      |       8 | PAD SPACE     |
| utf16_romanian_ci          | utf16    | 104 |         | Yes      |       8 | PAD SPACE     |
| utf16_roman_ci             | utf16    | 116 |         | Yes      |       8 | PAD SPACE     |
| utf16_sinhala_ci           | utf16    | 120 |         | Yes      |       8 | PAD SPACE     |
| utf16_slovak_ci            | utf16    | 114 |         | Yes      |       8 | PAD SPACE     |
| utf16_slovenian_ci         | utf16    | 105 |         | Yes      |       8 | PAD SPACE     |
| utf16_spanish2_ci          | utf16    | 115 |         | Yes      |       8 | PAD SPACE     |
| utf16_spanish_ci           | utf16    | 108 |         | Yes      |       8 | PAD SPACE     |
| utf16_swedish_ci           | utf16    | 109 |         | Yes      |       8 | PAD SPACE     |
| utf16_turkish_ci           | utf16    | 110 |         | Yes      |       8 | PAD SPACE     |
| utf16_unicode_520_ci       | utf16    | 123 |         | Yes      |       8 | PAD SPACE     |
| utf16_unicode_ci           | utf16    | 101 |         | Yes      |       8 | PAD SPACE     |
| utf16_vietnamese_ci        | utf16    | 124 |         | Yes      |       8 | PAD SPACE     |
| utf32_bin                  | utf32    |  61 |         | Yes      |       1 | PAD SPACE     |
| utf32_croatian_ci          | utf32    | 181 |         | Yes      |       8 | PAD SPACE     |
| utf32_czech_ci             | utf32    | 170 |         | Yes      |       8 | PAD SPACE     |
| utf32_danish_ci            | utf32    | 171 |         | Yes      |       8 | PAD SPACE     |
| utf32_esperanto_ci         | utf32    | 177 |         | Yes      |       8 | PAD SPACE     |
| utf32_estonian_ci          | utf32    | 166 |         | Yes      |       8 | PAD SPACE     |
| utf32_general_ci           | utf32    |  60 | Yes     | Yes      |       1 | PAD SPACE     |
| utf32_german2_ci           | utf32    | 180 |         | Yes      |       8 | PAD SPACE     |
| utf32_hungarian_ci         | utf32    | 178 |         | Yes      |       8 | PAD SPACE     |
| utf32_icelandic_ci         | utf32    | 161 |         | Yes      |       8 | PAD SPACE     |
| utf32_latvian_ci           | utf32    | 162 |         | Yes      |       8 | PAD SPACE     |
| utf32_lithuanian_ci        | utf32    | 172 |         | Yes      |       8 | PAD SPACE     |
| utf32_persian_ci           | utf32    | 176 |         | Yes      |       8 | PAD SPACE     |
| utf32_polish_ci            | utf32    | 165 |         | Yes      |       8 | PAD SPACE     |
| utf32_romanian_ci          | utf32    | 163 |         | Yes      |       8 | PAD SPACE     |
| utf32_roman_ci             | utf32    | 175 |         | Yes      |       8 | PAD SPACE     |
| utf32_sinhala_ci           | utf32    | 179 |         | Yes      |       8 | PAD SPACE     |
| utf32_slovak_ci            | utf32    | 173 |         | Yes      |       8 | PAD SPACE     |
| utf32_slovenian_ci         | utf32    | 164 |         | Yes      |       8 | PAD SPACE     |
| utf32_spanish2_ci          | utf32    | 174 |         | Yes      |       8 | PAD SPACE     |
| utf32_spanish_ci           | utf32    | 167 |         | Yes      |       8 | PAD SPACE     |
| utf32_swedish_ci           | utf32    | 168 |         | Yes      |       8 | PAD SPACE     |
| utf32_turkish_ci           | utf32    | 169 |         | Yes      |       8 | PAD SPACE     |
| utf32_unicode_520_ci       | utf32    | 182 |         | Yes      |       8 | PAD SPACE     |
| utf32_unicode_ci           | utf32    | 160 |         | Yes      |       8 | PAD SPACE     |
| utf32_vietnamese_ci        | utf32    | 183 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_0900_ai_ci         | utf8mb4  | 255 | Yes     | Yes      |       0 | NO PAD        |
| utf8mb4_0900_as_ci         | utf8mb4  | 305 |         | Yes      |       0 | NO PAD        |
| utf8mb4_0900_as_cs         | utf8mb4  | 278 |         | Yes      |       0 | NO PAD        |
| utf8mb4_0900_bin           | utf8mb4  | 309 |         | Yes      |       1 | NO PAD        |
| utf8mb4_bin                | utf8mb4  |  46 |         | Yes      |       1 | PAD SPACE     |
| utf8mb4_croatian_ci        | utf8mb4  | 245 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_cs_0900_ai_ci      | utf8mb4  | 266 |         | Yes      |       0 | NO PAD        |
| utf8mb4_cs_0900_as_cs      | utf8mb4  | 289 |         | Yes      |       0 | NO PAD        |
| utf8mb4_czech_ci           | utf8mb4  | 234 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_danish_ci          | utf8mb4  | 235 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_da_0900_ai_ci      | utf8mb4  | 267 |         | Yes      |       0 | NO PAD        |
| utf8mb4_da_0900_as_cs      | utf8mb4  | 290 |         | Yes      |       0 | NO PAD        |
| utf8mb4_de_pb_0900_ai_ci   | utf8mb4  | 256 |         | Yes      |       0 | NO PAD        |
| utf8mb4_de_pb_0900_as_cs   | utf8mb4  | 279 |         | Yes      |       0 | NO PAD        |
| utf8mb4_eo_0900_ai_ci      | utf8mb4  | 273 |         | Yes      |       0 | NO PAD        |
| utf8mb4_eo_0900_as_cs      | utf8mb4  | 296 |         | Yes      |       0 | NO PAD        |
| utf8mb4_esperanto_ci       | utf8mb4  | 241 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_estonian_ci        | utf8mb4  | 230 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_es_0900_ai_ci      | utf8mb4  | 263 |         | Yes      |       0 | NO PAD        |
| utf8mb4_es_0900_as_cs      | utf8mb4  | 286 |         | Yes      |       0 | NO PAD        |
| utf8mb4_es_trad_0900_ai_ci | utf8mb4  | 270 |         | Yes      |       0 | NO PAD        |
| utf8mb4_es_trad_0900_as_cs | utf8mb4  | 293 |         | Yes      |       0 | NO PAD        |
| utf8mb4_et_0900_ai_ci      | utf8mb4  | 262 |         | Yes      |       0 | NO PAD        |
| utf8mb4_et_0900_as_cs      | utf8mb4  | 285 |         | Yes      |       0 | NO PAD        |
| utf8mb4_general_ci         | utf8mb4  |  45 |         | Yes      |       1 | PAD SPACE     |
| utf8mb4_german2_ci         | utf8mb4  | 244 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_hr_0900_ai_ci      | utf8mb4  | 275 |         | Yes      |       0 | NO PAD        |
| utf8mb4_hr_0900_as_cs      | utf8mb4  | 298 |         | Yes      |       0 | NO PAD        |
| utf8mb4_hungarian_ci       | utf8mb4  | 242 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_hu_0900_ai_ci      | utf8mb4  | 274 |         | Yes      |       0 | NO PAD        |
| utf8mb4_hu_0900_as_cs      | utf8mb4  | 297 |         | Yes      |       0 | NO PAD        |
| utf8mb4_icelandic_ci       | utf8mb4  | 225 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_is_0900_ai_ci      | utf8mb4  | 257 |         | Yes      |       0 | NO PAD        |
| utf8mb4_is_0900_as_cs      | utf8mb4  | 280 |         | Yes      |       0 | NO PAD        |
| utf8mb4_ja_0900_as_cs      | utf8mb4  | 303 |         | Yes      |       0 | NO PAD        |
| utf8mb4_ja_0900_as_cs_ks   | utf8mb4  | 304 |         | Yes      |      24 | NO PAD        |
| utf8mb4_latvian_ci         | utf8mb4  | 226 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_la_0900_ai_ci      | utf8mb4  | 271 |         | Yes      |       0 | NO PAD        |
| utf8mb4_la_0900_as_cs      | utf8mb4  | 294 |         | Yes      |       0 | NO PAD        |
| utf8mb4_lithuanian_ci      | utf8mb4  | 236 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_lt_0900_ai_ci      | utf8mb4  | 268 |         | Yes      |       0 | NO PAD        |
| utf8mb4_lt_0900_as_cs      | utf8mb4  | 291 |         | Yes      |       0 | NO PAD        |
| utf8mb4_lv_0900_ai_ci      | utf8mb4  | 258 |         | Yes      |       0 | NO PAD        |
| utf8mb4_lv_0900_as_cs      | utf8mb4  | 281 |         | Yes      |       0 | NO PAD        |
| utf8mb4_persian_ci         | utf8mb4  | 240 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_pl_0900_ai_ci      | utf8mb4  | 261 |         | Yes      |       0 | NO PAD        |
| utf8mb4_pl_0900_as_cs      | utf8mb4  | 284 |         | Yes      |       0 | NO PAD        |
| utf8mb4_polish_ci          | utf8mb4  | 229 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_romanian_ci        | utf8mb4  | 227 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_roman_ci           | utf8mb4  | 239 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_ro_0900_ai_ci      | utf8mb4  | 259 |         | Yes      |       0 | NO PAD        |
| utf8mb4_ro_0900_as_cs      | utf8mb4  | 282 |         | Yes      |       0 | NO PAD        |
| utf8mb4_ru_0900_ai_ci      | utf8mb4  | 306 |         | Yes      |       0 | NO PAD        |
| utf8mb4_ru_0900_as_cs      | utf8mb4  | 307 |         | Yes      |       0 | NO PAD        |
| utf8mb4_sinhala_ci         | utf8mb4  | 243 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_sk_0900_ai_ci      | utf8mb4  | 269 |         | Yes      |       0 | NO PAD        |
| utf8mb4_sk_0900_as_cs      | utf8mb4  | 292 |         | Yes      |       0 | NO PAD        |
| utf8mb4_slovak_ci          | utf8mb4  | 237 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_slovenian_ci       | utf8mb4  | 228 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_sl_0900_ai_ci      | utf8mb4  | 260 |         | Yes      |       0 | NO PAD        |
| utf8mb4_sl_0900_as_cs      | utf8mb4  | 283 |         | Yes      |       0 | NO PAD        |
| utf8mb4_spanish2_ci        | utf8mb4  | 238 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_spanish_ci         | utf8mb4  | 231 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_sv_0900_ai_ci      | utf8mb4  | 264 |         | Yes      |       0 | NO PAD        |
| utf8mb4_sv_0900_as_cs      | utf8mb4  | 287 |         | Yes      |       0 | NO PAD        |
| utf8mb4_swedish_ci         | utf8mb4  | 232 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_tr_0900_ai_ci      | utf8mb4  | 265 |         | Yes      |       0 | NO PAD        |
| utf8mb4_tr_0900_as_cs      | utf8mb4  | 288 |         | Yes      |       0 | NO PAD        |
| utf8mb4_turkish_ci         | utf8mb4  | 233 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_unicode_520_ci     | utf8mb4  | 246 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_unicode_ci         | utf8mb4  | 224 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_vietnamese_ci      | utf8mb4  | 247 |         | Yes      |       8 | PAD SPACE     |
| utf8mb4_vi_0900_ai_ci      | utf8mb4  | 277 |         | Yes      |       0 | NO PAD        |
| utf8mb4_vi_0900_as_cs      | utf8mb4  | 300 |         | Yes      |       0 | NO PAD        |
| utf8mb4_zh_0900_as_cs      | utf8mb4  | 308 |         | Yes      |       0 | NO PAD        |
| utf8_bin                   | utf8     |  83 |         | Yes      |       1 | PAD SPACE     |
| utf8_croatian_ci           | utf8     | 213 |         | Yes      |       8 | PAD SPACE     |
| utf8_czech_ci              | utf8     | 202 |         | Yes      |       8 | PAD SPACE     |
| utf8_danish_ci             | utf8     | 203 |         | Yes      |       8 | PAD SPACE     |
| utf8_esperanto_ci          | utf8     | 209 |         | Yes      |       8 | PAD SPACE     |
| utf8_estonian_ci           | utf8     | 198 |         | Yes      |       8 | PAD SPACE     |
| utf8_general_ci            | utf8     |  33 | Yes     | Yes      |       1 | PAD SPACE     |
| utf8_general_mysql500_ci   | utf8     | 223 |         | Yes      |       1 | PAD SPACE     |
| utf8_german2_ci            | utf8     | 212 |         | Yes      |       8 | PAD SPACE     |
| utf8_hungarian_ci          | utf8     | 210 |         | Yes      |       8 | PAD SPACE     |
| utf8_icelandic_ci          | utf8     | 193 |         | Yes      |       8 | PAD SPACE     |
| utf8_latvian_ci            | utf8     | 194 |         | Yes      |       8 | PAD SPACE     |
| utf8_lithuanian_ci         | utf8     | 204 |         | Yes      |       8 | PAD SPACE     |
| utf8_persian_ci            | utf8     | 208 |         | Yes      |       8 | PAD SPACE     |
| utf8_polish_ci             | utf8     | 197 |         | Yes      |       8 | PAD SPACE     |
| utf8_romanian_ci           | utf8     | 195 |         | Yes      |       8 | PAD SPACE     |
| utf8_roman_ci              | utf8     | 207 |         | Yes      |       8 | PAD SPACE     |
| utf8_sinhala_ci            | utf8     | 211 |         | Yes      |       8 | PAD SPACE     |
| utf8_slovak_ci             | utf8     | 205 |         | Yes      |       8 | PAD SPACE     |
| utf8_slovenian_ci          | utf8     | 196 |         | Yes      |       8 | PAD SPACE     |
| utf8_spanish2_ci           | utf8     | 206 |         | Yes      |       8 | PAD SPACE     |
| utf8_spanish_ci            | utf8     | 199 |         | Yes      |       8 | PAD SPACE     |
| utf8_swedish_ci            | utf8     | 200 |         | Yes      |       8 | PAD SPACE     |
| utf8_tolower_ci            | utf8     |  76 |         | Yes      |       1 | PAD SPACE     |
| utf8_turkish_ci            | utf8     | 201 |         | Yes      |       8 | PAD SPACE     |
| utf8_unicode_520_ci        | utf8     | 214 |         | Yes      |       8 | PAD SPACE     |
| utf8_unicode_ci            | utf8     | 192 |         | Yes      |       8 | PAD SPACE     |
| utf8_vietnamese_ci         | utf8     | 215 |         | Yes      |       8 | PAD SPACE     |
+----------------------------+----------+-----+---------+----------+---------+---------------+
  1. 比较规则名称以其关联的字符集的名称开头。
  2. 后面紧跟着该比较规则所对应的语言。

如:utf8_polish_ci

3. 后缀ci表示该比较规则是否区分中间语言中的重音,大小写等。

  • _ci: case insensitive 不区分大小写
  • _cs:case sensitive 区分大小写
  • _ai: accent insensitive 不区分重音
  • _as: accent sensitive 区分重音
  • _bin:binary 以二进制方式

4. 字符集与比较规则有四个级别:服务器级别/数据库级别/表级别/列级别

代码语言:javascript
复制
mysql> SHOW variables like "%character_set_server%";
+----------------------+---------+
| Variable_name        | Value   |
+----------------------+---------+
| character_set_server | utf8mb4 |
+----------------------+---------+

mysql> SHOW variables like "%character_set_database%";
+------------------------+---------+
| Variable_name          | Value   |
+------------------------+---------+
| character_set_database | utf8mb4 |
+------------------------+---------+

mysql> SHOW variables like "%collation_server%";
+------------------+--------------------+
| Variable_name    | Value              |
+------------------+--------------------+
| collation_server | utf8mb4_0900_ai_ci |
+------------------+--------------------+

mysql> SHOW variables like "%collation_database%";
+--------------------+--------------------+
| Variable_name      | Value              |
+--------------------+--------------------+
| collation_database | utf8mb4_0900_ai_ci |
+--------------------+--------------------+
本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2023-02-05,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 云数智圈 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
云数据库 MySQL
腾讯云数据库 MySQL(TencentDB for MySQL)为用户提供安全可靠,性能卓越、易于维护的企业级云数据库服务。其具备6大企业级特性,包括企业级定制内核、企业级高可用、企业级高可靠、企业级安全、企业级扩展以及企业级智能运维。通过使用腾讯云数据库 MySQL,可实现分钟级别的数据库部署、弹性扩展以及全自动化的运维管理,不仅经济实惠,而且稳定可靠,易于运维。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档