专栏首页idba源码解析MySQL 的隐式转换

源码解析MySQL 的隐式转换

作者徐晨亮,MySQL DBA,知数堂学员。热衷于数据库优化,自动化运维及数据库周边工具开发,对MySQL源码有一定的兴趣。

一、问题描述

root@mysqldb 22:12:  [xucl]> show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `id` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

root@mysqldb 22:19:  [xucl]> select * from t1;
+--------------------+
| id                 |
+--------------------+
| 204027026112927605 |
| 204027026112927603 |
| 2040270261129276   |
| 2040270261129275   |
| 100                |
| 101                |
+--------------------+
6 rows in set (0.00 sec)

奇怪的现象:

root@mysqldb 22:19:  [xucl]> select * from t1 where id=204027026112927603;
+--------------------+
| id                 |
+--------------------+
| 204027026112927605 |
| 204027026112927603 |
+--------------------+
2 rows in set (0.00 sec)

什么鬼,明明查的是204027026112927603,为什么204027026112927605也出来了

二、源码解释

堆栈调用关系如下所示

其中 JOIN::exec()是执行的入口, Arg_comparator::compare_real()是进行等值判断的函数,其定义如下

int Arg_comparator::compare_real()
{
  /*
    Fix yet another manifestation of Bug#2338. 'Volatile' will instruct
    gcc to flush double values out of 80-bit Intel FPU registers before
    performing the comparison.
  */
  volatile double val1, val2;
  val1= (*a)->val_real();
  if (!(*a)->null_value)
  {
    val2= (*b)->val_real();
    if (!(*b)->null_value)
    {
      if (set_null)
        owner->null_value= 0;
      if (val1 < val2)    return -1;
      if (val1 == val2) return 0;
      return 1;
    }
  }
  if (set_null)
    owner->null_value= 1;
  return -1;
}

比较步骤如下图所示,逐行读取t1表的id列放入val1,而常量204027026112927603存在于cache中,类型为double类型(2.0402702611292762E+17),所以到这里传值给val2后val2=2.0402702611292762E+17。

当扫描到第一行时,204027026112927605转成doule的值为2.0402702611292762e17,等式成立,判定为符合条件的行,继续往下扫描,同理204027026112927603也同样符

MySQL string转成double的定义函数如下:

{
  char buf[DTOA_BUFF_SIZE];
  double res;
  DBUG_ASSERT(end != NULL && ((str != NULL && *end != NULL) ||
                              (str == NULL && *end == NULL)) &&
              error != NULL);

  res= my_strtod_int(str, end, error, buf, sizeof(buf));
  return (*error == 0) ? res : (res < 0 ? -DBL_MAX : DBL_MAX);
}

真正转换函数 my_strtod_int位置在dtoa.c(太复杂了,简单贴个注释吧?)

/*
  strtod for IEEE--arithmetic machines.

  This strtod returns a nearest machine number to the input decimal
  string (or sets errno to EOVERFLOW). Ties are broken by the IEEE round-even
  rule.

  Inspired loosely by William D. Clinger's paper "How to Read Floating
  Point Numbers Accurately" [Proc. ACM SIGPLAN '90, pp. 92-101].

  Modifications:

   1. We only require IEEE (not IEEE double-extended).
   2. We get by with floating-point arithmetic in a case that
     Clinger missed -- when we're computing d * 10^n
     for a small integer d and the integer n is not too
     much larger than 22 (the maximum integer k for which
     we can represent 10^k exactly), we may be able to
     compute (d*10^k) * 10^(e-k) with just one roundoff.
   3. Rather than a bit-at-a-time adjustment of the binary
     result in the hard case, we use floating-point
     arithmetic to determine the adjustment to within
     one bit; only in really hard cases do we need to
     compute a second residual.
   4. Because of 3., we don't need a large table of powers of 10
     for ten-to-e (just some small tables, e.g. of 10^k
     for 0 <= k <= 22).
*/

既然是这样,我们测试下没有溢出的案例

root@mysqldb 23:30:  [xucl]> select * from t1 where id=2040270261129276;
+------------------+
| id               |
+------------------+
| 2040270261129276 |
+------------------+
1 row in set (0.00 sec)

root@mysqldb 23:30:  [xucl]> select * from t1 where id=101;
+------+
| id   |
+------+
| 101  |
+------+
1 row in set (0.00 sec)

结果符合预期,而在本例中,正确的写法应当是

root@mysqldb 22:19:  [xucl]> select * from t1 where id='204027026112927603';
+--------------------+
| id                 |
+--------------------+
| 204027026112927603 |
+--------------------+
1 row in set (0.01 sec)

三、结论

  1. 避免发生隐式类型转换,隐式转换的类型主要有字段类型不一致、in参数包含多个类型、字符集类型或校对规则不一致等
  2. 隐式类型转换可能导致无法使用索引、查询结果不准确等,因此在使用时必须仔细甄别
  3. 数字类型的建议在字段定义时就定义为int或者bigint,表关联时关联字段必须保持类型、字符集、校对规则都一致
  4. 最后贴一下官网对于隐式类型转换的说明吧
1、If one or both arguments are NULL, the result of the comparison is NULL, except for the NULL-safe
<=> equality comparison operator. For NULL <=> NULL, the result is true. No conversion is needed.

2、If both arguments in a comparison operation are strings, they are compared as strings.

3、If both arguments are integers, they are compared as integers.

4、Hexadecimal values are treated as binary strings if not compared to a number.

5、If one of the arguments is a TIMESTAMP or DATETIME column and the other argument is a constant
, the constant is converted to a timestamp before the comparison is performed. This isdone to be more ODBC-friendly. This is not done for the arguments to IN(). To be safe, always
use complete datetime, date, or time strings when doing comparisons. For example, to achieve best
results when using BETWEEN with date or time values, use CAST() to explicitly convert the values to
the desired data type.
A single-row subquery from a table or tables is not considered a constant. For example, if a subquery
returns an integer to be compared to a DATETIME value, the comparison is done as two integers.
The integer is not converted to a temporal value. To compare the operands as DATETIME values,
use CAST() to explicitly convert the subquery value to DATETIME. 

6、If one of the arguments is a decimal value, comparison depends on the other argument. The
arguments are compared as decimal values if the other argument is a decimal or integer value, or as
floating-point values if the other argument is a floating-point value.

7、In all other cases, the arguments are compared as floating-point (real) numbers.

四 参考文章

聊聊 隐式转换

Type Conversion in Expression Evaluation

本文分享自微信公众号 - yangyidba(yangyidba),作者:徐晨亮

原文出处及转载信息见文内详细说明,如有侵权,请联系 yunjia_community@tencent.com 删除。

原始发表时间:2019-09-19

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • MySQL性能优化特性 Index Condition Pushdown

    一 概念介绍 Index Condition Pushdown (ICP)是MySQL 5.6 版本中的新特性,是一种在存储引擎层使用索引过滤数据的一种优...

    用户1278550
  • 读PolarDB论文有感 : 异构计算和数据库软硬一体化设计

    作者:熊中哲,现任才云科技工程VP,负责产品和研发工作。曾就职于阿里巴巴、沃趣科技、美团。超过12年数据库领域的工作经历,目前对云原生,机器学习和异构计算也很感...

    用户1278550
  • redis cluster 学习 实战篇(一)

    从Redis3.0开始,Redis提供了 redis cluster 集群支持,用于在多个redis节点间共享数据,以提高服务的可用性。 redis clust...

    用户1278550
  • 如何通过Chrome直接创建Opportunity的文本

    Since we need to use HTTP post for note creation, it is necessary to get a CSRF ...

    Jerry Wang
  • 如何通过Chrome直接创建Opportunity的文本

    Since we need to use HTTP post for note creation, it is necessary to get a CSRF ...

    Jerry Wang
  • Kit 3D 更新

    Kit3D is a 3D graphics engine written for Microsoft Silverlight. Kit3D was inita...

    用户1172164
  • 智能城市中的自动停车 (cs)

    这个项目的目标是最大限度地利用土地空间,减少驾驶员的压力和挫败感,同时显著减少空气污染。我们的贡献是一个由手机控制的自动停车系统。该结构是一个六边形的形状,使用...

    用户7454091
  • 黑客的价值观The Hacker’s Code

    黑客,可能在大家的眼里是那些入侵别人计算机搞破坏的人,其实并不是那样的。如果你这样认为了,只能说明你对计算机文化并不了解,真正的黑客是一种 自由的象征,他们挑战...

    一个会写诗的程序员
  • SAP CRM中间件Generic stop set的错误如何解决

    I have Q7Q/504 as ERP system and QDD/504 as CRM system.

    Jerry Wang
  • WHY CHINA’S CROSS-BORDER E-COMMERCE MARKET IS SURGING? 2020 Report on Asia-Pacific Cross-Border B...

    At $5.8 trillion last year, China has the second-largest retail market in the wo...

    一个会写诗的程序员

扫码关注云+社区

领取腾讯云代金券