一条长sql的排错过程

过程

有这样一条长sql,由于环境原因,对select.....in...... 语法限制使用,因此以left join语法代替,原来只需要统计一天的数据汇总结果,因为特殊需要,需要一次性统计各天的结果,于是首先将子查询中时间过滤的部分提取到外层。

修改前sql1:

select count(*) from (SELECT DISTINCT t.con_no AS con_no
    , CASE 
 WHEN tgua.atime IS NOT NULL THEN tgua.atime
 WHEN tcc.CTIME IS NOT NULL THEN tcc.CTIME
 WHEN bb.CTIME IS NOT NULL THEN bb.CTIME
  END AS CTIME, t.PRO_METHOD AS pro_method
 FROM my_customer t
 LEFT OUTER JOIN first_ auth tgua
 ON t.con_no = tgua.con_no
 AND tgua.stat = 1
 AND tgua.type = 'C' and 
(tgua.atime>= '2017-06-22' and tgua.atime< '2017-06-23')
 LEFT OUTER JOIN my_ contact tcc
 ON t.con_no = tcc.con_no
 AND tcc.NAME IS NOT NULL and 
(tcc.ctime>= '2017-06-22' and tcc.ctime< '2017-06-23')
 LEFT OUTER JOIN (
 SELECT tcp.con_no AS con_no, tcai.CTIME AS CTIME
 FROM my_ person tcp
 INNER JOIN my_auth_info tcai
 ON tcp.CNO = tcai.CNO and 
(tcai.ctime>= '2017-06-22' and tcai.ctime< '2017-06-23')
 ) bb
 ON t.con_no = bb.con_no 
 WHERE t.PRO_METHOD = 'uc' and (tgua.con_no IS NOT NULL
 OR tcc.con_no IS NOT NULL
 OR bb.con_no IS NOT NULL)) pp order by pp.con_no
sql2:
select count(*) from 
 (SELECT DISTINCT t.con_no AS con_no
 , CASE 
                               WHEN tgua.atime IS NOT NULL THEN tgua.atime
             WHEN tcc.CTIME IS NOT NULL THEN tcc.CTIME
             WHEN bb.CTIME IS NOT NULL THEN bb.CTIME
 END AS CTIME, t.PRO_METHOD AS pro_method

 FROM my_customer t
 LEFT OUTER JOIN first_ auth tgua
 ON t.con_no = tgua.con_no
 AND tgua.status = 1
 AND tgua.type = 'C'
 LEFT OUTER JOIN my_ contact tcc
 ON t.con_no = tcc.con_no
 AND tcc.NAME IS NOT NULL
 LEFT OUTER JOIN (
 SELECT tcp.con_no AS con_no, tcai.CTIME AS CTIME
 FROM my_ person tcp
 INNER JOIN my_auth_info tcai
 ON tcp.CNO = tcai.CNO
 ) bb
 ON t.con_no = bb.con_no
 WHERE t.pro_method = 'uc' and (tgua.con_no IS NOT NULL
 OR tcc.con_no IS NOT NULL
 OR bb.con_no IS NOT NULL)) p where CTIME>= '2017-06-22' and CTIME< '2017-06-23'
 order by con_no 

试着执行了两条sql,发现sql2查询结果数量总比sql1要少,于是用*代替count(*),打印出具体查询结果,

sql3:

select * from (SELECT DISTINCT t.con_no AS con_no
 , CASE 
 WHEN tgua.atime IS NOT NULL THEN tgua.atime
 WHEN tcc.CTIME IS NOT NULL THEN tcc.CTIME
 WHEN bb.CTIME IS NOT NULL THEN bb.CTIME
 END AS CTIME, t.PRO_METHOD AS pro_method
     FROM my_customer t
 LEFT OUTER JOIN first_ auth tgua
 ON t.con_no = tgua.con_no
 AND tgua.stat = 1
 AND tgua.type = 'C' and 
(tgua.atime>= '2017-06-22' and tgua.atime< '2017-06-23')
 LEFT OUTER JOIN my_ contact tcc
 ON t.con_no = tcc.con_no
 AND tcc.NAME IS NOT NULL and 
(tcc.ctime>= '2017-06-22' and tcc.ctime< '2017-06-23')
 LEFT OUTER JOIN (
 SELECT tcp.con_no AS con_no, tcai.CTIME AS CTIME
 FROM my_ person tcp
 INNER JOIN my_auth_info tcai
 ON tcp.CNO = tcai.CNO and 
(tcai.ctime>= '2017-06-22' and tcai.ctime< '2017-06-23')
 ) bb
 ON t.con_no = bb.con_no 
 WHERE t.PRO_METHOD = 'uc' and (tgua.con_no IS NOT NULL
 OR tcc.con_no IS NOT NULL
 OR bb.con_no IS NOT NULL)) pp order by pp.con_no

sql4:

select * from 
 (SELECT DISTINCT t.con_no AS con_no
 , CASE 
                             WHEN tgua.atime IS NOT NULL THEN tgua.atime
 WHEN tcc.CTIME IS NOT NULL THEN tcc.CTIME
 WHEN bb.CTIME IS NOT NULL THEN bb.CTIME
     END AS CTIME, t.PRO_METHOD AS pro_method
 FROM my_customer t
 LEFT OUTER JOIN first_ auth tgua
 ON t.con_no = tgua.con_no
 AND tgua.status = 1
 AND tgua.type = 'C'
 LEFT OUTER JOIN my_ contact tcc
 ON t.con_no = tcc.con_no
 AND tcc.NAME IS NOT NULL
 LEFT OUTER JOIN (
 SELECT tcp.con_no AS con_no, tcai.CTIME AS CTIME
 FROM my_ person tcp
 INNER JOIN my_auth_info tcai
 ON tcp.CNO = tcai.CNO
 ) bb
 ON t.con_no = bb.con_no
 WHERE t.pro_method = 'uc' and (tgua.con_no IS NOT NULL
 OR tcc.con_no IS NOT NULL
 OR bb.con_no IS NOT NULL)) p where CTIME>= '2017-06-22' and CTIME< '2017-06-23'
 order by con_no

找出sql3比sql4多出的执行结果记录,如编号为4534的一条记录, 这条记录在sql4中很有可能被过滤掉了,因此修改sql4外层的where查询条件,如下:

select * from 
 (SELECT DISTINCT t.con_no AS con_no
 , CASE 
  WHEN tgua.atime IS NOT NULL THEN tgua.atime
    WHEN tcc.CTIME IS NOT NULL THEN tcc.CTIME
    WHEN bb.CTIME IS NOT NULL THEN bb.CTIME
  END AS CTIME, t.PRO_METHOD AS pro_method
  FROM my_customer t
 LEFT OUTER JOIN first_ auth tgua
 ON t.con_no = tgua.con_no
 AND tgua.status = 1
 AND tgua.type = 'C'
 LEFT OUTER JOIN my_ contact tcc
 ON t.con_no = tcc.con_no
 AND tcc.NAME IS NOT NULL
 LEFT OUTER JOIN (
 SELECT tcp.con_no AS con_no, tcai.CTIME AS CTIME
 FROM my_ person tcp
 INNER JOIN my_auth_info tcai
 ON tcp.CNO = tcai.CNO
 ) bb
 ON t.con_no = bb.con_no
 WHERE t.pro_method = 'uc' and (tgua.con_no IS NOT NULL
 OR tcc.con_no IS NOT NULL
 OR bb.con_no IS NOT NULL)) p where con_no='4534'

查询结果发现CTIME对应的值竟是“2017-06-23 00:18:07“。而在sql3执行结果中,CTIME值为2017-06-22 00:16:08。问题出在这一sql片段中:

 CASE 
            WHEN tgua.atime IS NOT NULL THEN tgua.atime
   WHEN tcc.CTIME IS NOT NULL THEN tcc.CTIME
             WHEN bb.CTIME IS NOT NULL THEN bb.CTIME
          END AS CTIME, t.PRO_METHOD AS pro_method

在sql3的执行结果中,由于时间筛选在内层的left join后面,时间不是2017-06-22的记录字段值都用null代替,因此在结果集中,无论是tgua.atime,tcc.CTIME 还是bb.CTIME,时间不是null就是2017-06-22 ,而在sql4中,tgua.atime、tcc.CTIME 、bb.CTIME的时间是不一致的,因此假如case when选择一个not null的时间值,且该时间刚好不是2017-06-22,那么这条记录会在外层的where筛选中被过滤掉。而我们希望的却是tgua.atime、tcc.CTIME 、bb.CTIME只要有一个时间是在2017-06-22,那这条记录就应该出现在结果集中。使用union语法,修改如下:

select distinct * from 
 ((SELECT DISTINCT t.con_no AS con_no 
 , DATE_FORMAT(tgua.atime,"%Y-%m-%d") AS ctime, t.PRO_METHOD AS pro_method
 FROM my_customer t
 inner JOIN first_ auth tgua
 ON t.con_no = tgua.con_no
 AND tgua.status = 1
 AND tgua.type = 'C')
 union
 (SELECT DISTINCT t.con_no AS con_no 
 , DATE_FORMAT(tcc.ctime,"%Y-%m-%d") AS ctime, t.PRO_METHOD AS pro_method
 FROM my_customer t
 inner JOIN
  my_ contact tcc
 ON t.con_no = tcc.con_no
 AND tcc.NAME IS NOT NULL)
 union
 (SELECT DISTINCT t.con_no AS con_no 
 ,DATE_FORMAT(bb.ctime,"%Y-%m-%d") AS ctime, t.PRO_METHOD AS pro_method
 FROM my_customer t
 inner JOIN (
 SELECT tcp.con_no AS con_no, tcai.ctime AS ctime
 FROM my_ person tcp
 INNER JOIN my_auth_info tcai
 ON tcp.CNO = tcai.CNO
 ) bb
 ON t.con_no = bb.con_no)) p where pro_method = 'uc' and ctime>= '2017-06-22' and ctime< '2017-06-23'
 order by con_no 

查出的记录数恰好与sql1中的一致,在此基础上进行group by分组统计处理。

总结

多表关联查询排错遇到问题时,要通过select *打印出具体的明细,进行对比,分析其特征,再反观sql思考其错误的可能原因,再修改,再排查,这是一个不断从现象到本质,从思考到实践的循环上升的过程,这也是修改一切bug的要旨所在。另外,越复杂的sql越容易出错,而且不利于修改维护,因此,在应用层编写代码时,sql越简单越好。

参考文章:mysql left( right ) join使用on 与where 筛选的差异

原文发布于微信公众号 - java达人(drjava)

原文发表时间:2017-07-02

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏大数据挖掘DT机器学习

50多条实用mysql数据库优化建议

1.对查询进行优化,应尽量避免全表扫描,首先应考虑在 where 及 order by 涉及的列上建立索引。 缺省情况下建立的索引是非群集索引,但有时它并不是最...

6206
来自专栏Spark学习技巧

MySQL索引优化分析

1575
来自专栏java达人

mysql性能优化的几条重要建议

今天,数据库的操作越来越成为整个应用的性能瓶颈了,这点对于Web应用尤其明显。关于数据库的性能,这并不只是DBA才需要担心的事,而这更是我们程序员需要去关注的事...

2286
来自专栏用户2442861的专栏

MySQL性能优化的最佳20+条经验

今天,数据库的操作越来越成为整个应用的性能瓶颈了,这点对于Web应用尤其明显。关于数据库的性能,这并不只是DBA才需要担心的事,而这更是我们程序员需要去关注的...

762
来自专栏java一日一条

MySQL 性能优化的最佳 20+ 条经验

今天,数据库的操作越来越成为整个应用的性能瓶颈了,这点对于Web应用尤其明显。关于数据库的性能,这并不只是DBA才需要担心的事,而这更是我 们程序员需要去关注的...

1043
来自专栏博客园

.NET面试题解析(11)-SQL语言基础及数据库基本原理

转自:http://www.cnblogs.com/anding/p/5281558.html

1411
来自专栏杨建荣的学习笔记

一条SQL语句的执行计划变化探究(r10笔记第9天)

继续上次分析的一个问题,一个简单的SQL语句执行计划有些奇怪,明明可以走唯一性索引但是却走了另外一个索引。 当然了,最后逐步定位,发现是在直方图...

2873
来自专栏PHP在线

MySQL性能优化的21条经验

1. 为查询缓存优化你的查询 大多数的MySQL服务器都开启了查询缓存。这是提高性最有效的方法之一,而且这是被MySQL的数据库引擎处理的。当有很多相同的查询被...

3198
来自专栏*坤的Blog

mysql生成百万级数量测试数据

1.2K3
来自专栏Java后端技术栈

数据库两大必备神器:索引和锁底层原理是什么!

所以说,如果我们写select * from user where username = 'Java3y'这样没有进行任何优化的sql语句,默认会这样做:

2924

扫码关注云+社区

领取腾讯云代金券