前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >当心外部连接中的ON子句

当心外部连接中的ON子句

作者头像
Leshami
发布2018-08-14 11:13:50
2.1K0
发布2018-08-14 11:13:50
举报
文章被收录于专栏:乐沙弥的世界

       在SQL tuning中,不良写法导致SQL执行效率比比皆是。最近的SQL tuning中一个外部连接写法不当导致过SQL执行时间超过15分钟左右此即 为一例。通过修正该SQL的写法之后,过高的逻辑读呈数量级下降以及SQL语句执行时间也大幅下降。下面给出一个列子来演示该情形。

一、创建演示环境

代码语言:javascript
复制
-->当前数据库版本
  SQL> select * from v$version where rownum<2;               
  
  BANNER
  ----------------------------------------------------------------
  Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod

-->创建演示表并插入记录
  SQL> create table t as select empno,ename,job,sal,deptno
    2  from emp where 1=2;
  
  SQL> insert into t select empno,ename,job,sal,deptno
    2  from emp e where empno=(select max(empno) from emp where deptno=e.deptno);
  
  SQL> insert into t(empno,ename,job,sal) values(8888,'ROBINSON','DBA',2000);
  
  SQL> insert into t(empno,ename,job,sal) values(9999,'JACKSON','CLERK',2500);
  
  SQL> commit;
  
  SQL> analyze table t compute statistics;
  
  SQL> select * from t;
  
       EMPNO ENAME      JOB              SAL     DEPTNO
  ---------- ---------- --------- ---------- ----------
        7934 MILLER     CLERK           1300         10
        7902 FORD       ANALYST         3000         20
        7900 JAMES      CLERK            950         30
        8888 ROBINSON   DBA             2000
        9999 JACKSON    CLERK           2500

-->使用left join连接查看数据,此时表t中所有记录被返回       
  SQL> select empno,ename,sal,dname from t left join dept d on t.deptno=d.deptno;
  
       EMPNO ENAME             SAL DNAME
  ---------- ---------- ---------- --------------
        7934 MILLER           1300 ACCOUNTING
        7902 FORD             3000 RESEARCH
        7900 JAMES             950 SALES
        8888 ROBINSON         2000
        9999 JACKSON          2500

-->下面同样是使用left join连接,但在on子句中增加了过滤条件t.sal>=2000
-->从下面的返回结果可知,t.sal>=2000子句并没有过滤掉sal小于2000的记录 
  SQL> select empno,ename,sal,dname from t left join dept d     -->简称语句A
    2  on(t.deptno=d.deptno and t.sal>=2000);
  
       EMPNO ENAME             SAL DNAME
  ---------- ---------- ---------- --------------
        7934 MILLER           1300
        7902 FORD             3000 RESEARCH
        7900 JAMES             950
        8888 ROBINSON         2000
        9999 JACKSON          2500

-->使用left join连接,将过滤条件放到where 子句中
-->此时仅仅t.sal>=2000且符合t.deptno=d.deptno的记录被返回(结果与所期望一致)
  SQL> select empno,ename,sal,dname from t left join dept d        -->简称语句B
    2  on t.deptno=d.deptno where t.sal>=2000;
  
       EMPNO ENAME             SAL DNAME
  ---------- ---------- ---------- --------------
        7902 FORD             3000 RESEARCH
        8888 ROBINSON         2000
        9999 JACKSON          2500

-->查看执行计划
  SQL> set autotrace traceonly exp;
  
  -->语句A(过滤条件位于on 子句中的情形)的执行计划
  SQL> select empno,ename,sal,dname from t left join dept d   
    2  (on t.deptno=d.deptno and t.sal>=2000);               
    
  Execution Plan
  ----------------------------------------------------------
  Plan hash value: 2195752858
  
  -----------------------------------------------------------------------------------------
  | Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
  -----------------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT              |         |     5 |   120 |     6   (0)| 00:00:01 |
  |   1 |  NESTED LOOPS OUTER           |         |     5 |   120 |     6   (0)| 00:00:01 |
  |   2 |   TABLE ACCESS FULL           | T       |     5 |    70 |     3   (0)| 00:00:01 |
  |   3 |   VIEW                        |         |     1 |    10 |     1   (0)| 00:00:01 |
  |   4 |    TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    13 |     1   (0)| 00:00:01 |
  |*  5 |     INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
  -----------------------------------------------------------------------------------------
  
  Predicate Information (identified by operation id):
  ---------------------------------------------------
  
     5 - access("T"."DEPTNO"="D"."DEPTNO")                      -->重点关注这里的谓词信息,两个过滤条件合在一起
         filter("T"."DEPTNO" IS NOT NULL AND "T"."SAL">=2000)   -->从执行计划来看位于第5步为INDEX UNIQUE SCAN
  
  Statistics
  ----------------------------------------------------------
            0  recursive calls
            0  db block gets
           11  consistent gets                                  -->此时的逻辑读为11
            0  physical reads
            0  redo size
          696  bytes sent via SQL*Net to client
          385  bytes received via SQL*Net from client
            2  SQL*Net roundtrips to/from client
            0  sorts (memory)
            0  sorts (disk)
            5  rows processed
       
    -->语句B(将谓词信息置于到where子句中的情形)的执行计划
  SQL> select empno,ename,sal,dname from t left join dept d
    2  on t.deptno=d.deptno where t.sal>=2000;
  
  Execution Plan
  ----------------------------------------------------------
  Plan hash value: 832694258
  
  ----------------------------------------------------------------------------------------
  | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
  ----------------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT             |         |     3 |    81 |     4   (0)| 00:00:01 |
  |   1 |  NESTED LOOPS OUTER          |         |     3 |    81 |     4   (0)| 00:00:01 |
  |*  2 |   TABLE ACCESS FULL          | T       |     3 |    42 |     3   (0)| 00:00:01 |
  |   3 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    13 |     1   (0)| 00:00:01 |
  |*  4 |    INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
  ----------------------------------------------------------------------------------------
  
  Predicate Information (identified by operation id):
  ---------------------------------------------------
  
     2 - filter("T"."SAL">=2000)                     -->此时的谓词信息分为两部分,"T"."SAL">=2000位于第二步
     4 - access("T"."DEPTNO"="D"."DEPTNO"(+))        -->此条谓词信息用于实现表连接
  
  Statistics
  ----------------------------------------------------------
            0  recursive calls
            0  db block gets
           10  consistent gets             -->此时的逻辑读为10,由于2 - filter("T"."SAL">=2000)过滤后,内部循环少执行了一次
            0  physical reads
            0  redo size
          658  bytes sent via SQL*Net to client
          385  bytes received via SQL*Net from client
            2  SQL*Net roundtrips to/from client
            0  sorts (memory)
            0  sorts (disk)
            3  rows processed   

  /**************************************************/
  /* Author: Robinson Cheng                         */
  /* Blog:   http://blog.csdn.net/robinson_0612     */
  /* MSN:    robinson_0612@hotmail.com              */
  /* QQ:     645746311                              */
  /**************************************************/            

-->从上面的观察中发现上述两条SQL语句执行计划并非最佳,存在改良的余地
-->由于是nested loops outer,因此考虑在表t的谓词列增加索引以快速过滤记录         

  SQL> create index i_t_sal on t(sal);
  
  SQL> exec dbms_stats.gather_table_stats('SCOTT','T',cascade=>true);
  
-->增加索引后两个语句的执行情况

  -->语句A的执行计划以及统计信息没有发生任何变化
  SQL> select empno,ename,sal,dname from t left join dept d
    2  on (t.deptno=d.deptno and t.sal>=2000);
  
  Execution Plan
  ----------------------------------------------------------
  Plan hash value: 2195752858
  
  -----------------------------------------------------------------------------------------
  | Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
  -----------------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT              |         |     5 |   140 |     6   (0)| 00:00:01 |
  |   1 |  NESTED LOOPS OUTER           |         |     5 |   140 |     6   (0)| 00:00:01 |
  |   2 |   TABLE ACCESS FULL           | T       |     5 |    90 |     3   (0)| 00:00:01 |
  |   3 |   VIEW                        |         |     1 |    10 |     1   (0)| 00:00:01 |
  |   4 |    TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    13 |     1   (0)| 00:00:01 |
  |*  5 |     INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
  -----------------------------------------------------------------------------------------
  
  Predicate Information (identified by operation id):
  ---------------------------------------------------
  
     5 - access("T"."DEPTNO"="D"."DEPTNO")
         filter("T"."DEPTNO" IS NOT NULL AND "T"."SAL">=2000)
  
  Statistics
  ----------------------------------------------------------
            1  recursive calls
            0  db block gets
           11  consistent gets
            0  physical reads
            0  redo size
          696  bytes sent via SQL*Net to client
          385  bytes received via SQL*Net from client
            2  SQL*Net roundtrips to/from client
            0  sorts (memory)
            0  sorts (disk)
            5  rows processed          
          
    -->语句B的执行计划发生变化,原来的全表扫描变为索引扫描      
  SQL> select empno,ename,sal,dname from t left join dept d
    2  on t.deptno=d.deptno where t.sal>=2000;
  
  Execution Plan
  ----------------------------------------------------------
  Plan hash value: 2452308905
  
  ----------------------------------------------------------------------------------------
  | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
  ----------------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT             |         |     3 |    93 |     3   (0)| 00:00:01 |
  |   1 |  NESTED LOOPS OUTER          |         |     3 |    93 |     3   (0)| 00:00:01 |
  |   2 |   TABLE ACCESS BY INDEX ROWID| T       |     3 |    54 |     2   (0)| 00:00:01 |
  |*  3 |    INDEX RANGE SCAN          | I_T_SAL |     3 |       |     1   (0)| 00:00:01 |
  |   4 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    13 |     1   (0)| 00:00:01 |
  |*  5 |    INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
  ----------------------------------------------------------------------------------------
  
  Predicate Information (identified by operation id):
  ---------------------------------------------------
  
     3 - access("T"."SAL">=2000)
     5 - access("T"."DEPTNO"="D"."DEPTNO"(+))
  
  Statistics
  ----------------------------------------------------------
            1  recursive calls
            0  db block gets
            6  consistent gets              -->逻辑读也由10下降到6
            0  physical reads
            0  redo size
          658  bytes sent via SQL*Net to client
          385  bytes received via SQL*Net from client
            2  SQL*Net roundtrips to/from client
            0  sorts (memory)
            0  sorts (disk)
            3  rows processed

二、总结   1、尽可能避免SQL不良写法导致的不良后果   2、此例中由于将谓词信息放到ON子句中,在数据量庞大的表(百万行)连接中,则该写法导致过多的物理和逻辑I/O,使得中间结果集庞大   3、谓词信息放到ON子句中同时也导致索引失效   4、尽可能的在满足需求的情况下减小中间结果集

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2012年01月05日,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
数据库
云数据库为企业提供了完善的关系型数据库、非关系型数据库、分析型数据库和数据库生态工具。您可以通过产品选择和组合搭建,轻松实现高可靠、高可用性、高性能等数据库需求。云数据库服务也可大幅减少您的运维工作量,更专注于业务发展,让企业一站式享受数据上云及分布式架构的技术红利!
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档