专栏首页乐沙弥的世界当心外部连接中的ON子句

当心外部连接中的ON子句

       在SQL tuning中,不良写法导致SQL执行效率比比皆是。最近的SQL tuning中一个外部连接写法不当导致过SQL执行时间超过15分钟左右此即 为一例。通过修正该SQL的写法之后,过高的逻辑读呈数量级下降以及SQL语句执行时间也大幅下降。下面给出一个列子来演示该情形。

一、创建演示环境

-->当前数据库版本
  SQL> select * from v$version where rownum<2;               
  
  BANNER
  ----------------------------------------------------------------
  Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod

-->创建演示表并插入记录
  SQL> create table t as select empno,ename,job,sal,deptno
    2  from emp where 1=2;
  
  SQL> insert into t select empno,ename,job,sal,deptno
    2  from emp e where empno=(select max(empno) from emp where deptno=e.deptno);
  
  SQL> insert into t(empno,ename,job,sal) values(8888,'ROBINSON','DBA',2000);
  
  SQL> insert into t(empno,ename,job,sal) values(9999,'JACKSON','CLERK',2500);
  
  SQL> commit;
  
  SQL> analyze table t compute statistics;
  
  SQL> select * from t;
  
       EMPNO ENAME      JOB              SAL     DEPTNO
  ---------- ---------- --------- ---------- ----------
        7934 MILLER     CLERK           1300         10
        7902 FORD       ANALYST         3000         20
        7900 JAMES      CLERK            950         30
        8888 ROBINSON   DBA             2000
        9999 JACKSON    CLERK           2500

-->使用left join连接查看数据,此时表t中所有记录被返回       
  SQL> select empno,ename,sal,dname from t left join dept d on t.deptno=d.deptno;
  
       EMPNO ENAME             SAL DNAME
  ---------- ---------- ---------- --------------
        7934 MILLER           1300 ACCOUNTING
        7902 FORD             3000 RESEARCH
        7900 JAMES             950 SALES
        8888 ROBINSON         2000
        9999 JACKSON          2500

-->下面同样是使用left join连接,但在on子句中增加了过滤条件t.sal>=2000
-->从下面的返回结果可知,t.sal>=2000子句并没有过滤掉sal小于2000的记录 
  SQL> select empno,ename,sal,dname from t left join dept d     -->简称语句A
    2  on(t.deptno=d.deptno and t.sal>=2000);
  
       EMPNO ENAME             SAL DNAME
  ---------- ---------- ---------- --------------
        7934 MILLER           1300
        7902 FORD             3000 RESEARCH
        7900 JAMES             950
        8888 ROBINSON         2000
        9999 JACKSON          2500

-->使用left join连接,将过滤条件放到where 子句中
-->此时仅仅t.sal>=2000且符合t.deptno=d.deptno的记录被返回(结果与所期望一致)
  SQL> select empno,ename,sal,dname from t left join dept d        -->简称语句B
    2  on t.deptno=d.deptno where t.sal>=2000;
  
       EMPNO ENAME             SAL DNAME
  ---------- ---------- ---------- --------------
        7902 FORD             3000 RESEARCH
        8888 ROBINSON         2000
        9999 JACKSON          2500

-->查看执行计划
  SQL> set autotrace traceonly exp;
  
  -->语句A(过滤条件位于on 子句中的情形)的执行计划
  SQL> select empno,ename,sal,dname from t left join dept d   
    2  (on t.deptno=d.deptno and t.sal>=2000);               
    
  Execution Plan
  ----------------------------------------------------------
  Plan hash value: 2195752858
  
  -----------------------------------------------------------------------------------------
  | Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
  -----------------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT              |         |     5 |   120 |     6   (0)| 00:00:01 |
  |   1 |  NESTED LOOPS OUTER           |         |     5 |   120 |     6   (0)| 00:00:01 |
  |   2 |   TABLE ACCESS FULL           | T       |     5 |    70 |     3   (0)| 00:00:01 |
  |   3 |   VIEW                        |         |     1 |    10 |     1   (0)| 00:00:01 |
  |   4 |    TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    13 |     1   (0)| 00:00:01 |
  |*  5 |     INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
  -----------------------------------------------------------------------------------------
  
  Predicate Information (identified by operation id):
  ---------------------------------------------------
  
     5 - access("T"."DEPTNO"="D"."DEPTNO")                      -->重点关注这里的谓词信息,两个过滤条件合在一起
         filter("T"."DEPTNO" IS NOT NULL AND "T"."SAL">=2000)   -->从执行计划来看位于第5步为INDEX UNIQUE SCAN
  
  Statistics
  ----------------------------------------------------------
            0  recursive calls
            0  db block gets
           11  consistent gets                                  -->此时的逻辑读为11
            0  physical reads
            0  redo size
          696  bytes sent via SQL*Net to client
          385  bytes received via SQL*Net from client
            2  SQL*Net roundtrips to/from client
            0  sorts (memory)
            0  sorts (disk)
            5  rows processed
       
    -->语句B(将谓词信息置于到where子句中的情形)的执行计划
  SQL> select empno,ename,sal,dname from t left join dept d
    2  on t.deptno=d.deptno where t.sal>=2000;
  
  Execution Plan
  ----------------------------------------------------------
  Plan hash value: 832694258
  
  ----------------------------------------------------------------------------------------
  | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
  ----------------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT             |         |     3 |    81 |     4   (0)| 00:00:01 |
  |   1 |  NESTED LOOPS OUTER          |         |     3 |    81 |     4   (0)| 00:00:01 |
  |*  2 |   TABLE ACCESS FULL          | T       |     3 |    42 |     3   (0)| 00:00:01 |
  |   3 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    13 |     1   (0)| 00:00:01 |
  |*  4 |    INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
  ----------------------------------------------------------------------------------------
  
  Predicate Information (identified by operation id):
  ---------------------------------------------------
  
     2 - filter("T"."SAL">=2000)                     -->此时的谓词信息分为两部分,"T"."SAL">=2000位于第二步
     4 - access("T"."DEPTNO"="D"."DEPTNO"(+))        -->此条谓词信息用于实现表连接
  
  Statistics
  ----------------------------------------------------------
            0  recursive calls
            0  db block gets
           10  consistent gets             -->此时的逻辑读为10,由于2 - filter("T"."SAL">=2000)过滤后,内部循环少执行了一次
            0  physical reads
            0  redo size
          658  bytes sent via SQL*Net to client
          385  bytes received via SQL*Net from client
            2  SQL*Net roundtrips to/from client
            0  sorts (memory)
            0  sorts (disk)
            3  rows processed   

  /**************************************************/
  /* Author: Robinson Cheng                         */
  /* Blog:   http://blog.csdn.net/robinson_0612     */
  /* MSN:    robinson_0612@hotmail.com              */
  /* QQ:     645746311                              */
  /**************************************************/            

-->从上面的观察中发现上述两条SQL语句执行计划并非最佳,存在改良的余地
-->由于是nested loops outer,因此考虑在表t的谓词列增加索引以快速过滤记录         

  SQL> create index i_t_sal on t(sal);
  
  SQL> exec dbms_stats.gather_table_stats('SCOTT','T',cascade=>true);
  
-->增加索引后两个语句的执行情况

  -->语句A的执行计划以及统计信息没有发生任何变化
  SQL> select empno,ename,sal,dname from t left join dept d
    2  on (t.deptno=d.deptno and t.sal>=2000);
  
  Execution Plan
  ----------------------------------------------------------
  Plan hash value: 2195752858
  
  -----------------------------------------------------------------------------------------
  | Id  | Operation                     | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
  -----------------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT              |         |     5 |   140 |     6   (0)| 00:00:01 |
  |   1 |  NESTED LOOPS OUTER           |         |     5 |   140 |     6   (0)| 00:00:01 |
  |   2 |   TABLE ACCESS FULL           | T       |     5 |    90 |     3   (0)| 00:00:01 |
  |   3 |   VIEW                        |         |     1 |    10 |     1   (0)| 00:00:01 |
  |   4 |    TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    13 |     1   (0)| 00:00:01 |
  |*  5 |     INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
  -----------------------------------------------------------------------------------------
  
  Predicate Information (identified by operation id):
  ---------------------------------------------------
  
     5 - access("T"."DEPTNO"="D"."DEPTNO")
         filter("T"."DEPTNO" IS NOT NULL AND "T"."SAL">=2000)
  
  Statistics
  ----------------------------------------------------------
            1  recursive calls
            0  db block gets
           11  consistent gets
            0  physical reads
            0  redo size
          696  bytes sent via SQL*Net to client
          385  bytes received via SQL*Net from client
            2  SQL*Net roundtrips to/from client
            0  sorts (memory)
            0  sorts (disk)
            5  rows processed          
          
    -->语句B的执行计划发生变化,原来的全表扫描变为索引扫描      
  SQL> select empno,ename,sal,dname from t left join dept d
    2  on t.deptno=d.deptno where t.sal>=2000;
  
  Execution Plan
  ----------------------------------------------------------
  Plan hash value: 2452308905
  
  ----------------------------------------------------------------------------------------
  | Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
  ----------------------------------------------------------------------------------------
  |   0 | SELECT STATEMENT             |         |     3 |    93 |     3   (0)| 00:00:01 |
  |   1 |  NESTED LOOPS OUTER          |         |     3 |    93 |     3   (0)| 00:00:01 |
  |   2 |   TABLE ACCESS BY INDEX ROWID| T       |     3 |    54 |     2   (0)| 00:00:01 |
  |*  3 |    INDEX RANGE SCAN          | I_T_SAL |     3 |       |     1   (0)| 00:00:01 |
  |   4 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    13 |     1   (0)| 00:00:01 |
  |*  5 |    INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
  ----------------------------------------------------------------------------------------
  
  Predicate Information (identified by operation id):
  ---------------------------------------------------
  
     3 - access("T"."SAL">=2000)
     5 - access("T"."DEPTNO"="D"."DEPTNO"(+))
  
  Statistics
  ----------------------------------------------------------
            1  recursive calls
            0  db block gets
            6  consistent gets              -->逻辑读也由10下降到6
            0  physical reads
            0  redo size
          658  bytes sent via SQL*Net to client
          385  bytes received via SQL*Net from client
            2  SQL*Net roundtrips to/from client
            0  sorts (memory)
            0  sorts (disk)
            3  rows processed

二、总结   1、尽可能避免SQL不良写法导致的不良后果   2、此例中由于将谓词信息放到ON子句中,在数据量庞大的表(百万行)连接中,则该写法导致过多的物理和逻辑I/O,使得中间结果集庞大   3、谓词信息放到ON子句中同时也导致索引失效   4、尽可能的在满足需求的情况下减小中间结果集

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • Oracle大表清理truncate .. reuse storage

    a、通过上述测试,当使用reuse storage与普通方式并无明显差异 b、truncate table 是ddl操作,无法回滚 c、尽管无明显性能差...

    Leshami
  • 参数job_queue_processes与Oracle jobs

    Oracle jobs为Oracle开发人员和数据库管理员提供了数据库层面维护的极大便利性。对于Oracle jobs在Oracle 9i之前,是由dbms_...

    Leshami
  • SQL*Plus break与compute的简单用法

       在SQL*Plus提示符下输出求和报表,我们可以借助break与compute两个命令来实现。这个两个命令简单易用,可满足日常需求,其实质也相当于在编写S...

    Leshami
  • Yum 报错 curl#60 – “Peer’s Certificate has expired.”解决

    Cannot retrieve metalink for repository: epel/x86_64. Please verify its path and...

    zhangdd
  • MPEG4视频中,I帧、p帧、B帧的判定(转载) By HKL,

    mpeg4的每一帧开头是固定的:00 00 01 b6,那么我们如何判断当前帧属于什么帧呢?在接下来的2bit,将会告诉我们答案。注意:是2bit,不是byte...

    hiplon
  • Python os.popen() 方法

    os.popen() 方法用于从一个命令打开一个管道。 在Unix,Windows中有效

    周小董
  • 巧用parallel极速提升数据加载速度(r2第21天)

    并行在平时工作中可能不是很注意,因为有时候即使设定了parallel 相关的hint,感觉性能也好不到哪去。这是我以前的感觉。 今天通过一个案例来分享一下通过p...

    jeanron100
  • 如何在CDH集群中部署Presto

    Fayson
  • truncate分区表的操作,会导致全局索引失效?

    truncate是删除数据操作,但他是DDL语句,不是delete这种DML语句,不会写redo和undo,不能rollback。

    bisal
  • 【MATLAB 从零到进阶】day3 矩阵 数组

    矩阵的除法包括左除(A\B)、右除(A/B)和点除(A./B)三种。一般情况下,x=A\b是方程组A*x=b的解,而x=b/A 是方程组x*A=b的解,x=A....

    统计学家

扫码关注云+社区

领取腾讯云代金券