通过执行计划中的CONCATENATION分析sql问题(r4笔记第16天)

昨天开发的一个同事找到我,说写了一条sql语句,但是执行了半个小时还没有执行完,想让我帮忙看看是怎么回事。 他大体上给我讲了下逻辑,表bl1_rc_rates是千万级数据量的表,autsu_subscriber 是个临时表,里面只有三百多条数据,bl1_activity_history 表的数据量略小,是百万级的。

   select distinct hist.entity_id, rc.* from bl1_activity_history hist, bl1_rc_rates rc, autsu_subscriber sub
where hist.entity_id = sub.subscriber_no
    and hist.customer_id = sub.customer_id
    and hist.activity_id = '48'
    and hist.entity_id = rc.service_receiver_id
    and hist.customer_id = rc.receiver_customer
    and rc.service_receiver_id=sub.subscriber_no
   and rc.receiver_customer= sub.customer_id
   and trunc(hist.activity_date,'dd') = trunc(rc.effective_date,'dd')
and rc.amount > 0
  and rc.expiration_date is null or rc.expiration_date > to_date('20141019','yyyymmdd');

先来看看执行计划吧,一看吓一跳

Plan hash value: 3128694621
--------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name                     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                          |    22G|    13T|       |  4217M  (1)|999:59:59 |       |       |
|   1 |  HASH UNIQUE                            |                          |    22G|    13T|    16T|  4217M  (1)|999:59:59 |       |       |
|   2 |   CONCATENATION                         |                          |       |       |       |            |          |       |       |
|   3 |    NESTED LOOPS                         |                          |    21G|    13T|       |    47M  (1)|159:35:59 |       |       |
|   4 |     NESTED LOOPS                        |                          |    13M|  8211M|       |  1393K  (1)| 04:38:47 |       |       |
|   5 |      PARTITION RANGE ALL                |                          |     1 |   622 |       |   980K  (1)| 03:16:02 |     1 |    11 |
|*  6 |       TABLE ACCESS FULL                 | BL1_RC_RATES             |     1 |   622 |       |   980K  (1)| 03:16:02 |     1 |    11 |
|   7 |      PARTITION RANGE ALL                |                          |    27M|   622M|       |   413K  (1)| 01:22:45 |     1 |    11 |
|   8 |       TABLE ACCESS FULL                 | BL1_ACTIVITY_HISTORY     |    27M|   622M|       |   413K  (1)| 01:22:45 |     1 |    11 |
|   9 |     TABLE ACCESS FULL                   | AUTSU_SUBSCRIBER         |  1634 | 42484 |       |     3   (0)| 00:00:01 |       |       |
|  10 |    NESTED LOOPS                         |                          |       |       |       |            |          |       |       |
|  11 |     NESTED LOOPS                        |                          |     1 |   672 |       |  2949   (1)| 00:00:36 |       |       |
|  12 |      NESTED LOOPS                       |                          |     1 |    50 |       |  2947   (1)| 00:00:36 |       |       |
|  13 |       TABLE ACCESS FULL                 | AUTSU_SUBSCRIBER         |  1634 | 42484 |       |     5   (0)| 00:00:01 |       |       |
|  14 |       PARTITION RANGE ALL               |                          |     1 |    24 |       |     2   (0)| 00:00:01 |     1 |    11 |
|* 15 |        TABLE ACCESS BY LOCAL INDEX ROWID| BL1_ACTIVITY_HISTORY     |     1 |    24 |       |     2   (0)| 00:00:01 |     1 |    11 |
|* 16 |         INDEX RANGE SCAN                | BL1_ACTIVITY_HISTORY_3IX |     1 |       |       |     2   (0)| 00:00:01 |     1 |    11 |
|  17 |      PARTITION RANGE ALL                |                          |     1 |       |       |     2   (0)| 00:00:01 |     1 |    11 |
|* 18 |       INDEX RANGE SCAN                  | BL1_RC_RATES_3IX         |     1 |       |       |     2   (0)| 00:00:01 |     1 |    11 |
|* 19 |     TABLE ACCESS BY LOCAL INDEX ROWID   | BL1_RC_RATES             |     1 |   622 |       |     2   (0)| 00:00:01 |     1 |     1 |
--------------------------------------------------------------------------------------------------------------------------------------------

这条语句的预计结果又22G rows,执行时间已经没法估量了。这种问题一看就是一个很好的案例。 首先就是查看是不是逻辑上出现了明显的问题,这个时候索引的影响已经没那么重要了。 我们来推敲一下where中的过滤条件

 hist.entity_id = sub.subscriber_no
    and hist.customer_id = sub.customer_id
    and hist.entity_id = rc.service_receiver_id
    and
    and rc.service_receiver_id=sub.subscriber_no
   and rc.receiver_customer= sub.customer_id

通过hist.entity_id = sub.subscriber_no和and rc.service_receiver_id=sub.subscriber_no可以推得hist.entity_id=rc.service_receiver_id,在过滤条件中又写了一遍, 同理hist.customer_id = sub.customer_id和rc.receiver_customer= sub.customer_id可以推得 hist.customer_id = rc.receiver_customer 所以这个条件也是冗余的。 我们可以基于表中的数据量来合理的选择列的关联。 除了这个问题,还有一个明显的问题,就是查询输出列select distinct hist.entity_id, 既然hist.entity_id和rc.subscriber_no已经是相等的了,就不需要再输出hist.entity_id然后做distinct运算了。为了突出这个问题的严重性,我先不删除冗余的过滤条件。只是删除查询输出列中的distinct hist.entity_id

select rc.* from bl1_activity_history hist, bl1_rc_rates rc, autsu_subscriber sub
where hist.entity_id = sub.subscriber_no
     and hist.customer_id = sub.customer_id
and hist.activity_id = '48'
    and hist.entity_id = rc.service_receiver_id
    and hist.customer_id = rc.receiver_customer
    and rc.service_receiver_id=sub.subscriber_no
    and rc.receiver_customer= sub.customer_id
  and trunc(hist.activity_date,'dd') = trunc(rc.effective_date,'dd')
  and rc.amount > 0
and rc.expiration_date is null or rc.expiration_date > to_date('20141019','yyyymmdd'); 

来看看执行计划

Plan hash value: 1018700604
-----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                          |    21G|    13T|    47M  (1)|159:36:35 |       |       |
|   1 |  CONCATENATION                         |                          |       |       |            |          |       |       |
|   2 |   NESTED LOOPS                         |                          |    21G|    13T|    47M  (1)|159:35:59 |       |       |
|   3 |    NESTED LOOPS                        |                          |    13M|  8211M|  1393K  (1)| 04:38:47 |       |       |
|   4 |     PARTITION RANGE ALL                |                          |     1 |   622 |   980K  (1)| 03:16:02 |     1 |    11 |
|*  5 |      TABLE ACCESS FULL                 | BL1_RC_RATES             |     1 |   622 |   980K  (1)| 03:16:02 |     1 |    11 |
|   6 |     PARTITION RANGE ALL                |                          |    27M|   622M|   413K  (1)| 01:22:45 |     1 |    11 |
|   7 |      TABLE ACCESS FULL                 | BL1_ACTIVITY_HISTORY     |    27M|   622M|   413K  (1)| 01:22:45 |     1 |    11 |
|   8 |    TABLE ACCESS FULL                   | AUTSU_SUBSCRIBER         |  1634 | 42484 |     3   (0)| 00:00:01 |       |       |
|   9 |   NESTED LOOPS                         |                          |       |       |            |          |       |       |
|  10 |    NESTED LOOPS                        |                          |     1 |   672 |  2949   (1)| 00:00:36 |       |       |
|  11 |     NESTED LOOPS                       |                          |     1 |    50 |  2947   (1)| 00:00:36 |       |       |
|  12 |      TABLE ACCESS FULL                 | AUTSU_SUBSCRIBER         |  1634 | 42484 |     5   (0)| 00:00:01 |       |       |
|  13 |      PARTITION RANGE ALL               |                          |     1 |    24 |     2   (0)| 00:00:01 |     1 |    11 |
|* 14 |       TABLE ACCESS BY LOCAL INDEX ROWID| BL1_ACTIVITY_HISTORY     |     1 |    24 |     2   (0)| 00:00:01 |     1 |    11 |
|* 15 |        INDEX RANGE SCAN                | BL1_ACTIVITY_HISTORY_3IX |     1 |       |     2   (0)| 00:00:01 |     1 |    11 |
|  16 |     PARTITION RANGE ALL                |                          |     1 |       |     2   (0)| 00:00:01 |     1 |    11 |
|* 17 |      INDEX RANGE SCAN                  | BL1_RC_RATES_3IX         |     1 |       |     2   (0)| 00:00:01 |     1 |    11 |
|* 18 |    TABLE ACCESS BY LOCAL INDEX ROWID   | BL1_RC_RATES             |     1 |   622 |     2   (0)| 00:00:01 |     1 |     1 |
-----------------------------------------------------------------------------------------------------------------------------------

情况相对改善了不少,但是还是有问题的节奏。 这个时候我们来看看执行计划吧,注意到这个执行计划有些奇怪,只有3个表的关联,但是执行计划中缺出现了两个子查询,对于执行计划中的CONCATENATION自己比较陌生,就没有细究。 直接看走一个并行,效果怎么样。并行开了4个。从执行计划来看,情况好了很多,看似可以完成的样子了。

-----------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                          |  1282 |   831K|   390K  (1)| 01:18:07 |       |       |        |      |        |
|   1 |  CONCATENATION                          |                          |       |       |            |          |       |       |        |      |        |
|   2 |   PX COORDINATOR                        |                          |       |       |            |          |       |       |        |      |        |
|   3 |    PX SEND QC (RANDOM)                  | :TQ10000                 |   801 |   519K|   271K  (1)| 00:54:23 |       |       |  Q1,00 | P->S | QC (RAND)       |
|   4 |     NESTED LOOPS                        |                          |   801 |   519K|   271K  (1)| 00:54:23 |       |       |  Q1,00 | PCWP |        |
|   5 |      NESTED LOOPS                       |                          |     1 |   638 |   271K  (1)| 00:54:23 |       |       |  Q1,00 | PCWP |        |
|   6 |       PX BLOCK ITERATOR                 |                          |     1 |   622 |   271K  (1)| 00:54:23 |     1 |    11 |  Q1,00 | PCWC |        |
|*  7 |        TABLE ACCESS FULL                | BL1_RC_RATES             |     1 |   622 |   271K  (1)| 00:54:23 |     1 |    11 |  Q1,00 | PCWP |        |
|   8 |       PARTITION RANGE ALL               |                          |     1 |    16 |     2   (0)| 00:00:01 |     1 |    11 |  Q1,00 | PCWP |        |
|*  9 |        TABLE ACCESS BY LOCAL INDEX ROWID| BL1_ACTIVITY_HISTORY     |     1 |    16 |     2   (0)| 00:00:01 |     1 |    11 |  Q1,00 | PCWP |        |
|* 10 |         INDEX RANGE SCAN                | BL1_ACTIVITY_HISTORY_3IX |     1 |       |     2   (0)| 00:00:01 |     1 |    11 |  Q1,00 | PCWP |        |
|  11 |      TABLE ACCESS FULL                  | AUTSU_SUBSCRIBER         |  1634 | 42484 |     5   (0)| 00:00:01 |       |       |  Q1,00 | PCWP |        |
|  12 |   NESTED LOOPS                          |                          |   481 |   311K|   118K  (1)| 00:23:45 |       |       |        |      |        |
|  13 |    NESTED LOOPS                         |                          |     1 |   648 |  3764   (1)| 00:00:46 |       |       |        |      |        |
|  14 |     TABLE ACCESS FULL                   | AUTSU_SUBSCRIBER         |  1634 | 42484 |     5   (0)| 00:00:01 |       |       |        |      |        |
|  15 |     PARTITION RANGE ALL                 |                          |     1 |   622 |     2   (0)| 00:00:01 |     1 |    11 |        |      |        |
|* 16 |      TABLE ACCESS BY LOCAL INDEX ROWID  | BL1_RC_RATES             |     1 |   622 |     2   (0)| 00:00:01 |     1 |    11 |        |      |        |
|* 17 |       INDEX RANGE SCAN                  | BL1_RC_RATES_3IX         |    13 |       |     2   (0)| 00:00:01 |     1 |    11 |        |      |        |
|  18 |    PARTITION RANGE ALL                  |                          |    27M|   415M|   413K  (1)| 01:22:45 |     1 |    11 |        |      |        |
|* 19 |     TABLE ACCESS FULL                   | BL1_ACTIVITY_HISTORY     |    27M|   415M|   413K  (1)| 01:22:45 |     1 |    11 |        |      |        |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------

但是根据实际的情况,从300条左右的数据中大表中的索引,查取数据应该也没那么慢。肯定还是什么地方不对劲,就查看了下CONCATENATION 的解释 concatenation在sql级别和两个hint相关,no_expend, no_concat

no_expand提示的说明是

The NO_EXPAND hint prevents the cost-based optimizer from considering OR-expansion for queries having OR conditions or IN-lists in the WHERE clause. Usually, the optimizer considers using OR expansion and uses this method if it decides that the cost is lower than not using it.

use_concat提示的说明是

The USE_CONCAT hint forces combined OR conditions in the WHERE clause of a query to be transformed into a compound query using the UNION ALL set operator. Generally, this transformation occurs only if the cost of the query using the concatenations is cheaper than the cost without them.

当我读到第二句的时候,我就恍然明白了。 和开发确认过滤条件and rc.expiration_date is null or rc.expiration_date > to_date('20141019','yyyymmdd'); 是不是 期望是 and (rc.expiration_date is null or rc.expiration_date > to_date('20141019','yyyymmdd')); 他一愣,一想确实是这个道理。剩下的事情就简单了。我们不要并行来看看最终的执行结果。

select rc.* from  bl1_rc_rates rc, autsu_subscriber sub,bl1_activity_history hist
where rc.service_receiver_id=sub.subscriber_no
and rc.receiver_customer= sub.customer_id
and  rc.amount > 0
and (rc.expiration_date is null or rc.expiration_date > to_date('20141019','yyyymmdd'))
and rc.service_receiver_id=hist.entity_id
and rc.receiver_customer=hist.customer_id
and  hist.activity_id = '48'

Plan hash value: 3908327465
----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                          |  1634 |  1059K|  6706   (1)| 00:01:21 |       |       |
|   1 |  NESTED LOOPS                         |                          |       |       |            |          |       |       |
|   2 |   NESTED LOOPS                        |                          |  1634 |  1059K|  6706   (1)| 00:01:21 |       |       |
|   3 |    NESTED LOOPS                       |                          |  1634 |  1034K|  3764   (1)| 00:00:46 |       |       |
|   4 |     TABLE ACCESS FULL                 | AUTSU_SUBSCRIBER         |  1634 | 42484 |     5   (0)| 00:00:01 |       |       |
|   5 |     PARTITION RANGE ALL               |                          |     1 |   622 |     2   (0)| 00:00:01 |     1 |    11 |
|*  6 |      TABLE ACCESS BY LOCAL INDEX ROWID| BL1_RC_RATES             |     1 |   622 |     2   (0)| 00:00:01 |     1 |    11 |
|*  7 |       INDEX RANGE SCAN                | BL1_RC_RATES_3IX         |    13 |       |     2   (0)| 00:00:01 |     1 |    11 |
|   8 |    PARTITION RANGE ALL                |                          |     1 |       |     2   (0)| 00:00:01 |     1 |    11 |
|*  9 |     INDEX RANGE SCAN                  | BL1_ACTIVITY_HISTORY_3IX |     1 |       |     2   (0)| 00:00:01 |     1 |    11 |
|* 10 |   TABLE ACCESS BY LOCAL INDEX ROWID   | BL1_ACTIVITY_HISTORY     |     1 |    16 |     2   (0)| 00:00:01 |     1 |     1 |
----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   6 - filter("RC"."AMOUNT">0 AND ("RC"."EXPIRATION_DATE" IS NULL OR "RC"."EXPIRATION_DATE">TO_DATE(' 2014-10-19
              00:00:00', 'syyyy-mm-dd hh24:mi:ss')) AND "RC"."RECEIVER_CUSTOMER"="SUB"."CUSTOMER_ID")
   7 - access("RC"."SERVICE_RECEIVER_ID"="SUB"."SUBSCRIBER_NO")
   9 - access("RC"."SERVICE_RECEIVER_ID"="HIST"."ENTITY_ID" AND "HIST"."ACTIVITY_ID"='48')
  10 - filter("RC"."RECEIVER_CUSTOMER"="HIST"."CUSTOMER_ID")

看来任何细小对的问题都会导致很严重的问题,尤其是在数据量太大的情况下,错误的放大效应就会很明显。

原文发布于微信公众号 - 杨建荣的学习笔记(jianrong-notes)

原文发表时间:2015-01-15

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏杨建荣的学习笔记

生产环境sql语句调优实战第四篇(r2笔记41天)

生产中有一条sql语句消耗了大量的cpu资源,执行时间在18秒左右, Session:PRODBUSER (1560:61133)SQL ID:1hg2wcua...

2145
来自专栏杨建荣的学习笔记

关于oracle中的反连接(r3笔记第95天)

在之前的章节中见到讨论过oracle中的半连接 http://blog.itpub.net/23718752/viewspace-1334483/ 与半连接相对...

2237
来自专栏杨建荣的学习笔记

关于索引扫描的极速调优实战(第一篇) (r3笔记第81天)

一般在生产环境中,如果某个查询中涉及一个大表,走索引扫描是显然是最值得推荐的方式,但是索引扫描有unique index scan, range scan,sk...

2424
来自专栏数据和云

深入解析:半连接与反连接的原理和等价改写方法

半连接的原理及等价改写 1. 什么是半连接 当两张表进行关联,只返回匹配上的数据并且只会返回一张的表的数据,半连接一般就是指的在子查询中出现 IN 和 EXI...

3357
来自专栏乐沙弥的世界

当心外部连接中的ON子句

       在SQL tuning中,不良写法导致SQL执行效率比比皆是。最近的SQL tuning中一个外部连接写法不当导致过SQL执行时间超过15分钟左右...

1534
来自专栏数据和云

突破常识:SQL增加DISTINCT后查询效率反而提高

杨廷琨,网名 yangtingkun 云和恩墨技术总监,Oracle ACE Director,ACOUG 核心专家 只要增加了DISTINCT关键字,Orac...

3436
来自专栏杨建荣的学习笔记

一条简单的sql在11g和12c中的不同(r5笔记第2天)

今天在查看awr报告的时候,有一句很简单的sql语句引起了我的注意,因为它排在SQL Order by Reads的第2位。 Physical ReadsExe...

3454
来自专栏杨建荣的学习笔记

关于查询转换的一些简单分析(二) (r3笔记第68天)

关于查询转换的作用,在之前的一篇博文里面也大概说了视图合并和子查询嵌套的部分。http://blog.itpub.net/23718752/viewspace-...

28211
来自专栏乐沙弥的世界

SQL 基础--> ROLLUP与CUBE运算符实现数据汇总

--=============================================

1183
来自专栏杨建荣的学习笔记

sql_profile的使用(一) (r2笔记29天)

今天看了老熊关于sql_profile的讲解,受益匪浅,自己在本机也做了一通,感觉好记性不如烂笔头还是得多总结总测试才能真正理解。 准备的数据如下,创建两个表,...

2789

扫码关注云+社区

领取腾讯云代金券