我终于问了我的第一个问题(虽然我是一个长期跟踪者)。
有一天,一个SQL查询引起了我的注意。当使用WHERE运算符将索引与可能的值进行比较时,问题在于IN子句中的性能。
SELECT SUM (parts.quantity) AS quantity,
concessions.concessionCode,
concessions.description AS concessionDesc,
parts.type,
activities.activityCode,
REPLACE (activities.activityCode, activities.lvl2 || '-', '') AS activityCodeDisplay,
strings.activityDesc,
strings.activityDesc2,
strings.activityDesc3
FROM tb_parts parts,
tb_activities activities,
tb_strings strings,
tb_concessions concessions
WHERE parts.activityCode = activities.activityCode
AND parts.concessionCode = activities.concessionCode
AND activities.concessionCode = concesions.concessionCode
AND activities.concessionCode = strings.concessionCode
AND activities.activityCode = strings.activityCode
AND strings.language = 'ENG'
--AND parts.concesionCode IN ('ZD', 'G9', 'TR', 'JS0')
AND parts.concesionCode IN ('ZD', 'G9')
AND parts.date >= TO_DATE ('01/01/2013 00:00:00', 'DD/MM/YYYY HH24:MI:SS')
AND parts.date <= TO_DATE ('30/04/2013 23:59:59', 'DD/MM/YYYY HH24:MI:SS')
AND parts.type IN ('U', 'M')
AND parts.value = 'E'
GROUP BY concesions.concessionCode,
concesions.description,
parts.type,
activities.activityCode,
REPLACE (activities.activityCode, activities.lvl2|| '-', ''),
strings.activityDesc,
strings.activityDesc2,
strings.activityDesc3
ORDER BY concesions.concessionCode;我遇到的问题是--如果查询按原样运行(IN有两个值),则需要30次。如果它使用四个值运行(就像在注释行中那样),则查询需要5s。我预计,将指数与多个值进行比较需要更多时间,但情况似乎并非如此。我在一天中重复了几次“测试”,它们总是或多或少地相同( 30 +-1s,5 +-1s)。
任何关于为什么这样做的洞察力都是非常感谢的!
我翻译了表/栏的名称,如果有任何出入,很抱歉。
我用joins重写了这段代码,它的速度要快得多,但这种异常背后的原因仍然困扰着我:)
编辑:终于开始工作了!经过一些修改后,我已经能够为这两个版本创建执行计划,甚至对于第三个版本的查询(在中使用4和2值,时间大约为600 ms)。此外,对于表中的数据有几个问题,下面是一些信息:
All the stats are analyzed the day that queries were executed
Table parts
total rows - 3.2 M
matches for 2 values - 1.08 M (~34%)
matches for 4 values - 1.30 M (~41%)
Table activities
total rows - 3866
matches for 2 values - 321 (~ 8%)
matches for 4 values - 644 (~16%)
Table strings
total rows - 7436
matches for 2 values - 642 (~ 8%)
matches for 4 values - 1288 (~17%)
Index in_parts
codConcession
username
date因此,我认为在使用动态采样时(除了+2/3s之外)没有很大的差别(如果我做得正确,即在SELECT关键字之后使用SELECT)。
对于二值
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 186 | 864 (1)| 00:00:11 |
| 1 | SORT ORDER BY | | 1 | 186 | 864 (1)| 00:00:11 |
| 2 | HASH GROUP BY | | 1 | 186 | 864 (1)| 00:00:11 |
|* 3 | TABLE ACCESS BY INDEX ROWID | tb_parts | 1 | 37 | 818 (1)| 00:00:10 |
| 4 | NESTED LOOPS | | 1 | 186 | 862 (1)| 00:00:11 |
| 5 | NESTED LOOPS | | 1 | 149 | 44 (0)| 00:00:01 |
| 6 | NESTED LOOPS | | 34 | 2108 | 10 (0)| 00:00:01 |
| 7 | INLIST ITERATOR | | | | | |
| 8 | TABLE ACCESS BY INDEX ROWID| tb_concesions | 2 | 54 | 2 (0)| 00:00:01 |
|* 9 | INDEX UNIQUE SCAN | pk_concession | 2 | | 1 (0)| 00:00:01 |
| 10 | TABLE ACCESS BY INDEX ROWID | tb_activities | 17 | 595 | 4 (0)| 00:00:01 |
|* 11 | INDEX RANGE SCAN | pk_activity | 17 | | 2 (0)| 00:00:01 |
| 12 | TABLE ACCESS BY INDEX ROWID | tb_strings | 1 | 87 | 1 (0)| 00:00:01 |
|* 13 | INDEX UNIQUE SCAN | pk_string | 1 | | 0 (0)| 00:00:01 |
|* 14 | INDEX RANGE SCAN | in_parts | 454 | | 648 (1)| 00:00:08 |
-----------------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("parts"."value"='E'
AND ("parts"."type"='M' OR "parts"."type"='U')
AND "parts"."activityCode"="activities"."activityCode")
9 - access("concessions"."concessionCode"='G9'
OR "concessions"."concessionCode"='ZD')
11 - access("activities"."concessionCode"="concessions"."concessionCode")
filter("activities"."concessionCode"='G9'
OR "activities"."concessionCode"='ZD')
13 - access("activities"."concessionCode"="strings"."concessionCode"
AND "activities"."activityCode"="strings"."activityCode"
AND "strings"."language"='ENG')
filter("strings"."concessionCode"='G9'
OR "strings"."concessionCode"='ZD')
14 - access("parts"."concessionCode"="activities"."concessionCode"
AND "parts"."date">=TO_DATE('2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')
AND "parts"."date"<=TO_DATE(' 2013-04-30 23:59:59',
'syyyy-mm-dd hh24:mi:ss'))
filter("parts"."date">=TO_DATE('2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')
AND ("parts"."concessionCode"='G9'
OR "parts"."concessionCode"='ZD')
AND "parts"."date"<=TO_DATE(' 2013-04-30 23:59:59',
'syyyy-mm-dd hh24:mi:ss')) 对于四值
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 186 | 7412 (2)| 00:01:29 |
| 1 | SORT ORDER BY | | 1 | 186 | 7412 (2)| 00:01:29 |
| 2 | HASH GROUP BY | | 1 | 186 | 7412 (2)| 00:01:29 |
| 3 | NESTED LOOPS | | 1 | 186 | 7410 (2)| 00:01:29 |
|* 4 | HASH JOIN | | 17 | 1683 | 7393 (2)| 00:01:29 |
|* 5 | HASH JOIN | | 136 | 8432 | 21 (5)| 00:00:01 |
| 7 | TABLE ACCESS BY INDEX ROWID| tb_concesions | 4 | 108 | 2 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | pk_concession | 4 | | 1 (0)| 00:00:01 |
|* 9 | TABLE ACCESS FULL | tb_activities | 644 | 22540 | 18 (0)| 00:00:01 |
|* 10 | TABLE ACCESS FULL | tb_parts | 4310 | 155K| 7372 (2)| 00:01:29 |
| 11 | TABLE ACCESS BY INDEX ROWID | tb_strings | 1 | 87 | 1 (0)| 00:00:01 |
|* 12 | INDEX UNIQUE SCAN | pk_string | 1 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
---------------------------------------------------
4 - access("parts"."activityCode"="activities"."activityCode"
AND "parts"."concessionCode"="activities"."concessionCode")
5 - access("activities"."concessionCode"="concessions"."concessionCode")
8 - access("concessions"."concessionCode"='G9'
OR "concessions"."concessionCode"='JS0'
OR "concessions"."concessionCode"='TR'
OR "concessions"."concessionCode"='ZD')
9 - filter("activities"."concessionCode"='G9'
OR "activities"."concessionCode"='JS0'
OR "activities"."concessionCode"='TR'
OR "activities"."concessionCode"='ZD')
10 - filter("parts"."date">=TO_DATE(' 2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')
AND "parts"."value"='E'
AND ("parts"."type"='M' OR "parts"."type"='U')
AND ("parts"."concessionCode"='G9'
OR "parts"."concessionCode"='JS0'
OR "parts"."concessionCode"='TR'
OR "parts"."concessionCode"='ZD')
AND "parts"."date"<=TO_DATE(' 2013-04-30 23:59:59',
'syyyy-mm-dd hh24:mi:ss'))
12 - access("activities"."concessionCode"="strings"."concessionCode"
AND "activities"."activityCode"="strings"."activityCode"
AND "strings"."language"='ENG')
filter("strings"."concessionCode"='G9'
OR "strings"."concessionCode"='JS0'
OR "strings"."concessionCode"='TR'
OR "strings"."concessionCode"='ZD') 最后六值
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 186 | 4525 (1)| 00:00:55 |
| 1 | SORT ORDER BY | | 1 | 186 | 4525 (1)| 00:00:55 |
| 2 | HASH GROUP BY | | 1 | 186 | 4525 (1)| 00:00:55 |
| 3 | NESTED LOOPS | | 1 | 186 | 4523 (1)| 00:00:55 |
|* 4 | HASH JOIN | | 9 | 891 | 4514 (1)| 00:00:55 |
|* 5 | HASH JOIN | | 136 | 8432 | 21 (5)| 00:00:01 |
| 6 | INLIST ITERATOR | | | | | |
| 7 | TABLE ACCESS BY INDEX ROWID| tb_concesions | 4 | 108 | 2 (0)| 00:00:01 |
|* 8 | INDEX UNIQUE SCAN | pk_concession | 4 | | 1 (0)| 00:00:01 |
|* 9 | TABLE ACCESS FULL | tb_activities | 644 | 22540 | 18 (0)| 00:00:01 |
| 10 | INLIST ITERATOR | | | | | |
|* 11 | TABLE ACCESS BY INDEX ROWID | tb_parts | 2155 | 79735 | 4493 (1)| 00:00:54 |
|* 12 | INDEX RANGE SCAN | in_parts | 8620 | | 1277 (1)| 00:00:16 |
| 13 | TABLE ACCESS BY INDEX ROWID | tb_strings | 1 | 87 | 1 (0)| 00:00:01 |
|* 14 | INDEX UNIQUE SCAN | pk_string | 1 | | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("parts"."activityCode"="activities"."activityCode"
AND "parts"."concessionCode"="activities"."concessionCode")
5 - access("activities"."concessionCode"="concessions"."concessionCode")
8 - access("concessions"."concessionCode"='G9'
OR "concessions"."concessionCode"='JS0'
OR "concessions"."concessionCode"='TR'
OR "concessions"."concessionCode"='ZD')
9 - filter("activities"."concessionCode"='G9'
OR "activities"."concessionCode"='JS0'
OR "activities"."concessionCode"='TR'
OR "activities"."concessionCode"='ZD')
11 - filter("parts"."value"='E'
AND ("parts"."type"='M' OR "parts"."type"='U'))
12 - access(("parts"."concessionCode"='G9'
OR "parts"."concessionCode"='ZD')
AND "parts"."date">=TO_DATE(' 2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')
AND "parts"."date"<=TO_DATE(' 2013-04-30 23:59:59',
'syyyy-mm-dd hh24:mi:ss'))
filter("parts"."date">=TO_DATE(' 2013-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')
AND "parts"."date"<=TO_DATE(' 2013-04-30 23:59:59',
'syyyy-mm-dd hh24:mi:ss'))
14 - access("activities"."concessionCode"="strings"."concessionCode"
AND "activities"."activityCode"="strings"."activityCode"
AND "strings"."language"='ENG')
filter("strings"."concessionCode"='G9'
OR "strings"."concessionCode"='JS0'
OR "strings"."concessionCode"='TR'
OR "strings"."concessionCode"='ZD')由于这是我第一次与执行计划会面,所以我只能猜测延迟的原因是什么。在4到6个值之间,我猜想这是从完全访问到按索引访问的变化。此外,在访问表时,四个值(id 10)的筛选器包含所有四个特许权值;而对于六个值,两个特许权值位于access部分,过滤器只包含日期、类型和值。
发布于 2013-04-26 05:52:54
通常,出现这种异常的原因是查询优化器无法准确预测成本。准确了解成本的唯一方法是使用不同的执行计划实际运行该语句几次。相反,它使用统计数据来估计成本,有时估计是错误的。
当您比较“两个值”和“四个值”的选择计划时,可以看到后者产生了更高的成本,并且计划是完全不同的。优化器在这两个执行计划之间有一个选择,并且肯定认为第一个有两个值更好,第二个更好有四个值。然而,在现实中,第二种情况在这两种情况下都更好。
如果你仔细地分析这些异常现象,你通常会得到一些见解,比如在你的数据中,某种价值的组合会被高估或被低估。在统计数据中使用直方图可以为优化器提供更多线索,它可以更好地处理“倾斜数据”,但其预测能力仍然有限。
实际上,解决方案就是您所做的:重写SQL,直到获得可接受的性能为止。通常,“提示”(在Oracle中)也能给优化器提供更多的线索。
https://stackoverflow.com/questions/16120485
复制相似问题