在生产环境中有一条sql语句的性能极差,在早晨非高峰时段运行抽取数据,平均要花费40分钟,有时候竟然要跑10个多小时。
sql语句比较长,需要点耐心往下看。我对表的数据量都做了简单的说明。
首先拿到sql语句,一看到关联的表有一长串,而且都是很大的表。如果性能很差,可能和走全表扫描有关,还有可能和多表关联时,表的查取顺序也有一定的关系。
SELECT DISTINCT CA.L9_CONVERGENCE_CODE AS ATB2,
CU.CUST_SUB_TYPE AS ACCOUNT_TYPE,
CST.DESCRIPTION AS ACCOUNT_TYPE_DESC,
SS.PRIM_RESOURCE_VAL AS MSISDN,
CA.BAN AS BAN_KEY,
to_char(MO.MEMO_DATE, 'YYYYMMDD') AS MEMO_DATE,
CU.L9_IDENTIFICATION AS THAI_ID,
SS.SUBSCRIBER_NO AS SUBS_KEY,
SS.DEALER_CODE AS SHOP_CODE,
CD.DESCRIPTION AS SHOP_NAME,
MOT.SHORT_DESC,
REGEXP_SUBSTR(MO.ATTR1VALUE, '[^ ;]+', 1, 3) STAFF_ID,
MO.OPERATOR_ID AS USER_ID,
MO.MEMO_SYSTEM_TEXT,
CO2.SOC_NAME AS FIRST_SOCNAME,
CO3.SOC_NAME AS PREVIOUS_SOCNAME,
CO.SOC_NAME AS CURRENT_SOCNAME,
REGEXP_SUBSTR(MO.ATTR1VALUE, '[^ ; ]+', 1, 1) NAME,
CO.SOC_DESCRIPTION AS CURRENT_PP_DESC,
CO3.SOC_DESCRIPTION AS PREV_PP_DESC,
CO.SOC_CD AS SOC_CD,
(SELECT sum(BR.AMOUNT)
FROM BL1_RC_RATES BR, CUSTOMER CU, SUBSCRIBER SS
WHERE BR.SERVICE_RECEIVER_ID = SS.SUBSCRIBER_NO
AND BR.RECEIVER_CUSTOMER = SS.CUSTOMER_ID
AND BR.EFFECTIVE_DATE <= BR.EXPIRATION_DATE
AND((SS. SUB_STATUS <> 'C' and SS.
SUB_STATUS <> 'T' and BR.EXPIRATION_DATE is null)
OR (SS. SUB_STATUS = 'C' and
BR.EXPIRATION_DATE like SS.EFFECTIVE_DATE))
AND BR.PP_IND = 'Y'
AND BR.CYCLE_CODE = CU.BILL_CYCLE) AS PP_RATE,
CU.BILL_CYCLE AS CYCLE_CODE,
to_char(NVL(SS.L9_TMV_ACT_DATE, SS.INIT_ACT_DATE),'YYYYMMDD') AS ACTIVATED_DATE,
to_char(CD.EFFECTIVE_DATE, 'YYYYMMDD') AS SHOP_EFFECTIVE_DATE,
CD.EXPIRATION_DATE AS SHOP_EXPIRED_DATE,
CA.L9_COMPANY_CODE AS COMPANY_CODE
FROM SERVICE_DETAILS S, --大分区表,千万级数据量,存放着交易的明细信息
PRODUCT CO, --产品配置表,大概几万条左右
CSM_PAY_CHANNEL CPC, --账务相关表,百万级
ACCOUNT CA, --账务相关表,百万级
SUBSCRIBER SS, --用户相关表,百万级
CUSTOMER CU, --用户相关表,百万级
CUSTOMER_SUB_TYPE CST, --用户配置表,几千条数据
CSM_DEALER CD, --产品配置表,大概几千条左右
SERVICE_DETAILS S2,
PRODUCT CO2,
SERVICE_DETAILS S3,
PRODUCT CO3,
MEMO MO , --交易备注表,数据量过亿
MEMO_TYPE MOT, --配置表,数据量几千
LOGICAL_DATE LO, --时间配置表,数据量1千多
CHARGE_DETAILS CHD --交易表,数据量千万
WHERE SS.SUBSCRIBER_NO = CHD.AGREEMENT_NO
AND CPC.PYM_CHANNEL_NO = CHD.TARGET_PCN
AND CHD.CHG_SPLIT_TYPE = 'DR'
AND CHD.EXPIRATION_DATE IS NULL
AND S.SOC = CO.SOC_CD
AND CO.SOC_TYPE = 'P'
AND S.AGREEMENT_NO = SS.SUBSCRIBER_NO
AND SS.PRIM_RESOURCE_TP = 'C'
AND CPC.PAYMENT_CATEGORY = 'POST'
AND CA.BAN = CPC.BAN
AND (CA.L9_COMPANY_CODE = 'RF' OR CA.L9_COMPANY_CODE = 'RM' OR
CA.L9_COMPANY_CODE = 'TM')
AND SS.CUSTOMER_ID = CU.CUSTOMER_ID
AND CU.CUST_SUB_TYPE = CST.CUST_SUB_TYPE
AND CU.CUSTOMER_TYPE = CST.CUSTOMER_TYPE
AND SS.DEALER_CODE = CD.DEALER
AND S2.EFFECTIVE_DATE= (SELECT MAX(SA1.EFFECTIVE_DATE)
FROM SERVICE_DETAILS SA1, PRODUCT o1
WHERE SA1.AGREEMENT_NO = SS.SUBSCRIBER_NO
AND co.soc_cd = sa1.soc
and co.soc_type = 'P'
)
AND S2.AGREEMENT_NO = S.AGREEMENT_NO
AND S2.SOC = CO2.SOC_CD
AND CO2.SOC_TYPE = 'P'
AND S2.EFFECTIVE_DATE = (SELECT MIN(SA1.EFFECTIVE_DATE)
FROM SERVICE_DETAILS SA1, PRODUCT o1
WHERE SA1.AGREEMENT_NO = SS.SUBSCRIBER_NO
AND co2.soc_cd = sa1.soc
and co.soc_type = 'P'
)
AND S3.AGREEMENT_NO = S.AGREEMENT_NO
AND S3.SOC = CO3.SOC_CD
AND CO3.SOC_TYPE = 'P'
AND S3.EFFECTIVE_DATE =
(SELECT MAX(SA1.EFFECTIVE_DATE)
FROM SERVICE_DETAILS SA1, PRODUCT o1
WHERE SA1.AGREEMENT_NO = SS.SUBSCRIBER_NO
AND SA1.EFFECTIVE_DATE <
(SELECT MAX(SA1.EFFECTIVE_DATE)
FROM SERVICE_DETAILS SA1, PRODUCT o1
WHERE SA1.AGREEMENT_NO = SS.SUBSCRIBER_NO
and co3.soc_cd = sa1.soc
and co3.soc_type = 'P'
)
and co3.soc_cd = sa1.soc
and o1.soc_type = 'P'
)
AND MO.ENTITY_ID = SS.SUBSCRIBER_NO
AND MO.ENTITY_TYPE_ID = 6
AND MO.MEMO_TYPE_ID = MOT.MEMO_TYPE_ID
AND TRUNC(MO.SYS_CREATION_DATE) = (select TRUNC(LO.LOGICAL_DATE - 1) from lo)
TRUNC(LO.LOGICAL_DATE - 1)
AND LO.EXPIRATION_DATE IS NULL
AND LO.LOGICAL_DATE_TYPE = 'B'
AND LO.EXPIRATION_DATE IS NULL
AND (MOT.SHORT_DESC = 'BCN' OR MOT.SHORT_DESC = 'BCNM' OR
............
)
sql语句的执行计划如下,可以看到基本没有性能可言。在几个大表上都做了全表扫描,而且连接的消耗极大。庆幸的是这条sql语句貌似已经使用sql profile调优过了,性能才保持在40左右。
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 3445667740
----------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 551 | | 235K (1)| 00:47:03 | | |
| 1 | SORT AGGREGATE | | 1 | 56 | | | | | |
|* 2 | HASH JOIN | | 10G| 541G| 4840K| 305K (13)| 01:01:04 | | |
|* 3 | HASH JOIN | | 76218 | 3944K| 24M| 259K (1)| 00:51:59 | | |
| 4 | PARTITION RANGE ALL | | 601K| 17M| | 248K (1)| 00:49:38 | 1 | 11 |
|* 5 | TABLE ACCESS FULL | RC_RATES | 601K| 17M| | 248K (1)| 00:49:38 | 1 | 11 |
| 6 | TABLE ACCESS FULL | SUBSCRIBER | 1249K| 26M| | 8495 (1)| 00:01:42 | | |
| 7 | TABLE ACCESS FULL | CUSTOMER | 1226K| 3591K| | 7441 (1)| 00:01:30 | | |
| 8 | HASH UNIQUE | | 1 | 551 | | 235K (1)| 00:47:03 | | |
|* 9 | FILTER | | | | | | | | |
| 10 | NESTED LOOPS | | | | | | | | |
| 11 | NESTED LOOPS | | 1 | 551 | | 235K (1)| 00:47:03 | | |
| 12 | NESTED LOOPS | | 5 | 2485 | | 235K (1)| 00:47:03 | | |
| 13 | NESTED LOOPS | | 1 | 476 | | 235K (1)| 00:47:03 | | |
|* 14 | HASH JOIN | | 2 | 930 | | 235K (1)| 00:47:03 | | |
|* 15 | TABLE ACCESS FULL | LOGICAL_DATE | 1 | 18 | | 12 (0)| 00:00:01 | | |
| 16 | NESTED LOOPS | | 209 | 93423 | | 235K (1)| 00:47:02 | | |
| 17 | NESTED LOOPS | | 5 | 1690 | | 234K (1)| 00:46:59 | | |
| 18 | NESTED LOOPS | | 17 | 5406 | | 234K (1)| 00:46:59 | | |
| 19 | NESTED LOOPS | | 2 | 594 | | 234K (1)| 00:46:59 | | |
| 20 | NESTED LOOPS | | 3 | 858 | | 234K (1)| 00:46:59 | | |
| 21 | NESTED LOOPS | | 3 | 807 | | 234K (1)| 00:46:59 | | |
| 22 | NESTED LOOPS | | 8 | 1992 | | 234K (1)| 00:46:59 | | |
|* 23 | HASH JOIN | | 30 | 5850 | 187M| 234K (1)| 00:46:59 | | |
|* 24 | HASH JOIN | | 1059K| 175M| 33M| 119K (1)| 00:23:56 | | |
| 25 | VIEW | VW_SQ_1 | 1318K| 18M| | 86070 (1)| 00:17:13 | | |
| 26 | HASH GROUP BY | | 1318K| 37M| 50M| 86070 (1)| 00:17:13 | | |
|* 27 | HASH JOIN | | 1318K| 37M| | 75316 (1)| 00:15:04 | | |
|* 28 | MAT_VIEW ACCESS FULL | PRODUCT | 666 | 5994 | | 48 (0)| 00:00:01 | | |
| 29 | TABLE ACCESS FULL | SERVICE_DETAILS | 19M| 392M| | 75198 (1)| 00:15:03 | | |
|* 30 | HASH JOIN | | 1059K| 160M| | 23306 (1)| 00:04:40 | | |
| 31 | MAT_VIEW ACCESS FULL | CSM_DEALER | 16895 | 841K| | 40 (0)| 00:00:01 | | |
|* 32 | HASH JOIN | | 1099K| 113M| 71M| 23262 (1)| 00:04:40 | | |
|* 33 | HASH JOIN | | 1079K| 58M| | 7473 (1)| 00:01:30 | | |
| 34 | MAT_VIEW ACCESS FULL | CUSTOMER_SUB_TYPE | 59 | 1652 | | 3 (0)| 00:00:01 | | |
| 35 | TABLE ACCESS FULL | CUSTOMER | 1226K| 33M| | 7465 (1)| 00:01:30 | | |
|* 36 | TABLE ACCESS FULL | SUBSCRIBER | 1248K| 60M| | 8534 (1)| 00:01:43 | | |
| 37 | TABLE ACCESS FULL | SERVICE_DETAILS | 19M| 392M| | 75198 (1)| 00:15:03 | | |
|* 38 | MAT_VIEW ACCESS BY INDEX ROWID| PRODUCT | 1 | 54 | | 1 (0)| 00:00:01 | | |
|* 39 | INDEX RANGE SCAN | PRODUCT_1IX | 1 | | | 1 (0)| 00:00:01 | | |
|* 40 | TABLE ACCESS BY INDEX ROWID | CHARGE_DETAILS | 1 | 20 | | 1 (0)| 00:00:01 | | |
|* 41 | INDEX RANGE SCAN | CHARGE_DETAILS_1IX | 7 | | | 1 (0)| 00:00:01 | | |
|* 42 | TABLE ACCESS BY INDEX ROWID | CSM_PAY_CHANNEL | 1 | 17 | | 1 (0)| 00:00:01 | | |
|* 43 | INDEX UNIQUE SCAN | CSM_PAY_CHANNEL_PK | 1 | | | 1 (0)| 00:00:01 | | |
|* 44 | TABLE ACCESS BY INDEX ROWID | CSM_ACCOUNT | 1 | 11 | | 1 (0)| 00:00:01 | | |
|* 45 | INDEX UNIQUE SCAN | CSM_ACCOUNT_PK | 1 | | | 1 (0)| 00:00:01 | | |
| 46 | TABLE ACCESS BY INDEX ROWID | SERVICE_DETAILS | 8 | 168 | | 1 (0)| 00:00:01 | | |
|* 47 | INDEX RANGE SCAN | SERVICE_DETAILS_PK | 8 | | | 1 (0)| 00:00:01 | | |
|* 48 | MAT_VIEW ACCESS BY INDEX ROWID | PRODUCT | 1 | 20 | | 1 (0)| 00:00:01 | | |
|* 49 | INDEX RANGE SCAN | PRODUCT_1IX | 1 | | | 1 (0)| 00:00:01 | | |
| 50 | PARTITION RANGE ALL | | 42 | 4578 | | 57 (0)| 00:00:01 | 1 | 289 |
| 51 | TABLE ACCESS BY LOCAL INDEX ROWID | MEMO | 42 | 4578 | | 57 (0)| 00:00:01 | 1 | 289 |
|* 52 | INDEX RANGE SCAN | MEMO_1IX | 1 | | | 57 (0)| 00:00:01 | 1 | 289 |
|* 53 | MAT_VIEW ACCESS BY INDEX ROWID | MEMO_TYPE | 1 | 11 | | 1 (0)| 00:00:01 | | |
|* 54 | INDEX UNIQUE SCAN | MEMO_TYPE_PK | 1 | | | 1 (0)| 00:00:01 | | |
| 55 | TABLE ACCESS BY INDEX ROWID | SERVICE_DETAILS | 8 | 168 | | 1 (0)| 00:00:01 | | |
|* 56 | INDEX RANGE SCAN | SERVICE_DETAILS_PK | 8 | | | 1 (0)| 00:00:01 | | |
|* 57 | INDEX RANGE SCAN | PRODUCT_1IX | 1 | | | 1 (0)| 00:00:01 | | |
|* 58 | MAT_VIEW ACCESS BY INDEX ROWID | PRODUCT | 1 | 54 | | 1 (0)| 00:00:01 | | |
| 59 | SORT AGGREGATE | | 1 | 30 | | | | | |
| 60 | NESTED LOOPS | | | | | | | | |
| 61 | NESTED LOOPS | | 2 | 60 | | 4 (0)| 00:00:01 | | |
| 62 | TABLE ACCESS BY INDEX ROWID | SERVICE_DETAILS | 8 | 168 | | 1 (0)| 00:00:01 | | |
|* 63 | INDEX RANGE SCAN | SERVICE_DETAILS_PK | 8 | | | 1 (0)| 00:00:01 | | |
|* 64 | INDEX RANGE SCAN | PRODUCT_1IX | 1 | | | 1 (0)| 00:00:01 | | |
|* 65 | MAT_VIEW ACCESS BY INDEX ROWID | PRODUCT | 1 | 9 | | 1 (0)| 00:00:01 | | |
| 66 | SORT AGGREGATE | | 1 | 30 | | | | | |
|* 67 | FILTER | | | | | | | | |
| 68 | NESTED LOOPS | | | | | | | | |
| 69 | NESTED LOOPS | | 2 | 60 | | 4 (0)| 00:00:01 | | |
| 70 | TABLE ACCESS BY INDEX ROWID | SERVICE_DETAILS | 8 | 168 | | 1 (0)| 00:00:01 | | |
|* 71 | INDEX RANGE SCAN | SERVICE_DETAILS_PK | 8 | | | 1 (0)| 00:00:01 | | |
|* 72 | INDEX RANGE SCAN | PRODUCT_1IX | 1 | | | 1 (0)| 00:00:01 | | |
|* 73 | MAT_VIEW ACCESS BY INDEX ROWID | PRODUCT | 1 | 9 | | 1 (0)| 00:00:01 | | |
| 74 | SORT AGGREGATE | | 1 | 30 | | | | | |
| 75 | NESTED LOOPS | | | | | | | | |
| 76 | NESTED LOOPS | | 2 | 60 | | 4 (0)| 00:00:01 | | |
| 77 | TABLE ACCESS BY INDEX ROWID | SERVICE_DETAILS | 8 | 168 | | 1 (0)| 00:00:01 | | |
|* 78 | INDEX RANGE SCAN | SERVICE_DETAILS_PK | 8 | | | 1 (0)| 00:00:01 | | |
|* 79 | INDEX RANGE SCAN | PRODUCT_1IX | 1 | | | 1 (0)| 00:00:01 | | |
|* 80 | MAT_VIEW ACCESS BY INDEX ROWID | PRODUCT | 1 | 9 | | 1 (0)| 00:00:01 | | |
----------------------------------------------------------------------------------------------------------------------------------------------
Note
-----
- SQL profile "SYS_SQLPROF_0141a69ce4f40002" used for this statement
得到了执行计划,和数据的情况。下面需要做的工作有以下几步:
--查找性能瓶颈,
根据反馈,查取的数据其实并不错,可能在几千条以内的样子。但是有很多的查询条件过滤。
如果有些大表走了索引,但是join的消耗很大,很可能就是表的查询顺序不当导致的。
有些情况下使用全表扫描的代价要比使用索引要低。
像这个例子,排查后,logical_date表中虽然有上千条记录,但是实际上使用的只有一条记录。
memo这个表是最大的表,由上亿条记录,走了索引。但是join的效率很差,根据排查,memo这个表是这个查询的关键,需要根据时间来得到前一天的数据变化。
如果根据时间来过滤,可以过滤到绝大多数的数据。
上一条记录过滤后只剩下 74811 rows selected.
如果关联配置表memo_type查询的数据就会一下子减少到1713条左右,这是对于性能极大的提升和关键。
但是问题就来了,如果按照时间来查询,这个大表上没有和时间相关的字段,查询走全表扫描会很长,大概在5分钟左右。
--without parallel
74811 rows selected.
Elapsed: 00:03:23.10
这个时候如果只能走全表扫描,但是想使得速度能够提升,可以考虑并行,加入并行后,查询速度控制在了一分钟以内。
--add table mo1_memo_type, with parllel 8
1713 rows selected.
--加上配置表的过滤条件,查取的数据更少了,速度也有了提升。
Elapsed: 00:00:41.85
如果是以Memo表作为首发,表的执行计划就有了很大的不同,关联时间时,会不停的去和Logical_date表做关联,其实Logical_date表里只需要一条记录,查看执行计划却走了笛卡尔积连接。
--去除笛卡尔积连接 -去除笛卡尔积连接可以考虑采用with的句式,把数据先缓存起来,作为后续的查询,就避免了反复全表扫描的消耗。 --简化sql 可以看到sql语句中存在着很多重复的过滤条件,需要考虑在不改变业务的情况下保证语句的简单易读。 --减少/去除全表扫描 尝试减少或者去除全表扫描,保证效率。 --子查询最大程度过滤结果集 可以考虑使用一些尽可能过滤较多数据的子查询来提高效率。 --观察执行计划中表的查取顺序。 明天继续更新更多的细节。