我正在尝试修复股票交易数据的大型数据库中的一些错误。一列(数量)具有每个节拍的交易量,其他列存储累积成交量(即当天前几个节拍的总和)。这第二列在某些情况下是错误的(不是很多,所以我们可以放心地假设对相邻的记号是错误的)。因此,理论上修复很简单:只需搜索累积音量减少的节拍(这就足够了),然后从最后一个节拍中选取累积音量,并将当前节拍的数量相加。问题是,我一直在尝试在oracle中使用一个查询来完成这项工作,但由于缺乏sql方面的专业知识,我一直在苦苦挣扎。这是我到目前为止所得到的:
update
(
select m.cumulative_volume, q.cum_volume_ant, q.quantity from
market_data_intraday_trades m
join
(
select * from
(select
product_key,
sequence_number,
lead(product_key) over (order by product_key, sequence_number) as product_key_ant,
to_char(trade_date_time, 'yyyymmdd') as fecha,
to_char(lag(trade_date_time) over (order by product_key, sequence_number), 'yyyymmdd') as fecha_ant,
cumulative_volume,
lead(cumulative_volume) over (order by product_key, sequence_number) as cum_volume_ant,
cumulative_volume - lead(cumulative_volume) over (order by product_key, sequence_number) as dif
from market_data_intraday_trades)
where product_key = product_key_ant
and fecha = fecha_ant
and dif < 0
and rownum < 10
) q
on m.sequence_number = q.sequence_number
)
set m.cumulative_volume = q.cum_volume_ant + q.quantity目前的问题是,我似乎不能在外部计算中使用内部查询中的数量。
也许使用时态表或pl/sql或游标会更清楚和/或更容易,但由于公司策略,我没有权限这样做,只有选择和更新。
如果您能为我指出解决这个问题的方向,我将不胜感激。
提前感谢!
PS。Fecha在西班牙语中是日期,以防万一:)
发布于 2010-06-28 20:14:47
只是在APC的答案中添加了一个性能比较:
SQL> update market_data_intraday_trades m
2 set m.cumulative_volume =
3 ( select inq.running_total
4 from (
5 select sum (quantity) over (partition by product_key
6 order by sequence_number) as running_total
7 , cumulative_volume
8 , rowid as row_id
9 from market_data_intraday_trades
10 ) inq
11 where m.rowid = inq.row_id
12 )
13 /
6 rows updated.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'iostats last'))
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 4mgw11769k00r, child number 0
-------------------------------------
update market_data_intraday_trades m set m.cumulative_volume = ( select inq.running_total
from ( select sum (quantity) over (partition by product_key
order by sequence_number) as running_total
, cumulative_volume , rowid as row_id from
market_data_intraday_trades ) inq where m.rowid = inq.row_id )
Plan hash value: 3204855846
--------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------------
| 1 | UPDATE | MARKET_DATA_INTRADAY_TRADES | 1 | | 0 |00:00:00.01 | 35 |
| 2 | TABLE ACCESS FULL | MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 3 |
|* 3 | VIEW | | 6 | 6 | 6 |00:00:00.01 | 18 |
| 4 | WINDOW SORT | | 6 | 6 | 36 |00:00:00.01 | 18 |
| 5 | TABLE ACCESS FULL| MARKET_DATA_INTRADAY_TRADES | 6 | 6 | 36 |00:00:00.01 | 18 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("INQ"."ROW_ID"=:B1)
25 rows selected.看看这36个,那是O(N^2)。
SQL> update market_data_intraday_trades m
2 set m.cumulative_volume =
3 ( select inq.running_total
4 from (
5 select sum (quantity) over (partition by product_key
6 order by sequence_number) as running_total
7 , cumulative_volume
8 , rowid as row_id
9 from market_data_intraday_trades
10 ) inq
11 where m.rowid = inq.row_id
12 )
13 where m.rowid in
14 ( select inq.row_id
15 from (
16 select sum (quantity) over (partition by product_key
17 order by sequence_number) as running_total
18 , cumulative_volume
19 , rowid as row_id
20 from market_data_intraday_trades
21 ) inq
22 where m.cumulative_volume != running_total
23 )
24
SQL> /
1 row updated.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'iostats last'))
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 8fg3vnav1t742, child number 0
-------------------------------------
update market_data_intraday_trades m set m.cumulative_volume = ( select inq.running_total
from ( select sum (quantity) over (partition by product_key
order by sequence_number) as running_total ,
cumulative_volume , rowid as row_id from
market_data_intraday_trades ) inq where m.rowid = inq.row_id )
where m.rowid in ( select inq.row_id from ( select sum (quantity)
over (partition by product_key order by
sequence_number) as running_total , cumulative_volume
, rowid as row_id from market_data_intraday_trades ) inq
where m.cumulative_volume != running_total )
Plan hash value: 1087408236
---------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------------
| 1 | UPDATE | MARKET_DATA_INTRADAY_TRADES | 1 | | 0 |00:00:00.01 | 14 |
|* 2 | HASH JOIN SEMI | | 1 | 5 | 1 |00:00:00.01 | 6 |
| 3 | TABLE ACCESS FULL | MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 3 |
| 4 | VIEW | | 1 | 6 | 6 |00:00:00.01 | 3 |
| 5 | WINDOW SORT | | 1 | 6 | 6 |00:00:00.01 | 3 |
| 6 | TABLE ACCESS FULL| MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 3 |
|* 7 | VIEW | | 1 | 6 | 1 |00:00:00.01 | 4 |
| 8 | WINDOW SORT | | 1 | 6 | 6 |00:00:00.01 | 4 |
| 9 | TABLE ACCESS FULL | MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 4 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("M".ROWID="INQ"."ROW_ID")
filter("M"."CUMULATIVE_VOLUME"<>"RUNNING_TOTAL")
7 - filter("INQ"."ROW_ID"=:B1)
36 rows selected.这就好多了。
SQL> merge into market_data_intraday_trades mdit1
2 using ( select product_key
3 , sequence_number
4 , running_total
5 from ( select product_key
6 , sequence_number
7 , cumulative_volume
8 , sum(quantity) over (partition by product_key order by sequence_number) as running_total
9 from market_data_intraday_trades
10 )
11 where cumulative_volume != running_total
12 ) mdit2
13 on ( mdit1.product_key = mdit2.product_key
14 and mdit1.sequence_number = mdit2.sequence_number
15 )
16 when matched then
17 update set mdit1.cumulative_volume = mdit2.running_total
18 /
1 row merged.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'iostats last'))
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID cjafdk3jg4gzz, child number 0
-------------------------------------
merge into market_data_intraday_trades mdit1 using ( select product_key , sequence_number
, running_total from ( select product_key , sequence_number
, cumulative_volume , sum(quantity) over (partition by
product_key order by sequence_number) as running_total from
market_data_intraday_trades ) where cumulative_volume != running_total )
mdit2 on ( mdit1.product_key = mdit2.product_key and mdit1.sequence_number =
mdit2.sequence_number ) when matched then update set mdit1.cumulative_volume =
mdit2.running_total
Plan hash value: 2367693855
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------
| 1 | MERGE | MARKET_DATA_INTRADAY_TRADES | 1 | | 1 |00:00:00.01 | 9 |
| 2 | VIEW | | 1 | | 1 |00:00:00.01 | 6 |
|* 3 | HASH JOIN | | 1 | 6 | 1 |00:00:00.01 | 6 |
|* 4 | VIEW | | 1 | 6 | 1 |00:00:00.01 | 3 |
| 5 | WINDOW SORT | | 1 | 6 | 6 |00:00:00.01 | 3 |
| 6 | TABLE ACCESS FULL| MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 3 |
| 7 | TABLE ACCESS FULL | MARKET_DATA_INTRADAY_TRADES | 1 | 6 | 6 |00:00:00.01 | 3 |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("MDIT1"."PRODUCT_KEY"="PRODUCT_KEY" AND "MDIT1"."SEQUENCE_NUMBER"="SEQUENCE_NUMBER")
4 - filter("CUMULATIVE_VOLUME"<>"RUNNING_TOTAL")
31 rows selected.但是merge用更少的一次表扫描击败了它们。
问候你,罗伯。
https://stackoverflow.com/questions/3131657
复制相似问题