我被困在解决这个问题上,听到新的想法应该很好:)
我有一个表,上面有数十亿条这样的记录
TAB_IX (int) (PK)
TAB_ID (int) (PK)
PR_ID (int) (PK)
SP_ID (int) (PK)(IX)
....以前我是这样检索数据的
SELECT TAB_ID, COUNT (SP_ID) as HITS FROM table t
INNER JOIN table_sp s on t.SP_ID = s.ID
WHERE TAB_IX = @tab_inx
AND PR_ID IN (SELECT PR_ID FROM @pr_id)
AND s.NAME IN (SELECT DISTINCT NAME FROM @sp_names)
GROUP BY TAB_IDtable_sp是一个包含10k记录的小表(ID (int) (PK),NAME (varchar) (IX))
@pr_id和@sp_names是具有一列的表变量
查询非常快(大约2-3秒);现在我不想区分具有不同PR_ID和相同TAB_IX、TAB_ID、SP_ID的记录
例如像这样的记录
TAB_IX - TAB_ID - PR_ID - SP_ID
1 - 700 - 1 - 100
1 - 700 - 2 - 100应该被视为一体。
唯一的方法似乎是做一个额外的GROUP BY
像这样
SELECT TAB_ID, COUNT(SP_ID) as HITS FROM (
SELECT TAB_ID, SP_ID, COUNT (PR_ID) FROM table
WHERE TAB_IX = @tab_inx
AND PR_ID in (select PR_ID from @pr_id)
AND s.NAME IN (SELECT DISTINCT NAME FROM @sp_names)
GROUP BY TAB_ID, SP_ID) AS DUMMY
GROUP BY TAB_ID问题在于性能,因为添加这个额外的GROUP BY操作看起来非常痛苦。
你有任何改进查询的想法吗?
提前感谢:)
发布于 2013-02-20 17:45:05
我认为,在原始查询中指定要计算DISTINCT SP_ID的数量将会起到作用
SELECT TAB_ID, COUNT (DISTINCT SP_ID) as HITS FROM table t
INNER JOIN table_sp s on t.SP_ID = s.ID
WHERE TAB_IX = @tab_inx
AND PR_ID IN (SELECT PR_ID FROM @pr_id)
AND s.NAME IN (SELECT DISTINCT NAME FROM @sp_names)
GROUP BY TAB_IDhttps://stackoverflow.com/questions/14976567
复制相似问题