我对SQL Server中的公用表表达式有一个性能问题。在我们的开发团队中,我们在构建查询时使用了大量的链式CTE。我目前正在处理一个性能很差的查询。但我发现,如果我在链的中间插入一个临时表中该CTE之前的所有记录,然后继续,但从该临时表中进行选择,则显着提高了性能。现在,我希望获得一些帮助,以了解这种类型的更改是否仅适用于此特定查询,以及为什么您将在下面看到的两种情况在性能上存在如此大的差异。或者我们可能在我们的团队中过度使用CTE,我们是否可以通过从这个案例中学习来获得总体性能?
请试着向我解释这里到底发生了什么。
代码已经完成,您将能够在SQL Server2008甚至2005上运行它。其中一部分被注释掉了,我的想法是你可以通过注释掉一个或另一个来切换这两个案例。您可以查看放置块注释的位置,我已经用--block comment here
和--end block comment here
标记了这些位置
性能较慢的情况是未注释的默认情况。您的位置如下:
--Declare tables to use in example.
CREATE TABLE #Preparation
(
Date DATETIME NOT NULL
,Hour INT NOT NULL
,Sales NUMERIC(9,2)
,Items INT
);
CREATE TABLE #Calendar
(
Date DATETIME NOT NULL
)
CREATE TABLE #OpenHours
(
Day INT NOT NULL,
OpenFrom TIME NOT NULL,
OpenTo TIME NOT NULL
);
--Fill tables with sample data.
INSERT INTO #OpenHours (Day, OpenFrom, OpenTo)
VALUES
(1, '10:00', '20:00'),
(2, '10:00', '20:00'),
(3, '10:00', '20:00'),
(4, '10:00', '20:00'),
(5, '10:00', '20:00'),
(6, '10:00', '20:00'),
(7, '10:00', '20:00')
DECLARE @CounterDay INT = 0, @CounterHour INT = 0, @Sales NUMERIC(9, 2), @Items INT;
WHILE @CounterDay < 365
BEGIN
SET @CounterHour = 0;
WHILE @CounterHour < 5
BEGIN
SET @Items = CAST(RAND() * 100 AS INT);
SET @Sales = CAST(RAND() * 1000 AS NUMERIC(9, 2));
IF @Items % 2 = 0
BEGIN
SET @Items = NULL;
SET @Sales = NULL;
END
INSERT INTO #Preparation (Date, Hour, Items, Sales)
VALUES (DATEADD(DAY, @CounterDay, '2011-01-01'), @CounterHour + 13, @Items, @Sales);
SET @CounterHour += 1;
END
INSERT INTO #Calendar (Date) VALUES (DATEADD(DAY, @CounterDay, '2011-01-01'));
SET @CounterDay += 1;
END
--Here the query starts.
;WITH P AS (
SELECT DATEADD(HOUR, Hour, Date) AS Hour
,Sales
,Items
FROM #Preparation
),
O AS (
SELECT DISTINCT DATEADD(HOUR, SV.number, C.Date) AS Hour
FROM #OpenHours AS O
JOIN #Calendar AS C ON O.Day = DATEPART(WEEKDAY, C.Date)
JOIN master.dbo.spt_values AS SV ON SV.number BETWEEN DATEPART(HOUR, O.OpenFrom) AND DATEPART(HOUR, O.OpenTo)
),
S AS (
SELECT O.Hour, P.Sales, P.Items
FROM O
LEFT JOIN P ON P.Hour = O.Hour
)
--block comment here case 1 (slow performing)
--With this technique it takes about 34 seconds.
,N AS (
SELECT
A.Hour
,A.Sales AS SalesOrg
,CASE WHEN COALESCE(B.Sales, C.Sales, 1) < 0
THEN 0 ELSE COALESCE(B.Sales, C.Sales, 1) END AS Sales
,A.Items AS ItemsOrg
,COALESCE(B.Items, C.Items, 1) AS Items
FROM S AS A
OUTER APPLY (SELECT TOP 1 *
FROM S
WHERE Hour <= A.Hour
AND Sales IS NOT NULL
AND DATEDIFF(DAY, Hour, A.Hour) = 0
ORDER BY Hour DESC) B
OUTER APPLY (SELECT TOP 1 *
FROM S
WHERE Sales IS NOT NULL
AND DATEDIFF(DAY, Hour, A.Hour) = 0
ORDER BY Hour) C
)
--end block comment here case 1 (slow performing)
/*--block comment here case 2 (fast performing)
--With this technique it takes about 2 seconds.
SELECT * INTO #tmpS FROM S;
WITH
N AS (
SELECT
A.Hour
,A.Sales AS SalesOrg
,CASE WHEN COALESCE(B.Sales, C.Sales, 1) < 0
THEN 0 ELSE COALESCE(B.Sales, C.Sales, 1) END AS Sales
,A.Items AS ItemsOrg
,COALESCE(B.Items, C.Items, 1) AS Items
FROM #tmpS AS A
OUTER APPLY (SELECT TOP 1 *
FROM #tmpS
WHERE Hour <= A.Hour
AND Sales IS NOT NULL
AND DATEDIFF(DAY, Hour, A.Hour) = 0
ORDER BY Hour DESC) B
OUTER APPLY (SELECT TOP 1 *
FROM #tmpS
WHERE Sales IS NOT NULL
AND DATEDIFF(DAY, Hour, A.Hour) = 0
ORDER BY Hour) C
)
--end block comment here case 2 (fast performing)*/
SELECT * FROM N ORDER BY Hour
IF OBJECT_ID('tempdb..#tmpS') IS NOT NULL DROP TABLE #tmpS;
DROP TABLE #Preparation;
DROP TABLE #Calendar;
DROP TABLE #OpenHours;
如果你想试着理解我在最后一步做了什么,我有一个关于它的SO问题,here。
对我来说,情况1大约需要34秒,情况2大约需要2秒。不同之处在于,在第二种情况下,我将S的结果存储在一个临时表中,在第一种情况下,我直接在下一个CTE中使用S。
https://stackoverflow.com/questions/10751875
复制相似问题