首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >查找表中的分区列值没有分区消除吗?

查找表中的分区列值没有分区消除吗?
EN

Database Administration用户
提问于 2019-07-10 20:11:55
回答 3查看 476关注 0票数 7

我创建了一个分区表(如下所示),并播种了4.8亿行--每个帐户大约181行。

在添加索引之前,我正在运行基线查询。我惊讶地发现,即使在添加option(recompile)之后,在分区列上执行日期查找也不会导致分区消除。分区表就是这样吗?在我看来,这更像是现实生活,而不是硬编码谓词的分区列值。

最后,如果我对此有疑问,我将添加索引(Es)并在这里发帖。我不想继续,直到我对在这篇文章中给出的答案感到满意。

代码语言:javascript
运行
复制
    --step 2 (after creating db)
    ALTER DATABASE partitionresearch
    ADD FILEGROUP January
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP February
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP March
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP April
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP May
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP June
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP July
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP August
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP September
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP October
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP November
    GO
    ALTER DATABASE partitionresearch
    ADD FILEGROUP December
    GO
    --step 3
    -- Table Partitioning in SQL Server
        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartJan],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartJan.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [January]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartFeb],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartFeb.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [February]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartMar],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartMar.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [March]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartApr],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartApr.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [April]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartMay],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartMay.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [May]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartJun],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartJun.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [June]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartJul],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartJul.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [July]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartAug],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartAug.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [August]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartSep],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartSep.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [September]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartOct],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartOct.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [October]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartNov],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartNov.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [November]

        ALTER DATABASE [Partitionresearch]
        ADD FILE 
        (
        NAME = [PartDec],
        FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL14.mycompany2\MSSQL\DATA\PartDec.ndf',
            SIZE = 5080 KB, 
            MAXSIZE = UNLIMITED, 
            FILEGROWTH = 2040 KB
        ) TO FILEGROUP [December]

    --step 4
    -- Table Partitioning in SQL Server
    USE Partitionresearch
    GO

    CREATE PARTITION FUNCTION [MonthlyPartition] (date)
    AS RANGE RIGHT FOR VALUES ('20190201', '20190301', '20190401',
                   '20190501', '20190601', '20190701', '20190801', 
                   '20190901', '20191001', '20191101', '20191201');

    --step 5
    -- Table Partitioning in SQL Server
    USE Partitionresearch
    GO

    CREATE PARTITION SCHEME MonthWisePartition
    AS PARTITION MonthlyPartition
            TO (January, February, March, April, May, June, July, 
                August, September, October, November, December
                );
    --step 6
    create table dbo.partitionresearch 
    (
    tranid int identity(1,1),
    [Date] date,

    Account int,
    SeqNumber tinyint,
    AlertType int,
    IsFirst tinyint,
    Indicator1 int,
    [time] time
    )
    on monthwisepartition([date])

    --with partitioning help - 40 seconds (as opposed to 3 min 46 sec) , hovered over table scan and didnt see partition count, but clearly partitions (elimination) were used
    --did see scalar operators with values 5 and 10 which happens to be where these accounts are partition wise (may and october)
    use partitionresearch
    select * from dbo.partitionresearch --hoverd over and closest thing to partn help i saw were scalar operators 5 and 10
    where (date between '5/1/2019' and '5/31/2019' or date between '10/1/2019' and '10/31/2019') and
          account in (1000000,2000000) 
    ------------------------------------------------------------------------------------------------------------------------
    --with "partition help" from a lookup table--3 minutes 33 seconds
    use partitionresearch
    select a.* from dbo.partitionresearch a--hovered over and believe partns wont be used
    join [dbo].[monthlookup] b
    on a.date=b.date
    where account in (1000000,2000000) 
    ------------------------------------------------------------------------------------------------------------------------
--this is the date lookup table which isnt partitioned, thus not aligned
USE [partitionresearch]
GO

/****** Object:  Table [dbo].[monthlookup]    Script Date: 7/12/2019 6:24:35 PM ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[monthlookup](
    [monthid] [int] IDENTITY(1,1) NOT NULL,
    [Date] [date] NOT NULL
) ON [PRIMARY]
GO
EN

回答 3

Database Administration用户

回答已采纳

发布于 2019-07-10 21:17:50

这在产品中不能用于行存储分区堆。如果您将表更改为具有分区的集群列存储索引,则有时可以通过位图筛选器消除行组,从而消除分区,这似乎就是您所追求的。

我在博客上提到了这个这里。引用一小部分:

我们知道,基于维度表中的数据,Server只需要从事实表中读取两个分区。理论上,查询优化器能比它做得更好吗?考虑到分区表最多有15000个分区这一事实。所有分区值都不能重叠,如果没有DDL操作,它们不会更改。在构建哈希表时,查询优化器可以跟踪哪些分区中至少有一行。在散列构建结束时,我们将确切地知道哪些分区可以包含数据,因此在探测阶段可以跳过其余的分区。这可能没有实现,因为散列构建独立于探测是很重要的。也许无法保证位图运算符在正确的时间被推送到扫描,而不是重新划分流运算符。也许这不是一个常见的情况,优化也不值得付出这么大的努力。毕竟,您多长时间加入分区列,而不是通过它进行筛选?

票数 7
EN

Database Administration用户

发布于 2019-08-25 06:36:45

为了完整起见,您可以获得动态分区消除,但只有当联接类型是具有相关参数的嵌套循环时才能得到。

例如,使用提供的分区函数和方案:

代码语言:javascript
运行
复制
CREATE PARTITION FUNCTION MonthlyPartition ([date])
AS RANGE RIGHT FOR VALUES
(
    '20190201', '20190301', '20190401',
    '20190501', '20190601', '20190701', '20190801', 
    '20190901', '20191001', '20191101', '20191201'
);

CREATE PARTITION SCHEME MonthWisePartition
AS PARTITION MonthlyPartition ALL TO ([PRIMARY]);

和表格:

代码语言:javascript
运行
复制
CREATE TABLE dbo.PartitionResearch 
(
    tranid integer identity(1,1) NOT NULL,
    [Date] date NULL,
    Account integer NULL,
    SeqNumber tinyint NULL,
    AlertType integer NULL,
    IsFirst tinyint NULL,
    Indicator1 integer NULL,
    [time] time NULL
)
ON MonthWisePartition([Date]);

CREATE TABLE dbo.MonthLookup
(
    [MonthId] integer IDENTITY(1,1) NOT NULL,
    [Date] date NOT NULL
)
ON [PRIMARY];

没有使用分区消除的查询:

代码语言:javascript
运行
复制
SELECT
    a.* 
FROM dbo.PartitionResearch AS a
JOIN dbo.MonthLookup AS b
    ON a.[Date]=b.[Date]
WHERE
    a.Account IN (1000000,2000000);

...produces执行以下计划:

PartitionResearch表扫描的属性显示了在循环的每次迭代中使用MonthLookup表中的当前日期进行动态分区消除:

这个选项在这里是首选的,因为表是空的。在您的示例中,由于估计成本的原因,优化器更倾向于哈希连接计划。给定一个56行的MonthLookup表(如您的计划所示),嵌套循环替代将扫描单个分区56次。优化器(可能是正确的)评估,最好只扫描一次所有12个分区。

如果您想要使用动态分区消除测试数据,您可能会得到一个带有OPTION (LOOP JOIN)查询提示的计划。由于示例查询只访问了两个分区,至少可以在合理的时间内对两个分区进行28次扫描。

对于更健壮的一般策略,您需要编写特定的table来实现分区消除,例如使用$PARTITION函数、临时表或动态SQL。

票数 2
EN

Database Administration用户

发布于 2019-07-12 15:11:13

在更好地理解了这个社区给出的方向之后,我很想尝试一个原型,在这个原型中,SQL是根据查找表包含的内容动态生成的(即没有联接)。可能在将分区列更改为仅由年份和月份组成的新列之后。

我希望在需要时将动态SQL建立在1-12个变量的基础上,并使用一个重新编译选项。相反(请参阅下面的代码),为了节省时间,我只是向自己证明,理论上可以利用动态SQL来解决与查找表和分区消除有关的任何(合理的)限制。

我做了两根弦。一个包含2组日期范围变量声明(@low1,@high1,@low2,@high2)。一个有一个select和谓词,引用日期范围变量和两个目标帐户。我执行了这两个字符串的连接,并且确信使用了分区消除。select在结尾有一个选项(重新编译)。在比较从一个试用到next...apples到apples的运行时间之前,已经清除了缓存。部分是为了确保分区消除真的发生了。

代码语言:javascript
运行
复制
declare @sql1 varchar(500)=
'declare @low1 date ='+'''' +'5/1/19'+'''' +
'declare @low2 date ='+'''' +'10/1/19'+'''' +
'declare @high1 date ='+''''+'5/31/19' +'''' +
'declare @high2 date ='+''''+'10/31/19'+'''' 
declare @sql2 varchar(500)=
'select * from dbo.partitionresearch 
where (date between @low1 and @high1 or date between @low2 and @high2) and
      account in (1000000,2000000) 
OPTION (RECOMPILE)'
CHECKPOINT 
DBCC DROPCLEANBUFFERS
exec (@sql1+@sql2)
票数 1
EN
页面原文内容由Database Administration提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://dba.stackexchange.com/questions/242582

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档