开发者

Sql Server Primary Key With Partition Issue

I am building a table that will be partitioned and contain a FILESTREAM column. The issue I am encountering is that it appears I have to have a composite primary key (FILE_ID and FILE_UPLOADED_DATE) becau开发者_运维问答se FILE_UPLOADED_DATE is part of my partition scheme. Is that correct? I would prefer not to have this be a composite key and simply just have FILE_ID being the primary key.....could this be just an user error?

Any suggestions would be appreciated.

Version: SQL Server 2008 R2

Partition Schemes and Function:

CREATE PARTITION FUNCTION DocPartFunction (datetime)
AS RANGE RIGHT FOR VALUES ('20101220')
GO
CREATE PARTITION SCHEME DocPartScheme AS
PARTITION DocPartFunction TO (DATA_FG_20091231, DATA_FG_20101231);
GO
CREATE PARTITION SCHEME DocFSPartScheme AS
PARTITION DocPartFunction TO (FS_FG_20091231,FS_FG_20101231);
GO

Create Statement:

CREATE TABLE [dbo].[FILE](
    [FILE_ID] [int] IDENTITY(1,1) NOT NULL,
    [DOCUMENT] [varbinary](max) FILESTREAM  NULL,
    [FILE_UPLOADED_DATE] [datetime] NOT NULL,
    [FILE_INT] [int] NOT NULL,
    [FILE_EXTENSION] [varchar](10) NULL,
    [DocGUID] [uniqueidentifier] ROWGUIDCOL  NOT NULL UNIQUE ON [PRIMARY],
CONSTRAINT [PK_File] PRIMARY KEY CLUSTERED
    (   [FILE_ID] ASC
    ) ON DocPartScheme ([FILE_UPLOADED_DATE])
)ON DocPartScheme ([FILE_UPLOADED_DATE])
FILESTREAM_ON DocFSPartScheme;

Error if I don't include FILE_UPLOADED_DATE:

Msg 1908, Level 16, State 1, Line 1
Column 'FILE_UPLOADED_DATE' is partitioning column of the index 'PK_File'. Partition columns for a unique index must be a subset of the index key.
Msg 1750, Level 16, State 0, Line 1
Could not create constraint. See previous errors.

Thanks!


You are confusing the primary key and the clustered index. There is no reason for the two to be one and the same. You can have a clustered index on FILE_UPLOADED_DATE and a separate, non-clustered, primary key on FILE_ID. In fact you already do something similar for the DocGUID column:

CREATE TABLE [dbo].[FILE](
    [FILE_ID] [int] IDENTITY(1,1) NOT NULL,
    [DOCUMENT] [varbinary](max) FILESTREAM  NULL,
    [FILE_UPLOADED_DATE] [datetime] NOT NULL,
    [FILE_INT] [int] NOT NULL,
    [FILE_EXTENSION] [varchar](10) NULL,
    [DocGUID] [uniqueidentifier] ROWGUIDCOL  NOT NULL,
    constraint UniqueDocGUID UNIQUE NONCLUSTERED ([DocGUID]) 
        ON [PRIMARY])
    ON DocPartScheme ([FILE_UPLOADED_DATE])
    FILESTREAM_ON DocFSPartScheme;

CREATE CLUSTERED INDEX cdx_File 
   ON [FILE] (FILE_UPLOADED_DATE)
   ON DocPartScheme ([FILE_UPLOADED_DATE])
   FILESTREAM_ON DocFSPartScheme;

ALTER TABLE [dbo].[FILE]
    ADD CONSTRAINT PK_File PRIMARY KEY NONCLUSTERED (FILE_ID)
    ON [PRIMARY];

However such a design will lead to non-aligned indexes which can cause very serious performance problems, and also block all fast partition switch operations. See Special Guidelines for Partitioned Indexes:

Each sort table requires a minimum amount of memory to build. When you are building a partitioned index that is aligned with its base table, sort tables are built one at a time, using less memory. However, when you are building a nonaligned partitioned index, the sort tables are built at the same time.

As a result, there must be sufficient memory to handle these concurrent sorts. The larger the number of partitions, the more memory required. The minimum size for each sort table, for each partition, is 40 pages, with 8 kilobytes per page. For example, a nonaligned partitioned index with 100 partitions requires sufficient memory to serially sort 4,000 (40 * 100) pages at the same time. If this memory is available, the build operation will succeed, but performance may suffer. If this memory is not available, the build operation will fail

Your design already has a non-aligned index for DocGUID, so the performance problems are likely already present. If you must keep your indexes aligned then you have to admit one of the side effects of choosing a partition scheme: you can no longer have a logical primary key, nor unique constraints enforcement, unless the key includes the partitioning key.

And finally, one must ask: why use a partitioned table? They are always slower than a non-partitioned alternative. Unless you need fast partition switch operations for ETL (which you are already punting due to the non-aligned index on DocGUID), there is basically no incentive to use a partitioned table. (Preemptive comment: clustered index on the FILE_UPLOADED_DATE is guaranteed a better alternative than 'partition elimination').


The partitioning column must always be present in a partitioned table's clustered index. Any work-around you come up with has to factor this in.


I know, its an old question, but maybe google leads someone else to this question:

A possible solution would be not to partition by the date-column but by the File_ID. Every day / week / month (or whatever time period you use) you have to run a Agent Job at midnight that takes the Max(File_ID) where file_uploadet_date < GetDate(), adds the next filegroup to the partition scheme and does a split on the MaxID + 1.

Of course you will still have the problem with the non aligned index on the DocID, except you eighter add the file_id to this unique index too (could cause non unique DocIds) and / or check its uniqueness in an insert / update trigger.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜