开发者

Need some tSQL Wizardry: SQL Update Based on Running Total

I've got an implementation for this that uses a super hairy recursive CTE that is really hard to follow/maintain. I was hoping that one of the brains on SO could come up with some more straightforward tSQL approach code to accomplish the following:

Table Documents

DocID    SortOrder    PageCount    StartPgNum   EndPgNum
5        1            2               {1}          {2}
8        2            7               {3}          {9}
22       3            3               {10}         {12}

Fo开发者_开发知识库r the table given above, I need a query to populate StartPgNum and EndPgNum (Sample values included in the example in {} to make the intentions clearer for what I need.

Assumptions:

* DocID, SortOrder, and PageCount are pre-populated.

* StartPgNum and EndgNum need to be populated by tSQL code.

* SortOrder always starts at 1, and is continuous with no gaps.

* Documents should get a continuous page numbering scheme as ordered by SortOrder


Updated to be better :)

DECLARE @temp TABLE (DocID INT, SortOrder INT, PageCount INT)

INSERT INTO @temp VALUES (5, 1, 2)
INSERT INTO @temp VALUES (8, 2, 7)
INSERT INTO @temp VALUES (22, 3, 3)

SELECT
    *,
    StartPgNum + PageCount-1 AS EndPgNum
FROM
(SELECT
    DocID,
    SortOrder,
    PageCount,
    ISNULL((SELECT SUM(PageCount)+1 FROM @temp WHERE SortOrder < parent.SortOrder), 1) AS StartPgNum
FROM
    @temp parent) _temp


I did some testing on all of the solutions provided here in the other answers, my original "Hairy Recursive CTE" option and for the sake of completeness a simple cursor based approach. To my great surprise the cursor option performed the best by a clear margin in all my tests (1K Rows, 10KRows, 50K Rows, 500K Rows)

Here are the average times for each approach for 10K records:
Hairy Recursive CTE: 3 minutes 55 seconds
CROSS APPLY (Ben Dempsey): 21-25 seconds
SUBSELECTS (Tim Khouri): 19-21 seconds
CURSOR: 1-2 Seconds

Here is my cursor based solution:

Declare @temp TABLE(
 DocID INT PRIMARY KEY NOT NULL, 
 SortOrder INT NOT NULL, 
 PageCount INT NOT NULL,
 BegPg int,
 EndPg int
)

Insert into @temp (DocID,SortOrder,PageCount) 
SELECT top 50000 docid, ROW_NUMBER() OVER (ORDER BY DOCID),Pages FROM tblDocuments

DECLARE @PC int
SET @PC=1
DECLARE @FetchPageCount int
DECLARE @FetchDocID int

DECLARE myCursor CURSOR FOR 
SELECT DocID, PageCount FROM @temp ORDER BY SortOrder

OPEN myCursor
FETCH NEXT FROM myCursor INTO @FetchDocID,@FetchPageCount

WHILE @@FETCH_STATUS = 0
BEGIN

  UPDATE @temp SET BegPg=@PC, EndPg=@PC+ @FetchPageCount-1   
  WHERE (Docid=@fetchDocid)

     SET @PC = @PC + @FetchPageCount

    FETCH NEXT FROM myCursor INTO @FetchDocID,@FetchPageCount
END 
CLOSE myCursor
DEALLOCATE myCursor

SELECT * FROM @temp

Who would have guessed it? Maybe cursors aren't always evil.

A word of warning: Lest you be tempted to replace the update to the "WHERE CURRENT OF myCursor" syntax, it performed much slower than using the current version with a where clause, although still faster than most of the other approaches.


The fastest way to do it would be with a Quirky Update. It depends whether you fall into the 'Microsoft don't explicitly say it works so I'll avoid' it camp or not...

Otherwise you're in hairy recursive CTE (as you've already discovered) or triangular join (which could become a nightmare on a large data set) territory.


Maybe one of these three solutions can help, since this is a kind of "running total" problem: http://www.sqlteam.com/article/calculating-running-totals


SQL 2008 using cross apply (running total)

/*
DocID    SortOrder    PageCount    StartPgNum   EndPgNum
5        1            2               {1}          {2}
8        2            7               {3}          {9}
22       3            3               {10}        {12}
*/

Declare @MyTable TABLE(
DocID int,
SortOrder int,
PageCount int
)

Insert into @MyTable(DocID,SortOrder,PageCount)
values (5,1,2), (8,2,7), (22,3,3)

select 
    T1.docID, 
    T1.Sortorder, 
    T1.Pagecount, 
    (T.RunningTotal - T1.Pagecount) + 1 StartPgNum , 
    T.RunningTotal EndPgNum

FROM    @MyTable T1
        CROSS APPLY ( Select SUM(PageCount) RunningTotal FROM @MyTable where SortOrder <= T1.SortOrder) T
order by T1.sortorder


I chose to tackle these two problems by creating functions, one to get the first page the second to get the last page. Here are the functions and the query which will work.

CREATE FUNCTION dbo.GetFirstPage(@SortOrder int) 
RETURNS int
as
BEGIN
DECLARE @FirstPage int
SET @FirstPage = 1
IF(@SortOrder > 1)
BEGIN
SELECT @FirstPage = SUM(PageCount) + 1
FROM Documents
WHERE SortOrder < @SortOrder
END
RETURN @FirstPage
END

CREATE FUNCTION dbo.GetLastPage(@FirstPage int, @PageCount int)
RETURNS int 
AS
BEGIN
RETURN (@FirstPage + @PageCount -1)
END

And finally the query.

SELECT * ,  
        dbo.GetFirstPage(SortOrder) AS FirstPage,
        dbo.GetLastPage(dbo.GetFirstPage(SortOrder),Pagecount) AS LastPage 
FROM Documents
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜