开发者

Flatten date range memberships retaining only the highest priority membership (TRANSACT-SQL)

Problem statement: A table contains an item_id, a category_id and a date range (begin_date and end_date).

No item may be in more than one category on any given date (in general; during daily rebuilding it can be in an invalid state temporarily).

By default, all items are added (and re-added if removed) to a category (derived from outside data) automatically on a daily basis, and their membership in that category matches the lifespan of the item (items have their own begin and end date, and usually spend their entire lives in the same category, which is why this matches).

For items in category X, it is occasionally desirable to override the default category by adding them to category Y. Membership in category Y could entirely replace membership in category X (that is, the begin and end dates for membership in category Y would match the begin and end dates of the item itself), or it could override it for an arbitrary period of time (at the beginning, middle or end the item's lifespan, possibly overriding for short periods at multiple times). Membership in category Y is not renewed automatically and additions to that category is done by manual data entry.

Every day, when category X is rebuilt, we get an ov开发者_StackOverflowerlap, where any item in category Y will now be in category X as well (which is forbidden, as noted previously).

Goal: After each repopulation of category X (which is done in a rather complicated and fragile manner, and ideally would be left alone), I'm trying to find an efficient means of writing a stored procedure that:

  1. Identifies the overlaps

  2. Changes existing entries, adds new ones where necessary (such as in the case where an item starts in category X, switches to category Y, then eventually switches back to category X before ending), or removes entries (when an item is in category Y for its entire life) such that every item remains in category Y (and only Y) where specified, while category X membership is maintained when not overridden by category Y.

  3. Does not affect memberships of categories A, B, C, Z, etc., which do not have override categories and are built separately, by completely different rules.

  4. Note: It can be assumed that X membership covers the entire lifespan of the item before this procedure is called, so it is unnecessary to query any data outside this table.

  5. Bonus credit: If for some reason there are two adjacent or overlapping memberships in for the same item in category Y, stitching them together into a single entry is appreciated, but not necessary.

Example:

  • item_id category_id begin_date end_date
  • 1 X 20080101 20090628
  • 1 Y 20090101 20090131
  • 1 Y 20090601 20090628
  • 2 X 20080201 20080731
  • 2 Y 20080201 20080731

Should become:

  • item_id category_id begin_date end_date
  • 1 X 20080101 20081231
  • 1 Y 20090101 20090131
  • 1 X 20090201 20090531
  • 1 Y 20090601 20090628
  • 2 Y 20080201 20080731

If it matters, this needs to work on SQL Server 2005 and SQL Server 2008


Given an item that should be in category X, once the category Y rows are already there you should be able to add the X with the code below. You'll need to adapt it to however you will be using it - in a cursor, as a set-based operation for all items with a category X row, or whatever.

DECLARE
    @y_id INT,
    @x_id INT,
    @item_id INT
SELECT @y_id = category_id FROM dbo.Categories WHERE name = 'Y'
SELECT @x_id = category_id FROM dbo.Categories WHERE name = 'X'
SET @item_id = ???

;WITH Begin_Dates AS (SELECT
    BI.item_id,
    BI.begin_date AS begin_date
FROM
    Items BI
WHERE
    NOT EXISTS (SELECT * FROM Item_Categories BIC2 WHERE BIC2.item_id = BI.item_id AND BIC2.category_id = @y_id AND BIC2.begin_date = BI.begin_date)
UNION
SELECT
    BIC1.item_id,
    end_date AS begin_date
FROM
    Item_Categories BIC1
WHERE
    BIC1.category_id = @y_id AND
    NOT EXISTS (SELECT * FROM dbo.Items WHERE item_id = BIC1.item_id AND BIC1.end_date = end_date)),
End_Dates AS (SELECT
    EI.item_id,
    EI.end_date AS end_date
FROM
    Items EI
WHERE
    NOT EXISTS (SELECT * FROM Item_Categories EIC2 WHERE EIC2.item_id = EI.item_id AND EIC2.category_id = @y_id AND EIC2.begin_date = EI.begin_date)
UNION
SELECT
    EIC1.item_id,
    begin_date AS end_date
FROM
    Item_Categories EIC1
WHERE
    EIC1.category_id = @y_id AND
    NOT EXISTS (SELECT * FROM dbo.Items WHERE item_id = EIC1.item_id AND EIC1.begin_date = begin_date))
INSERT INTO dbo.Item_Categories
(
    item_id,
    category_id,
    begin_date,
    end_date
)
SELECT
    @item_id,
    @x_id,
    BD.begin_date,
    ED.end_date
FROM
    Begin_Dates BD
INNER JOIN End_Dates ED ON
    ED.end_date > BD.begin_date AND
    NOT EXISTS (SELECT * FROM End_Dates ED2 WHERE ED2.end_date > BD.begin_date AND ED2.end_date < ED.end_date)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜