Need to convert a recursive CTE query to an index friendly query

2023-02-07 04:13 问答作者：

After going through all the hard work of writing a recursive CTE query to meet my needs, I realize I can't use it because it doesn't work in an indexed view. So I need something else to replace the CTE below. (Yes you can use a CTE in a non-indexed view, but that's too slow for me).

The requirements:

My ultimate goal is to have a self updating indexed view (it doesn't have to be a view, but something similar)... that is, if data changes in any of the tables the view joins on, then the view needs to update itself.
The view needs to be indexed because it has to be very fast, and the data doesn't change very frequently. Unfortunately, the non-indexed view using a CTE takes 3-5 seconds to run which is way too long for my needs. I need the query to run in milliseconds. The recursive table has a few hundred thousand records in it.

As far as my research has taken me, the best solution to meet all these requirements is an indexed view, but I'm open to any solution.

The CTE can be found in the answer to my other post. Or here it is again:

DECLARE @tbl TABLE ( 
     Id INT 
    ,[Name] VARCHAR(20) 
    ,ParentId INT 
    ) 

INSERT INTO @tbl( Id, Name, ParentId ) 
VALUES 
 (1, 'Europe', NULL) 
,(2, 'Asia',   NULL) 
,(3, 'Germany', 1) 
,(4, 'UK',      1) 
,(5, 'China',   2) 
,(6, 'India',   2) 
,(7, 'Scotland', 4) 
,(8, 'Edinburgh', 7) 
,(9, 'Leith', 8) 

; 
DECLARE @tbl2 table (id int, abbreviation varchar(10), tbl_id int)
INSERT INTO @tbl2( Id, Abbreviation, tbl_id ) 
VALUES 
 (100, 'EU', 1) 
,(101, 'AS', 2) 
,(102, 'DE', 3) 
,(103, 'CN', 5)

;WITH abbr AS (
    SELECT a.*, isnull(b.abbreviation,'') abbreviation
    FROM @tbl a
    left join @tbl2 b on a.Id = b.tbl_id
), abcd AS ( 
          -- anchor 
        SELECT  id, [Name], ParentID,
                CAST(([Name]) AS VARCHAR(1000)) [Path],
                cast(abbreviation as varchar(max)) abbreviation
        FROM    abbr
        WHERE   ParentId IS NULL 
        UNION开发者_如何学JAVA ALL
          --recursive member 
        SELECT  t.id, t.[Name], t.ParentID, 
                CAST((a.path + '/' + t.Name) AS VARCHAR(1000)) [Path],
                isnull(nullif(t.abbreviation,'')+',', '') + a.abbreviation
        FROM    abbr AS t 
                JOIN abcd AS a 
                  ON t.ParentId = a.id 
       )
SELECT *, [Path] + ':' + abbreviation
FROM abcd

After hitting all the roadblocks with indexed views (self join, cte, udf accessing data etc), I propose that the below as a solution for you.

Create support function

Based on maximum depth of 4 from root (5 total). Or use a CTE

CREATE FUNCTION dbo.GetHierPath(@hier_id int) returns varchar(max)
WITH SCHEMABINDING
as
begin
return (
    select FullPath =
               isnull(H5.Name+'/','') + 
               isnull(H4.Name+'/','') +
               isnull(H3.Name+'/','') +
               isnull(H2.Name+'/','') +
               H1.Name
             +
               ':'
             +
               isnull(STUFF(
               isnull(','+A1.abbreviation,'') +
               isnull(','+A2.abbreviation,'') + 
               isnull(','+A3.abbreviation,'') +
               isnull(','+A4.abbreviation,'') +
               isnull(','+A5.abbreviation,''),1,1,''),'')
    from dbo.HIER H1
    left join dbo.ABBR A1 on A1.hier_id = H1.Id
    left join dbo.HIER H2 on H1.ParentId = H2.Id
    left join dbo.ABBR A2 on A2.hier_id = H2.Id
    left join dbo.HIER H3 on H2.ParentId = H3.Id
    left join dbo.ABBR A3 on A3.hier_id = H3.Id
    left join dbo.HIER H4 on H3.ParentId = H4.Id
    left join dbo.ABBR A4 on A4.hier_id = H4.Id
    left join dbo.HIER H5 on H4.ParentId = H5.Id
    left join dbo.ABBR A5 on A5.hier_id = H5.Id
    where H1.id = @hier_id)
end
GO

Add columns to the table itself

For example the fullpath column, if you need, add the other 2 columns in the CTE by splitting the result of dbo.GetHierPath on ':' (left=>path, right=>abbreviations)

-- index maximum key length is 900, based on your data, 400 is enough
ALTER TABLE HIER ADD FullPath VARCHAR(400)

Maintain the columns

Because of the hierarchical nature, record X could be deleted that affects a Y descendent and Z ancestor, which is quite hard to identify in either of INSTEAD OF or AFTER triggers. So the alternative approach is based on the conditions

if data changes in any of the tables the view joins on, then the view needs to update itself.
the non-indexed view using a CTE takes 3-5 seconds to run which is way too long for my needs

We maintain the data simply by running through the entire table again, taking 3-5 seconds per update (or faster if the 5-join query works out better).

CREATE TRIGGER TG_HIER
ON HIER
AFTER INSERT, UPDATE, DELETE
AS
UPDATE HIER
SET FullPath = dbo.GetHierPath(HIER.Id)

Finally, index the new column(s) on the table itself

create index ix_hier_fullpath on HIER(FullPath)

_{If you intended to access the path data via the id, then it is already in the table itself without adding an additional index.}

The above TSQL references these objects

Modify the table and column names to suit your schema.

CREATE TABLE dbo.HIER (Id INT Primary Key Clustered, [Name] VARCHAR(20) ,ParentId INT)
;
INSERT dbo.HIER( Id, Name, ParentId ) VALUES 
     (1, 'Europe', NULL) 
    ,(2, 'Asia',   NULL) 
    ,(3, 'Germany', 1) 
    ,(4, 'UK',      1) 
    ,(5, 'China',   2) 
    ,(6, 'India',   2) 
    ,(7, 'Scotland', 4) 
    ,(8, 'Edinburgh', 7) 
    ,(9, 'Leith', 8)
    ,(10, 'Antartica', NULL) 
; 
CREATE TABLE dbo.ABBR (id int primary key clustered, abbreviation varchar(10), hier_id int)
;
INSERT dbo.ABBR( Id, Abbreviation, hier_id ) VALUES 
     (100, 'EU', 1) 
    ,(101, 'AS', 2) 
    ,(102, 'DE', 3) 
    ,(103, 'CN', 5)
GO

EDIT - Possibly faster alternative

Given that all records are recalculated each time, there is no real need for a function that returns the FullPath for a single HIER.ID. The query in the support function can be used without the where H1.id = @hier_id filter at the end. Furthermore, the expression for FullPath can be broken into PathOnly and Abbreviation easily down the middle. Or just use the original CTE, whichever is faster.

继续阅读：common-table-expression hierarchical-data recursive-query sql sql-server-2008

Need to convert a recursive CTE query to an index friendly query

Create support function

Add columns to the table itself

Maintain the columns

Finally, index the new column(s) on the table itself

The above TSQL references these objects

EDIT - Possibly faster alternative

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

Create support function

Add columns to the table itself

Maintain the columns

Finally, index the new column(s) on the table itself

The above TSQL references these objects

EDIT - Possibly faster alternative

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？