What is the most efficient way to concatenate a string from all parent rows using T-SQL?
I have a table that has a self-referencing foreign key that represents its parent row. To illustrate the problem in its simplest form we'll use this table:
CREATE TABLE Folder(
id int IDENTITY(1,1) NOT NULL, --PK
parent_id int NULL, --FK
folder_name varchar(255) NOT NULL)
I want to create a scalar-valued function that would return a concatenated string of the folder's name and all its parent folder names all the way to the root folder, which would be designated by a null parent_id value.
My current solution is a procedural approach which I assume is not ideal. Here is what I'm doing:
CREATE FUNCTION dbo.GetEntireLineage
(@folderId INT)
RETURNS VARCHAR(MAX)
AS
BEGIN
D开发者_如何学编程ECLARE @lineage VARCHAR(MAX)
DECLARE @parentFolderId INT
SELECT @lineage = folder_name, @parentFolderId = parent_id FROM Folder WHERE id = @folderId
WHILE NOT @parentFolderId IS NULL
BEGIN
SET @parentFolderId = (SELECT parent_id FROM Folder WHERE parent_id = @parentFolderId)
SET @lineage = (SELECT @lineage + '-' + (SELECT folder_name FROM Folder WHERE parent_id = @parentFolderId))
END
RETURN @lineage
END
Is there a more ideal way to do this? I'm an experienced programmer but T-SQL not a familiar world to me and I know these problems generally require a different approach due to the nature of set based data. Any help finding a solution or any other tips and tricks to deal with T-SQL would be much appreciated.
To know for sure about performance you need to test. I have done some testing using your version (slightly modified) and a recursive CTE versions suggested by others.
I used your sample table with 2048 rows all in one single folder hierarchy so when passing 2048 as parameter to the function there are 2048 concatenations done.
The loop version:
create function GetEntireLineage1 (@id int)
returns varchar(max)
as
begin
declare @ret varchar(max)
select @ret = folder_name,
@id = parent_id
from Folder
where id = @id
while @@rowcount > 0
begin
select @ret = @ret + '-' + folder_name,
@id = parent_id
from Folder
where id = @id
end
return @ret
end
Statistics:
SQL Server Execution Times:
CPU time = 125 ms, elapsed time = 122 ms.
The recursive CTE version:
create function GetEntireLineage2(@id int)
returns varchar(max)
begin
declare @ret varchar(max);
with cte(id, name) as
(
select f.parent_id,
cast(f.folder_name as varchar(max))
from Folder as f
where f.id = @id
union all
select f.parent_id,
c.name + '-' + f.folder_name
from Folder as f
inner join cte as c
on f.id = c.id
)
select @ret = name
from cte
where id is null
option (maxrecursion 0)
return @ret
end
Statistics:
SQL Server Execution Times:
CPU time = 187 ms, elapsed time = 183 ms.
So between these two it is the loop version that is more efficient, at least on my test data. You need to test on your actual data to be sure.
Edit
Recursive CTE with for xml path('')
trick.
create function [dbo].[GetEntireLineage4](@id int)
returns varchar(max)
begin
declare @ret varchar(max) = '';
with cte(id, lvl, name) as
(
select f.parent_id,
1,
f.folder_name
from Folder as f
where f.id = @id
union all
select f.parent_id,
lvl + 1,
f.folder_name
from Folder as f
inner join cte as c
on f.id = c.id
)
select @ret = (select '-'+name
from cte
order by lvl
for xml path(''), type).value('.', 'varchar(max)')
option (maxrecursion 0)
return stuff(@ret, 1, 1, '')
end
Statistics:
SQL Server Execution Times:
CPU time = 31 ms, elapsed time = 37 ms.
use a recursive query to traverse the parents and then this method for concatenating into a string.
A hierarchyid is often overkill unless you have a really deep hierarchy or very large sets of data that can take advantage of the indexing. This is as fast as you can get without changing your schema.
with recursiveCTE (parent_id,concatenated_name) as (
select parent_id,folder_name
from folder
union all
select f.parent_id,r.concatenated_name +f.folder_name
from folder f
inner join recursiveCTE r on r.parent_id = f.id
)
select folder_name from recursiveCTE
This works for you:
with cte (Parent_id, Path) as
(
select Parent_Id,Folder_Name
from folder
union all
select f.Parent_Id,r.Path + '\' + f.Folder_Name
from Folder as f
inner join cte as c on c.Parent_Id = f.Id
)
select Folder_Name from cte
精彩评论