What is the most efficient way to concatenate a string from all parent rows using T-SQL?
I have a table that has a self-referencing foreign key that represents its parent row. To illustrate the problem in its simplest form we'll use this table:
CREATE TABLE Folder(
    id int IDENTITY(1,1) NOT NULL, --PK
    parent_id int NULL,        --FK
    folder_name varchar(255) NOT NULL)
I want to create a scalar-valued function that would return a concatenated string of the folder's name and all its parent folder names all the way to the root folder, which would be designated by a null parent_id value.
My current solution is a procedural approach which I assume is not ideal. Here is what I'm doing:
CREATE FUNCTION dbo.GetEntireLineage
    (@folderId INT)
    RETURNS VARCHAR(MAX)
AS
BEGIN
    D开发者_如何学编程ECLARE @lineage VARCHAR(MAX)
    DECLARE @parentFolderId INT
    SELECT @lineage = folder_name, @parentFolderId = parent_id FROM Folder WHERE id = @folderId
WHILE NOT @parentFolderId IS NULL
    BEGIN
        SET @parentFolderId = (SELECT parent_id FROM Folder WHERE parent_id = @parentFolderId)
        SET @lineage = (SELECT @lineage + '-' + (SELECT folder_name FROM Folder WHERE parent_id = @parentFolderId))
    END
RETURN @lineage
END
Is there a more ideal way to do this? I'm an experienced programmer but T-SQL not a familiar world to me and I know these problems generally require a different approach due to the nature of set based data. Any help finding a solution or any other tips and tricks to deal with T-SQL would be much appreciated.
To know for sure about performance you need to test. I have done some testing using your version (slightly modified) and a recursive CTE versions suggested by others.
I used your sample table with 2048 rows all in one single folder hierarchy so when passing 2048 as parameter to the function there are 2048 concatenations done.
The loop version:
create function GetEntireLineage1 (@id int)
returns varchar(max)
as
begin
  declare @ret varchar(max)
  select @ret = folder_name,
         @id = parent_id
  from Folder
  where id = @id
  while @@rowcount > 0
  begin
    select @ret = @ret + '-' + folder_name,
           @id = parent_id
    from Folder
    where id = @id
  end
  return @ret
end
Statistics:
 SQL Server Execution Times:
   CPU time = 125 ms,  elapsed time = 122 ms.
The recursive CTE version:
create function GetEntireLineage2(@id int)
returns varchar(max)
begin
  declare @ret varchar(max);
  with cte(id, name) as
  (
    select f.parent_id,
           cast(f.folder_name as varchar(max))
    from Folder as f
    where f.id = @id
    union all
    select f.parent_id,
           c.name + '-' + f.folder_name
    from Folder as f
      inner join cte as c
        on f.id = c.id
  )
  select @ret = name
  from cte
  where id is null
  option (maxrecursion 0)
  return @ret
end
Statistics:
 SQL Server Execution Times:
   CPU time = 187 ms,  elapsed time = 183 ms.
So between these two it is the loop version that is more efficient, at least on my test data. You need to test on your actual data to be sure.
Edit
Recursive CTE with for xml path('') trick.
create function [dbo].[GetEntireLineage4](@id int)
returns varchar(max)
begin
  declare @ret varchar(max) = '';
  with cte(id, lvl, name) as
  (
    select f.parent_id,
           1,
           f.folder_name
    from Folder as f
    where f.id = @id
    union all
    select f.parent_id,
           lvl + 1,
           f.folder_name
    from Folder as f
      inner join cte as c
        on f.id = c.id
  )
  select @ret = (select '-'+name
                 from cte
                 order by lvl
                 for xml path(''), type).value('.', 'varchar(max)')
  option (maxrecursion 0)
  return stuff(@ret, 1, 1, '')
end
Statistics:
 SQL Server Execution Times:
   CPU time = 31 ms,  elapsed time = 37 ms.
use a recursive query to traverse the parents and then this method for concatenating into a string.
A hierarchyid is often overkill unless you have a really deep hierarchy or very large sets of data that can take advantage of the indexing. This is as fast as you can get without changing your schema.
 with recursiveCTE (parent_id,concatenated_name) as (
    select parent_id,folder_name
    from folder
    union all
    select f.parent_id,r.concatenated_name +f.folder_name
    from folder f
    inner join recursiveCTE r on r.parent_id = f.id
    )
    select folder_name from recursiveCTE 
This works for you:
 with cte (Parent_id, Path) as 
    (
    select Parent_Id,Folder_Name
    from folder
    union all
    select f.Parent_Id,r.Path + '\' + f.Folder_Name
    from Folder as f
    inner join cte as c on c.Parent_Id = f.Id
    )
    select Folder_Name from cte
 
         加载中,请稍侯......
 加载中,请稍侯......
      
精彩评论