Sql server query to flattening a hierarchy of records
I have a table that describes a hierarchy:
Name MemberName
A B
A C
B D
D E
F G
MemberName references the Name column of the same table. From this table, I can easily query to see that B and C are members within A, D is a member of B, E is a member of D and G is a member of F.
Based on this structure it's difficult to write a query that shows that D, and E are also indirectly a member of A. D and E are also indirectly a member of B, etc. So what I need to do is build up a new table that shows shows all the indirect members. So for the above table data, I'd end up with a table containing:
Name MemberName
A B
A C
A D
A E
B D
B E
D E
F G
I started by putting all the records that weren't members of other records (top level) records) into a temp table:
CREATE TABLE #TMP
(
[Name] varchar(20),
[MemberName] varchar(20)
)
DECLARE @iRowsFound INT
INSERT INTO #TMP ([Name],[MemberName])
(SELECT * FROM [HierarchyData] WHERE [Name] NOT IN
(SELECT [MemberName] FROM [HierarchyData]))
SELECT @iRowsFound = @@ROWCOUNT
Name MemberName
A B
A C
F G
Then my theory was to, in a while loop, cross join the temp table to the heirachy table and insert the applicable records from the cross join back into the temp table, and perform that while loop until there were no more applicable records in the cross join to insert:
WHILE (@iRowsFound > 0)
BEGIN
INSERT INTO #TMP ([Name],[MemberName])
(
SELECT
[NewName] = ??,
[NewMember] = ??
FROM
[HierarchyData],[#TMP]
WHERE
???
)
SELECT @iRowsFound = @@ROWCOUNT
END
I'm just not sure I'm on the right track, as I'm a little stumped as to what the cross join select should look like. Has anyone done something like this (in sql server 2000)?
Edit: I think I may have gotten it: - Although I'm pretty sure there must be a more efficient way to do this...
WHILE (@iRowsFound > 0)
BEGIN
INSERT INTO #TMP ([Name],[MemberName])
(
SELECT
--[#TMP].[Name],
--[#TMP].[MemberName],
[HierarchyData].[Name],
[HierarchyData].[MemberName]
FROM
[#TMP]
JOIN
[HierarchyData] ON [#TMP].[MemberName] = [HierarchyData].[Name]
--WHERE
-- [#TMP].[MemberName] = [HierarchyData].[Name]
AND NOT EXISTS (SELECT * FROM [#TMP] WHERE [#TMP].[Name] = [HierarchyData].[Name] AND [#TMP].[MemberName] = [HierarchyData].[MemberName])
UNION
SELECT
开发者_开发问答 [#TMP].[Name],
--[#TMP].[MemberName],
--[HierarchyData].[Name],
[HierarchyData].[MemberName]
FROM
[#TMP]
JOIN
[HierarchyData] ON [#TMP].[MemberName] = [HierarchyData].[Name]
AND NOT EXISTS (SELECT * FROM [#TMP] WHERE [#TMP].[Name] = [#TMP].[Name] AND [#TMP].[MemberName] = [HierarchyData].[MemberName])
)
SELECT @iRowsFound = @@ROWCOUNT
END
So sad you are not on sql server 2005 or later, it is easy with a recursive CTE the code is here:
WITH Members AS
(
Select Name, MemberName
FROM HierarchyData
UNION ALL
SELECT Name, Child.MemberName as [MemberName]
FROM Members
JOIN HierarchyData Child ON Members.MemberName = Child.Name
)
SELECT * FROM Members
In 2000 you can do it basically the same way (joining the results of the last select to the original table till you have no results of the last set inside a loop), but it is much harder because you have to keep track of what iteration you are on via a counter. Yuck.
Does this help, or do you want some sql 2000 pseudo code?
Better yet, just upgrade!
Here is an SQL 2000 version.
Some notes: This will work with any numbers of levels and will not have cycle errors (like the CTE versions will.)
declare @lastcount int
declare @lastcycle int
Select HierarchyData.Name, HierarchyData.MemberName, 0 as [Cycle] INTO #list
FROM HierarchyData
SET @lastcount = @@rowcount
SET @lastcycle = 0
while @lastcount > 0
BEGIN
INSERT INTO #list
SELECT Members.Name, Child.MemberName as [MemberName], @lastcycle+1 as [Cycle]
FROM #list Members
JOIN HierarchyData Child ON Members.MemberName = Child.Name
LEFT JOIN #list cycletest ON Members.Name = cycletest.Name AND Child.MemberName = cycletest.Membername
WHERE Members.Cycle = @lastcycle AND NOT (Members.Name = Child.MemberName) AND cycletest.Name is null
SET @lastcount = @@rowcount
SET @lastcycle = @lastcycle + 1
END
SELECT [Name], [MemberName] FROM #list
ORDER BY [Name], [MemberName]
DROP TABLE #list
---- Test data
--create table HierarchyData
--(
-- [Name] varchar(20),
-- [MemberName] varchar(20)
--)
--
--INSERT INTO HierarchyData (Name,MemberName) Values('A','B')
--INSERT INTO HierarchyData (Name,MemberName) Values('A','C')
--INSERT INTO HierarchyData (Name,MemberName) Values('B','D')
--INSERT INTO HierarchyData (Name,MemberName) Values('D','E')
--INSERT INTO HierarchyData (Name,MemberName) Values('F','G')
----CYCLE TEST (the CTE will not work)
--INSERT INTO HierarchyData (Name,MemberName) Values('E','D')
--
---- Test
--select * from HierarchyData
---- CTE Works (note, will fail on cycles.)
--WITH Members AS
--(
-- Select HierarchyData.Name, HierarchyData.MemberName
-- FROM HierarchyData
-- UNION ALL
-- SELECT Members.Name, Child.MemberName as [MemberName]
-- FROM Members
-- JOIN HierarchyData Child ON Members.MemberName = Child.Name
--)
--SELECT * FROM Members
--ORDER BY [Name], [MemberName]
I've used the following code pattern to follow a hierarchy in SQL Server 2000. The "magic" is adding the depth value to the temporary table so you can use that in the WHERE clause.
SET NOCOUNT ON
CREATE TABLE #super_trees
(
supervisor_uid INTEGER,
actor_uid INTEGER,
depth INTEGER
)
DECLARE
@more_users BIT,
@depth INTEGER
SET @more_users = 1
SET @depth = 0
INSERT INTO #super_trees VALUES (@supervisor_uid, @supervisor_uid, @depth)
SET @depth = @depth + 1
WHILE (@more_users = 1)
BEGIN
INSERT INTO #super_trees (supervisor_uid, actor_uid, depth)
SELECT u.supervisor_uid,
u.actor_uid,
@depth
FROM #super_trees sr
INNER JOIN
dbo.users u
ON (sr.actor_uid = u.supervisor_uid)
WHERE sr.depth = (@depth - 1)
IF @@ROWCOUNT < 1
SET @more_users = 0
SET @depth = @depth + 1
END
Using CTE above does not meet the objective of the poster. He/she wants to flatten the data. CTE only returns hierarchy information with different values under the ParentID column.
Name MemberName A B A C B D D E F G
So the above is what you get using CTE, NOT
Name MemberName A B A C A D A E B D B E D E F G
I would advise you to make a slight alteration to your data. You do not have a record that says that A is the root of the hierarchy. Adding that:
INSERT INTO #TMP(Name, MemberName) VALUES (NULL, 'A')
greatly simplifies things (also, typically, the adjacency list would be represented 'the other way around': a column Name
, and a column ParentName
which would correspond to your MemberName
, Name
columns respectively.
With that setup you can use a common table expression to do the job:
WITH Node (Name, ParentName)
AS (
SELECT Name, ParentName
FROM Tab
WHERE ParentName IS NULL
UNION ALL
SELECT Tab.Name, Tab.ParentName
FROM Tab
INNER JOIN Node
ON ParentName = Node.Name
)
SELECT Name, ParentName
FROM Node
Unfortunately, common table expressions are supported in MS SQL 2005 and up as pointed out by Hogan.
精彩评论