Locating all reachable nodes using SQL
Suppose a table with two columns: From and To. Example:
From To
1 2
2 3
2 4
4 5
I would like to know the most effective way to locate all nodes that are reachable from a node using a SQL Query. Example: given 1 it would return 2,3,4 and 5. It is possible to use several queries united by UNION clauses but it would limit the number of levels开发者_开发技巧 that can be reached. Perhaps a different data structure would make the problem more tractable but this is what is available.
I am using Firebird but I would like have a solution that only uses standard SQL.
You can use a recursive common table expression if you use most brands of database -- except for MySQL and SQLite and a few other obscure ones (sorry, I do consider Firebird obscure). This syntax is ANSI SQL standard, but Firebird doesn't support it yet.
Correction: Firebird 2.1 does support recursive CTE's, as @Hugues Van Landeghem comments.
Otherwise see my presentation Models for Hierarchical Data with SQL for several different approaches.
For example, you could store additional rows for every path in your tree, not just the immediate parent/child paths. I call this design Closure Table.
From To Length
1 1 0
1 2 1
1 3 2
1 4 2
1 5 3
2 2 0
2 3 1
2 4 1
3 3 0
4 4 0
4 5 1
5 5 0
Now you can query SELECT * FROM MyTable WHERE From = 1
and get all the descendants of that node.
PS: I'd avoid naming a column From
, because that's an SQL reserved word.
Unfortunately there isn't a good generic solution to this that will work for all situations on all databases.
I recommend that you look at these resources for a MySQL solution:
- Managing Hierarchical Data in MySQL
- Models for hierarchical data - presentation by Bill Karwin which discusses this subject, demonstrates different solutions, and compares the adjacency list model you are using with other alternative models.
For PostgreSQL and SQL Server you should take a look at recursive CTEs.
If you are using Oracle you should look at CONNECT BY which is a proprietary extension to SQL that makes dealing with tree structures much easier.
With standard SQL the only way to store a tree with acceptable read performance is by using a hack such as path enumeration. Note that this is very heavy on writes.
ID PATH
1 1
2 1;2
3 1;2;3
4 1;2;4
SELECT * FROM tree WHERE path LIKE '%2;%'
精彩评论