开发者

representing graph using relational database

I need to represent graph information with relational database.

Let's say, a is connected to b, c, and d.

a -- b
|_ c
|_ d

I can have a node table for a, b, c, and d, and I can also have a link table (FROM, TO) -> (a,b), (a,c), (a,d). For other implementation there might be a way to store the link 开发者_开发问答info as (a,b,c,d), but the number of elements in the table is variable.

  • Q1 : Is there a way to represent variable elements in a table?
  • Q2 : Is there any general way to represent the graph structure using relational database?


Q1 : Is there a way to represent variable elements in a [database] table?

I assume you mean something like this?

 from | to_1 | to_2 | to_3 | to_4 | to_5 | etc...
 1    | 2    | 3    | 4    | NULL | NULL | etc...

This is not a good idea. It violates first normal form.

Q2 : Is there any general way to represent the graph structure using database?

For a directed graph you can use a table edges with two columns:

nodeid_from nodeid_to
1           2
1           3
1           4

If there is any extra information about each node (such as a node name) this can be stored in another table nodes.

If your graph is undirected you have two choices:

  • store both directions (i.e. store 1->2 and 2->1)
  • use a constraint that nodeid_from must be less than nodeid_to (i.e. store 1->2 but 2->1 is implied).

The former requires twice the storage space but can make querying easier and faster.


In addition to the two tables route mentioned by Mark take a look at the following link:

http://articles.sitepoint.com/article/hierarchical-data-database/2

This article basically preorders the elements in the tree assigning left and right values. You are then able to select portions or all of the tree using a single select statement.

Node | lft | rght
-----------------
  A  |  0  |  7
  B  |  1  |  2
  C  |  3  |  4
  D  |  5  |  6

EDIT: If you are going to be updating the tree heavily this is not an optimum solution as the whole tree must be re-numbered


I have stored multiple "TO" nodes in a relational representation of a graph structure. I was able to do this because my graph was directed. This meant that if I wanted to know what nodes "A" was connected to, I only needed to select a single record from my table of connections. I stored the TO nodes in an easy-to-parse string and it worked great, with a class that could manage the conversion from string to collection and back.


I recommend looking at dedicated graph databases, as nawroth suggests. One example would be the "Trinity" Database, suited for very large datasets. But there are others.

Listen to the podcast by Scott Hanselman on Hanselminutes about Trinity. Here is the text transcript.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜