Why don't the mainstream DBMSs have graph functionality?

2022-12-09 02:43 问答作者：

Relational databases are frequently used to store graphs in all their many flavors (trees, directed graphs, undirected graphs, ...).

Why then do none of the major DBMSs (Microsoft, MySql, Oracle, PostgreSQL, SqlLite, just to name a few in alphabetical order) include library support for treating relations as graphs?

Some desirable features, by way of example:

Constraint checking (connectedness, acyclicity, planarity, ...)
Commonly needed functions (shortest path, minimu开发者_开发百科m spanning tree, transitive closure, max flow/min cut, clique detection, Hamiltonian/Eulerian cycles ...)
Auxiliary data structures needed to improve performance for any of the above

Building support for some of these things outside the database is complicated because (among other reasons):

It's inherently complicated (libraries help here)
Short answers are often supported by lots of data: an external client running a shortest path algorithm would need to either be very "chatty" with the database or would need to retrieve a much-larger-than-needed amount of data; either choice is bad for the network
Maintaining integrity when integrity depends on a graph-theoretic constraint requires access to all proposed updates, hence a trigger, and access to existing graph libraries from triggers is complicated in many systems
The DBMS storage manager and optimizer are uniquely positioned to address the question of auxiliary data structures, as they do with indexes

This isn't a rhetorical question, I actually want to know if there are interesting technical (or historical) reasons.

I've worked in a research group, interested among others in delevoping a database for RDF(S) data, which is basically labeled graphs, or triples [subject, predicate, object], which are basically graph edges: [sourceNode, edgeLabel, targetNode].

The question to ask, to appreciate the hardness of the problem: What kind of indices are you going to build for a labeled graph? You have to take advantage of common "properties" (each "predicate" is a property of the subject, with the value of object), and index edges accordingly, so you can quickly find if "there is an edge called 'hasAge' on a Person with value greater than 18".

For illustration, here is a simple approach, which is schema-oblivious (and goes quite to the opposite direction of traditional database research which quite unanimously agrees that schemata are good to have). It completely ignores any schema information (this paper provides useful context). Just store everything in three big tables (s: subject, p: predicate, o: object):

[s, p, o]
[p, o, s]
[o, s, p]

These three suffice to answer any efficiently evaluate any query with (at most) a subject, (at most) a predicate, and (at most) an object (i.e. queries of the form (s, *, *), (*, p, *), (*, *, o), (s, p, *), (s, *, o), (*, p, o), (s, p, o)). Complex queries though consist of many "path expressions" (i.e. you describe data for which you can find certain paths satisfying some criteria), each of which is translated to a self-join on one of these (big!) tables, which is not all that efficient, which is a problem.

There, that's a simple graph database in a pocket. :)

Concluding, this is the field of active research. I'm not up to date with the current state of art, but I've seen products like AllegroGraph and others that claim very good results.

Oracle has support for graph functionality (Oracle Locator/Oracle Spatial) and semantic web functionality.

I suspect that your question contains the beginnings of its own answer.

The commonly needed functions you list are not, for general purpose databases, commonly needed at all. Yes they are certainly needed for graph operations, but rarely for, say, customer billing. Of course, a relational database can store graphs in tables but the graph operations are beyond the powers of any version of SQL I've seen.

You write building support for some of these things outside the database is complicated. True that and it's why we all get paid so much. But it would be just as complicated to build support for those things into the database wouldn't it ?

继续阅读：graph-theory history implementation query-optimization rdbms

Why don't the mainstream DBMSs have graph functionality?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？