Duplicate triple in RDF, authoritative view?

2022-12-13 07:27 问答作者：

if a triple store contains twice the same triple, what is (if any exist) the authoritative position about this redundancy ?

Additionally, should a triplestore be allowed to store twice the same triple within the same context ?

I ask this because in rdflib apparently you can store the same triple twice (or more). This is the reader

import rdflib
from rdflib import store

s = rdflib.plugin.get('MySQL', store.Store)('rdfsto开发者_开发百科re')

config_string = "host=localhost,password=foo,user=foo,db=foo"
rt = s.open(config_string,create=False)
if rt != store.VALID_STORE:
    s.open(config_string,create=True)

graph = rdflib.ConjunctiveGraph(s, identifier = rdflib.URIRef("urn:uuid:a19f9b78-cc43-4866-b9a1-4b009fe91f52"))
rows = graph.query("SELECT ?id ?value { ?id <http://localhost#ha> ?value . }")
for r in rows:
    print r[0], r[1]

and this is the writer

import rdflib
from rdflib import store

s = rdflib.plugin.get('MySQL', store.Store)('rdfstore')

config_string = "host=localhost,password=foo,user=foo,db=foo"
rt = s.open(config_string,create=False)
if rt != store.VALID_STORE:
    s.open(config_string,create=True)

graph = rdflib.ConjunctiveGraph(s, identifier = rdflib.URIRef("urn:uuid:a19f9b78-cc43-4866-b9a1-4b009fe91f52"))
graph.add( ( rdflib.URIRef("http://localhost/1000"), rdflib.URIRef("http://localhost#ha"), rdflib.Literal("18")) )
graph.commit()

This is what I obtain

sbo@dhcp-045:~/tmp/gd $ python ./reader2.py 
table kb_7b066eca61_relations Doesn't exist
table kb_7b066eca61_relations Doesn't exist
sbo@dhcp-045:~/tmp/gd $ python ./reader2.py 
sbo@dhcp-045:~/tmp/gd $ python ./reader2.py 
sbo@dhcp-045:~/tmp/gd $ python ./writer2.py 
sbo@dhcp-045:~/tmp/gd $ python ./reader2.py 
http://localhost/1000 18
sbo@dhcp-045:~/tmp/gd $ python ./writer2.py 
sbo@dhcp-045:~/tmp/gd $ python ./reader2.py 
http://localhost/1000 18
http://localhost/1000 18

To me it appears as a bug. A modified version shows me that both triples belong to the same context, and there are indeed two triples as well

len : 2
http://localhost/1000 18
http://localhost/1000 18
(rdflib.URIRef('http://localhost/1000'), rdflib.URIRef('http://localhost#ha'), rdflib.Literal(u'18'), <Graph identifier=urn:uuid:a19f9b78-cc43-4866-b9a1-4b009fe91f52 (<class 'rdflib.Graph.Graph'>)>)
(rdflib.URIRef('http://localhost/1000'), rdflib.URIRef('http://localhost#ha'), rdflib.Literal(u'18'), <Graph identifier=urn:uuid:a19f9b78-cc43-4866-b9a1-4b009fe91f52 (<class 'rdflib.Graph.Graph'>)>)

An RDF triple store is a set of triples, so the same triple cannot be present twice, by definition. However, most rdf stores are actually quad stores (sets of rdf graphs also known as datasets) and in that case, the triple may appear multiple times. That is sometimes called context, depending on the store (eg mine, Redland). Authority is really up to the user to define what meaning a particular graph name/context name has.

One should keep in mind that any particular triple may have different metadata than other - otherwise identical - triples. Metadata such as the original source of the triple, possible strength of connection information, etcetera. It may also be feasible to merely count the number of copies of a triple in order to judge the relative strength of a connection compared to other possible contradictory connections. So, as always, it all depends upon what you intend to do with your data.

RDF is a language for expressing factual claims, organized and grouped into graphs. If a graph contains "Alice is a Person" twice, that's just redundant. So within a graph, triples are normalised; there's no point in repeating them. However applications, stores and SPARQL-queriable systems will often collect factual claims from different sources. The SPARQL language has the 'GRAPH' keyword for when you want to take a multi-graph perspective and look for the same triple in different sources.

继续阅读：rdf triplestore

Duplicate triple in RDF, authoritative view?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？