开发者

Taxonomy hierarchy representation format

We are planning to integrate a hierarchical taxonomy in our software solution. (Java based)

Is there a standardized (and easy to use) format to represent hierarchical taxonomies? A format which would the common exchange format used by different taxonomy editors?

I have been looking at OWL (RDF), PMML... but those are either quite complex,开发者_开发知识库 or do not really seem be fit for this purpose.

To give a simple example. We would like to represent a tree of concepts. Attached to each concept there would be some kind of data object (in brackets)

Vehicles (category := 'V')
 |-> Car (code := 1)
 |    |-> Petrol (code := 2 && car_code := 'petrol')
 |    |-> Electical (code := 2 && car_code := 'electrical')
 |-> Plane (code := 1)

We could of develop our own XML format using a serialization library like Xstream. But if there is a good standard - which is well supported by Java, I would prefer to use it.


You are looking for SKOS - Simple Knowledge Organization System Namespace Document

SKOS is an ontology to represent taxonomies, hierarchies and thesaurus. It is based on the concept of broader and narrower properties to state relationships between terms. For instance:

ex:animals rdf:type skos:Concept;
  skos:prefLabel "animals"@en;
  skos:narrower ex:mammals.
ex:mammals rdf:type skos:Concept;
  skos:prefLabel "mammals"@en;
  skos:broader ex:animals.

You can represent your taxonomy with SKOS, serialize in RDF and assert in a RDF database. To query it, and retrieve hierarchy trees, use the SPARQL language.


[Apologies for using a reply for what should be a comment to the question. 't is just that the comment format is not suitable for this kind of "question redirect"]

While the question appears to be about a format to represent taxonomy hierarchies, the references to OWL, RDF and PMML point toward ontology solutions. Also the perceived complexity of these ontology formats is maybe a tell that a simpler approach is warranted.

In a nutshell, you need to assert if you really need an ontology framework rather than a taxonomy framework. It is easy to confuse these two related concepts but it seems that in many instances a more flexible DBMS or even a simple XML-based schema descriptor is all that is required.

For example, to perform guided searches through catalogs of heterogeneous items an EAV database back-end with a relatively simple hierarchical schema model can "fit the bill".
Or, to support/validate some entity extraction logic, a simple taxonomy where the leaf nodes contain the accepted texts

On the other hand, if some reasoning on the basis of the schema is required, or for, say, fancy data mining efforts whereby the ontology drives the data-gathering bots, then you may effectively be talking about a semantic web / ontology application.


Bioinformaticians use the OBO File format ( http://www.geneontology.org/GO.format.obo-1_2.shtml ) to store some well known ontologies such as GeneOntology (a directed graph ontology). It comes with a java parser: http://www.geneontology.org/GO.java.obo.parser.shtml

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜