How to document the structure of XML files
When it comes to documenting the structure of XML files...
One of my co-workers does it in a Word table.
Another pastes the elements into a Word document with comments like this:
<learningobject id="{Learning Object Id (same value as the loid tag)}"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="http://www.aicpcu.org/schemas/cms_lo.xsd">
<objectRoot>
<v>
<!-- Current version of the object from the repository. !-->
<!-- (Occurance: 1) -->
</v>
<label>
<!-- Name of the object from the repository. !-->
<!-- (Occurance: 0 or 1 or Many) -->
</label>
</objectRoot>
Which one of these met开发者_如何学JAVAhods is preferred? Is there a better way?
Are there other options that do not require third party Schema Documenter tools to update?
I'd write an XML Schema (XSD) file to define the structure of the XML document. xs:annotation
and xs:documentation
tags can be included to describe the elements. The XSD file can be transformed into documentation using XSLT stylesheets such as xs3p or tools such as XML Schema Documenter.
For an introduction to XML Schema see the XML Schools tutorial.
Here is your example, expressed as XML Schema with xs:annotation
tags:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="objectroot">
<xs:complexType>
<xs:sequence>
<xs:element name="v" type="xs:string">
<xs:annotation>
<xs:documentation>Current version of the object from the repository.</xs:documentation>
</xs:annotation>
</xs:element>
<xs:element name="label" minOccurs="0" maxOccurs="unbounded" type="xs:string">
<xs:annotation>
<xs:documentation>Name of the object from the repository.</xs:documentation>
</xs:annotation>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Enjoy RELAX NG compact syntax
Experimenting with various XML schema languages, I have found RELAX NG the best fit for most of the cases (reasoning at the end).
Requirements
- Allow documenting XML document structure
- Do it in readable form
- Keep it simple for the author
Modified sample XML (doc.xml)
I have added one attribute, to illustrate also this type of structure in the documentation.
<objectRoot created="2015-05-06T20:46:56+02:00">
<v>
<!-- Current version of the object from the repository. !-->
<!-- (Occurance: 1) -->
</v>
<label>
<!-- Name of the object from the repository. !-->
<!-- (Occurance: 0 or 1 or Many) -->
</label>
</objectRoot>
Use RELAX NG Compact syntax with comments (schema.rnc)
RELAX NG allows describing sample XML structure in the following way:
start =
## Container for one object
element objectRoot {
## datetime of object creation
attribute created { xsd:dateTime },
## Current version of the object from the repository
## Occurrence 1 is assumed by default
element v {
text
},
## Name of the object from the repository
## Note: the occurrence is denoted by the "*" and means 0 or more
element label {
text
}*
}
I think, it is very hard to beat the simplicity, keeping given level of expressiveness.
How to comment the structure
- always place the comment before relevant element, not after it.
- for readability, use one blank line before the comment block
- use
##
prefix, which is automatically translates into documentation element in other schema format. Single hash#
translates into XML comment and not a documentation element. multiple consecutive comments (as in the example) will turn into single multi-line documentation string within single element.
obvious fact: the inline XML comments in
doc.xml
are irrelevant, only what is inschema.rnc
counts.
If XML Schema 1.0 is required, generate it (schema.xsd)
Assuming you have a (open sourced) tool called trang
available, you may create an XML Schema file as follows:
$ trang schema.rnc schema.xsd
Resulting schema looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="objectRoot">
<xs:annotation>
<xs:documentation>Container for one object</xs:documentation>
</xs:annotation>
<xs:complexType>
<xs:sequence>
<xs:element ref="v"/>
<xs:element minOccurs="0" maxOccurs="unbounded" ref="label"/>
</xs:sequence>
<xs:attribute name="created" use="required" type="xs:dateTime">
<xs:annotation>
<xs:documentation>datetime of object creation</xs:documentation>
</xs:annotation>
</xs:attribute>
</xs:complexType>
</xs:element>
<xs:element name="v" type="xs:string">
<xs:annotation>
<xs:documentation>Current version of the object from the repository
Occurance 1 is assumed by default</xs:documentation>
</xs:annotation>
</xs:element>
<xs:element name="label" type="xs:string">
<xs:annotation>
<xs:documentation>Name of the object from the repository
Note: the occurance is denoted by the "*" and means 0 or more</xs:documentation>
</xs:annotation>
</xs:element>
</xs:schema>
Now can your clients, insisting on using only XML Schema 1.0 use your XML document specification.
Validating doc.xml against schema.rnc
There are open source tools like jing
and rnv
supporting RELAX NG Compact syntax and working on both Linux as well as on MS Windows.
Note: those tools are rather old, but very stable. Read it as a sign of stability not as sign of being obsolete.
Using jing:
$ jing -c schema.rnc doc.xml
The -c
is important, jing
by default assumes RELAX NG in XML form.
Using rnv
to check, the schema.rnc
itself is valid:
$ rnv -c schema.rnc
and to validate doc.xml
:
$ rnv schema.rnc doc.xml
rnv
allows validating multiple documents at once:
$ rnv schema.rnc doc.xml otherdoc.xml anotherone.xml
RELAX NG Compact syntax - pros
- very readable, even newbie should understand the text
- easy to learn (RELAX NG comes with good tutorial, one can learn most of it within one day)
- very flexible (despite the fact, it looks simple, it covers many situation, some of them cannot be even resolved by XML Schema 1.0).
- some tools for converting into other formats (RELAX NG XML form, XML Schema 1.0, DTD, but even generation of sample XML document) exists.
RELAX NG limitations
- multiplicity can be only "zero or one", "just one", "zero or more" or "one or more". (Multiplicity of small number of elements can be described by "stupid repetition" of "zero or one" definitions)
- There are XML Schema 1.0 constructs, which cannot be described by RELAX NG.
Conclusions
For the requirement defined above, RELAX NG Compact syntax looks like the best fit. With RELAX NG you get both - human readable schema which is even usable for automated validation.
Existing limitations do not come into effect very often and can be in many cases resolved by comments or other means.
You might try documenting it by creating an XSD schema which would provide a more formal specification of your XML. Many tools will generate the XSD for you from sample XML as a starting point.
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="objectroot">
<xs:complexType>
<xs:sequence>
<xs:element name="v" minOccurs="1" type="xs:string"/> <!-- current version -->
<xs:element name="label" type="xs:string"/> <!-- object name -->
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Personally, I would prefer seeing it in XML (the 2nd way).
Putting the elements in the table won't tell you clearly which elements are which elements' parent child and so on. Putting it in XML is rather clearer and I can see what's going on.
Showing it in a table has its limitaions e.g. mulit-levels of nested children, but for a simple XML structure I think this would be fine. For anything with more than one nested level I would prefer the XML way.
An even better way would be to create an XML Schema (XSD) file. That way, you get the benifits of seeing it in XML, and you can check the file after the data is inputted against the schema file using some software.
For a great series of tutorials on XSD check out w3schools - XML Schema Tutorial
I just want to add one more thing, in case someone finds it useful.
I do sometimes programming in HTML and other times in android. When I do HTML I document my custom XML following the same format as W3Schools, as in http://www.w3schools.com/tags/att_a_href.asp if it is an android project I am working on then I follow Google standards as in http://developer.android.com/guide/topics/manifest/activity-element.html#screen
This way the programmers I work with do not have to do any extra work to understand my documentation.
精彩评论