开发者

How to achieve this in XML schema?

This is an XML that I want to get:

<root>
    <A>
        <C>asd</C>
        <D>asd</D>
        <E>asd</E>
    </A>
    <B>
        <C>asd</C>
        <D>asd</D>
        <E>asd</E>
        <F>asd</F>
    </B>
</root>

Here are some more limits:

  • There can be multiple A and B elements, in any order.
  • A and B have exactly the same contents, except that B also may contain element F开发者_高级运维;
  • C, D, E and F may appear in any order.
  • E can appear multiple times;
  • C and D can appear 0 or 1 times;
  • F must appear exactly 1 time

Is this possible? And on a side note - why is XML schema so awkward in defining such simple scenarios?


This solves most of your conditions, however the harder one is allowing the any order part. Since you are dealing with complex types your primary usage is the Sequence command. There are others but they do not work for your scenarios either. Also, while this may seem simple from a pure xml perspective it's not from a validation perspective. The main thing to note is that the way this doc is built you would have to put all your <A> records first and all your <B> records second. Here is a link to some of the schema data: w3schools

There may be some much more complicated ways to do what you want but this gives you the basic pattern at least.

<xs:schema
     xmlns:xs="http://www.w3.org/2001/XMLSchema"
     elementFormDefault="qualified" attributeFormDefault="unqualified">

<xs:complexType name="Avalue">
  <xs:sequence>
    <xs:element minOccurs="0" maxOccurs="1" name="C" type="xs:string"/>
    <xs:element minOccurs="0" maxOccurs="1" name="D" type="xs:string"/>
    <xs:element minOccurs="0" maxOccurs="unbounded" name="E" type="xs:string"/>
  </xs:sequence>
</xs:complexType>


<xs:complexType name="Bvalue">
  <xs:sequence>
    <xs:element minOccurs="0" maxOccurs="1" name="C" type="xs:string"/>
    <xs:element minOccurs="0" maxOccurs="1" name="D" type="xs:string"/>
    <xs:element minOccurs="0" maxOccurs="unbounded" name="E" type="xs:string"/>
    <xs:element minOccurs="1" maxOccurs="1" name="F" type="xs:string"/>
  </xs:sequence>
</xs:complexType>

<xs:element name="root">
  <xs:complexType>
    <xs:sequence>
      <xs:element minOccurs="0" maxOccurs="unbounded" name="A" type="Avalue"/>
      <xs:element minOccurs="0" maxOccurs="unbounded" name="B" type="Bvalue"/>
    </xs:sequence>
  </xs:complexType>
</xs:element>

</xs:schema>


If you're just looking for code-completion (as you mentioned in a comment), then just define each type as an unlimited choice of the possible children. For example, for B:

  <xs:choice minOccurs="0" maxOccurs="unbounded">
    <xs:element name="C" type="xs:string"/>
    <xs:element name="D" type="xs:string"/>
    <xs:element name="E" type="xs:string"/>
    <xs:element name="F" type="xs:string"/>
  </xs:choice>

And on a side note - why is XML schema so awkward in defining such simple scenarios?

Partly because sparse XML/DTD had beaten rich SGML/DTD by keeping features at a minimum and being friendly to simple processors. So XML Schema wanted to stay closer to XML than SGML.


If the order of C, D, E, and F children conveys no information, then indeed it is simpler for many purposes to specify a fixed order for them, as suggested by the OP in a comment on another answer.

In XSD 1.0, allowing C, D, and E to appear in any order with the cardinality constraints specified requires a rather verbose content model. Using regular-expression syntax (and allowing whitespace for legibility), the content model for A would be

(C ((DE+)|(E+(DE*)?)) )
|
(D ((CE+)|(E+(CE*)?)) )
|
(E+ ((DE*(CE*)?)|(CE*(DE*)?))? )

Since the content model for B adds one bit to the amount of information which must be carried by each state in the finite state automaton for the language (namely 'have we seen an F yet?'), it not surprisingly doubles the size of the minimal FSA and makes the content model much larger:

C ( (D ((E+FE*)|(FE+)) )
  | (E+ ((DE*FE*)|(FE*(DE*)?)))
  | (F ((DE+)|(E+DE*)))
  )    
D ( (C ((E+FE*)|(FE+)))
  | (E+ ((CE*FE*)|(FE*(CE*)?)))
  | (F ((CE+)|(E+CE*)))
  )    
E+ ( (CE* ((DE*FE*)|(FE*(DE*)?)))
  |  (DE* ((CE*FE*)|(FE*(CE*)?)))
  |  (FE* ((CE*(DE*)?)|(DE*(CE*)?))?)
  )    
F ( (C ((DE+)|(E+DE*)))
  | (D ((CE+)|(E+CE*)))
  | (E+ ((CE*(DE*)?)|(DE*(CE*)?))?) 
  )

This is a bit tedious to work out, but it's certainly legal XSD 1.0 and it captures the constraints described by the OP; it thus demonstrates that it's an error to say that the constraints cannot be captured by XSD 1.0. The most that can be claimed is that it's not possible to capture the OP's requirements succinctly in XSD 1.0. That's a property XSD 1.0 content models share with the DTD content models, regular expressions, and conventional notations for context-free grammars on which XSD 1.0 content models are based.

Other schema languages can handle this more succinctly: in Relax NG the interleave operator makes it relatively straightforward to encode constraints of this kind. And in XSD 1.1 the relaxation of constraints on the all group make it fairly straightforward to capture the requirements:

<xs:complexType name="CDE">
  <xs:all>
    <xs:element ref="C" minOccurs="0" maxOccurs="1"/>
    <xs:element ref="D" minOccurs="0" maxOccurs="1"/>
    <xs:element ref="E" minOccurs="1" maxOccurs="unbounded"/>
  </xs:all>
</xs:complexType>

<xs:complexType name="CDEF">
  <xs:complexContent>
    <xs:extension base="CDE">
      <xs:all>
        <xs:element ref="F" minOccurs="1" maxOccurs="1"/>
      </xs:all>
    </xs:extension>
  </xs:complexContent>
</xs:complexType> 

This does complicate the work of the validator (put in the simplest possible way: it means the validator author cannot use standard textbook algorithms, because standard textbooks don't cover the interleave operator), but for better or worse many vocabulary designers prefer not to constrain order even in cases where order conveys no information and thus does not need to be unconstrained.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜