built-in schema datatype for html / xhtml
Is there a built-in schema datatype for xhtml data? Suppose I want to specify a "boozle" element that contains two "woozles", each of which is arbitrary xhtml. I want to write something like this, using the relax NG compact syntax:
namespace nifty = "http://brinckerhoff.org/nifty/"
start = element nifty:boozle {woozle, woozle}
woozle = element nifty:woozle {xhtml}
Unfortunately, xmllint then signals this error:
./lab.rng:43: e开发者_如何学Pythonlement ref: Relax-NG parser error : Reference xhtml has no matching definition ./lab.rng:43: element ref: Relax-NG parser error : Internal found no define for ref xhtml
So my question is this: is there something sensible that I should put in place of "xhtml" above?
Namespaces and schemas are orthogonal in RELAX NG, whereas they are tightly coupled in XML Schema. If you want to just validate that your elements are in the XHTML namespace, you can set up a rule like this:
htmlElement = element xhtml:* { (attribute * {text} | text | htmlElement)* }
on the same lines as the definition of anyElement above. But if you want to actually validate the content as XHTML, then you should use the RELAX NG schema for XHTML, include it (there are multiple start points, depending on if you want XHTML 1.0 strict, etc.) and then reference its pattern for the html element or whatever element(s) you want. When you include a full schema into your own schema, you need to say "include 'blahblah' { start = } in order to override the included schema's own start symbol.
Your woozles and boozles are in your namespace, while the xhtml elements are in the xhtml namespace. A schema validates a namespace - your schema validates your namespace and the xhtml schema validates the xhtml namespace. You can restrict an element to contain xhtml by mandating that all its child elemenents are in the xhtml namespace, but your schema should not be validating the xhtml namespace itself.
You can use the xhtml schema to validate any xhtml namespace nodes in your document. You add this schema to your processing pipeline, that is, a second validation step.
Ahhh..... okay, more quality time with the Relax NG documentation suggests two possible solutions to this problem.
1) Use name classes to specify an "anyElement" that matches everything, like this:
anyElement = element * { (attribute * { text } | text | anyElement)* }
This is moderately horrible, because it simply disables checking for these elements. With this definition, though, I could put "anyElement" in place of "xhtml", above.
2) It appears to me that a better solution would involve using Relax NG's "include" directive to include a full specification of xhtml, assuming I could find one.
精彩评论