Noting DTDs in speech grammar xml (SRGS)
Can someone please explain why both quoted lines are necessary in
<!DOCTYPE grammar PUBLIC "-//W3C//DTD GRAMMAR 1.0//EN"
"http://www.w3.org/TR/speech-grammar/grammar.dtd">
? This is from the official SRGS (http://www.w3.org/TR/speech-grammar/) document, Section 2.7. One would think just the latter quoted line which notes the location of the dtd开发者_如何学Go be enough. I suspect it has something to do with specifying the language in English but the document does not explain this. Thanks.
Only the latter string (the system identifier) must be present if an XML document is supposed to be valid against a DTD.
The first quoted string is the public identifier, and it is optional. It is used to uniquely identify a DTD (or other external identifier) by name instead of by physical address. A public identifier can often be assumed to be more stable than a http:
or file:
URL. It makes it possible to locate a DTD even if the system identifier is wrong, or if internet access is down (for example).
It would be OK to use only
<!DOCTYPE grammar SYSTEM "http://www.w3.org/TR/speech-grammar/grammar.dtd">
Note the use of the SYSTEM
keyword in this case.
See also http://www.xml.com/axml/target.html#sec-external-ent.
Regarding usage of DTDs hosted by W3C, you might be interested in http://www.w3.org/blog/systeam/2008/02/08/w3c_s_excessive_dtd_traffic/.
Actually in SGML (which XML is based on) the URL (=SYSTEM identifier) is in the same way optional as the PUBLIC identifier. If the keyword PUBLIC
exists, the document should be always validated by using a built-in DTD (from a catalog) that matches the given formal public identifier (FPI), "-//W3C//DTD GRAMMAR 1.0//EN"
in your case. In SGML the PUBLIC identifier could be followed by a system identifier which should be taken as a hint about the DTD that is used. In XML this has changed so that the PUBLIC identifier exists, it must be followed by a system identifier; but this doesn't change the logic or purpose of these identifiers.
The public identifier indicates the specification this document follows so practically it tells the grammar that is used in the whole document, and so in a way it serves the same purpose as namespaces in XML nowadays do. The form of the public identifiers usually follows a common structure and therefore contain a language code. This code only implies in what (natural/human) language the referred specification was written and does not mean that the document itself would contain use this language. In your case, the DOCTYPE does not mean that your document should use, refer to or be about English language.
Instead of PUBLIC
the DOCTYPE declaration can also contain keyword SYSTEM
which means that the DTD should be retrieved as a system specific fashion. SYSTEM
is followed by only a URL or a file path.
精彩评论