开发者

What’s the purpose of semicolon (;) in XmlAttribute:: AttributeName (DotNet)

I’m trying to solve one programming mystery (to me). I have searched with Google but I have found nothing about the use of semicolon in the AttributeName property of XmlAttribute. I’m working with an application that serializes an object. When this object is serialized all the attributes has th开发者_如何学Ce same value as a postfix.

For example:

[XmlType(TypeName = "Foo", Namespace = Declarations.SchemaVersion), XmlRoot, Serializable]
public class Foo
{
    private string _Name;

    [XmlAttribute(AttributeName = "Name;", Form = XmlSchemaForm.Unqualified, DataType = "string", Namespace = Declarations.SchemaVersion)]
    public string Name
    {
        get
        {
            return this._Name;
        }
        set
        {
            this._Name = value;
        }
    }
}

Get serialize as:

<Foo Name_x003B_="John" />

My question is, where does this x003B came from (I have searched the code for a literal "x003B" but found nothing {the above is just an example, I’m working with a big code base}). Where can I change it? What is the purpose of the semicolon at the end of the AttributeName? Thanks!


The XmlSerializer encodes characters such as semicolon by placing it in underscores with the hex value of the character inside so Name; becomes Name_0x003B_. If you put a question mark in there it would be Name_0x003F_.


That's the encoded value of ; = _x003B_


I am surprised it's allowed as a single character in an XML attribute name. XML does not allow:

<Foo Name;="bar"/>

From the spec:

The first character of a Name MUST be a NameStartChar, and any other characters MUST be NameChars; this mechanism is used to prevent names from beginning with European (ASCII) digits or with basic combining characters. Almost all characters are permitted in names, except those which either are or reasonably could be used as delimiters. The intention is to be inclusive rather than exclusive, so that writing systems not yet encoded in Unicode can be used in XML names. See J Suggestions for XML Names for suggestions on the creation of names.

Document authors are encouraged to use names which are meaningful words or combinations of words in natural languages, and to avoid symbolic or white space characters in names. Note that COLON, HYPHEN-MINUS, FULL STOP (period), LOW LINE (underscore), and MIDDLE DOT are explicitly permitted.

The ASCII symbols and punctuation marks, along with a fairly large group of Unicode symbol characters, are excluded from names because they are more useful as delimiters in contexts where XML names are used outside XML documents; providing this group gives those contexts hard guarantees about what cannot be part of an XML name. The character #x037E, GREEK QUESTION MARK, is excluded because when normalized it becomes a semicolon, which could change the meaning of entity references.
Names and Tokens
[4]    NameStartChar    ::=    ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] | [#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] | [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] | [#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] | [#x10000-#xEFFFF]
[4a]    NameChar    ::=    NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
[5]    Name    ::=    NameStartChar (NameChar)*
[6]    Names    ::=    Name (#x20 Name)*
[7]    Nmtoken    ::=    (NameChar)+
[8]    Nmtokens    ::=    Nmtoken (#x20 Nmtoken)*

What appears to have happened is that the system has translated the forbidden character ';' to a set of characters which are interpretable by humans but not machines to a Unicode code point. I do not think this is standard behaviour across all XML implementations.

I would also suspect it is likely to be a mistake because an attribute name of "Name;" is likely to cause problems in some XML tools.


Doesn't it come from

AttributeName = "Name;"
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜