开发者

Checking if a string is valid tag/attribute name for a XML document

Scenario

I need to write a validation function that validates XML tag names (or attribute names) .

Eg.:

  • "div" is valid
  • "d<iv" is not valid
  • "d\iv" is not valid

If a string is not valid i should escape that makes it invalid, and replace them with some arbitrary character (or remove it) .

Eg.:

  • "d<iv" is not valid -> I replace it with "div" .

Those functions will be heavily called - so I need to take in consideration code effectiveness.

My problem(s)

  • What are the rules that describe a valid XML tag/attribute name ? Is it safe to consider a valid XML tag/attribute to be described by the same rules as java variable name ? Or are开发者_JAVA技巧 those rules too restrictive ?
  • Should I use the java regex package or I should write my own specialized method ? (As I said speed is important) .
  • Do you have any suggestions ?

Thank you!


The rules are defined in the xml spec (look at the name definition)

If speed matters, then don't use regular expressions. Do it more like this:

public static String correctName(String name) {
  StringBuilder nameBuilder = new StringBuilder();
  for (char nameChar:name.charArray())
     if (isValidXml(nameChar))          // some magic left to do ;)
         nameBuilder.append(nameChar);
  return nameBuilder.toString();
}

Note - the code above is a simple guideline, it does not cover the little annoyance, that the first char of an xml name has a different value range ... if you want to correct illegal tags like $%&div then it's a bit more complicated (more magic needed)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜