开发者

NSXMLParser column number is wrong

I am trying to parse an XML file using NSXMLParser, but the method [parser columnNumber] returns a wrong value. For example, in my .xml I have:

...
<Test><something type="great"><lol>Joy</lol> // Three elements in the same line
...
</something>
</Test>

For the element "Test"开发者_如何学JAVA, I get the correct line:

<Test><something type="great"><lol>Joy</lol>

But the column number is "6". In the same line, I get the column number "22" for the element "something":

"great"><lol>Joy</lol>

Is this an expected behavior?


Edit. Two headaches ago I was still hopeful. Now I think it is much better to reformat the file to avoid strange things like elements in the same line and do some whitespace cleaning. But this is strange. What a bug.


Well, this is quite strange, but I'm writing an answer anyway.

I was doing some tests with an example XML and some line/column numbers got from NSXML.

<?xml version="1.0"?>
<catalog class="something">
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications 
      with XML.</description>
...

Some logs in the form of

column, line
OPEN/CLOS element

(I'm pretty noob at intelligent debugging)

15, 2 (A)
OPEN catalog
12, 3 (B)
OPEN book
14, 4
OPEN author
CLOS author
13, 5
OPEN title
CLOS title
13, 6
OPEN genre
CLOS genre
13, 7
OPEN price
CLOS price
20, 8
OPEN publish_date
CLOS publish_date
19, 9
OPEN description
CLOS description
CLOS book
...

There is a formula that always works1, that is:

columnPosition = columnNumber - length("<element>")

For example, consider the second line and the log near (A):

<catalog class="something">

I expect that columnPosition equals 0, in fact:

len("<catalog class>") = 15
0 = 15 - length("<catalog class>")

Note that NSXML's columnNumber is still 15 whatever I write inside the "class" tag, but it's 9 when I remove the entire tag. With the following line:

<catalog>

I expect that columnPosition equals 0, in fact:

length("<catalog>") = 9
0 = 9 - length("<catalog>")

Now, consider the following line and the log near (B):

   <book id="bk101">

I'm expecting that columnNumber equals 3. In fact:

length("<book id>") = 9
3 = 12 - length("<book id>")

Well, this is strange. I think this is not an excellent solution, but at least it works. I can't simply remove the whitespace in the beginning of the string, because it fails if there is a line like:

<catalog class="something"><book id="bk101">

What do you think about this? I'm feeling kinda noob but I'm going to check this one as the accepted answer if there is no other way. I'm looking forward to what you guys think.


1 Avoided formal proof for brevity and lack of will.


Why don't you increment level in startElement method and decrement in endElement? that way you keep track of nesting level

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜