开发者

Working with strings in an xml file C++

I have a function that retrives text between title and links tags from an XML file, but what i want is to test if the title and link tags are between it开发者_运维知识库em tags. This is my code:

istringstream iss(content);
    string line;
    while(getline(iss, line))
    {
        // get <title> and </title> positions
        int found3 = line.find("<title>");
        int found4 = line.find("</title>");
        // get <link> and </link> positions
        int found5 = line.find("<link>");
        int found6 = line.find("</link>");

        // if found tags, add them to stl::list
        if(found3 >= 0 && found4 >= 0)
        {
            string getTitleStr = line.substr(found3 + 7, found4 - found3 - 7);
            titles.push_back(getTitleStr);
        }
        if(found5 >= 0 && found6 >= 0)
        {
            string getLinkStr = line.substr(found5 + 6, found6 - found5 - 6);
            links.push_back(getLinkStr);
        }
    }

Does anyone have an ideea how to do this with C++ strings only ? Without parsers.

Thank you.


If you don't want to "parse" the XML, then you will have to know its exact structure. As other people have commented, this is painful and will break if the supplier changes the XML structure without your notification.

Example XML:

<-- This is not a "Well formed fragment" -->
<-- The following is a title tag without
    a corresponding link tag -->
<title>My XML file</title>
<author>Me.</author>

<-- The following is a title followed by a link -->
<title>Google</title>
<link>http://www.google.com</link>

<-- Nasty:  nested title and link tags with
    junk between them. -->
<outer_block>  
    <title>Inner Title</title>
    <junk>Junk between title and link</junk>
    <link>link text</link>
</outer_block>

Without parsing, you can't correlate title tags to link tags unless you know the exact layout of the XML. If there are any variable length fields or optional fields, this becomes more difficult.

In the example above, you could say that you were only interested in the 3rd occurrence of the title tags. This is easy, just use a for loop. However, to know if the title tag is inside a block, you will have to either search backwards for a start tag or when searching forwards, look for two start tags in a row (a.k.a. parsing).

Take a look at: http://www.w3.org/TR/REC-xml/#sec-starttags

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜