C++, subtract certain strings?

2023-01-21 07:16 问答作者：

This is a homework, thus I hope you guys don't give me the direct answers/code, but guide me to the solution.

My problem is, I have this XXX.html file, inside have thousands of codes. But what I need is to extract this portion:

<开发者_JS百科;html>
...
<table>
    <thead>
        <tr>
            <th class="xxx">xxx</th>
            <th>xxx</th>                       <th>xxx</th>         </tr>
    </thead>
    <tbody>
        <tr class=xxx>
        <td class="xxx"><a href="xxx" >ZZZ ZZ ZZZ</a></td>
<td>ZZZZ</td>        <td class="xxx">ZZZZ</td>    </tr>    <tr class=xxx>
<td class="xxx"><a href="xxx" >ZZZ ZZ ZZZ</a></td>
<td>ZZZZ</td>        <td class="xxx">ZZZZ</td>    </tr>    <tr class=xxx>
<td class="xxxx"><a href="xxxx" >ZZZ ZZ ZZZ</a></td>
<td>ZZZZ</td>        <td class="xxxx">zzzz</td>    </tr>    <tr class=xxx>
<td class="xxx"><a href="xxxx" >ZZZ ZZ ZZZ</a></td>
    ... and so on

This is my current codes so far:

// after open the file
while(!fileOpened.eof()){
        getline(fileOpened, reader);
        if(reader.find("ZZZ")){
            cout << reader << endl;
        }
    }

The "reader" is a string variable that I want to hold for each line of the HTML file. If the value of ZZZZ, as I need to get live, the value will change, what method should I use instead of using "find" method? (I am really sorry, for not mention this part)

But instead of display the value that I want, it display the some others portion of the html file. Why? Is my method wrong? If my method is wrong, how do I extract the ZZZZZ value?

std::string::find does not return a boolean value. It returns an index into the string where the substring match occurs if it is successful, else it returns std::string::npos.

So you would want to say:

    if (reader.find("ZZZ") != std::string::npos){
        cout << reader << endl;
    }

In general using string matching just won't work to extract values from an HTML file. A proper HTML parser would be required -- they are available for C++ as standard code.

Otherwise I'd suggest using a regex library (boost::regex until C++0x comes out). You'll be able to write better expressions to capture the part of the file you are interested in.

Reading by line probably won't work since an HTML file could be one large line. Outputing then each line you find will simply emit the entire file. Thus try the regexes and look for small sections of the code and output those. The regex library will have a "match all" command (I forgot the exact name).

The skeleton code for reading lines from a file should look like this:

if( !file.good() )
  throw "opening file failed!";

for(;;) {
  std::string line;
  std::getline(file, line);
  if( !file.good() )
    break;
  // reading succeeded, process line
}

if(!file.eof())
  // error before reaching EOF

(That funny looking loop is one that checks for the ending condition in the middle of the loop. There is not such thing in C++, so you have to use an endless loop with a break in the middle.)

However, as I said in a comment to your question, reading HTML code line-by-line isn't necessarily useful, as HTML doesn't rely on specific whitespaces.

继续阅读：string

C++, subtract certain strings?

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？