开发者

Can you solve this odd problem I'm having using the String IndexOf method?

I have code which uses the StreamReader to read HTML from a file, then calls the ReadToEnd() function. The HTML is stored as a string.

Then I call this line of code:

string bookmarksBar = HTMLDoc.Substring(HTMLDoc.IndexOf(">Bookmarks bar</H3>"), HTMLDoc.IndexOf("</DL><p>"));

So what's happening here is that I want a particular section of the HTML, so I'm using the string Substring method. The first argument is the startIndex, and the second argument is the length.

I am using the IndexOf methods so that this line of code will return a section of text which should be between ">Bookmarks bar</H3>" and "</DL><p>"

And so the end of the returned string should be where "</DL><p>" is found, right? 开发者_高级运维

The problem then is that the string does not end where </DL><p> is found, but ends 323 characters later, at this line (I have inserted four asterisks to illustrate where the returned string ends):

ICON="data:image/png;base64,iVBORw0KGgoAAA****ANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAABbklEQVQ4je3RPWuTYQCF4fs875uYKEilOA 

I can't make sense of why it's ending here, since the string does not match "</DL><p>" at this point.

So here is a bigger section of the HTML:

jNpXrXKt4WFgn/KY1J1yBg874KWb0Vmr+BSttzgKt3LuBAAAAAElFTkSuQmCC\"></A>\r\n    </DL><p>\r\n    <DT><H3 ADD_DATE=\"1282073650\" LAST_MODIFIED=\"1301438557\">Link 1</H3>\r\n    <DL><p>\r\n        <DT><H3 ADD_DATE=\"1282073650\" LAST_MODIFIED=\"1286905747\">Link2</H3>\r\n        <DL><p>\r\n            <DT><A HREF=\"http://creators.xna.com/en-GB/create_detail#tour_four\" ADD_DATE=\"1282073650\" ICON=\"data:image/png;base64,iVBORw0KGgoAAA"

You can see the "</DL><p>" in the above HTML, so why doesn't it stop at that point, instead of stopping at "KGgoAAA"?

Any ideas?

Thanks


You answered your own question.

the second argument is the length

The second argument is not the endIndex.

Also, the way you're calling this, you will end up getting the text ">Bookmarks bar</H3>" in your result. Try this:

// You could make the part after the + a constant. I'm using s.Length to demonstrate 
// where the number would come from.
var startIndex = HTMLDoc.IndexOf(">Bookmarks bar</H3>") + ">Bookmarks bar</H3>".Length;
var endIndex = HTMLDoc.IndexOf("</DL><p>");
string bookmarksBar = HTMLDoc.Substring(startIndex, endIndex - startIndex);


Try:

string bookmarksBar = HTMLDoc.Substring(HTMLDoc.IndexOf(">Bookmarks bar</H3>"), HTMLDoc.IndexOf("</DL><p>")-HTMLDoc.IndexOf(">Bookmarks bar</H3>"));


Try this:

int start = HTMLDoc.IndexOf(">Bookmarks bar</H3>");
string bookmarksBar = HTMLDoc.Substring(index, HTMLDoc.IndexOf("</DL><p>")-start);


the second parameter is the amount of characters it will move passed the index, so SubString(0, 4) takes the first four characters, SubString(4,8) instead of the Java SubString Logic does not bring characters 4 - 8, it returns characters 4 - 12.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜