开发者

How do I search for a string with quotes?

I am searching for the string

<!--m--><li class="g w0"><h3 class=r><a href="

within the HTML source of this link:

http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=Santarus+Inc‎

This is how I am searching for it:

string html_string = "http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=" + biocompany;
html = new WebClient().DownloadString(html_string);

d=html.IndexOf(@"<!--m--><li class=""g w0""><h3 class=r><a href=""",1);

For some reason it is finding an occurrence of it at position 45 (in other words d=45) but this is incorrect.

Here are the first couple hundred characters of the string HTML:

<!doctype html><head>开发者_开发问答;<title>Santarus Inc&#8206; - Google Search</title><script>window.google={kEI:\"b6jES5nPD4rysQOokrGDDQ\",kEXPI:\"23729,24229,24249,24260,24414,24457\",kCSI:{e:\"23729,24229,24249,24260,24414,24457\",ei:\"b6jES5nPD4rysQOokrGDDQ\",expi:\"23729,24229,24249,24260,24414,24457\"},ml:function(){},kHL:\"en\",time:function(){return(new Date).getTime()},log:function(b,d,c){var a=new Image,e=google,g=e.lc,f=e.li;a.onerror=(a.onload=(a.onabort=function(){delete g[f]}));g[f]=a;c=c||\"/gen_204?atyp=i&ct=\"+b+\"&cad=\"+d+\"&zx=\"+google.time();a.src=c;e.li=f+1},lc:[],li:0,Toolbelt:{}};\nwindow.google.sn=\"web\";window.google.timers={load:{t:{start:(new Date).getTime()}}};try{}catch(u){}window.google.jsrt_kill=1;\n</script><style>body{background:#fff;color:#000;margin:3px 8px}#gbar,#guser{font-size:13px;padding-top:1px !important}#gbar{float:left;height:22px}#guser{padding-bottom:7px !important;text-align:right}.gbh,.gbd{border-top:1px solid #c9d7f1;font-size:1px}.gbh


You are not searching for the string you say you are.

If you are searching for this:

<!--m--><li class="g w0"><h3 class=r>

Use this:

@"<!--m--><li class=""g w0""><h3 class=r>"

Not:

@"<!--m--><li class=""g w0""><h3 class=r><a href="""

Update: (following comments)

I ran a google search following the URL you have used. I found no case of an <li> tag with a quoted class attribute. Are you looking for the correct string?


Post more code. My guess is your html variable isn't storing what you think it is. Are you reading the html line by line, by chance? And are you appending or replacing the prior contents? At any rate, put a breakpoint immediately before or after the .IndexOf call and check the contents of html.

Edit I ran through a sample of your code and am not finding your string.

string biocompany = "Santarus Inc‎";
string html_string = "http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=" + biocompany;
using (WebClient client = new WebClient())
{
    string html = client.DownloadString(html_string);
    int d = html.IndexOf(@"<!--m--><li class=""g w0""><h3 class=r>");
    Console.WriteLine(d);
}

I am not sure what you are doing based on the code you posted.


Anthony has given correct solution.

But it is also true that the browsers are showing the download html different as compared to downloaded in the code.

That's why the index returned is coming to be -1, and not even 45, as mentioned by "every_answer_gets_a_point" :)

This is another question in itself - why is it happening so.

As far as the searching is concerned, Anthony has given the answer already.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜