How can I get the title of a page on another site?
Pretty long question;
How can I do the following in C#:开发者_如何学Go
Open a web page (Preferably not visible) Check whether the page redirects to a different page (Site is down, 404, etc.) Check if the title is not equal to a said string
Then separately, (They need to click a confirm button)
open their browser, and go to the address of the first (It'll be the only one) hyperlink on the site.
I literally have been looking on Google for ages and haven't found anything similar to what I need.
Whether you give me a link to a site with a tutorial on this area of programming or actual source code doesn't make a difference to me.
check out the webrequest class, it can do redirection :) then you can just parse the html and find the title tag using xpath or something
sort of like this
using System.Xml;
using System.Xml.XPath;
using System.Xml.Linq;
using System.Net;
...
HttpWebRequest myReq = ( HttpWebRequest )WebRequest.Create( "http://www.contoso.com/" );
myReq.AllowAutoRedirect = true;
myReq.MaximumAutomaticRedirections = 5;
XNode result;
using( var responseStream = myReq.GetResponse( ).GetResponseStream( ) ) {
result = XElement.Load( responseStream );
}
var title = result.XPathSelectElement( "//title" ).Value;
obviosly your xpath can (and probably should) be more sophisticated :) you can find out more on xpath here
on a similar note you can use xpath on the xml you get back to find the links and pick out the first one:
var links = result.XPathSelectElements( "//a" ).Select( linktag => linktag.Attribute( "href" ).Value );
when you eventually find the url you want to open you can use
System.Diagnostics.Process.Start( links.First() );
to get it to open in the browser. a nice aspect of this is that it will open what ever browser is the default for the client. it does have security implications though, you should make sure that its an url and not an exe file or something.
also, its possible that the html use diffrent capital letters for its elements, you'd have to deal with that when looking for linsk
You could use WebRequest or HttpWebRequest, but if you want a browser UI you will need to use the WebBrowser control: http://msdn.microsoft.com/en-us/library/system.windows.forms.webbrowser.aspx
You will need to handle the completion event from the Navigate call which will load the page for you:
WebBrowser myWebBrowser = new WebBrowser();
webBrowser1.Navigating += new WebBrowserNavigatingEventHandler(webBrowser1_IDontKnow);
myWebBrowser.Navigate("http://myurl.com/mypage.htm");
You can then implement your handler as follows, and interact with the WebBrowser ui as necessary... the DocumentText property contains the HTML of the currently loaded web page:
private void webBrowser1_IDontKnow(object sender, WebBrowserNavigatingEventArgs e)
{
CheckHTMLConfirmAndRedirect(webBrowser1.DocumentText);
}
Use HttpWebRequest and parse the response:
private static void method1()
{
string strWORD = "pain";
const string WORDWEBURI = "http://www.wordwebonline.com/search.pl?w=";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(WORDWEBURI + strWORD.ToUpper());
request.UserAgent = @"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0)";
request.ContentType = "text/html";
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StringBuilder sb = new StringBuilder();
Stream resStream = response.GetResponseStream();
byte[] buffer = new byte[8192];
string tempString = null;
int count = 0;
do
{
// fill the buffer with data
count = resStream.Read(buffer, 0, buffer.Length);
// make sure we read some data
if (count != 0)
{
// translate from bytes to ASCII text
tempString = Encoding.UTF8.GetString(buffer, 0, count);
// continue building the string
sb.Append(tempString);
}
}
while (count > 0); // any more data to read?
Console.Write(sb.ToString());
}
精彩评论