开发者

I want to check whether the file in a url entered exists or not using .net

I am developing a tool for validation of links in url entered. suppose i have entered a url (e.g http://www-review-k6.thinkcentral.com/content/hsp/science/hspscience/na/gr3/se_9780153722271_/content/nlsg3_006.html ) in textbox1 and i want to check whether the contents of all the links exists on remote server or not. 开发者_如何学Pythonfinally i want a log file for the broken links.


You can use HttpWebRequest.

Note four things

1) The webRequest will throw exception if the link doesn't exist

2) You may like to disable auto redirect

3) You may also like to check if it's a valid url. If not, it will throw UriFormatException.

UPDATED

4) Per Paige suggested , Use "Head" in request.Method so that it won't download the whole remote file

    static bool UrlExists(string url)
    {
        try
        {
            HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(url);
            request.Method = "HEAD";
            request.AllowAutoRedirect = false;
            request.GetResponse();
        }
        catch (UriFormatException)
        {
            // Invalid Url
            return false;
        }
        catch (WebException ex)
        {
            // Valid Url but not exists
            HttpWebResponse webResponse = (HttpWebResponse)ex.Response;
            if (webResponse.StatusCode == HttpStatusCode.NotFound)
            {
                return false;
            }
        }
        return true;
    }


Use the HttpWebResponse class:

HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("http://www.gooogle.com/");            
HttpWebResponse response = (HttpWebResponse)webRequest.GetResponse();

if (response.StatusCode == HttpStatusCode.NotFound)
 {
     // do something
 }


bool LinkExist(string link)
{
   HttpWebRequest webRequest = (HttpWebRequest) webRequest.Create(link);
   HttpWebResponse webResponse = (HttpWebResponse)webRequest.GetResponse();
   return !(webResponse.StatusCode != HttpStatusCode.NotFound);
}


Use an HTTP HEAD request as explained in this article: http://www.eggheadcafe.com/tutorials/aspnet/2c13cafc-be1c-4dd8-9129-f82f59991517/the-lowly-http-head-reque.aspx


Make a HTTP request to the URL and see if you get a 404 response. If so then it does not exist.

Do you need a code example?


If your goal is robust validation of page source, consider usign a tool that is already written, like the W3C Link Checker. It can be run as a command-line program that handles finding links, pictures, css, etc and checking them for validity. It can also recursively check an entire web-site.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜