python: check if url to jpg exists

2022-12-24 03:12 问答作者：

In python, how would I check if a url ending in .jpg exists?

ex: http://www.开发者_JAVA百科fakedomain.com/fakeImage.jpg

thanks

The code below is equivalent to tikiboy's answer, but using a high-level and easy-to-use requests library.

import requests

def exists(path):
    r = requests.head(path)
    return r.status_code == requests.codes.ok

print exists('http://www.fakedomain.com/fakeImage.jpg')

The requests.codes.ok equals 200, so you can substitute the exact status code if you wish.

requests.head may throw an exception if server doesn't respond, so you might want to add a try-except construct.

Also if you want to include codes 301 and 302, consider code 303 too, especially if you dereference URIs that denote resources in Linked Data. A URI may represent a person, but you can't download a person, so the server will redirect you to a page that describes this person using 303 redirect.

>>> import httplib
>>>
>>> def exists(site, path):
...     conn = httplib.HTTPConnection(site)
...     conn.request('HEAD', path)
...     response = conn.getresponse()
...     conn.close()
...     return response.status == 200
...
>>> exists('http://www.fakedomain.com', '/fakeImage.jpg')
False

If the status is anything other than a 200, the resource doesn't exist at the URL. This doesn't mean that it's gone altogether. If the server returns a 301 or 302, this means that the resource still exists, but at a different URL. To alter the function to handle this case, the status check line just needs to be changed to return response.status in (200, 301, 302).

thanks for all the responses everyone, ended up using the following:

try:
  f = urllib2.urlopen(urllib2.Request(url))
  deadLinkFound = False
except:
  deadLinkFound = True

Looks like http://www.fakedomain.com/fakeImage.jpg automatically redirected to http://www.fakedomain.com/index.html without any error.

Redirecting for 301 and 302 responses are automatically done without giving any response back to user.

Please take a look HTTPRedirectHandler, you might need to subclass it to handle that.

Here is the one sample from Dive Into Python:

http://diveintopython3.ep.io/http-web-services.html#redirects

There are problems with the previous answers when the file is in ftp server (ftp://url.com/file), the following code works when the file is in ftp, http or https:

import urllib2

def file_exists(url):
    request = urllib2.Request(url)
    request.get_method = lambda : 'HEAD'
    try:
        response = urllib2.urlopen(request)
        return True
    except:
        return False

Try it with mechanize:

import mechanize
br = mechanize.Browser()
br.set_handle_redirect(False)
try:
 br.open_novisit('http://www.fakedomain.com/fakeImage.jpg')
 print 'OK'
except:
 print 'KO'

This might be good enough to see if a url to a file exists.

import urllib
if urllib.urlopen('http://www.fakedomain.com/fakeImage.jpg').code == 200:
  print 'File exists'

in Python 3.6.5:

import http.client

def exists(site, path):
    connection =  http.client.HTTPConnection(site)
    connection.request('HEAD', path)
    response = connection.getresponse()
    connection.close()
    return response.status == 200

exists("www.fakedomain.com", "/fakeImage.jpg")

In Python 3, the module httplib has been renamed to http.client

And you need remove the http:// and https:// from your URL, because the httplib is considering : as a port number and the port number must be numeric.

Python3

import requests

def url_exists(url):
    """Check if resource exist?"""
    if not url:
        raise ValueError("url is required")
    try:
        resp = requests.head(url)
        return True if resp.status_code == 200 else False
    except Exception as e:
        return False

The answer of @z3moon was good, but I think it is for py 2.x. For python 3.x, you may want to add request to the module call.

import urllib
def check_valid_URLs(url) -> bool:
  try:
    if urllib.request.urlopen(url).code == 200:
      return True
    else:
      return False
  except:
    return False

I think you can try send a http request to the url and read the response.If no exception was caught,it probably exists.

继续阅读：python validation

python: check if url to jpg exists

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？