closing files properly opened with urllib2.urlopen()
I have following code in a python script
try:
# send the query request
sf = urllib2.urlopen(search_query)
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
sf.close()
except Exception, err:
print("Couldn't get programme information.")
print(str(err))
return
开发者_如何学JAVAI'm concerned because if I encounter an error on sf.read()
, then sf.clsoe()
is not called.
I tried putting sf.close()
in a finally
block, but if there's an exception on urlopen()
then there's no file to close and I encounter an exception in the finally
block!
So then I tried
try:
with urllib2.urlopen(search_query) as sf:
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except Exception, err:
print("Couldn't get programme information.")
print(str(err))
return
but this raised a invalid syntax error on the with...
line.
How can I best handle this, I feel stupid!
As commenters have pointed out, I am using Pys60 which is python 2.5.4
I would use contextlib.closing (in combination with from __future__ import with_statement for old Python versions):
from contextlib import closing
with closing(urllib2.urlopen('http://blah')) as sf:
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
Or, if you want to avoid the with statement:
try:
sf = None
sf = urllib2.urlopen('http://blah')
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
finally:
if sf:
sf.close()
Not quite as elegant though.
finally:
if sf: sf.close()
Why not just try closing sf
, and passing if it doesn't exist?
import urllib2
try:
search_query = 'http://blah'
sf = urllib2.urlopen(search_query)
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except urllib2.URLError, err:
print(err.reason)
finally:
try:
sf.close()
except NameError:
pass
Given that you are trying to use 'with', you should be on Python 2.5, and then this applies too: http://docs.python.org/tutorial/errors.html#defining-clean-up-actions
If urlopen() has an exception, catch it and call the exception's close() function, like this:
try:
req = urllib2.urlopen(url)
req.close()
print 'request {0} ok'.format(url)
except urllib2.HTTPError, e:
e.close()
print 'request {0} failed, http code: {1}'.format(url, e.code)
except urllib2.URLError, e:
print 'request {0} error, error reason: {1}'.format(url, e.reason)
the exception is also a full response object, you can see this issue message: http://bugs.jython.org/issue1544
Looks like the problem runs deeper than I thought - this forum thread indicates urllib2 doesn't implement with
until after python 2.6, and possibly not until 3.1
You could create your own generic URL opener:
from contextlib import contextmanager
@contextmanager
def urlopener(inURL):
"""Open a URL and yield the fileHandle then close the connection when leaving the 'with' clause."""
fileHandle = urllib2.urlopen(inURL)
try: yield fileHandle
finally: fileHandle.close()
Then you could then use your syntax from your original question:
with urlopener(theURL) as sf:
search_soup = BeautifulSoup.BeautifulSoup(sf.read())
This solution gives you a clean separation of concerns. You get a clean generic urlopener syntax that handles the complexities of properly closing the resource regardless of errors that occur underneath your with clause.
Why not just use multiple try/except blocks?
try:
# send the query request
sf = urllib2.urlopen(search_query)
except urllib2.URLError as url_error:
sys.stderr.write("Error requesting url: %s\n" % (search_query,))
raise
try:
search_soup = BeautifulSoup.BeautifulStoneSoup(sf.read())
except Exception, err: # Maybe catch more specific Exceptions here
sys.stderr.write("Couldn't get programme information from url: %s\n" % (search_query,))
raise # or return as in your original code
finally:
sf.close()
精彩评论