Python time to age, part 2: timezones [duplicate]
Following on from my previous question, Python time to age, I have now come across a problem regarding the timezone, and it turns out that it's not always going to be "+0200". So when strptime tries to parse it as such, it throws up an exception.
I thought about just chopping off the +0200 with [:-6] or whatever, but is there a real way to do this with strptime?
I am using Python 2.5.2 if it matters.
>>> from datetime import datetime
>>> fmt = "%a, %d %b %Y %H:%M:%S +0200"
>>> datetime.strptime("Tue, 22 Jul 2008 08:17:41 +0200", fmt)
datetime.datetime(2008, 7, 22, 8, 17, 41)
>>> datetime.strptime("Tue, 22 Jul 2008 08:17:41 +0300", fmt)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.5/_strptime.py", line 330, in strptime
(data_string, format))
ValueError: time data did not match format: data=Tue, 22 Jul 2008 08:17:41 +0300 fmt=%a, %d %b %Y %H:%M:%S +0200
is there a real way to do this with strptime?
No, but since your format appears to be an RFC822-family date, you can read it much more easily using the email library instead:
>>> import email.utils
>>> email.utils.parsedate_tz('Tue, 22 Jul 2008 08:17:41 +0200')
(2008, 7, 22, 8, 17, 41, 0, 1, 0, 7200)
(7200 = timezone offset from UTC in seconds)
New in version 2.6.
For a naive object, the %z and %Z format codes are replaced by empty strings.
It looks like this is implemented only in >= 2.6, and I think you have to manually parse it.
I can't see another solution than to remove the time zone data:
from datetime import timedelta,datetime
try:
offset = int("Tue, 22 Jul 2008 08:17:41 +0300"[-5:])
except:
print "Error"
delta = timedelta(hours = offset / 100)
fmt = "%a, %d %b %Y %H:%M:%S"
time = datetime.strptime("Tue, 22 Jul 2008 08:17:41 +0200"[:-6], fmt)
time -= delta
You can use the dateutil
library which is very useful:
from datetime import datetime
from dateutil.parser import parse
dt = parse("Tue, 22 Jul 2008 08:17:41 +0200")
## datetime.datetime(2008, 7, 22, 8, 17, 41, tzinfo=tzoffset(None, 7200)) <- dt
print dt
2008-07-22 08:17:41+02:00
As far as I know, strptime()
doesn't recognize numeric time zone codes. If you know that the string is always going to end with a time zone specification of that form (+ or - followed by 4 digits), just chopping it off and parsing it manually seems like a perfectly reasonable thing to do.
It seems that %Z corresponds to time zone names, not offsets.
For example, given:
>>> format = '%a, %d %b %Y %H:%M:%S %Z'
I can parse:
>>> datetime.datetime.strptime('Tue, 22 Jul 2008 08:17:41 GMT', format)
datetime.datetime(2008, 7, 22, 8, 17, 41)
Although it seems that it doesn't do anything with the time zone, merely observing that it exists and is valid:
>>> datetime.datetime.strptime('Tue, 22 Jul 2008 08:17:41 NZDT', format)
datetime.datetime(2008, 7, 22, 8, 17, 41)
I suppose if you wished, you could locate a mapping of offsets to names, convert your input, and then parse it. It might be simpler to just truncate your input, though.
精彩评论