Translate various inconsistent date strings into a valid 'date'; Python
I have a variety of text input files containing line-based records (csv/tsv etc.) that need to be processed; each row in each file typically has a date field, and the date formats vary wildly.
My first instinct is to (manually) try to identify each date format, and then for each row use regular expressions to identify the format, validate the data, and cast into a valid date object.
However, I would like to开发者_运维百科 take this opportunity to advance my knowledge of python and programming in general, and hope that you can offer some techniques that I might use to more efficiently manage this job, learning something in the process.
Thanks!
Use the python-dateutil module, it's parse
method can detect and parse many different date formats.
Note that python-dateutil is quite specific in how it'll deal with ambiguous dates (year, month or day first) and let's you tweak how you parse these. This of course depends on what your data looks like; you may have to test some samples and hand-verify.
I'm not sure this is even possible. Consider the following date:
83-05-03 10:20
Is this:
- May 3rd, 1983, at 10:20 AM?
- Mar 5th, 2083, at 10:20 PM?
- etc
Without specifying the format I don't see how you can resolve such ambiguities.
精彩评论