Best way to get date strings with Python
What's 开发者_运维技巧the best way to get datestrings from a website using Python?
The datestrings can be, for example, in the forms of:
- April 1st, 2011
- April 2nd, 2011
- April 23, 2011
- 4/2/2011
- 04/23/2011
Would this have to be a ton of regex? What's the most elegant solution?
Consider this lib: http://code.google.com/p/parsedatetime/
From its examples Wiki page, here are a couple of formats it can handle that look relevant to your question:
result = p.parseDateText("March 5th, 1980")
result = p.parseDate("4/4/80")
EDIT: now I notice it's actually a duplicate of this SO question where the same library was recommended!
month = '(jan|feb|mar|apr|may|jun|jul|aug|sep|nov|dec)[a-z]{0,6}'
regex_strings = ['%s(\.| )\d{1,2},? \d{2,4}' % month, # Month.Day, Year
'\d{1,2} %s,? \d{4}' % month, # Day Month Year(4)
'%s \d{1,2}\w{2},? \d{4}' % month, # Mon Day(th), Year
'\d{1,2} %s' % month, # Day Month
'\d{1,2}\.\d{1,2}\.\d{4}', # Month.Day.Year
'\d{1,2}/\d{1,2}/\d{2,4}', # Month/Day/Year{2,4}
]
精彩评论