开发者

Python, date validation

I'm trying to think of a way to accomplish this in the best pythonic way possible. Right now the only method I can think of is to brute force it.

User inputs a date (via command line) in one of the following manners (ex. ./mypy.py date='20110909.00 23' )

date='20110909'
date='20110909.00 23'
date='20110909.00 20110909.23'

All three examples should have the same result, it doesn't matter if it populates a list (which I can sort) such as

['20110909.00', '20110909.23]

or even two sorted separate variables, but in all cases it's YYYYMMDD.HH, and needs to make sure it is indeed a date and not text.

Any ideas?

Thank you.

+++++ EDIT +++++ After plugging away at this, I'm thinking I needed to do a lot of date checking/manipulating first. Which all seems to be working just great. Except at the very end I run the list through the date validation and it fails every time - even when it should be passing.

(I launch it with) ./test.py date='20110909.00 23'

(or any variation of date - i.e. date='20 22' or date='20110909' or date='20110909.00 23' etc.)

import sys, re, time, datetime

now = datetime.datetime.now()
tempdate=[]
strfirstdate=None
strtempdate=None

temparg2 = sys.argv
del temparg2[0]
tempdate = temparg2[0].replace('date=','')
date = tempdate.split(' ');

tempdate=[]
date.sort(key=len, reverse=True)
result = None

# If no date is passed then create list according to [YYMMDD.HH, YYMMDD.HH]
if date[0] == 'None':
    tempdate.extend([now.strftime('%Y%m%d.00'), now.strftime('%Y%m%d.%H')])


# If length of date list is 1 than see if it is YYMMDD only or HH only, and create list according to [YYMMDD.HH, YYMMDD.HH]
elif len(date) == 1:
    if len(date[0]) == 8:
        tempdate.extend([ date[0] + '.00', date[0] + '.23'])
    elif len(date[0]) == 2:
        tempdate.extend([now.strftime('%Y%m%d') + '.' + date[0], now.strftime('%Y%m%d') + '.' + date[0]])
    else:
        tempdate.extend([date[0], date[0]])


# iterate through list, see if value is YYMMDD only or HH only or YYYYMMDD.HH, and create list accoring to [YYYYMMDD.HH, YYYYMMDD.HH] - maximum of 2 values
else:
    for _ in range(2):
        if len(date[_]) == 8:
            strfirstdate = date[0]
            tempdate.append([ date[_] + '.00'])
        elif len(date[_]) == 2:
            if _ == 0:  # both values passed could be hours only
                tempdate.append(now.strftime('%Y%m%d') + '.' + date[_])
            else:  # we must be at the 2nd value passed.
                if strfirstdate == None:
                    tempdate.append(now.strftime('%Y%m%d') + '.' + date[_])
                else:
                    tempdate.append(strfirstdate + '.' + date [_])
        else:
            strfirstdate = date[0][:8]
            tempdate.append(date[_])

tempdate.sort()


for s in tempdate:
    try:
        result = datetime.datetime.strptime(s, '%Y%m%d.%H')
    except:
        pass

if result is None:
    print 'Malformed date.'
else:
    print 'Date is fine.'

print tempdate

++++ Edit 2 ++++ If I remove the bottom part (after tempdate.sort()) and replace it with this.

strfirstdate = re.compile(r'([0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]+\.[0-9][0-9])')
for s in tempdate:
    if re.match(strfirstdate, s):
        result = "validated"
    else:
        print "#####################"
        print "#####################"
        print "##  error in date  ##"
        print "#####################"
        print "#####################"
        exit

It will validate appropriately.

This entire method just doesn't seem to be very pythonic开发者_StackOverflow.


You can create a mask and parse it, using try...except to determine whether the date string matches one of the many masks. I had this code for a project, so I've slightly modified it:

from time import mktime, strptime
from datetime import datetime

date = '20110909.00 20110909.23'.split(' ')[0]
result = None

for format in ['%Y%m%d', '%Y%m%d.%H']:
  try:
    result = datetime.strptime(date, format)
  except:
    pass

if result is None:
  print 'Malformed date.'
else:
  print 'Date is fine.'


I found some problems when I attempted to use the try..except code example in my own parsing so here's a version with the fixes I added, and I also addressed the question of handling only the hour part:

from datetime import datetime

dates = ['20110909.00','20110909.23','13','20111212','20113131']

def dateTest(date):
  dateOk = False
  for format in ['%Y%m%d', '%Y%m%d.%H', '%H']:
    try:
      result = datetime.strptime(date, format)
      dateOk = (date == result.strftime(format)) # this makes sure the parsed date matches the original string
      if format == '%H': # this handles the hour only case
        date = '%s.%s' % (datetime.now().strftime('%Y%m%d'), date)
    except:
      pass

  if dateOk:
    print 'Date is fine.'
  else:
    print 'Malformed date.'
  return date

for date in dates:
  print date
  print dateTest(date)
  print ''


Take a look at the time module. Specifically, see the time.strptime() function.

There's also a pretty easy conversion between time values and datetime objects.


Does this help you ? :

from datetime import datetime
import re

reg = re.compile('(\d{4})(\d\d)(\d\d)'
                 '(?:\.(\d\d)(\d\d)?(\d\d)? *'
                 '(?:(\d{4})(\d\d)(\d\d)\.)?(\d\d)(\d\d)?(\d\d)? *)?')

for x in ('20110909',
          '20110909.00 23',
          '20110909.00 74',
          '20110909.00 20110909.23',
          '20110909.00 19980412.23',
          '20110909.08 20110909.23',
          '20110935.08 20110909.23',
          '20110909.08 19970609.51'):
    print x

    gr = reg.match(x).groups('000')

    try:
        x1 = datetime(*map(int,gr[0:6]))

        if gr[6]=='000':

            if gr[9]=='000':
                x2 = x1

            else:
                y = map(int,gr[0:3] + gr[9:12])
                try:
                    x2 = datetime(*y)
                except:
                    x2 = "The second part isn't in range(0,25)"

        else:
            y = map(int,gr[6:12])
            try:
                x2 = datetime(*y)
            except:
                x2 = "The second part doesn't represent a real date"
    except:
        x1 = "The first part dosen't represent a real date"
        x2 = '--'

    print [str(x1),str(x2)],'\n'

result

20110909
['2011-09-09 00:00:00', '2011-09-09 00:00:00'] 

20110909.00 23
['2011-09-09 00:00:00', '2011-09-09 23:00:00'] 

20110909.00 74
['2011-09-09 00:00:00', "The hour in the second part isn't in range(0,25)"] 

20110909.00 20110909.23
['2011-09-09 00:00:00', '2011-09-09 23:00:00'] 

20110909.00 19980412.23
['2011-09-09 00:00:00', '1998-04-12 23:00:00'] 

20110909.08 20110909.23
['2011-09-09 08:00:00', '2011-09-09 23:00:00'] 

20110935.08 20110909.23
["The first part dosen't represent a real date", '--'] 

20110909.08 19970609.51
['2011-09-09 08:00:00', "The second part doesn't represent a real date"]  

.

Note that groups('000') replace None with '000' for each group that is None

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜