开发者

script to convert date (month_name DD, YYYY) to (YYYY-MM-DD)

I have a text file with dates in the form: "date=month_name DD, YYYY" and "date=(month_name DD, YYYY)"

How can I convert these dates so they are in the form: "date=YYYY-MM-DD"?

I also have some dates preceded by the field name "accessdate=" or no field name, that I would like to convert.

Thanks.

ADDENDUM:

  • The month names are are the ful开发者_开发百科l English month names e.g. January, February, etc.
  • I would only like to convert the dates inside ref tags i.e. they would be surrounded by other text inside <ref></ref>'
  • I'm open to any language for the scripting. I've done a little bash, javascript & python. But I think awk, sed, perl, etc. would be also fine. Explanations of the code would be appreciated.


Depends on the tool you use.

E.g. with awk & sed you can do something like this:

 awk '{
        /date=(?Jan/ {print "s/\\(.\\+\\)date=(\\?month_name \\(\\d\\d\\), \\(\\d\\d\\d\\d\\))\\?\\(\.\\+\\)$/\\1date=\\3-01-\\2\\4"}
        /date=(?Feb/ {print "s/\\(.\\+\\)date=(\\?month_name \\(\\d\\d\\), \\(\\d\\d\\d\\d\\))\\?\\(\.\\+\\)$/\\1date=\\3-02-\\2\\4"}
        /date=(?Mar/ {print "s/\\(.\\+\\)date=(\\?month_name \\(\\d\\d\\), \\(\\d\\d\\d\\d\\))\\?\\(\.\\+\\)$/\\1date=\\3-03-\\2\\4"}
        # ...
}' INPUT_FILE > tmp.sed

Then you can do an

sed -i.ORIG -f tmp.sed INPUT_FILE

Or you can write it in pure awk, by parsing $0.


You can begin with

echo 'date=April 13, 1985' | sed -e 's/January/01/' ... \
        -e 's/April/04/' ... -e 's/December/12/' | \
    sed 's/\([0-9]*\)[^0-9]\([0-9]*\)[^0-9] \([0-9]*\)$/\1-\2-\3/'

To handle "date=(month_name DD, YYYY)" you can also add sed 's/date=(\([^(]*\))/date=\1/' to the pipe and so on.

Concerning your addendum. sed would not be enough to work with <ref></ref> tag if it spans more then one line. So you have to use something more powerful. E.g. Python.

re.search() can be used to find <ref> and the matching </ref>. Then re.match() can be used to transform what's inside using the regexps similar to those used in sed. This algorithm have to be enclosed in a while loop to traverse all the document.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜