开发者

Python RE question - proper state initial formatting

I have a string that I need to edit, it looks something similar to this:

string = "Idaho Ave N,,Crystal,Mn,55427-1463,US,,610839124763,Expedited"

If you notice the state initial "Mn" is not in proper formatting. I'm trying to use a regular expression to change this:

re.sub("[A-Z][a-z],", "[A-Z][A-Z],", string)

However, re.sub treats the second part as a literal and will change Mn, to [A-Z][A-Z],. How would I use re.sub (or something similar and simple) to properly change Mn, to MN, in this string?

Thank 开发者_运维技巧you in advance!


Your re.sub might modify also parts of the string you would not want to modify. Try to process the right element in your list explicitly:

input = "Idaho Ave N,,Crystal,Mn,55427-1463,US,,610839124763,Expedited"
elems = input.split(',')
elems[3] = elems[3].upper()
output = ','.join(elems)

returns

'Idaho Ave N,,Crystal,MN,55427-1463,US,,610839124763,Expedited'


You can pass a function as the replacement parameter to re.sub to generate the replacement string from the match object, e.g.:

import re

s = "Idaho Ave N,,Crystal,Mn,55427-1463,US,,610839124763,Expedited"

def upcase(match):
    return match.group().upper()

print re.sub("[A-Z][a-z],", upcase, s)

(This is ignoring the concern of whether you're genuinely finding state initials with this method.)

The appropriate documentation for re.sub is here.


sub(pattern, repl, string, count=0)

Return the string obtained by replacing the leftmost non-overlapping occurrences of the pattern in string by the replacement repl. repl can be either a string or a callable; if a string, backslash escapes in it are processed. If it is a callable, it's passed the match object and must return a replacement string to be used.

re.sub("[A-Z][a-z]", lambda m: m.group(0).upper(), myString)

I would avoid calling your variable string since that is a type name.


You create a group by surrounding it in parentheses withing your regex, then refer to is by its group number:

re.sub("([A-Z][a-z]),", "\1,".upper(), string)
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜