开发者

Check String for / against Characters in Python

I 开发者_如何学Pythonneed to be able to tell the difference between a string that can contain letters and numbers, and a string that can contain numbers, colons and hyphens.

>>> def checkString(s):
...   pattern = r'[-:0-9]'
...   if re.search(pattern,s):
...     print "Matches pattern."
...   else:
...     print "Does not match pattern."

# 3 Numbers seperated by colons. 12, 24 and minus 14
>>> s1 = "12:24:-14"
# String containing letters and string containing letters/numbers.
>>> s2 = "hello"
>>> s3 = "hello2"

When I run the checkString method on each of the above strings:

>>>checkString(s1)
Matches Pattern.
>>>checkString(s2)
Does not match Pattern.
>>>checkString(s3)
Matches Pattern

s3 is the only one that doesn't do what I want. I'd like to be able to create a regex that allows numbers, colons and hyphens, but excludes EVERYTHING else (or just alphabetical characters). Can anyone point me in the right direction?

EDIT:

Therefore, I need a regex that would accept:

229            // number
187:657        //two numbers
187:678:-765   // two pos and 1 neg numbers

and decline:

Car          //characters
Car2         //characters and numbers


you need to match the whole string, not a single character as you do at the moment:

>>> re.search('^[-:0-9]+$', "12:24:-14")
<_sre.SRE_Match object at 0x01013758>
>>> re.search('^[-:0-9]+$', "hello")
>>> re.search('^[-:0-9]+$', "hello2")

To explain regex:

  • within square brackets (character class): match digits 0 to 9, hyphen and colon, only once.
  • + is a quantifier, that indicates that preceding expression should be matched as many times as possible but at least once.
  • ^ and $ match start and end of the string. For one-line strings they're equivalent to \A and \Z.

This way you restrict content of the whole string to be at least one-charter long and contain any permutation of characters from the character class. What you were doing before hand was to search for a single character from the character class within subject string. This is why s3 that contains a digit matched.


SilentGhost's answer is pretty good, but take note that it would also match strings like "---::::" with no digits at all.

I think you're looking for something like this:

'^(-?\d+:)*-?\d+$'
  • ^ Matches the beginning of the line.
  • (-?\d+:)* Possible - sign, at least one digit, a colon. That whole pattern 0 or many times.
  • -?\d+ Then the pattern again, at least once, without the colon
  • $ The end of the line

This will better match the strings you describe.


pattern = r'\A([^-:0-9]+|[A-Za-z0-9])\Z'


Your regular expression is almost fine; you just need to make it match the whole string. Also, as a commenter pointed out, you don't really need a raw string (the r prefix on the string) in this case. Voila:

def checkString(s):
  if re.match('[-:0-9]+$', s):
    print "Matches pattern."
  else:
    print "Does not match pattern."

The '+' means "match one or more of the previous expression". (This will make checkString return False on an empty string. If you want True on an empty string, change the '+' to a '*'.) The '$' means "match the end of the string".

re.match means "the string must match the regular expression starting at the first character"; re.search means "the regular expression can match a sequence anywhere inside the string".

Also, if you like premature optimization--and who doesn't!--note that 're.match' needs to compile the regular expression each time. This version compiles the regular expression only once:

__checkString_re = re.compile('[-:0-9]+$')
def checkString(s):
  global __checkString_re
  if __checkString_re.match(s):
    print "Matches pattern."
  else:
    print "Does not match pattern."
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜