开发者

Python Regex (Search Multiple values in one string)

In python regex how would I match against a large string of text and flag if any one of the regex values are matched... I have tried this with "|" or statements and i have tried making a regex list.. neither worked for me.. here is an example of what I am trying to do with the or..

I think my "or" gets commented out

patterns=re.compile(r'[\btext String1\b] | [\bText String2\b]')   

if(patterns.search(MyTextFile)):
     print ("YAY one 开发者_运维问答of your text patterns is in this file")

The above code always says it matches regardless if the string appears and if I change it around a bit I get matches on the first regex but never checks the second.... I believe this is because the "Raw" is commenting out my or statement but how would I get around this??

I also tried to get around this by taking out the "Raw" statement and putting double slashes on my \b for escaping but that didn't work either :(

patterns=re.compile(\\btext String1\\b | \\bText String2\\b)   

if(patterns.search(MyTextFile)):
     print ("YAY one of your text patterns is in this file")

I then tried to do 2 separate raw statements with the or and the interpreter complains about unsupported str opperands...

patterns=re.compile(r'\btext String1\b' | r'\bText String2\b')   

if(patterns.search(MyTextFile)):
     print ("YAY one of your text patterns is in this file")


patterns=re.compile(r'(\btext String1\b)|(\bText String2\b)')   

You want a group (optionally capturing), not a character class. Technically, you don't need a group here:

patterns=re.compile(r'\btext String1\b|\bText String2\b')   

will also work (without any capture).

The way you had it, it checked for either one of the characters between the first square brackets, or one of those between the second pair. You may find a regex tutorial helpful.

It should be clear where the "unsupported str operands" error comes from. You can't OR strings, and you have to remember the | is processed before the argument even gets to compile.


This part [\btext String1\b] means is there a "word separator" or one of the letters in "text String1" present. So that matches anything but an empty line I think.


In a RE pattern, square brackets [ ] indicate a "character class" (depending on what's inside them, "any one of these character" or "any character except one of these", the latter indicate by a caret ^ as the first character after the opening [). This is what you're expressing and it has absolutely nothing to do with what you want -- just remove the brackets and you should be fine;-).

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜