Anchors in Regex
- In a Python Regex, must ^ or $ appear just once?
I tried to match two lines with
^(.*\|.*)$^.*$
It does not work. How do you match several lines?
Note: I am not 开发者_如何学Goprogramming in Python, but using Python-style Regex in my editor gedit.
Thanks and regards!
As other answers have said, you are looking for re.MULTILINE
, but even with that your regex won't work.
$
matches the position before the line break, and ^
matches the start of a line, so $^
in the middle of a regex will never match. For example:
>>> re.search("^(.*)$^.*$", multiline_string, re.M) # won't match
>>> re.search("^(.*)$\n^.*$", multiline_string, re.M) # will match
<_sre.SRE_Match object at 0xb7f3e5e0>
You need something to match the end of line characters between the $
and the ^
.
You have to use re.MULTILINE ( or even re.DOTALL if you change regex and depending on what you actually want to match / do )
re.MULTILINE
When specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character '$' matches at the end of the string and at the end of each line (immediately preceding each newline).
By default, '^' matches only at the beginning of the string, and '$' only at the end of the string and immediately before the newline (if any) at the end of the string.
http://docs.python.org/library/re.html
BTW, what are you doing with - ^(.*\|.*)$^.*$
- that is not a very good regex! ( ignoring the fact that you have the multiple $
and ^
which is the point of the question. )
Take a look at re.MULTILINE
.
I quote:
When specified, the pattern character
'^'
matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character'$'
matches at the end of the string and at the end of each line (immediately preceding each newline).By default,
'^'
matches only at the beginning of the string, and'$'
only at the end of the string and immediately before the newline (if any) at the end of the string.
To add to other answers. You can get away with putting the re.MULTILINE modifier directly into the regex:
(?m)^(.*\|.*)$\n^.*$
I would refer to the python regex manual http://docs.python.org/library/re.html#re.MULTILINE
Prefixing your regex with (?m) should do what you need (tells the regex engine that it's going to receive multiline texts, and that ^/$ match the beginning/end of a line instead of the whole text).
Edit: after looking at your regex a bit more, I think you also need to put (?s), meaning that you want dot to match newline characters. For example:
(?m)(?s)^hello.*?world$
correctly matched "helloworld" for me in a case like this:
dssdf
hello
world
sdfasdf
精彩评论