How to get this regex working?
i have a small problem, i want to find in
<tr><td>3</td><td>foo</td><td>2</td>开发者_Python百科
the foo, i use:
$<tr><td>\d</td><td>(.*)</td>$
to find the foo, but it dont work because it dont match with the </td>
at the end of foo but with the </td>
at the end of the string
You have to make the .*
lazy instead of greedy. Read more about lazy vs greedy here.
Your end of string anchors ($
) also don't make sense. Try:
<tr><td>\d<\/td><td>(.*?)<\/td>
(As seen on rubular.)
NOTE: I don't advocate using regex to parse HTML. But some times the task at hand is simple enough to be handled by regex, for which a full-blown XML parser is overkill (for example: this question). Knowing to pick the "right tool for the job" is an important skill in programming.
Your leading $
should be a ^
.
If you don't want to match all of the way to the end of the string, don't use a $
at the end. However, since *
is greedy, it'll grab as much as it can. Some regex implementations have a non-greedy version which would work, but you probably just want to change (.*)
to ([^<]*)
.
Use:
^<tr><td>\d</td><td>(.*?)</td>
(insert obligatory comment about not using regex to parse xml)
精彩评论