regular expression to match html tag with specific contents

2022-12-19 07:58 问答作者：

I am trying to write a regular expression to capture this string:

<td style="white-space:nowrap;">###.##</td>

I can't even match it if include the string as it is in the regex pattern! I am using preg_match_all(), however, I am not finding the correct pattern. I am thinking that "white-space:nowrap;" is throwing off the matching in some way. Any idea? Thanks 开发者_如何学编程...

Why not try using DOM document instead? Then you do not have to worry about having the HTML formatted properly. Using the Dom Doc collection will also improve readability and ensure fast performance since its part of the PHP Core rather then living in user space

When I'm having problems with regular expressions, I like to test them in real time with one of the following websites:

preg_match Regular Expression Tester
Regular Expression Test Tool

Did you see any warnings? You have to escape some bits of that, namely the / before the td close tag. This seemed to work for me:

$string='cow cow cow    <td style="white-space:nowrap;">###.##</td> cat cat cat cat';
php > preg_match_all('/<td style="white-space:nowrap;">###\.##<\/td>/',$string,$result);
php > var_dump($result);
array(1) {
  [0]=>
  array(1) {
    [0]=>
    string(43) "<td style="white-space:nowrap;">###.##</td>"
  }
}

Are you aware that the regex argument to any of PHP's preg_ functions has to be double-delimited? For example:

preg_match_all(`'/foo/'`, $target, $results)

'...' are the string delimiters, /.../ are the regex delimiters, and the actual regex is foo. The regex delimiters don't have to be slashes, they just have to match; some popular choices are #...#, %...% and ~...~. They can also be balanced pairs of bracketing characters, like {...}, (...), [...], and <...>; those are much less popular, and for good reason.

If you leave out the regex delimiters, the regex-compilation phase will probably fail and the error message will probably make no sense. For example, this code:

preg_match_all('<td style="white-space:nowrap;">###.##</td>', $s, $m)

...would generate this message:

 Unknown modifier '#'

It tries to use the first pair of angle brackets as the regex delimiters, and whatever follows the > as the regex modifiers (e.g., i for case-insensitive, m for multiline). To fix that, you would add real regex delimiters, like so:

preg_match_all('%<td style="white-space:nowrap;">###\.##</td>%i', $s, $m)

The choice of delimiter is a matter of personal preference and convenience. If I had used # or /, I would have had to escape those characters in the actual regex. I escaped the . because it's a regex metacharacter. Finally, I added the i modifier to demonstrate the use of modifiers and because HTML isn't case sensitive.

继续阅读：php regex

regular expression to match html tag with specific contents

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？