Removing newlines and tabs after Regex

2023-03-15 23:02 问答作者：

I am performing preg_match() on the following HTML code:

HTML Code:

<div class="phone"> 
        (123) 123-1234
    </div>

Regex Pattern:

/<div class="phone">(?<phone>.*?)<\/div>/s

Result:

[phone] => '
                    (617) 547-6670
 开发者_StackOverflow社区     '

The extra line and spaces is what I am trying to get rid of. Using /sm option does not affect the result. Using str_replace("\n",'',$string) got rid of a line, and the spaces infront should be \t tabs. I got rid of the annoying stuff with str_replace("\n\t\t\t\t",'',$string) but I need a more general solution.

How can I remove the \n and \t regardless of how many there are?

Not sure if this is what you would like, but trim() will take care of spaces, tabs, and newlines on each side of the string (but not within the string).

http://php.net/manual/en/function.trim.php

string trim ( string $str [, string $charlist ] )

This function returns a string with whitespace stripped from the beginning and end of str. Without the second parameter, trim() will strip these characters:
" " (ASCII 32 (0x20)), an ordinary space.
"\t" (ASCII 9 (0x09)), a tab.
"\n" (ASCII 10 (0x0A)), a new line (line feed).
"\r" (ASCII 13 (0x0D)), a carriage return.
"\0" (ASCII 0 (0x00)), the NUL-byte.
"\x0B" (ASCII 11 (0x0B)), a vertical tab.

I do realize that this will not handle something like Hello<space><space><space>World, but it may be what you're after (outside of the regex).

The simplest way is to pad the "content" part of the regex with \s*, like so:

/<div class="phone">\s*(?<phone>.*?)\s*<\/div>/s

The first \s* consumes as many whitespace characters as it can, stopping when it sees the first character in the phone number. Then the .*? starts consuming characters reluctantly, stopping at the first position where the next part of the regex (\s*<\/div>) can match, which is just after the last character in the phone number.

Be aware that the first \s* must be greedy and the .*? in the named group must be non-greedy for this to work. So you if you start feeling the urge to make all quantifiers non-greedy with the /U option, resist it. I mention this because some people use it all their regexes, which I consider a poor practice. Also, the /s (single-line) modifier is necessary but the /m (multiline) modifier isn't.

using \s*

\s is a whitespace character and * means any number of including 0

But I think you should look for an html parser, its here probably the better solution.

继续阅读：codeigniter php regex

Removing newlines and tabs after Regex

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？