开发者

extract text from html using regex or other method

i am trying to extract the text "abcdef" from the following html using regex:

<a href="xyz.com" rel="bookmark" title="hello_world">abc def</a>

i am trying this pattern

$pattern = "<a href=(.*?) rel='bookmark' title=(.*?)>(.*?)</a>"

it would be helpful if anyone help me to figure out the pattern . I am 开发者_如何学Pythonusing PHP .

thanks


Use DOMDocument instead. Specifically, DOMDocument::loadHTML. Your life will be much easier.

You could use a pattern like the following, but I really don't recommend using regexes to manipulate HTML:

/<a\s+href\s*=\s*"([^"]+)"\s+rel\s*=\s*"([^"]+)"\s+title\s*=\s*"([^"]+)"\s*>([^<]+)<\/a>/

I also noticed that in your regular expression you have rel='bookmark' whereas the original string has rel="bookmark". This is probably why your original regex is not working.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜