Get text between HTML tags [duplicate]

2023-02-25 07:09 问答作者：

This question already has answers here: extract text from tag (2 answers) Closed 9 years ago.

Ok, This is a pretty basic question im sure but im new to PHP and haven't been able to figure it out. The input string is $data im trying to continue to pull and only use the first match. Is the below incorrect? This may not even be the best way to perform the action, im just trying to pull the contents in between two html tags (first set found) and discard the rest of the data. I know there are similar questions, ive read them all, my question is a mix, if theres a better way to do this and how i can define the match as the new input for the rest of开发者_如何学编程 the remaining code. If i change $matches to $data2 and use it from there on out it returns errors.

preg_match('/<h2>(.*?)<\/h2>/s', $data, $matches);

Don't parse HTML via preg_match, use this PHP class instead:

The DOMDocument class

Example:

<?php 

$html= "<p>hi</p>
<h1>H1 title</h1>
<h2>H2 title</h2>
<h3>H2 title</h3>";
 // a new dom object 
 $dom = new domDocument('1.0', 'utf-8'); 
 // load the html into the object 
 $dom->loadHTML($html); 
 //discard white space 
 $dom->preserveWhiteSpace = false; 
 $hTwo= $dom->getElementsByTagName('h2'); // here u use your desired tag
 echo $hTwo->item(0)->nodeValue; 
 //will return "H2 title";
 ?>

Reference

Using regular expressions is generally a good idea for your problem.

When you look at http://php.net/preg_match you see that $matches will be an array, since there may be more than one match. Try

print_r($matches);

to get an idea of how the result looks, and then pick the right index.

EDIT:

If there is a match, then you can get the text extracted between the parenthesis-group with

print($matches[1]);

If you had more than one parenthesis-group they would be numbered 2, 3 etc. You should also consider the case when there is no match, in which case the array will have the size of 0.

You could do it this way::

$h1 = preg_replace('/<h1[^>]*?>([\\s\\S]*?)<\/h1>/',
'\\1', $h1);

This will Strip off or unwrap the TEXT from the <H1></H1> HTML Tags

继续阅读：arrays php preg-match string

Get text between HTML tags [duplicate]

The DOMDocument class

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

The DOMDocument class

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集 河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？