PHP strip img tags from html, returning the html and the images in an array
I need to write a function that takes some HTML, strips out the img tags and returns the html (sans images). However, I also need to retain the imgs (in an array) so I can output them to the page separately.
I barely know any php so what开发者_运维知识库 is the best way to do this?
You'll need to familiarize yourself with the DOMDocument class. The best way to do this is parse out the HTML using DOMDocument
, and locate all the <img>
tags using getElementsByTagName('img')
. If it's the images' src
attributes you're after, DOMDocument can return those and store in an array.
// HTML already parsed into $dom
$imgs = $dom->getElementsByTagName('img');
$img_src = array();
// Array of nodes to remove.
$to_remove = array();
foreach ($imgs as $img) {
// Store the img src
$img_src[] = $img->getAttribute('src');
// Delete the node (I think this works)
$to_remove[] = $img;
}
// Then remove all the nodes slated for deletion:
foreach ($to_remove as $node) {
$dom->removeChild($img);
}
<?php
$pattern = '/<img[^>]*src="([^"]*)[^>]*>/i';
preg_match_all($pattern, $data, $matches);
// image src array
$images = $matches[1];
// no images
$html = preg_replace($pattern, '', $data);
?>
精彩评论