开发者

how can i fetch all images src into array with file get content

How can I fetch all images src开发者_如何学JAVA into array with file_get_content(), with preg_match or whatever?


You shouldn't use regex to parse HTML. You should use classes like DOMDocument to do so. DOMDocument has the getElementsByTagName method that can be used to retrieve all the img tag from the document you want to parse.

Here's an example that will echo the list of the images in the document :

<?php
    $document = new DOMDocument();
    $document->loadHTML(file_get_contents('yourfilehere.html'));
    $lst = $document->getElementsByTagName('img');

    for ($i=0; $i<$lst->length; $i++) {
        $image = $lst->item($i);
        echo $image->attributes->getNamedItem('src')->value, '<br />';
    }
?>


It's more reliable and simpler to use phpQuery or SimpleHTMLparser (more elaborate). But for basic extraction purposes, and just searching for src= attributes, this is overkill and an regular expression is in fact sufficient:

preg_match_all('/<img[^>]+src\s*=[\'\"\s]?([^<\'\"]+)/ims', file_get_contents($url), $uu);

Note that it will yield relative path names, mostly not URLs. So needs postprocessing, whereas phpQuery IIRC has a shortcut for normalizing them.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜