开发者

Extract all urls Href php [duplicate]

This question already has answers here: Finding links matching given string in xpath/domdocument query (2 answers) PHP Xpath : get all href values that contain needle (1 answer) Closed 11 months ago.

I have an HTML with many links. I am开发者_如何学Python currently able to get links, just all over, I would only get a certain word.


$dom = new DOMDocument;
$dom->loadHTML($html);
$links = $dom->getElementsByTagName('a');
foreach ($links as $link){
    echo $link->getAttribute('href');
}

I would list only links that contained a certain word, example: sendspace.com

result would be more or less below the:

http://www.fileserve.com/file/eDpDMm9sad/

http://www.fileserve.com/file/7s83hjh347/

I would then convert these links to sha1.

after conversion to save the html sha1 already applied to the links with the words contained.


Using phpQuery, you can traverse the DOM and find the anchors (<a>) with the href attribute containing what you want:

$dom = phpQuery::newDocument($htmlSource);
$anchors = $dom->find('a[href|=sendspace.com]');

$urls = array();

if($anchors) {
  foreach($anchors as $anchor) {
    $anchor = pq($anchor);
    $urls[] = $anchor->attr('href');
  }
}


You can use regex to match your word (or whatever else) in the string like so:

foreach ($links as $link) {
    if (preg_match("/example\.com/i", $link->getAttribute('href'))) {
        // do things here!
    }
}
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜