Extract all urls Href php [duplicate]
I have an HTML with many links. I am开发者_如何学Python currently able to get links, just all over, I would only get a certain word.
$dom = new DOMDocument;
$dom->loadHTML($html);
$links = $dom->getElementsByTagName('a');
foreach ($links as $link){
echo $link->getAttribute('href');
}
I would list only links that contained a certain word, example: sendspace.com
result would be more or less below the:
http://www.fileserve.com/file/eDpDMm9sad/ http://www.fileserve.com/file/7s83hjh347/I would then convert these links to sha1.
after conversion to save the html sha1 already applied to the links with the words contained.
Using phpQuery, you can traverse the DOM and find the anchors (<a>
) with the href
attribute containing what you want:
$dom = phpQuery::newDocument($htmlSource);
$anchors = $dom->find('a[href|=sendspace.com]');
$urls = array();
if($anchors) {
foreach($anchors as $anchor) {
$anchor = pq($anchor);
$urls[] = $anchor->attr('href');
}
}
You can use regex to match your word (or whatever else) in the string like so:
foreach ($links as $link) {
if (preg_match("/example\.com/i", $link->getAttribute('href'))) {
// do things here!
}
}
精彩评论