开发者

finding linked files with HPricot

I've been playing around with HPricot, but after a fair amount of searching, I've not be开发者_运维问答en able to work this out.

I'm trying to parse a HTML page and find all tags with a href to an mp3 file. So far I've got

<ul>
    <% @page.search('//a[@href*=mp3]').each do |link| %>    
        <li>
            <%= link.inner_text %>
        </li>
    <% end %>
</ul>

which is working fine, and a regex, /href\s*=\s*\"([^\"]+)(.mp3)/ which also works. I'm just not sure how to combine the two.

Is there a good example, or documentation that someone could point me to in order to work out what I can do with the .search function.

Thanks


You can access the attribute href with

link.attr('href')

As CSS3 selector you might want to consider @href$=.mp3 (instead of *=) as it matches only attributes which ends in .mp3.

Edit: You're right, sorry. I found out, that attr is only an alias for set for Hpricot::Elements. The right way is indeed:

link.attributes['href']

Nevertheless I would like to recommend Nokogiri as a faster substitute for Hpricot.


found the answer. the method is attributes, (not attr) and also, the brackets need to be square. link.attributes['href']

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜