finding linked files with HPricot
I've been playing around with HPricot, but after a fair amount of searching, I've not be开发者_运维问答en able to work this out.
I'm trying to parse a HTML page and find all tags with a href to an mp3 file. So far I've got
<ul>
<% @page.search('//a[@href*=mp3]').each do |link| %>
<li>
<%= link.inner_text %>
</li>
<% end %>
</ul>
which is working fine, and a regex, /href\s*=\s*\"([^\"]+)(.mp3)/
which also works. I'm just not sure how to combine the two.
Is there a good example, or documentation that someone could point me to in order to work out what I can do with the .search function.
Thanks
You can access the attribute href
with
link.attr('href')
As CSS3 selector you might want to consider @href$=.mp3
(instead of *=
) as it matches only attributes which ends in .mp3
.
Edit:
You're right, sorry. I found out, that attr
is only an alias for set
for Hpricot::Elements
. The right way is indeed:
link.attributes['href']
Nevertheless I would like to recommend Nokogiri as a faster substitute for Hpricot.
found the answer. the method is attributes, (not attr) and also, the brackets need to be square. link.attributes['href']
精彩评论