anemone scrubbing a certain page depth
I am not understanding how to use the tentacle part of the anemone. If I am interpreting it right I feel i could use it to only scrub a certain page depth away from the root.
Anemone.crawl(start_url) do |anemone|
tentacle.new(i think but not working)
anemone.on_every_page do |page|
puts page.depth
puts page.url
end
end
I am wanting it to go to a depth of 3 away from the root.
here is what the rdoc says
http://anemone.rubyforge.org/doc/index.html
Public Class methods
new(link_queue, page_queue, opts = {})
Create a new Tentacle
Public Instance methods
run()
Gets links from @link_queue, and returns the fetched Page objects into @page_queue
Th开发者_C百科ank you
got it :)
Anemone.crawl(domain, :depth_limit => 1) do | anemone |
anemone.storage = Anemone::Storage.MongoDB
anemone.on_every_page do |page|
puts page.url
puts page.depth
end
end
精彩评论