开发者

anemone scrubbing a certain page depth

I am not understanding how to use the tentacle part of the anemone. If I am interpreting it right I feel i could use it to only scrub a certain page depth away from the root.

  Anemone.crawl(start_url) do |anemone|
  tentacle.new(i think but not working)
  anemone.on_every_page do |page|
      puts page.depth
      puts page.url
    end
  end

I am wanting it to go to a depth of 3 away from the root.

here is what the rdoc says

http://anemone.rubyforge.org/doc/index.html

Public Class methods
new(link_queue, page_queue, opts = {})
Create a new Tentacle

Public Instance methods
run()
Gets links from @link_queue, and returns the fetched Page objects into @page_queue

Th开发者_C百科ank you


got it :)

Anemone.crawl(domain, :depth_limit => 1) do | anemone |
  anemone.storage = Anemone::Storage.MongoDB
  anemone.on_every_page do |page|
      puts page.url
      puts page.depth

  end
end
0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜