开发者

How can I identify and process all URLs in a text string?

I would like to enumerate all the URLs in a text string, for example:

text = "fasòls http://george.it sdafsda"

For each URL found, I want to invoke a function method(...) that transforms the string.

Right now I'm using a method like this:

msg = ""
f开发者_开发技巧or i in text.split
  if (i =~ URI::regexp).nil?
        msg += " " + i
      else 
         msg+= " " + method(i)
  end
end
text = msg

This works, but it's slow for long strings. How can I speed this up?


I think "gsub" is your friend here:

class UrlParser
  attr_accessor :text, :url_counter, :urls

  def initialize(text)
    @text = parse(text)
  end

  private
    def parse(text)
      @counter = 0
      @urls = []
      text.gsub(%r{(\A|\s+)(http://[^\s]+)}) do
        @urls << $2
        "#{$1}#{replace_url($2)}"
      end
    end

    def replace_url(url)
      @counter += 1
      "[#{@counter}]"
    end
end

parsed_url = UrlParser.new("one http://x.com/url two")
puts parsed_url.text
puts parsed_url.urls

If you really need extra fast parsing of long strings, you should build a ruby C extension with ragel.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜