How can I identify and process all URLs in a text string?
I would like to enumerate all the URLs in a text string, for example:
text = "fasòls http://george.it sdafsda"
For each URL found, I want to invoke a function method(...)
that transforms the string.
Right now I'm using a method like this:
msg = ""
f开发者_开发技巧or i in text.split
if (i =~ URI::regexp).nil?
msg += " " + i
else
msg+= " " + method(i)
end
end
text = msg
This works, but it's slow for long strings. How can I speed this up?
I think "gsub" is your friend here:
class UrlParser
attr_accessor :text, :url_counter, :urls
def initialize(text)
@text = parse(text)
end
private
def parse(text)
@counter = 0
@urls = []
text.gsub(%r{(\A|\s+)(http://[^\s]+)}) do
@urls << $2
"#{$1}#{replace_url($2)}"
end
end
def replace_url(url)
@counter += 1
"[#{@counter}]"
end
end
parsed_url = UrlParser.new("one http://x.com/url two")
puts parsed_url.text
puts parsed_url.urls
If you really need extra fast parsing of long strings, you should build a ruby C extension with ragel.
精彩评论