How to count the number of occurences of urls and hyperlinks in a Ruby string?
Say a user submits this comment on 开发者_运维技巧a blog:
@SO - Great community, but we've also seen some great communities at Stack Overflow. At the same time Google's Gmail (http://gmail.com) is a great example of a community with endless bounds. I'm just wondering if anyone will really go toe-to-toe with something like http://www.twitter.com. What do you think?
Note: the 3rd url was actually posted as plain text, but SO converted it to a hyperlink.
Anyways, the total url and hyperlink count should be 3.
So, from a Ruby and/or Ruby on Rails perspective: How to count the number of occurences of urls and hyperlinks in a Ruby string?
This is pretty easy, albeit relatively naive:
string.count("http://")
Of course, it won't pick up links without a leading "http://", but that might be a reasonable assumption.
The easiest way is to scan for "http" pattern, but really it can be more complicated, because sometimes urls haven't got "http://" at the beggining
string = "@SO - Great community, but we've also seen some great communities at <a href='http://blabla'>Stack Overflow</a>. At the same time Google's Gmail (http://gmail.com) is a great example of a community with endless bounds. I'm just wondering if anyone will really go toe-to-toe with something like http://www.twitter.com. What do you think?"
string.scan(/http/).size #=> 3
Using regular expressions is a good way. Here is an example on how to do that:
yourpost.each do |yourword|
if yourword =~ /^(((ht|f)tps?\:\/\/)|~/|/)?([a-zA-Z]{1}([\w\-]+\.)+([\w]{2,5})(:[\d]{1,5})?)/?(\w+\.[\w]{3,4})?((\?\w+=\w+)?(&\w+=\w+)*)?/
puts %Q!We found #{$&} an URL in #{$1}!
end
end
See this post for further discussion on regular expressions matching URLs.
精彩评论