Regex Valid Twitter Mention
I'm trying to find a regex that matches if a Tweet it's a true mention. To be a mention, the string can't start with "@" and can't contain "RT" (case insensitive) and "@" must start the word.
In the examples I commented the desired output
Some examples:
function search($strings, $regexp) {
$regexp;
foreach ($strings as $string) 开发者_StackOverflow社区{
echo "Sentence: \"$string\" <- " .
(preg_match($regexp, $string) ? "MATCH" : "NO MATCH") . "\n";
}
}
$strings = array(
"Hi @peter, I like your car ", // <- MATCH
"@peter I don't think so!", //<- NO MATCH: the string it's starting with @ it's a reply
"Helo!! :@ how are you!", // NO MATCH <- it's not a word, we need @(word)
"Yes @peter i'll eat them this evening! RT @peter: hey @you, do you want your pancakes?", // <- NO MATCH "RT/rt" on the string , it's a RT
"Helo!! ineed@aser.com how are you!", //<- NO MATCH, it doesn't start with @
"@peter is the best friend you could imagine. RT @juliet: @you do you know if @peter it's awesome?" // <- NO MATCH starting with @ it's a reply and RT
);
echo "Example 1:\n";
search($strings, "/(?:[[:space:]]|^)@/i");
Current output:
Example 1:
Sentence: "Hi @peter, I like your car " <- MATCH
Sentence: "@peter I don't think so!" <- MATCH
Sentence: "Helo!! :@ how are you!" <- NO MATCH
Sentence: "Yes @peter i'll eat them this evening! RT @peter: hey @you, do you want your pancakes?" <- MATCH
Sentence: "Helo!! ineed@aser.com how are you!" <- MATCH
Sentence: "@peter is the best friend you could imagine. RT @juliet: @you do you know if @peter it's awesome?" <- MATCH
EDIT:
I need it in regex beacause it can be used on MySQL and anothers languages too. Im am not looking for any username. I only want to know if the string it's a mention or not.
This regexp might work a bit better: /\B\@([\w\-]+)/gim
Here's a jsFiddle example of it in action: http://jsfiddle.net/2TQsx/96/
Here's a regex that should work:
/^(?!.*\bRT\b)(?:.+\s)?@\w+/i
Explanation:
/^ //start of the string
(?!.*\bRT\b) //Verify that rt is not in the string.
(?:.*\s)? //Find optional chars and whitespace the
//Note: (?: ) makes the group non-capturing.
@\w+ //Find @ followed by one or more word chars.
/i //Make it case insensitive.
I have found that this is the best way to find mentions inside of a string in javascript. I don't know exactly how i would do the RT's but I think this might help with part of the problem.
var str = "@jpotts18 what is up man? Are you hanging out with @kyle_clegg";
var pattern = /@[A-Za-z0-9_-]*/g;
str.match(pattern);
["@jpotts18", "@kyle_clegg"]
I guess something like this will do it:
^(?!.*?RT\s).+\s@\w+
Roughly translated to:
At the beginning of string, look ahead to see that RT\s is not present, then find one or more of characters followed by a @ and at least one letter, digit or underscore.
Twitter has published the regex they use in their twitter-text library. They have other language versions posted as well on GitHub.
A simple but works correctly even if the scraping tool has appended some special characters sometimes: (?<![\w])@[\S]*\b
. This worked for me
精彩评论