Regular expression for counting sentences in the text escaping the dot in email adress
I have a sentence like My email address is xxxx@xxx.xxx.
My java regExp is ".+?[\.\?\!]+" This regExp sees two sentences My email addre开发者_开发问答ss is xxxx@xxx. and xxx.
How do i modify my regExp so it wouldn't count email dots as a sentence end?
You can't reliably. Consider this example:
My email is someone@subdomain.com.hi!
Is the email someone@subdomain.com.hi
or just someone@subdomain.com
?
The only thing you can do is, if your texts are written by literate people, detect a sentence ending as one of .
, ?
, !
(or maybe other) plus some space.
And now to ruin all hope. What about this text? How many sentences does it have?
He shouted "Freeze!", and then pulled out his gun.
With proper writing, that is - leaving space after each punctuation mark, you can look for each \.\s
Decide what constitutes the end of a sentence, I'd probably use a '.' followed by space, tab or end of line.
Actually I am not clear about your question. If you are looking for a regular expression for detecting valid e-mail address use this.
pattern=/^[a-zA-Z0-9_-.]+@[a-zA-Z0-9_-.]+\.[a-zA-Z]{2,4}$/
Example:
- name@gmail.co.uk
- name1.name2@gmail.com
- name1_name2@hotmail.co.ir
- etc.
精彩评论