Regex to match repeated consonant
How can I detect with a regular exp开发者_JS百科ression if the same consonant is repeated three or more times?
My idea is to match words like tttool
, likkke
, or likkkkke
Try this:
([b-df-hj-np-tv-z])\1{2,}
Explanation:
[b-df-hj-np-tv-z]
are all the consonants\1
is the back reference to the 1st group (ie the same character){2,}
means "2 or more of the preceding term", making 3 or more in all
See live demo.
This is about the shortest regex I could think of to do it:
(?i)([b-z&&[^eiou]])\1\1+
This uses a regex character class subtraction to exclude vowels.
I didn't have to mention "a" because I started the range from "b".
Using (?i)
makes the regex case insensitive.
See a live demo.
There may be shortcuts in certain regex libraries but you can always...
b{3,}|c{3,}|d{3,}...
Some libs for example let you match using a back reference which may be a tad cleaner...
(bcd...)\1{2,}
The regex from answer higher [b-df-hj-np-tv-z])\1{2,}
has a mistake ("y" is fogotten)
It should be [b-df-hj-np-tv-xz])\1{2,}
I'd personally solve this in reverse; instead of using [b-df-hj-np-tv-z]
, I'd go with the double-negative, [^\W_aeiou]
.
/([^\W_aeiou])\1\1+/i
This has a character class that uses a double negative: match anything except a non-word-character, an underscore, or a vowel. Ignoring non-ASCII vowels, only consonants can match this. Saving a match, the regex then seeks a match of that same consonant (case-insensitive), then one or more again, which brings us to 3+ consecutive consonants.
You can use capture groups with back-references. This will capture repeating symbols:
/(
([\w]) ## second group is just one symbol
\2 ## match symbol found in second groups
\2+ ## match same symbol one or more times
)/x ## x is just to allow inner comments
But not all regexp engines support back-references.
精彩评论