开发者

Regex to match repeated consonant

How can I detect with a regular exp开发者_JS百科ression if the same consonant is repeated three or more times?

My idea is to match words like tttool, likkke, or likkkkke


Try this:

([b-df-hj-np-tv-z])\1{2,}

Explanation:

  • [b-df-hj-np-tv-z] are all the consonants
  • \1 is the back reference to the 1st group (ie the same character)
  • {2,} means "2 or more of the preceding term", making 3 or more in all

See live demo.


This is about the shortest regex I could think of to do it:

(?i)([b-z&&[^eiou]])\1\1+

This uses a regex character class subtraction to exclude vowels.
I didn't have to mention "a" because I started the range from "b".
Using (?i) makes the regex case insensitive.

See a live demo.


There may be shortcuts in certain regex libraries but you can always...

b{3,}|c{3,}|d{3,}...

Some libs for example let you match using a back reference which may be a tad cleaner...

(bcd...)\1{2,}


The regex from answer higher [b-df-hj-np-tv-z])\1{2,}has a mistake ("y" is fogotten)

It should be [b-df-hj-np-tv-xz])\1{2,}


I'd personally solve this in reverse; instead of using [b-df-hj-np-tv-z], I'd go with the double-negative, [^\W_aeiou].

/([^\W_aeiou])\1\1+/i

This has a character class that uses a double negative: match anything except a non-word-character, an underscore, or a vowel. Ignoring non-ASCII vowels, only consonants can match this. Saving a match, the regex then seeks a match of that same consonant (case-insensitive), then one or more again, which brings us to 3+ consecutive consonants.


You can use capture groups with back-references. This will capture repeating symbols:

/(
   ([\w])        ## second group is just one symbol
   \2            ## match symbol found in second groups
   \2+           ## match same symbol one or more times
)/x              ## x is just to allow inner comments

But not all regexp engines support back-references.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜