Is there a more concise regular expression to accomplish this task?

2023-01-01 04:54 问答作者：

First off, sorry for the lame title, but I couldn't think of a better one. I need to test a password to ensure the following:

Passwords must contain at least 3 of the following:

upper case letters
lower case letters
numbers
special characters

Here's what I've come up with (it works, but I'm wondering if there is a better way to do this):

    Dim lowerCase As New Regex("[a-z]")
    Dim upperCase As New Regex("[A-Z]")
    Dim numbers As New Regex("\d")
    Dim special As New Regex("[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#]")

    Dim count As Int16 = 0

    If Not lowerCase.IsMatch(txtUpdate开发者_开发问答pass.Text) Then
        count += 1
    End If
    If Not upperCase.IsMatch(txtUpdatepass.Text) Then
        count += 1
    End If
    If Not numbers.IsMatch(txtUpdatepass.Text) Then
        count += 1
    End If
    If Not special.IsMatch(txtUpdatepass.Text) Then
        count += 1
    End If

If at least 3 of the criteria have not been met, I handle it. I'm not well versed in regular expressions and have been reading numerous tutorials on the web. Is there a way to combine all 4 regexes into one? But I guess doing that would not allow me to check if at least 3 of the criteria are met.

On a side note, is there a site that has an exhaustive list of all characters that would need to be escaped in the regex (those that have special meaning - eg. $, ^, etc.)?

As always, TIA. I can't express enough how awesome I think this site is.

The way you have it is about as good as it can get. You could accomplish this all in one line, but it would be terribly obfuscated and wouldn't really help at all.

Think about what you're trying to do; you want to check for four different criteria. Since each criterion is essentially a single comparison, you'll want to check for each one individually, which is what you're doing.

I think your approach is a sensible one here. It's clear exactly what you are trying to achieve (in a way that the correct regex would not be) and it works!

Most languages have very good documentation on their regex handling, so I'd suggest looking there first. Otherwise, I find the JavaScript MDC regex documentationt very good for the subset of Regular Expressions supported by that language (which covers most realistic usage).

One further hint though - you don't need to escape all of those special characters when you are inside a character set (square brackets). "[{}[.?*^$|]" is perfectly valid. (You obviously do need to escape ] and your delimiters (").

I believe this works but it just shows how ugly it would get. You wouldn't gain any speed or readability.

Try
    If Regex.IsMatch(SubjectString, "(?=\S*?[a-z])(?=\S*?[0-9])(?=\S*?[\\.+*?\^$[\]()|{}/'#])\S{3,}|(?=\S*?[A-Z])(?=\S*?[0-9])(?=\S*?[\\.+*?\^$[\]()|{}/'#])\S{3,}|(?=\S*?[A-Z])(?=\S*?[a-z])(?=\S*?[\\.+*?\^$[\]()|{}/'#])\S{3,}|(?=\S*?[A-Z])(?=\S*?[a-z])(?=\S*?[0-9])\S{3,}") Then
        ' Successful match
    Else
        ' Match attempt failed
    End If
Catch ex As ArgumentException
    'Syntax error in the regular expression
End Try

Regexbuddy explanation

' (?=\S*?[a-z])(?=\S*?[0-9])(?=\S*?[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#])\S{3,}|(?=\S*?[A-Z])(?=\S*?[0-9])(?=\S*?[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#])\S{3,}|(?=\S*?[A-Z])(?=\S*?[a-z])(?=\S*?[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#])\S{3,}|(?=\S*?[A-Z])(?=\S*?[a-z])(?=\S*?[0-9])\S{3,}
' 
' Match either the regular expression below (attempting the next alternative only if this one fails) «(?=\S*?[a-z])(?=\S*?[0-9])(?=\S*?[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#])\S{3,}»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[a-z])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character in the range between “a” and “z” «[a-z]»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[0-9])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character in the range between “0” and “9” «[0-9]»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character present in the list below «[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#]»
'          A \ character «\\»
'          A . character «\.»
'          A + character «\+»
'          A * character «\*»
'          A ? character «\?»
'          A ^ character «\^»
'          A $ character «\$»
'          A [ character «\[»
'          A ] character «\]»
'          A ( character «\(»
'          A ) character «\)»
'          A | character «\|»
'          A { character «\{»
'          A } character «\}»
'          A / character «\/»
'          A ' character «\'»
'          A # character «\#»
'    Match a single character that is a “non-whitespace character” «\S{3,}»
'       Between 3 and unlimited times, as many times as possible, giving back as needed (greedy) «{3,}»
' Or match regular expression number 2 below (attempting the next alternative only if this one fails) «(?=\S*?[A-Z])(?=\S*?[0-9])(?=\S*?[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#])\S{3,}»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[A-Z])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character in the range between “A” and “Z” «[A-Z]»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[0-9])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character in the range between “0” and “9” «[0-9]»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character present in the list below «[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#]»
'          A \ character «\\»
'          A . character «\.»
'          A + character «\+»
'          A * character «\*»
'          A ? character «\?»
'          A ^ character «\^»
'          A $ character «\$»
'          A [ character «\[»
'          A ] character «\]»
'          A ( character «\(»
'          A ) character «\)»
'          A | character «\|»
'          A { character «\{»
'          A } character «\}»
'          A / character «\/»
'          A ' character «\'»
'          A # character «\#»
'    Match a single character that is a “non-whitespace character” «\S{3,}»
'       Between 3 and unlimited times, as many times as possible, giving back as needed (greedy) «{3,}»
' Or match regular expression number 3 below (attempting the next alternative only if this one fails) «(?=\S*?[A-Z])(?=\S*?[a-z])(?=\S*?[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#])\S{3,}»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[A-Z])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character in the range between “A” and “Z” «[A-Z]»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[a-z])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character in the range between “a” and “z” «[a-z]»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character present in the list below «[\\\.\+\*\?\^\$\[\]\(\)\|\{\}\/\'\#]»
'          A \ character «\\»
'          A . character «\.»
'          A + character «\+»
'          A * character «\*»
'          A ? character «\?»
'          A ^ character «\^»
'          A $ character «\$»
'          A [ character «\[»
'          A ] character «\]»
'          A ( character «\(»
'          A ) character «\)»
'          A | character «\|»
'          A { character «\{»
'          A } character «\}»
'          A / character «\/»
'          A ' character «\'»
'          A # character «\#»
'    Match a single character that is a “non-whitespace character” «\S{3,}»
'       Between 3 and unlimited times, as many times as possible, giving back as needed (greedy) «{3,}»
' Or match regular expression number 4 below (the entire match attempt fails if this one fails to match) «(?=\S*?[A-Z])(?=\S*?[a-z])(?=\S*?[0-9])\S{3,}»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[A-Z])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character in the range between “A” and “Z” «[A-Z]»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[a-z])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character in the range between “a” and “z” «[a-z]»
'    Assert that the regex below can be matched, starting at this position (positive lookahead) «(?=\S*?[0-9])»
'       Match a single character that is a “non-whitespace character” «\S*?»
'          Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
'       Match a single character in the range between “0” and “9” «[0-9]»
'    Match a single character that is a “non-whitespace character” «\S{3,}»
'       Between 3 and unlimited times, as many times as possible, giving back as needed (greedy) «{3,}»

Is there a way to combine all 4 regexes into one? But I guess doing that would not allow me to check if at least 3 of the criteria are met.

That's the troublesome part: "3 out of 4", that's hard to translate into a (single) regex. The way you handle it now is fine, IMO.

On a side note, is there a site that has an exhaustive list of all characters that would need to be escaped in the regex (those that have special meaning - eg. $, ^, etc.)?

Regex' special characters may differ slightly from implementation to implementation. But, these are generally special and therefor need to be escaped:

.    // match any character (often not line breaks)
\    // used to escape characters
*    // quantifier: zero or more
+    // quantifier: one or more
?    // quantifier: once or none
(    // start of a group
)    // end of a group
[    // start of a character class
{    // start of a quantifier like X{2,5} (match 'X' between 2 and 5 times)
^    // start of the input string (or line)
$    // end of the input string (or line)
|    // OR

Note that inside a character class, most of the characters above loose their "special powers". A character class can be seen as a small language inside the regex-language. It has it's own special characters:

^    // when placed at the very start of the character class, it negates the class
-    // when NOT placed at the start or end of the class, it denotes a range: [a-c] matches 'a', 'b' or 'c'
\    // used to escape characters
]    // end of a character class

Some examples:

When you want to match the literal ^ inside a character class, you either need to escape it at the very start, or don't place it at the start:

[\^a]    // matches either 'a' or '^'
[a^]     // matches either '^' or 'a'

More special character class chars:

[a[\]b]    // matches either 'a', '[', ']' or 'b'

The range - character in action:

[a-c]     // matches 'a', 'b' or 'c'
[ac-]     // matches 'a', 'c' or '-'
[-ac]     // matches '-', 'a' or 'c'
[a\-c]    // matches 'a', '-' or 'c'

So these need no escaping:

[.()]    // simply matches either '.', '(' or ')'

As far as regex patterns for matching each criteria you may want to keep each one separate, that way you can iterate through a string and increment a count Dim whenever something matches to one of the four distinct patterns.

For special characters you can at least clean that one to be readable as such

\p{IsSpecials}

That's the only change I would make to the patterns and of course you may combine upper and lower case matching to one regex but it wouldn't make sense since they are distinct validation criteria.

To combine upper/lower case sets you can do this

[a-zA-Z]

Just for the fun of it, here's a neat way to do the job in one regex:

"^(?=(?:.*[a-z]())?)(?=.*[A-Z]()|\1)(?=.*[0-9]()|\1\2)(?=\1\2\3|.*[.+*?^$\[\]()|{}/'#\\])[a-zA-Z0-9.+*?^$\[\]()|{}/'#\\]+$"

This uses a trick that was made popular by Jan Goyvaerts and Steven Levithan in their indispensable book, Regular Expressions Cookbook: using empty capturing groups as check-boxes for optional conditions.

First we look for a lowercase letter, and if we find one, set group 1.

(?=(?:.*[a-z]())?)

If there's an uppercase letter, we set group 2. If not, and group 1 isn't set, fail. Order is important here: check the current condition in the first alternative, then do the back-assertion.

(?=.*[A-Z]()|\1)

If there's a digit, we set group 3; if not, and groups 1 and 2 aren't set, fail.

(?=.*[0-9]()|\1\2)

Finally, if groups 1, 2 and 3 aren't set, we look for one of the specials. We can do the back-assertions first this time; if three conditions have already been met, we don't care about the fourth.

(?=\1\2\3|.*[.+*?^$\[\]()|{}/'#\\])

继续阅读：regex

Is there a more concise regular expression to accomplish this task?

Some examples:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

Some examples:

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

王昌瑞《潜梦追凶》剧组庆生 新锐演员未来可期？

Is it allowed to ask users to enter credit card details for own payment method?

Escaping "<" in Perl-generated XML

imessage会显示已读吗？

微信重新建群怎么建？

王昌瑞《潜梦追凶》剧组庆生新锐演员未来可期？