Regular Expression - Small change needed
I need a regular expression that will allow only a to z and 0 to 9. I came across the function below on this site, but it allows a few symbols thru (#.-). How should it be done if it has to allow only a to z (both upper and lower case) and 0 to 9? I'm scared to edit it since I know nothing 开发者_高级运维about regular expressions.
Also is this regular expression good to check for a to z and 0 to 9, or is there any way it can be bettered.
function isValid($str) {
return !preg_match('/[^A-Za-z0-9.#\\-$]/', $str);
}
Thanks
The following seems to be what you need in this case:
function isValid($str) {
return !preg_match('/[^A-Za-z0-9]/', $str);
}
The […]
regex construct is called a character class. Something like [aeiou]
matches one of any of the vowels.
The [^…]
is a negated character class, so [^aeiou]
matches one of anything but the vowels (which includes consonants, digits, symbols, etc).
The -
, depending on where/how it appears in a character class definition, is a range definition, so 0-9
is the same as 0123456789
.
Thus, the regex [^A-Za-z0-9]
actually matches a character that's neither a letter nor a digit. This is why the result of preg_match
is negated with !
.
That is, the logic of the above method uses double negation:
isValid = it's not the case that
there's something other than a letter or a digit
anywhere in the string
You can alternatively get rid of the double negation and use something like this:
function isValid($str) {
return preg_match('/^[A-Za-z0-9]*$/', $str);
}
Now there's no negation. The ^
and $
are the beginning and of the string anchors, and *
is a zero-or-one-of repetition metacharacter. Now the logic is simply:
isValid = the entire string from beginning to end
is a sequence of letters and digits
References
- regular-expressions.info/Character Class, Anchors, Repetition
Related questions
- Regex: why doesn’t [01-12] range work as expected?
- Detailed discussion, with common mistakes, etc
- Character class subtraction, converting from Java syntax to RegexBuddy
- Some flavors have rich character class arithmetics like subtraction and intersection
Non-regex alternative
Some languages have standard functions/idiomatic ways to validate that a string consists of only alphanumeric characters (among other possible string "types").
In PHP, for example, you can use ctype_alnum
.
bool ctype_alnum ( string $text )
Checks if all of the characters in the provided string ,
text
, are alphanumeric.
API links
- PHP Ctype Functions - list of entire family of
ctype
functionsctype_alpha
,digit
,lower
,upper
,space
, etc
Whilst I have nothing against regular expressions, with such a simple pattern you should probably consider using
if(ctype_alnum($input)) {
http://uk3.php.net/manual/en/function.ctype-alnum.php
You can match z and 0-9 with [Zz0-9]
and you can match a-z and 0-9 with [a-z0-9]
. If you want both upper and lower case then you would use [A-Za-z0-9]
.
See regular expression character classes for more on this.
Further, the !preg_match()
isn't really necessary. Instead you could use a positive match on what you want, such as return preg_match('/^[A-Za-z0-9]+$/', $str);
The one you have is actually a negated character class, so it will disallow anything within the brackets. I may be misunderstanding your purpose, though.
精彩评论