开发者

Regular Expression - Small change needed

I need a regular expression that will allow only a to z and 0 to 9. I came across the function below on this site, but it allows a few symbols thru (#.-). How should it be done if it has to allow only a to z (both upper and lower case) and 0 to 9? I'm scared to edit it since I know nothing 开发者_高级运维about regular expressions.

Also is this regular expression good to check for a to z and 0 to 9, or is there any way it can be bettered.

function isValid($str) {
    return !preg_match('/[^A-Za-z0-9.#\\-$]/', $str);
}

Thanks


The following seems to be what you need in this case:

function isValid($str) {
    return !preg_match('/[^A-Za-z0-9]/', $str);
}

The […] regex construct is called a character class. Something like [aeiou] matches one of any of the vowels.

The [^…] is a negated character class, so [^aeiou] matches one of anything but the vowels (which includes consonants, digits, symbols, etc).

The -, depending on where/how it appears in a character class definition, is a range definition, so 0-9 is the same as 0123456789.

Thus, the regex [^A-Za-z0-9] actually matches a character that's neither a letter nor a digit. This is why the result of preg_match is negated with !.

That is, the logic of the above method uses double negation:

isValid = it's not the case that
              there's something other than a letter or a digit
                  anywhere in the string

You can alternatively get rid of the double negation and use something like this:

function isValid($str) {
    return preg_match('/^[A-Za-z0-9]*$/', $str);
}

Now there's no negation. The ^ and $ are the beginning and of the string anchors, and * is a zero-or-one-of repetition metacharacter. Now the logic is simply:

isValid = the entire string from beginning to end
              is a sequence of letters and digits

References

  • regular-expressions.info/Character Class, Anchors, Repetition

Related questions

  • Regex: why doesn’t [01-12] range work as expected?
    • Detailed discussion, with common mistakes, etc
  • Character class subtraction, converting from Java syntax to RegexBuddy
    • Some flavors have rich character class arithmetics like subtraction and intersection

Non-regex alternative

Some languages have standard functions/idiomatic ways to validate that a string consists of only alphanumeric characters (among other possible string "types").

In PHP, for example, you can use ctype_alnum.

bool ctype_alnum ( string $text )

Checks if all of the characters in the provided string , text, are alphanumeric.

API links

  • PHP Ctype Functions - list of entire family of ctype functions
    • ctype_alpha, digit, lower, upper, space, etc


Whilst I have nothing against regular expressions, with such a simple pattern you should probably consider using

if(ctype_alnum($input)) {

http://uk3.php.net/manual/en/function.ctype-alnum.php


You can match z and 0-9 with [Zz0-9] and you can match a-z and 0-9 with [a-z0-9]. If you want both upper and lower case then you would use [A-Za-z0-9].

See regular expression character classes for more on this.

Further, the !preg_match() isn't really necessary. Instead you could use a positive match on what you want, such as return preg_match('/^[A-Za-z0-9]+$/', $str); The one you have is actually a negated character class, so it will disallow anything within the brackets. I may be misunderstanding your purpose, though.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜