PHP: fastest way to check for invalid characters (all but a-z, A-Z, 0-9, #, -, ., $)?
I have to check the buffer input to a PHP socket server as fast as possible. To do so, I need to know if the input message $buffer contains any other character(s) than the following: a-z, A-Z, 0-9, #, -, . and $
I'm currently using the following ereg function, but wonder if there are ways to optimize the speed. Should I maybe use a different function, or a different regex?
if (ereg("[A-Za-z0开发者_JS百科-9]\.\#\-\$", $buffer) === false)
{
echo "buffer only contains valid characters: a-z, A-Z, 0-9, #, -, ., $";
}
Try this function:
function isValid($str) {
return !preg_match('/[^A-Za-z0-9.#\\-$]/', $str);
}
[^A-Za-z0-9.#\-$]
describes any character that is invalid. If preg_match
finds a match (an invalid character), it will return 1 and 0 otherwise. Furthermore !1
is false and !0
is true. Thus isValid
returns false if an invalid character is found and true otherwise.
Only allowing characters a-z uppercase or lowercase..
if (preg_match("/[^A-Za-z]/", $FirstName))
{
echo "Invalid Characters!";
}
Adding numbers..
if (preg_match("/[^A-Za-z0-9]/", $FirstName))
{
echo "Invalid Characters!";
}
Adding additional characters to allow (in this case the exclamation mark)..
(Additional characters must be preceded with an "\" as shown.)
if (preg_match("/[^A-Za-z0-9\!]/", $FirstName))
{
echo "Invalid Characters!";
}
The preg
family of functions is quite a bit faster than ereg
. To test for invalid characters, try something like:
if (preg_match('/[^a-z0-9.#$-]/i', $buffer)) print "Invalid characters found";
You'll want to shift over to using preg
instead of ereg
. The ereg
family of functions have been depreciated, and (since php 5.3) using them will throw up a PHP warning, and they'll be removed from teh language soon. Also, it's been anecdotal wisdom that the preg functions are, in general, faster than ereg.
As for speed, based on my experience and the codebases I've seen in my career, optimizing this kind of string performance would be premature at this point. Wrap the comparision in some logical function or method
//pseudo code based on OP
function isValidForMyNeeds($buffer)
{
if (ereg("[A-Za-z0-9]\.\#\-\$", $buffer) === false)
{
echo "buffer only contains valid characters: a-z, A-Z, 0-9, #, -, ., $";
}
}
and then when/if you determine this is a performance problem you can apply any needed optimization in one place.
Use preg isntead, its faster, and ereg has been discontinued.
preg_match is both faster and more powerful than ereg:
if(preg_match('/^[^a-z0-9\.#\-\$]*$/i', $sString) > 0) //check if (doesn't contain illegal characters) is true
{
//everything's fine: $sString does NOT contain any illegal characters
}
or turn it around:
if(preg_match('/[a-z0-9\.#\-\$]/i', $sString) === 0) //check if (contains illegal character) is false
{
//everything's fine: $sString does NOT contain any illegal characters
}
精彩评论