RegEx for cleanup a string
This RegEx is for cleanup user input from a search form
$query = preg_replace("/开发者_如何学JAVA[^A-Za-z0-9 _.,*&-]/", ' ', $query);
I need to add the slash as a valid character too, but if I add it, I get an error. I assume I have to escape it but can't find how to do that
$query = preg_replace("/[^A-Za-z0-9 _.,*&-/]/", ' ', $query); // doesn't works
$query = preg_replace("/[^A-Za-z0-9 _.,*&-//]/", ' ', $query); // doesn't works
$query = preg_replace("/[^A-Za-z0-9 _.,*&-\/]/", ' ', $query); // doesn't works
Using php
You can use something other then the /
as your delimiter - try something like this:
$query = preg_replace("%[^A-Za-z0-9 _.,*&-/]%", ' ', $query);
Kobe also posted the correct way to escape in that situation, but I find the regex stays more readable when I switch the delimiter to something I'm not using in the expression, when possible.
EDIT
A bit of additional information can be found at http://www.php.net/manual/en/regexp.reference.delimiters.php (quoting it here:)
"When using the PCRE functions, it is required that the pattern is enclosed by delimiters. A delimiter can be any non-alphanumeric, non-backslash, non-whitespace character."
You should escape twice - you need to create the string "\/"
, so the backslash should also be escape as "\\/"
:
$query = preg_replace("/[^A-Za-z0-9 _.,*&\\/-]/", ' ', $query);
Also, make sure you move -
to the end, or escape that as well. It has a different between two characters in a character set.
$query = preg_replace("/[^A-Za-z0-9 _.,*&-\/]/", ' ', $query);
would work if you wrote it with single quotes, like this:
$query = preg_replace('/[^A-Za-z0-9 _.,*&\/-]/', ' ', $query);
The cause of this is that strings enclosed in " are parsed for \n, \r \t etc. and $vars. Thus; escaping a / makes PHP try to find a special meaning for "/" as in "\n" and when it fails it removes the backslash.
Strings enclosed in ' are not parsed.
To escape a character, just put a backslash in front of it ; but don't forget you are using a double-quoted string -- which is probably the reason that makes this harder : you probably have to espace the backslash itself.
Another solution that I generally use is to work with a different regex delimiter, that you don't have in your regex. For instance, using a #
:
$query = preg_replace("#[^A-Za-z0-9 _.,*&-/]#", ' ', $query);
This should solve the problem :-)
精彩评论