PHP regex, replace all trash symbols
I can't get my head around a solid RegEx for doing this, still very new at all this RegEx magic. I had some limited success, but I feel like there is a simpler, more efficient way.
I would like to purify a string of all non-alphanumeric characters, and turn all those invalid subsets into one single underscore, but trim them at the edges. For example, the string <<+ćThis?//String_..!
should be converted to This_String
Any thoughts on doing this all in one RegEx? I did it with regular str_replace, and then regexed the multi-underscores o开发者_如何学编程ut of the way, and then trimmed the last underscores from the edges, but it seems like overkill and like something RegEx could do in one go. Kind of going for max speed/efficiency here, even if it is milliseconds I'm dealing with.
= trim(preg_replace('<\W+>', "_", $string), "_");
The uppercase \W
escape here matches "non-word" characters, meaning everything but letters and numbers. To remove the leftover outer underscores I would still use trim
.
Yes, you could do this:
preg_replace("/[^a-zA-Z0-9]+/", "_", $myString);
Then you would trim leading and trailing underscores, maybe by doing this:
preg_replace("/^_+|_+$/", "", $myReplacedString);
It's not one regex, but it's cleaner than str_replace
and a bunch of regex.
$output = preg_replace('/([^0-9a-z])/i', ' ', '<<+ćThis?//String_..!');
$output = preg_replace('!\s+!', '_', trim($output));
echo $output;
This_String
精彩评论