PHP urlize function
I'm using this funct开发者_如何学Goion on my website to transform user input into acceptable URL:
function urlize($url) {
$search = array('/[^a-z0-9]/', '/--+/', '/^-+/', '/-+$/' );
$replace = array( '-', '-', '', '');
return preg_replace($search, $replace, utf2ascii($url));
}
function utf2ascii($string) {
$iso88591 = "\\xE0\\xE1\\xE2\\xE3\\xE4\\xE5\\xE6\\xE7";
$iso88591 .= "\\xE8\\xE9\\xEA\\xEB\\xEC\\xED\\xEE\\xEF";
$iso88591 .= "\\xF0\\xF1\\xF2\\xF3\\xF4\\xF5\\xF6\\xF7";
$iso88591 .= "\\xF8\\xF9\\xFA\\xFB\\xFC\\xFD\\xFE\\xFF";
$ascii = "aaaaaaaceeeeiiiidnooooooouuuuyyy";
return strtr(mb_strtolower(utf8_decode($string), 'ISO-8859-1'),$iso88591,$ascii);
}
I'm having a problem with it though, with numbers. For some reason when I try:
echo urlize("test 23342");
I get "test-eiioe". Why is that and how can I fix it?
Thank you very much!
The problem is in your utf2ascii. I suggest you to use iconv()
function instead.
iconv("UTF-8", "ISO-8859-1//IGNORE", $string);
The //IGNORE part in the output encoding means to ignore any character it can't translate. The bad news is you lose all accented characters. To keep them, you can use //TRANSLIT.
Then, you can use strtolower and some regexp to eliminate non-alphanumeric characters (or to replace them with -).
If you want to encode any data, there is also urlencode()
, but this won't make you nice links.
Hey, it looks like you are trying to create a slug. If so, this is the function I use/suggest:
function slug( $string ) {
return strtolower( preg_replace( array( '/[^-a-zA-Z0-9\s]/', '/[\s]/' ), array( '', '-' ), $string ) );
}
What's wrong with urlencode()?
Your utf2ascii function is wrong, that's the one turning test 23342
into test eiioe
.
Why don't you use iconv to do the conversion from UTF-8 to ISO-8859-1? ie. use iconv("UTF-8", "ISO-8859-1//TRANSLIT", $url);
I added accented character replacing on Maxime Michel's answer:
function urlize($url) {
$search = array('/[^a-z0-9]/', '/--+/', '/^-+/', '/-+$/' );
$replace = array( '-', '-', '', '');
$unwanted_array = array( 'Š'=>'S', 'š'=>'s', 'Ž'=>'Z', 'ž'=>'z', 'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O', 'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U',
'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss', 'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c',
'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o',
'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y' );
$url = strtr( $url, $unwanted_array );
$url = strtolower(iconv("UTF-8", "ISO-8859-1//TRANSLIT", $url));
return preg_replace($search, $replace, $url);
}
精彩评论