PHP code explanation question.
I don't know if this id the place to ask this question so be kind if I am 开发者_如何学Cwrong.
I was wondering if someone can explain to me in detail what the following 3 code snippets below do.
Snippet 1
if($str !== mb_convert_encoding(mb_convert_encoding($str, 'UTF-32', 'UTF-8'), 'UTF-8', 'UTF-32')){
$str = mb_convert_encoding($str, 'UTF-8');
}
Snippet 2
$str = preg_replace('`&([a-z]{1,2})(acute|uml|circ|grave|ring|cedil|slash|tilde|caron|lig);`i', '\\1', $str);
Snippet 3
$str = preg_replace(array('`[^a-z0-9]`i','`[-]+`'), '-', $str);
Here is the full code below for reference.
function to_permalink($str){
if($str !== mb_convert_encoding(mb_convert_encoding($str, 'UTF-32', 'UTF-8'), 'UTF-8', 'UTF-32')){
$str = mb_convert_encoding($str, 'UTF-8');
}
$str = htmlentities($str, ENT_NOQUOTES, 'UTF-8');
$str = preg_replace('`&([a-z]{1,2})(acute|uml|circ|grave|ring|cedil|slash|tilde|caron|lig);`i', '\\1', $str);
$str = html_entity_decode($str, ENT_NOQUOTES, 'UTF-8');
$str = preg_replace(array('`[^a-z0-9]`i','`[-]+`'), '-', $str);
$str = strtolower(trim($str, '-'));
return $str;
}
Snippet 1 makes sure the string is in UTF-8 encoding.
Snippet 2 converts all special characters to their base form (ie, 'é' -> 'e').
Snippet 3 will convert spaces to hyphens (-).
All in all, taking into account the function's name and content, I'd say it is used to make URL friendly links, for example, convert
I discovered a new french word: église
to
i-discovered-a-new-french-word-eglise
Usually used for SEO.
Many of your questions can be answered by looking up what the functions do in your code.
Go here to get started: http://php.net/docs.php
Snippet #1: Checking if the string is valid UTF-8 data by round-trip converting it from source-> UTF-32 -> UTF-8. If the result is NOT the same as the input, then try to let the MB library determine the input encoding and output as UTF-8 regardless. Seems to be rather much work for little gain.
Snippet #2: Looks for a series of potential character entities (accented characters, in this case), and strips off the leading &
and trailing ;
if it matches and adds a backslash. So Æ
becomes \AElig
.
Snippet #3: Converts any character which is NOT a-z
or 0-9
or a sequence of 1 or more -
into a single -
.
精彩评论