reversing a regular expression in php
suppose I have this function:
function f($string){
$string = preg_re开发者_如何学Pythonplace("`\[.*\]`U","",$string);
$string = preg_replace('`&(amp;)?#?[a-z0-9]+;`i','-',$string);
$string = htmlentities($string, ENT_COMPAT, 'utf-8');
$string = preg_replace( "`&([a-z])(acute|uml|circ|grave|ring|cedil|slash|tilde|caron|lig|quot|rsquo);`i","\\1", $string );
$string = preg_replace( array("`[^a-z0-9]`i","`[-]+`") , "-", $string);
return $string;
}
how can I reverse this function...ie. how should I write the function fReverse() such that we have the following:
$s = f("some string223---");
$reversed = fReverse($s);
echo $s;
and output: some string223---
f
is lossy. It is impossible to find an exact reverse. For example, both "some string223---"
and "some string223--------"
gives the same output (see http://ideone.com/DtGQZ).
Nevertheless, we could find a pre-image of f
. The 5 replacements of f
are:
- Strip everything between
[
and]
. - Replace entities like
<
,{
and encoded entities like&lt;
to a hyphen-
. - Escape special HTML characters (
<
→<
,&
→&
etc.) - Remove accents of accented characters (
é
(=é) →e
, etc.) - Turn non-alphanumerics and consecutive hyphens into a single hyphen
-
.
Out of these, it is possible that 1, 2, 4 and 5 be identity transforms. Therefore, one possible preimage is just reverse step 3:
function fReverse($string) {
return html_entity_decode($string, ENT_COMPAT, 'utf-8');
}
精彩评论