开发者

Special (escaped) characters in replacements array in preg_replace get escaped

I’m trying to modify a string of the following form where each field is delimited by a tab except for the first which is followed by two or more tabs.

"$str1      $str2   $str3   $str4   $str5   $str6"

The modified string will have each field wrapped in HTML table tags, and be on its own, indented line as so.

"<tr>
  <td class="title">$str1</td>
  <td sorttable_customkey="$str2"></td>
  <td sorttable_customkey="$str3"></td>
  <td sorttable_customkey="$str4"></td>
  <td sorttable_customkey="$str5"></td>
  <td sorttable_customkey="$str6"></td>
</tr>

"

I tried using code like the following to do it.

$pat开发者_Go百科terns = array();
$patterns[0]='/^/';
$patterns[1]='/\t\t+/';
$patterns[2]='/\t/';
$patterns[3]='/$/';

$replacements = array();
$replacements[0]='\t\t<tr>\r\n\t\t\t<td class="title">';
$replacements[1]='</td>\r\n\t\t\t<td sorttable_customkey="';
$replacements[2]='"></td>\r\n\t\t\t<td sorttable_customkey="';
$replacements[3]='"></td>\r\n\t\t</tr>\r\n';

for ($i=0; $i<count($lines); $i++) {
  $lines[$i] = preg_replace($patterns, $replacements, $lines[$i]);
}

The problem is that the escaped characters (tabs and newlines) in the replacement array remain escaped in the destination string and I get the following string.

"\t\t<tr>\r\n\t\t\t<td class="title">$str</td>\r\n\t\t\t<td sorttable_customkey="$str2"></td>\r\n\t\t\t<td sorttable_customkey="$str3"></td>\r\n\t\t\t<td sorttable_customkey="$str4"></td>\r\n\t\t\t<td sorttable_customkey="$str5"></td>\r\n\t\t\t<td sorttable_customkey="$str6"></td>\r\n\t\t</tr>\r\n"

Strangely, this line I tried earlier on does work:

$data=preg_replace("/\t+/", "\t", $data);

Am I missing something? Any idea how to fix it?


You need double quotes or heredocs for the replacement string - PCRE only parses those escape characters in the search string.

In your working example preg_replace("/\t+/", "\t", $data) those are both literal tab characters because they're in double quotes.

If you changed it to preg_replace('/\t+/', '\t', $data) you can observe your main problem - PCRE understands that the \t in the search string represents a tab, but doesn't for the one in the replacement string.

So by using double quotes for the replacement, e.g. preg_replace('/\t+/', "\t", $data), you let PHP parse the \t and you get the expected result.

It is slightly incongruous, just something to remember.


Your $replacements array has all its strings decalred as single-quoted strings. That means that escaped characters won't scape (except \').

It is not related directly to PCRE regular expressions, but to how PHP handles strings.

Basically you can type strings like these:

<?php # String test

$value = "substitution";
$str1 = 'this is a $value that does not get substituted';
$str2 = "this is a $value that does not remember the variable"; # this is a substitution that does not remember the variable
$str3 = "you can also type \$value = $value" # you can also type $value = substitution
$bigstr =<<< MARKER
you can type
very long stuff here
provided you end it with the single
value MARKER you had put earlier in the beginning of a line
just like this:
MARKER;

tl;dr version: problem is single quotes in the $replacements and $patterns that should be double quotes

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜