开发者

Whitespace in a database field is not removed by trim()

I have some whitespace at the begining of a paragraph in a text field in MySQL.

Using trim($var_text_field) in PHP or TRIM(text开发者_JAVA技巧_field) in MySQL statements does absolutely nothing. What could this whitespace be and how do I remove it by code?

If I go into the database and backspace it out, it saves properly. It's just not being removed via the trim() functions.


function UberTrim($s) {
    $s = preg_replace('/\xA0/u', ' ', $s);  // strips UTF-8 NBSP: "\xC2\xA0"
    $s = trim($s);
    return $s;
}

The UTF-8 character encoding for a no-break space, Unicode (U+00A0), is the 2-byte sequence C2 A0. I tried to make use of the second parameter to trim() but that didn't do the trick. Example use:

assert("abc" === UberTrim("  \r\n  \xc2\xa0  abc  \t \xc2\xa0   "));

A MySQL replacement for TRIM(text_field) that also removes UTF no-break spaces, thanks to @RudolfRein's comment:

TRIM(REPLACE(text_field, '\xc2\xa0', ' '))

UTF-8 checklist:

(more checks here)

  1. Make sure your PHP source code editor is in

    Whitespace in a database field is not removed by trim()

    UTF-8 mode without BOM. Or set in the preferences.

  2. Make sure your MySQL client is set for UTF-8 character encoding (more here and here), e.g.

    $pdo = new PDO('mysql:host=...;dbname=...;charset=utf8',$userid,$password); $pdo->exec("SET CHARACTER SET utf8");

  3. Make sure your HTTP server is set for UTF-8, e.g. for Apache:

    AddDefaultCharset UTF-8

  4. Make sure the browser expects UTF-8.

    header('Content-Type: text/html; charset=utf-8');

    or

    <meta http-equiv="Content-Type" content="text/html;charset=utf-8" />


If the problem is with UTF-8 NBSP, another simple option is:

REPLACE(the_field, UNHEX('C2A0'), ' ')


The best solution is a combination of a few things mentioned to you already.

First run ORD() on the string in question. In my case I had to run a reverse first because my problem character was at the end of the string.

ORD(REVERSE([col name])) 

Once you discover the problematic char, run a

REPLACE([col_name], char([char_value_returned]), char(32))

Finally, call a proper

TRIM([col_name])

This will completely eradicate the problem character from all aspects of the string, and trim off the leading (in my case trailing) character.


Try using the MySQL ORD() function on the text_field to check the character code of the left-most character. It can be a non-whitespace characters that appears like whitespace.


you have to detect these "whitespace" characters first. if it's some HTML entity, like &nbsp; no trimming function would help, of course.

I'd suggest to print it out like this

echo urlenclde($row['field']);

and see what it says

Well as its A0 (or 160 decimal) non-breaking space character, you can convert it to ordinal space first:

<pre><?php
$str = urldecode("%A0")."bla";
var_dump(trim($str));
$str = str_replace(chr(160)," ",$str);
$str = trim($str);
var_dump($str);

and, ta-dam! -

string(4) " bla"
string(3) "bla"


Try to check what character each "whitespace" is by writing the charactercode out - It might be a non-visible charactertype that isn't removed by trim. Trim only removes a few characters such as whitespace, tab, newline, CR and NUL but there exist other non-visible characters that might cause this problem.


try


str_ireplace(array("\r", "\n", "\t"), $var_text_field

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜