开发者

Why my typeset function doesn't work for non-latin/Asian characters?

I've convinced my boss to do the typesetting stuff using PHP(PHP Version 5.2.8). And this is what I got so far(set Character encoding to Unicode(UTF-8) if you see misrendered Japanese characters):

demo page at my personal website

Basical开发者_StackOverflow社区ly, if you copy and paste the latin sample paragraph into the textarea and click the button, everything works well, you can verify that by pasting the result into Notepad for a check(albeit the fact that I haven't done anything to use hyphens to denote words separated by new lines).

However, when it comes with non-latin/Asian characters, nothing got printed out. I didn't get any error message generated, just cannot see anything at all...

The following is my code:

<?php
$words = typesetWords($_POST['words']);
echo json_encode(array('feedback' => $words));

function typesetWords($words, $lineLength = 70)
{
    try
    {
        $result = '';
        $paragraphs = explode("\n\n", $words);
        foreach($paragraphs as $paragraph)
        {
            $paragraph = str_replace("\n", "", $paragraph);
            $length = strlen($paragraph);
            $numberOfLines = intval($length / $lineLength);
            $tmp = '';
            if($numberOfLines > 0)
            {
                for($i = 0; $i < $numberOfLines; $i++)
                    $tmp .= substr($paragraph, $i * $lineLength, $lineLength)."\n";
                $tmp .= substr($paragraph, -1 * ($length % $lineLength))."\n\n";
                $result .= $tmp;
            }
            else $result .= $paragraph."\n\n";
        }
    }
    catch(Exception $e)
    {
        return $e->getMessage();
    }
    return $result;
}

?>

I tried to return what was sent by the form directly back, and I did see the Japanese sample paragraph without problems. So I reckon one of the PHP library functions must have caused the error, but I couldn't tell which one and how to fix it...

Many thanks in advance!


strlen() will return the number of characters from a string formatted for ANSI/ASCII, not UTF-8. Try mb_strlen() instead.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜