开发者

Non Latin Characters & ouch

I'm getting to know Cake PHP, which has unearthed a general question about best practice in terms of PHP / MySQL character set stuff, which I'm hoping can be answered here.

My (practice) system contains a mysql table of movies. This list was sourced from an Excel sheet, which was exported as CSV, and imported via phpMyAdmin.

I noticed that titles with more "exotic" glyphs have issues rendering in the browser, eg The é in Amélie. Using Cake or plain PHP, it renders as a ?, unless transformed via htmlentities into a é. Links with the special characters don't render at all.

If I use my Cake input form to enter an <alt>0233, this is rendered correctly in source, but as &Atilde;&copy; via htmlentities.

After a quick SO search, I decided maybe UTF-8 would fix stuff, hence I

  • changed the PHP source, and CSV file encoding to UTF-8
  • made sure the <meta> stuff was there (it was already via Cake's default layout).
  • made sure my browsers thinks the doc is UTF-8 (they do)
  • changed the collation on the MySQL DB to utf-8 general_ci (as an educated stab from avalable UTF-8 options)
  • deleted and reimported my data

However, I'm still stuck. I note that phpMyAdmin manages to render the characters "correctly" in it's HTML source when browsing records.

I sen开发者_如何学JAVAse that document encoding's to blame, however, am wondering if someone can provide the best answer to:

  • what's the best way to move my data from Excel to MySQL to preserve glyphs?
  • what's the optimum settings for my tables to accommodate this?
  • I'd prefer to use UTF-8 to natively display the likes of é, what can I do in Cake to avoid making loads of calls to the likes of htmlentities ie is there a configuration setting or way I set stuff up that makes this more friendly and lets Cake native helpers like Html->link work?

Some code, just in case:

movies controller excerpt..

function index() {
        $this->set('movies' , $this->Movie->find('all'));

}

index.ctp view excerpt

<?php foreach ($movies as $movie): ?>
<tr>
    <td><?php echo $movie['Movie']['id']; ?></td>
    <td><?php echo htmlentities($movie['Movie']['title']); ?>
    <td><?php echo $this->Html->link($movie['Movie']['title'] , 
    array('controller' => 'movies' , 'action' => 'view' , $movie['Movie']['id'])); ?>
    </td>

    <td><?php echo $this->Html->link("Edit", 
    array('action' => 'edit' , $movie['Movie']['id'])); ?>
    </td>

    <td>
    <?php echo $this->Html->link('Delete', array('action' => 'delete', $movie['Movie']['id']), null, 'Are you sure?')?>
    </td>

</tr>
<?php endforeach; ?>

Thanks in advance for any help / tips.


Make sure the MySQL connection is set to UTF-8 while importing the data. The collation is only used for sorting and comparison, not for saving data.

You can set the charset of the connection using SET NAMES 'utf-8'; in the beginning of your SQL file.


That question comes here often.

UTF8 should work. Make sure that:

  1. Your database collation uses utf8 (utf8 bin general)

  2. You html document encoding tag is set to utf8

  3. AND VERY IMPORTANT - most people forget that bit - make sure all your source files are saved as utf8. Use notepad++ on pc or Coda/TextMate/TextWrangler on mac to make sure the encoding is correct. If you don't do that, some transformation/re-interpretation of the characters may happen

EDIT: And forget about htmlentities, you don't need it if you use utf8 encoding all throughout

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜