Non Latin Characters & ouch
I'm getting to know Cake PHP, which has unearthed a general question about best practice in terms of PHP / MySQL character set stuff, which I'm hoping can be answered here.
My (practice) system contains a mysql table of movies. This list was sourced from an Excel sheet, which was exported as CSV, and imported via phpMyAdmin.
I noticed that titles with more "exotic" glyphs have issues rendering in the browser, eg The é in Amélie. Using Cake or plain PHP, it renders as a ?
, unless transformed via htmlentities
into a é
. Links with the special characters don't render at all.
If I use my Cake input form to enter an <alt>0233
, this is rendered correctly in source, but as é
via htmlentities
.
After a quick SO search, I decided maybe UTF-8 would fix stuff, hence I
- changed the PHP source, and CSV file encoding to UTF-8
- made sure the
<meta>
stuff was there (it was already via Cake's default layout). - made sure my browsers thinks the doc is UTF-8 (they do)
- changed the collation on the MySQL DB to utf-8 general_ci (as an educated stab from avalable UTF-8 options)
- deleted and reimported my data
However, I'm still stuck. I note that phpMyAdmin manages to render the characters "correctly" in it's HTML source when browsing records.
I sen开发者_如何学JAVAse that document encoding's to blame, however, am wondering if someone can provide the best answer to:
- what's the best way to move my data from Excel to MySQL to preserve glyphs?
- what's the optimum settings for my tables to accommodate this?
- I'd prefer to use UTF-8 to natively display the likes of é, what can I do in Cake to avoid making loads of calls to the likes of htmlentities ie is there a configuration setting or way I set stuff up that makes this more friendly and lets Cake native helpers like
Html->link
work?
Some code, just in case:
movies controller excerpt..
function index() {
$this->set('movies' , $this->Movie->find('all'));
}
index.ctp view excerpt
<?php foreach ($movies as $movie): ?>
<tr>
<td><?php echo $movie['Movie']['id']; ?></td>
<td><?php echo htmlentities($movie['Movie']['title']); ?>
<td><?php echo $this->Html->link($movie['Movie']['title'] ,
array('controller' => 'movies' , 'action' => 'view' , $movie['Movie']['id'])); ?>
</td>
<td><?php echo $this->Html->link("Edit",
array('action' => 'edit' , $movie['Movie']['id'])); ?>
</td>
<td>
<?php echo $this->Html->link('Delete', array('action' => 'delete', $movie['Movie']['id']), null, 'Are you sure?')?>
</td>
</tr>
<?php endforeach; ?>
Thanks in advance for any help / tips.
Make sure the MySQL connection is set to UTF-8 while importing the data. The collation is only used for sorting and comparison, not for saving data.
You can set the charset of the connection using SET NAMES 'utf-8';
in the beginning of your SQL file.
That question comes here often.
UTF8 should work. Make sure that:
Your database collation uses utf8 (utf8 bin general)
You html document encoding tag is set to utf8
AND VERY IMPORTANT - most people forget that bit - make sure all your source files are saved as utf8. Use notepad++ on pc or Coda/TextMate/TextWrangler on mac to make sure the encoding is correct. If you don't do that, some transformation/re-interpretation of the characters may happen
EDIT: And forget about htmlentities, you don't need it if you use utf8 encoding all throughout
精彩评论