开发者

wrong encoding in a csv generated by a PHP script

The CSV is always read by MAC users, so I guess this is a Mac problem

I'm generating a csv file with this piece of code (thx SO :p)

<?php
include("../include/include.php");
$file_new_export = '../temp/new_hve_full.php';
$query = "select * from mytable";
$result = mysql_query($query) or die("Sql error : " . mysql_error());

if (!$result)
    die('Couldn\'t fetch records');
$i = 0;
while ($row = mysql_fetch_assoc($result)) {

    $hve_biodiv = unserialize($row['hve_a']);
    $hve_ferti = unserialize($row['hve_b']);
    $hve_phyto = unserialize($row['hve_c']);
    $hve_irri = unserialize($row['hve_d']);
    $hve_eco = unserialize($row['hve_e']);

    $content[] = array_merge(array_values($hve_a), array_values($hve_b), array_values($hve_c), array_values($hve_d), array_values($hve_e));
    if ($i == 0)
        $headers = array_merge(array_keys($hve_a), array_keys($hve_b), array_keys($hve_c), array_keys($hve_d), array_keys($hve_e));

    $i++;
}

$fp = fopen($file_new_export, 'w');
if ($fp && $result) {

    fwrite($fp, '<?php ');
    fwrite($fp, 'header(\'Content-Type: application/csv; charset=iso-8859-1\');');
    fwrite($fp, 'header(\'Content-Disposition: attachment; filename="export_hve.csv"\');');
    fwrite($fp, 'header(\'Pragma: no-cache\');');
    fwrite($fp, 'header(\'Expires: 0\');');
    fwrite($fp, '?>');

    fputcsv($fp, $headers, ';');
    foreach ($content as $fields) {
        fputcsv($fp, $fields, ';');
    }
    fclose($fp);
}
?>

Everything works fine but I get some letters with wrong encoding, typically instead of "é" I get "È" it s close but not good...

If the content-type and the filename are swith to an html or a txt file then all caracters are well displayed, it seem开发者_Go百科s only to affect csv files, if I switch the encoding on the excel for mac to west european it still not working...

Don't know what to do here, I m looking for a simple solution not encoding all files to utf8 or things like this cause there s a lot of data...Everything is in iso8859-1 according to the my settings (BDD/IDE/PHP encoding)...

Thx for help


Well i should think that this, may solve your problem. Just put this line at the top of your php file (before any includes):

header('Content-Type: text/html; charset=iso-8859-1');

More info at: Enconding Type Header

This is because it will encode the file correctly.

gl, Paulo Bueno


.csv is just plain file text that happens to have structured data within. There's no way for the innards of the file to indicate which character set was used. You're forcing a download of the file via the "content-disposition: attachment", so the HTTP header indicating character set will only be in effect for the duration of the download. After that, it's just another file on the hard-drive.

If you're intending this data to be used in Excel exclusively, then I'd suggest using PHPExcel to generate a real Excel file, which will not have these translation problems.


erk, scary.

There is a lot of messy, redundant code in the while loop - and you should write the output within the same loop you're reading your input in. And you're not writing a csv file - you're writing a PHP file - which is extremely dangerous.

include("../include/include.php");
$result = mysql_query("select * from mytable") || die mysql_error();

header('Content-Type: application/csv; charset=iso-8859-1');
header('Content-Disposition: attachment; filename="export_hve.csv"');
while($row = mysql_fetch_assoc($result)) {
  print mkcsv($row) . "\n";
}
exit;
function mkcsv($a)
{
 foreach ($a as $k=>$v) {
   if (!preg_match("/^([0-9.])*$/", $v) {
       $a[$k]="'" . addslashes($v) . "'";
   }
 }
 return implode(',',$a) . "\n";
}

Now, on to the problem.

What character set are you using on the database? How did you verify the encoding was wrong? And was wrong because of the way you extracted it? i.e. did you do a hexdump on the data and check that the 0x233 had been converted to 0x200?

Try:

mysql_query('set names latin1');

before you execute your SELECT statement.


You seem to be setting the content-type for your export file correctly through the header, but I am curious if the problem doesn't lie in your database charset. If you are storing your data in MySQL with a different charset than iso-8859-1, then it might cause some funky issues if trying to be stored/displayed as such.


Sniffing around a little more I found the following:

Wikipedia:
http://en.wikipedia.org/wiki/ISO/IEC_8859-1

For the character encoding commonly mislabeled as "ISO-8859-1", see Windows-1252.


MySQL:
http://dev.mysql.com/doc/refman/5.0/en/charset-mysql.html
To figure out the default charset of your database, try running the query:

SHOW VARIABLES;

The list it returns should have a variable called

character_set_database 

which from the MySQL Reference is:

The character set used by the default database. The server sets this variable whenever the default database changes. If there is no default database, the variable has the same value as character_set_server.


UTF8 has been created to handle a large variety of character sets including the French language, Spanish, ... language sets. Using UTF8 is recommended.

That said, In PHP, when working on a Mac platform, you need to use the iconv() fonction to perform the conversions. For instance, you can do the following:

<?php
$unprocessed_string = "Éléphant";
$processed_spring = iconv('MACINTOSH', 'UTF8', $unprocessed_string);
?>

Feel free to replace UTF8 with whatever encoding type you wish to use.

The tricky part is that you may be using a file containing a text copied from another software. I have personally noticed that such text may already be in UTF8 and in that case, no conversion will be necessary.

(edited)

Additional note:

This will only affect the content manipulated from the file. If you still experience trouble displaying the content, make sure to use the header() function in PHP to force the encoding type and "charset" of the whole page.


it's working. Enjoy
just put this line before fputcsv. and it will convert to utf8.

foreach ($input_array as $line) {
        $line = array_map("utf8_decode", $line);
        fputcsv($temp_memory, $line, $delimiter);
    }


You could try re-encoding it using mb_convert_encoding

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜