开发者

character encoding is lost when processing a .txt file of country names with with PHP

I am processing a text file of country names from ISO-3166 to extract just the country names into an array. The problem is when I output the array, the special characters for some countri开发者_如何学运维es is lost or changed:

$country_1 = fopen("./country_names_iso_3166.txt", "r");        
while( !feof($country_1) ) {  // go through every line of file
$country = fgets( $country_1 );
if( strstr($country, ",") )         // if country name contains a comma
    $i = strpos( $country, "," ); // index position of comma
else
    $i = strpos( $country, ";" );   // index position of semicolon
$country = substr( $country, 0, $i );   // extract just the country name
$countries[] = $country;
}

So now when I output the array, for example, the second country name should be ÅLAND ISLANDS, however it outputs as LAND ISLANDS... Please advise on how to fix this.


Try using the multibyte-aware string functions instead. mb_strstr(), mb_strpos(), mb_substr() (basically just prefix with mb_).


Make sure the stream you are outputting the data is using the same character set as the input file.

(Removed mistake of saying that ISO-3166 is a charset)

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜