character encoding is lost when processing a .txt file of country names with with PHP
I am processing a text file of country names from ISO-3166 to extract just the country names into an array. The problem is when I output the array, the special characters for some countri开发者_如何学运维es is lost or changed:
$country_1 = fopen("./country_names_iso_3166.txt", "r");
while( !feof($country_1) ) { // go through every line of file
$country = fgets( $country_1 );
if( strstr($country, ",") ) // if country name contains a comma
$i = strpos( $country, "," ); // index position of comma
else
$i = strpos( $country, ";" ); // index position of semicolon
$country = substr( $country, 0, $i ); // extract just the country name
$countries[] = $country;
}
So now when I output the array, for example, the second country name should be ÅLAND ISLANDS, however it outputs as LAND ISLANDS... Please advise on how to fix this.
Try using the multibyte-aware string functions instead. mb_strstr(), mb_strpos(), mb_substr() (basically just prefix with mb_
).
Make sure the stream you are outputting the data is using the same character set as the input file.
(Removed mistake of saying that ISO-3166 is a charset)
精彩评论