Problems with characters like ÖÄÅ
My form
<form action="saveProfile.php" method="post" name="ProfileUpdate" id="ProfileUpdate" >
<input name="Smeknamn" id="Smeknamn" type="text" value="<?php echo $v["user_name"]; ?>" maxlength="16" id="ctl00_ctl00_cphContent_cphContent_cphContentLeft_tbUsername" onkeydown="return ((event.keyCode != 16) || (event.keyCode == 16 && this.value.length >= 1));" style="width: 130px;" />
</form>
When I try to echo $_POST["Smeknamn"];
on saveProfile.php i get ���
on the characters Ö Ä Å
Why is this happening? saveProfile
AND editProfile
is encoded in UTF-8 without BOM, and meta utf8 and all that.
UPDATE UPDATE
$smeknamn = $data["Smeknam开发者_Go百科n"]
Sorry forgot to mention that i had this foreach. And its $smeknamn im echoing and getting Ã�Ã�Ã�. I just tried $_POST["Smeknamn"] and it echo out ÖÄÅ just fine.. So the problem is now in the foreach() that makes the öäå chars Ã�Ã�Ã�. How can i fix this?
foreach($_POST as $key => $value) {
$data[$key] = filter($value);
}
function filter($data) {
$data = trim(htmlentities(strip_tags($data)));
if (get_magic_quotes_gpc())
$data = stripslashes($data);
$data = mysql_real_escape_string($data);
return $data;
}
Try encoding editProfile.php
and saveProfile.php
as UTF-8 with BOM.
This is a character encoding issue.
I guess your data is actually encoded with UTF-8 so the character Ö
(U+00D6) is encoded with 0xC396. Now when htmlentities
is called without specifying the charset parameter, it implicitly uses ISO 8859-1:
[…] optional third argument
charset
which defines character set used in conversion. Presently, the ISO-8859-1 character set is used as the default.
And when interpreting the byte sequence 0xC396 with ISO 8859-1 it represents the two ISO 8859-1 characters 0xC3 and 0x96. Since there is the entity Atilde for the ISO 8859-1 character 0xC3, this character is replaced by htmlentities
with the reference Ã
. But there isn’t any entity representing the second character 0x96, so it’s not being replaced. That means:
htmlentities("\xC3\x96") === "Ã\x96"
Now when this is interpreted by the user agent, the character reference gets displayed correctly but the remaining byte 0x96 is not a valid byte sequence for a character in UTF-8. That’s why the replacement character �
is displayed instead.
So the problem is that you didn’t specify the correct character encoding for htmlentities
:
htmlentities("\xC3\x96", ENT_COMPAT, "UTF-8") === "Ö"
But as you’re already using UTF-8 for your output, you don’t need to replace such characters and using htmlspecialchars
instead will suffice to replace the HTML special characters.
But besides that, you shouldn’t use such an universal-like filter function as every language and context has its own special character that need to be taken care of.
精彩评论