How to best configure PHP to handle a UTF-8 website [duplicate]
What extensions would you recommend and how should php be best configured to create a website that uses utf-8 encoding for everything. eg...
- Page output is utf-8
- forms submit data encoded in utf-8
- internal processing of string data (eg when talking to a database) are all in utf-8 as well.
It seems that php does not really cope well with开发者_开发知识库 multibyte character sets at the moment. So far I have worked out that mbstring looks like an important extension.
Is it worth the hassle..?
The supposed issues of PHP with Unicode content have been somewhat overstated. I've been doing multilingual websites since 1998 and never knew there might be an issue until I've read about it somewhere - many years and websites later.
This works just fine for me:
Apache configuration (in httpd.conf or .htaccess)
AddDefaultCharset utf-8
PHP (in php.ini)
default_charset = "utf-8"
mbstring.internal_encoding=utf-8
mbstring.http_output=UTF-8
mbstring.encoding_translation=On
mbstring.func_overload=6
MySQL
CREATE
your database with an utf8_*
collation,
let the tables inherit the database collation and
start every connection with "SET NAMES utf8"
HTML (in HEAD element)
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
I was facing same issue for UTF-8 characters
, Everything was working on live server and staging server, but sometime it's breaking on my dev machine. The behavior was so strange, some times characters was encoded properly but on random page reload it was start breaking with Diamond Charters
'���เห็นอเวิลด์!���'
or Question mark
'??�เห็นอเวิลด์!???'
or 85% data was rendering properly 'เห็นอเวิลด์!?��'
but rest 15% was showing unmatched characters. I was looking to fix the issue. So, started with my checklist
1 - Check if Character Header Added in HTML
2 - Check if data proper saved in MySQL table
3 - Check if MySQL has proper encoding settings for UTF-8
4 - Check if Apache has Setting to deal with UTF-8 Character set
5 - Check if simple PHP can echo "เห็นอเวิลด์" output same as input "เห็นอเวิลด์"
6 - Check if PHP sending proper Headers output
7 - Check if MySQL Query getting same data "เห็นอเวิลด์"
8 - Check if "เห็นอเวิลด์" has some html characters, deal with them properly
9 - Check if "เห็นอเวิลด์" passing through any html encode decode function
10- Check if .htaccess all set to deal with UTF-8 Character set
Check all the above list to figure out where something..breaking.
Give a try (I am using Codeigniter):
=================================
:: PHP ini Settings::
=================================
default_charset = "utf-8"
mbstring.internal_encoding=utf-8
mbstring.http_output=UTF-8
mbstring.encoding_translation=On
mbstring.func_overload=6
=================================
:: .htaccess Settings::
=================================
DefaultLanguage en-US
AddDefaultCharset UTF-8
=================================
:: HTML Header Page::
=================================
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
=================================
:: PHP Codeigniter index.php ::
=================================
header('Content-Type: text/html; charset=UTF-8');
=================================
:: Codeigniter config.php ::
=================================
$config['charset'] = 'UTF-8';
=================================
:: Codeigniter database.php ::
=================================
$db['default']['char_set'] = 'utf8';
$db['default']['dbcollat'] = 'utf8_general_ci';
=================================
:: Codeigniter helper function (optional)
=================================
if(!function_exists('safe_utf_string')){
function safe_utf_string($utf8string= ''){
$utf8string = htmlspecialchars($utf8string, ENT_QUOTES, 'UTF-8');
return mb_convert_encoding($utf8string, 'UTF-8');
}
}
and Finally don't forget to say Thanks! :) to @djn answer
php copes just fine!
You should set the php.ini "default_charset" parameter to 'utf-8'.
The make sure that:-
<head>
<meta http-equiv="Content-Type"
content="text/html; charset=utf-8"
/>
is at the top of every page you serve.
There are a few problem areas:
Databases -- make sure they are configured to use utf-8 by default or enter a world of pain.
IDEs/Editors -- a lot of editors don't support utf-8 well. I normally use vim which doesn't but its never been a big problem.
Documents -- just spent a whole afternoon getting php to read Thai characters out of a spreadsheet. I was eventually successful but am still not sure what I did right.
2018 Update :::
Kindly note that these php.ini entries are DEPRECATED;
;mbstring.internal_encoding = utf-8
;mbstring.http_input =
;mbstring.http_output = utf-8
Next ...
PHP - Set utf8 for the following - via a config.php file for your web app
ini_set('default_charset', 'UTF-8');
mb_internal_encoding('UTF-8');
iconv_set_encoding('internal_encoding', 'UTF-8');
iconv_set_encoding('output_encoding', 'UTF-8');
MariaDB / MySQL - Set utf8 via:
mysqli::set_charset ( "utf8mb4" );
HTML Pages - Set via:
<meta charset="utf-8" >
If mbstring isn't already part of your PHP package, then I definitely would recommend it to you - you'll even want to use it for calculationg string lengths ( mb_strlen($string_var, 'utf8') ) for form input... Else you won't need anything except valid and proper HTML, a correct http-server-config (so the server will deliver pages unsing utf-8) and a text editor with utf-8-support (e.g. Notepad++).
In your php.ini, set
mbstring.internal_encoding = UTF-8
mbstring.encoding_translation = On
so that you don't need to pass an encoding parameter to the mb_ functions every time.
精彩评论