开发者

Charset problem, MySQL and get_meta_tags()

I'm trying开发者_开发百科 to get HTML meta tags with PHP by using get_meta_tags() function. I'm using UTF8 for tables, charset/collations, as connection charset to MySQL and everything else.

But unfortunetely MySQL cuts off the string when inserting to table. It happens while HTML encodings are different than UTF-8 (for example ISO 8859-1)

Is there any way for converting strings to UTF8 without knowing it's encoding charset?


Example:

<?php 
// ------------------------------------------------------------ 

header('Content-Type:text/html; charset=utf-8');


// ------------------------------------------------------------ 

function str_to_utf8($string) { 
    if (mb_detect_encoding($string, 'UTF-8', true) === false) { 
    $string = utf8_encode($string); 
    } 
    return $string; 
}

// ------------------------------------------------------------ 


$url = 'http://example.org';    // ---- The URL to get Meta-Tags from --- 


// ------------------------------------------------------------ 

$meta_raw = get_meta_tags($surl);

$meta_enc = array(); 

foreach($meta_raw as $mkey => $mval) { 
   $meta_enc[$mkey] = str_to_utf8($mval); 
}


// ------------------------------------------------------------ 

print "<p>the (old) raw data</p>\n";
print "<pre style=\"margin:6px; padding:6px; background:#FFFFCC; text-align:left;\">\n";
print_r($meta_raw);
print "</pre>\n";

print "<br />\n";
print "<br />\n";

// ------------------------------------------------------------ 

print "<p>the (new) utf8 encoded data</p>\n";
print "<pre style=\"margin:6px; padding:6px; background:#DEDEDE; text-align:left;\">\n";
print_r($meta_enc);
print "</pre>\n";

print "<br />\n";
print "<br />\n";

// ------------------------------------------------------------ 
?>

:)

in the function: str_to_utf8($string) { ... } you can also use differet ways to dedect and encode the $string like iconv(), mb_convert_encoding(), ...


Encodes an ISO-8859-1 string to UTF-8 (PHP 3 >= 3.0.6, PHP 4, PHP 5)

string utf8_encode ( string data )

Convert string to requested character encoding (PHP 4 >= 4.0.5, PHP 5)

string iconv ( string in_charset, string out_charset, string str )

However, if you want to change to UTF-8 regardless of encoding, checkout;

Convert character encoding (PHP 4 >= 4.0.6, PHP 5)

string mb_convert_encoding ( string str, string to_encoding [, mixed from_encoding] )

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜