php problem with encoding in a cli script reading id3 tags
i am trying to get a php cli script to go through a folder, get the id3 tags, which are in utf8 in cyrillic and but it in the database. when i execute the script i get the raw utf in the DB fields like "Àëáåíà"
here is the script
<?
set_time_limit(0);
include('classes/adodb5/adodb.inc.php');
include ('classes/id3/getid3.php');
$ftpdir = "/radio/unprocessed/";
$processeddir = "/radio/music";
//set up the database
$conn = &ADONewConnection('mysql');
$conn->PConnect('localhost','root', '**********','radio');
$utf = $conn->Execute("SET NAMES 'UTF8';");
$charset = $conn->Execute("CHARS开发者_如何学编程ET UTF8;");
//function for processing the actual file
function processmp3($fn, $folder, $conn){
$getID3 = new getID3;
$ThisFileInfo = $getID3->analyze($folder.$fn);
//this is needed to consolidate all tag formats
getid3_lib::CopyTagsToComments($ThisFileInfo);
if (array_key_exists('artist', $ThisFileInfo['comments_html'])&& array_key_exists('artist', $ThisFileInfo['comments_html'])){
$artist=($ThisFileInfo['comments_html']['artist'][0]);
$title=($ThisFileInfo['comments_html']['title'][0]);
}else{$artist ='not defined'; $title="not defined";}
//random name
//random name
$rand_name = md5(time()).rand(1,1000).".mp3";
//movefile
//rename($folder.$fn,'/radio/music/'.$rand_name);
//put in DB
$insert = $conn->Execute('INSERT INTO unprocesseds VALUES("","'.$artist.'","'.$title.'","'.$rand_name.'","'.$fn.'");');
}
//cyccle through contents
if($handle = opendir($ftpdir)){
while(false !== ($file = readdir($handle))){
$type = mime_content_type($ftpdir.$file);
if ($type=='audio/mpeg'){processmp3($file, $ftpdir, $conn);}
else {
if(is_file($ftpdir.$file)){unlink($ftpdir.$file);}
}
}
}
closedir($handle);
In my particular case 2 things were happening.
- Putty was somehow disregarding my explicit "use utf-8" settings
- The class i was using was not copying id3 tags to "comments_html" wrong. When i print_r-ed the result, i found that by accessing the actual tag i was able to get to the uncorrupted utf-8
精彩评论