开发者

php uploaded file name slug

I am tr开发者_JS百科ying to change the name of uploaded image. Image file name is in Turkish, like Şömine.jpg and i am trying to save it as Somine.jpg BUT str_replace does not work.

Here is my testing code and results;

$img=pathinfo($_FILES['image']['name'], PATHINFO_FILENAME);
echo $img.PHP_EOL;
$turkce=array("ö","Ş");
$duzgun=array("o","S");
$img=str_replace($turkce,$duzgun,$img);
echo $img.PHP_EOL;

$img1 = "Şömine";
$turkce=array("ö","Ş");
$duzgun=array("o","S");
$img1=str_replace($turkce,$duzgun,$img1);
echo $img1.PHP_EOL;

And the output;

Şömine
Şömine
Somine

Everyting is UTF-8 encoded, what can i try to fix it? As you can see if i type the text in source works fine but while file upload it does not work. Any ideas?


Relying on a file system to preserve the name characteristics of uploaded files, especially those which contain UTF-8 charactres, is a bad idea.

A much better approach would be to create a unique hash for every uploaded file and store it inside a database along with the real name of the file.

In other words if you decide to upload a file called Şömine.jpg after the upload you don't store it with its original name but instead generate a unique md5 for it (in this case ecc3a7d1bdd36b0849ab609857351cd1) and store the file under the name ecc3a7d1bdd36b0849ab609857351cd1.jpg.

After that you simply add a record to your database indicating that ecc3a7d1bdd36b0849ab609857351cd1 is actually representing a file named "Şömine.jpg".

When you need to retrieve the file you simply search for the name in the database and retrieve the file with the corresponding hash in its name. After that you use the following headers to present the user with the file bearing its original filename.

header("Content-Disposition: attachment; filename=FILENAME_FROM_THE_DATABASE");


There are more convinient way to interact with character encoding in php, using mb_convert_encoding. In this case, you could do something like

$img = pathinfo($_FILES['image']['name'], PATHINFO_FILENAME);    
$new_name = mb_convert_encoding($img,'HTML-ENTITIES','UTF-8');
$new_name = preg_replace(
    array('/ß/','/&(..)lig;/',
         '/&([aouAOU])uml;/','/&(.)[^;]*;/'),
    array('ss',"$1","$1".'e',"$1"),
    $img);

echo $new_name;


You might want to use this utf-8 fail-safe method from Nette Framework http://api.nette.org/2.0/source-Utils.Strings.php.html#128


  1. First, you must determine the encoding of the file name retrieved from the form. The rule is simple: browsers always use the same encoding of the original form, so if the form was UTF-8 encoded, the same encoding will be used for the file name.

  2. Second, if you really want to save the uploaded file to the file system of the server, you must translate the encoding to the current locale as given by the LC_CTYPE parameter (see set_locale() for details). On Windows, this parameter has form

    language_country.codepage

where "codepage" is a number giving the Windows code page currently configured. Two examples might be 1252 (western countries, very similar to ISO-8859-1 aka Latin1) or 932 (japanese). You must then translate $fn from its encoding (say, UTF-8) to the local encoding (say, 1252) before saving the file with that name. Some characters might not have a corresponding translation in the current locale, so you must either signal an error or silently drop the invalid characters; that's only one of the reasons why saving files with their original name provided by a remote user is always a very bad idea.

More details about PHP support for Unicode file names are available in my reply to the PHP bug no. 47096 available at:

https://bugs.php.net/bug.php?id=47096

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜