开发者

How to upload Files with chinese names in PHP?

I have learning portal(LMS) w开发者_C百科here I will upload documents, images, videos etc to create content. If the file being uploaded has a chinese name then it is not getting uploaded. Instead a corrupted file with junk name is uploaded.

For example, I tried to upload a file named 地球科学.jpg. But on the server I got this file as 地çƒç§‘å­¦.jpg. Also the uploaded file is corrupted in the server.

I want this file to get upload with the same name on the server. Because I want to search for these files and reuse later for creating content.

FYI: I have XAMPP server installed on Windows XP. Chinese, Korean, and Japanese language packs installed.

Thanks for your answers.


AFAIK ntfs can't handle some characters on the filesystem. I would suggest to store the file with a generic name.

for example you could create a table with two columns: name and file, as name you save the original name, and as file you set something like md5(name).


If you need the name to search for it use a database to store name information and the file location and save the file using your own convention.

Example

// sql entry 
original name = 地球科学.jpg
path = /some/place/1.jpg

When you search you use the db to locate a given file name and location. Separation storage logic is something common when building image storage solutions not only for naming problems but also for limitations/spped considerations in terms of the number of files that accumulate in folders.


Use iconv or mb_convert_encoding to change character string encoding.

// Upload the file into the temp dir
$target_path = "uploadfiles/"; 
$target_path .= $_FILES['fileField']['name']; 

// iconv()
move_uploaded_file($_FILES['fileField']['tmp_name'], iconv("UTF-8", "big5", $target_path))
// mb_convert_encoding()
move_uploaded_file($_FILES['fileField']['tmp_name'], mb_convert_encoding($target_path, "big5", "UTF-8"))


Make sure the page displaying the form is rendered in utf-8 or higher, usually this does the job, you can also choose to use the accept-charset attribute of the form element to indicate the posted data is sent as the specified charset.

Not sure if this all will do the job, let me know.


I think you might want to use somekind of database solution, especially when you need to search files later on. With database you can avoid I/O overhead.


I think you must learn/understand what character set the file is in before you can work out how to handle the upload. I'm afraid I'm not too familiar with non-european character sets and don't know which are most widely used.

UTF-8 should be a safe bet to handle almost whatever you care to throw at it. There's some relevant information that could be useful in terms of configuring your application in a post I wrote recently on my blog: How to Avoid Character Encoding Problems in PHP

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜