regex for utf-8 filename validation
I have been googling around for a couple of hours but I couldn't find a good solution for utf-8 filename validating PHP regex solution. I have tried many of them, if needed I may copy/paste here. File name may include german or other charact开发者_JAVA技巧ers but not invalid ones like / etc. Have you got any idea?
http://php.net/manual/en/regexp.reference.unicode.php
One alternative I've always found very elegant is urlencode()
ing the file names.
That takes away the need to blacklist characters, as it creates file names that work on every file system; showing the real file name is trivial using urldecode()
.
Try ruling out invalid ones? ^[^/etc]+$
or some such (replace etc with other characters you don't like).
Not sure if you actually need regex for the task.
I'm not sure if you're trying to validate if a file is UTF-8 or how to do a UTF-8 regex. If you want to do a UTF-8 regex, you can use the mbstring
series of functions, by first setting mb_regex_encoding to UTF-8, then using mb_ereg to do the regular expression match. If you want to test if the file has UTF-8, you can use mb_detect_encoding on the file contents and see if it matches UTF-8.
精彩评论