how to write a Regexp to filter out non-english charater
i have a bunch of files.. they are all in this kind of file name
english words number.extension
or
english words Charaters.extension (Charaters mean Chinese, Japanese, Koren etc)
how can i write a Regexp to filter them, remove the number and non-english charater
so that they can become
english words.extensio开发者_JAVA技巧n
-thx
For just 26 English letters you could use /[^A-Za-z]/
or /[^a-z]/i
. I don't know what programming language you're using to give a more specific example.
If you don't mind being a bit verbose, you can make an explicit list of 'acceptable' characters and reject anything not on the list. For example:
for old_filename in `ls`; do
new_filename = `echo $old_filename |sed -e 's/[^a-zA-Z.-_ ]//g'`
mv $old_filename $new_filename
done
If the 'A-Z', etc character ranges are picking up some characters that you don't want (may or may not be an issue depending on your locale) then you can always list every letter individually.
Adjust the 'ls' call if you only want to pick up certain files in the directory (filter by extension, etc). You will run into problems if more than one file transforms into the same 'English-only' name, but you should be able to work around that by appending an extra character to the filename.
精彩评论