Problem with cyrillic symbols in console
sorry for bad English. It's Ruby code.
s = "мистика"
`touch #{s}`
`cat #{s}`
`cat < #{s}`
Can anybody tell why it's code fails? With
sh: cannot open ми�тика: No such file
But thic code works fine
s = "работает"
`touch #{s}`
`cat #{s}`
`cat < #{s}`
Problem is only when Russian symbol 'с' in the word and with symobol '<'
w开发者_如何转开发oto@woto-work:/tmp$ locale
LANG=ru_RU.UTF-8
LC_CTYPE="ru_RU.UTF-8"
LC_NUMERIC="ru_RU.UTF-8"
LC_TIME="ru_RU.UTF-8"
LC_COLLATE="ru_RU.UTF-8"
LC_MONETARY="ru_RU.UTF-8"
LC_MESSAGES="ru_RU.UTF-8"
LC_PAPER="ru_RU.UTF-8"
LC_NAME="ru_RU.UTF-8"
LC_ADDRESS="ru_RU.UTF-8"
LC_TELEPHONE="ru_RU.UTF-8"
LC_MEASUREMENT="ru_RU.UTF-8"
LC_IDENTIFICATION="ru_RU.UTF-8"
LC_ALL=
woto@woto-work:/tmp$ ruby -v
ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux]
woto@woto-work:/tmp$ uname -a
Linux woto-work 2.6.32-26-generic #48-Ubuntu SMP Wed Nov 24 10:14:11
UTC 2010 x86_64 GNU/Linux
woto@woto-work:/tmp$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 10.04.1 LTS
Release: 10.04
Codename: lucid
Another example
maybe this will be also useful to understand my problem
woto@woto-work:~/rails/avtorif$ touch мистика
woto@woto-work:~/rails/avtorif$ ruby -e "`cat < мистика`"
woto@woto-work:~/rails/avtorif$ ruby -e '`cat < мистика`'
sh: cannot open ми�тика: No such file
This is a bug in dash
, shell which Debian uses by default (symlink /bin/sh
leads to /bin/dash
; and python's os.system
uses sh
. Ruby probably uses sh
too). dash
cannot properly parse 8-bit text, including UTF-8. To workaround your problem, replace it by bash
:
sudo dpkg-reconfigure dash
and select "No". This way the system will use bash
as /bin/sh
shell, which can handle UTF-8.
The following works for me, have you tried it this way?
s="мистика"
touch $s
In bash you reference a variable prepending the dollar sign.
In each of your examples, you are executing a shell command. As a first step, I would make sure that your shell command executes as you expect it to when you type it in directly:
touch мистика
cat мистика
cat < мистика
If you are getting errors in the shell, it is one of two possibilities: the shell command doesn't understand the character encoding, or the filename needs quotes to distinguish it appropriately.
Ruby 1.9 understands character set encodings, something that Ruby 1.8 did not. You'll have to do a little research to determine what character encoding your shell environment uses. Once you do, you'll create the commands as regular strings:
touch = "touch #{s}".force_encoding("UTF-8") ## or whatever encoding you need
and then execute the command:
`#{touch}`
I believe Ruby 1.9's default encoding is UTF-8. Ruby 1.8 has no concept of encoding and a string is merely an array of bytes. Unfortunately, not every piece of software understands unicode or the concepts of character encoding (much like Ruby 1.8). In those cases the system will use whatever the default encoding is. I suspect your shell environment may be one of those programs.
use ruby 1.9 it has force_encoding methods in String object
精彩评论