Why is String#index returning nil here?
On the last two lines:
$ ruby -v
ruby 1.9.1p378 (2010-01-10 revision 26273) [i386-darwin10]
$ irb
irb(main):001:0> def t(str)
irb(main):开发者_高级运维002:1> str.index str
irb(main):003:1> end
=> nil
irb(main):004:0> t 'abc'
=> 0
irb(main):005:0> t "\x01\x11\xfe"
=> nil
irb(main):006:0> t "\x01\x11\xfe".force_encoding(Encoding::UTF_8)
=> nil
Why does str.index str
return nil?
"\x01\x11\xfe" is not valid UTF8.
If you call t "\x01\x11\xfe".force_encoding(Encoding::BINARY)
, you'll get the expected 0.
The behavior is different on my older Ruby:
$ ruby -v ; irb
ruby 1.8.7 (2010-01-10 patchlevel 249) [x86_64-linux]
irb(main):001:0> def t(str)
irb(main):002:1> str.index str
irb(main):003:1> end
=> nil
irb(main):004:0> t 'abc'
=> 0
irb(main):005:0> t "\x01\x11\xfe"
=> 0
I suspect it has something to do with the difference between single-quotes and double-quotes with string literals in ruby:
ruby-1.9.1-p378 > def t(str) ; str.index(str) ; end
=> nil
ruby-1.9.1-p378 > t 'abc'
=> 0
ruby-1.9.1-p378 > t "\x01\x11\xfe"
=> nil
ruby-1.9.1-p378 > t '\x01\x11\xfe'
=> 0
The short answer is that using single-quotes does minimal text processing, but double-quotes allows interpolation, character escaping, and a few other things.
Some examples:
#interpolation
ruby-1.9.1-p378 > x = 5 ; 'number: #{x}'
=> "number: \#{x}"
ruby-1.9.1-p378 > x = 5 ; "number: #{x}"
=> "number: 5"
#character escaping
ruby-1.9.1-p378 > puts 'tab\tseparated'
tab\tseparated
=> nil
ruby-1.9.1-p378 > puts "tab\tseparated"
tab separated
=> nil
#hex characters
ruby-1.9.1-p378 > puts '\x01\x11\xfe'
\x01\x11\xfe
=> nil
ruby-1.9.1-p378 > puts "\x01\x11\xfe"
�
=> nil
I'm sure someone can explain better why this happens, this is just what I've experienced in my rubying.
精彩评论