incompatible character encodings: UTF-8 and ASCII-8BIT Ruby 1.9
I have just recently upgraded to ruby 1.92 and one of my monkey patches is failing with some sort of encoding error. I have the following function:
def strip_noise()
return if (!self) || (self.size == 0)
self.delete(160.chr+194.chr).gsub(/[,]/, "").strip
end
That now gives me the following error:
incompatible character encodings: UTF-8 and ASCII-8BIT
Has anyone els开发者_如何学Goe come across this?
This is working for me at the moment anyway:
class String
def strip_noise()
return if empty?
self.mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/n,'')
end
end
I need to do more testing but I can progress..
class String
def strip_noise
return if empty?
ActiveSupport::Inflector.transliterate self, ''
end
end
"#{160.chr}#{197.chr} string with noises" # => "\xA0\xC5 string with noises"
"#{160.chr}#{197.chr} string with noises".strip_noise # => "A string with noises"
This might not be exactly what you want:
def strip_noise
return if empty?
sub = 160.chr.force_encoding(encoding) + 194.chr.force_encoding(encoding)
delete(sub).gsub(/[,]/, "").strip
end
Read more on the topic here: http://yehudakatz.com/2010/05/17/encodings-unabridged/
It's not entirely clear what you're trying to do here, but 160.chr+194.chr
is not valid UTF-8: 160 is a continuation byte, and 194 is the first byte of a 2-byte character. Reversed they form the unicode character for "non breaking space".
If you want to remove all non-ASCII-7 characters, try this:
s.delete!("^\u{0000}-\u{007F}")
精彩评论