开发者

How to parse a string of fullwidth integer characters to an integer in ruby

How can I parse a string of fullwidth unicode integer characters to an integer in ruby?

Attempting the obvious results in;

irb(main):011:0> a = "\uff11"
=> "1"
irb(main):012:0> Integer(a)
ArgumentError: invalid value for Integer: "\xEF\xBC\x91"
      from (irb):12:in `Integer'
      from (irb):12
      from /export/home/henry/apps/bin/irb:12:in `<main>开发者_运维技巧'
irb(main):013:0> a.to_i
=> 0

The equivalent in python gives;

>>> a = u"\uff11"
>>> print a
1
>>> int(a)
1


Ruby 1.9's numeric parsing is thinking in ascii only. I don't think there's any convenient elegant parsing methods that properly handle fullwidth unicode numeric codepoints.

A quick filthy hack function:

def parse_utf(utf_integer_string)
  ascii_numeric_chars = "0123456789"
  utf_numeric_chars = "\uff10\uff11\uff12\uff13\uff14\uff15\uff16\uff17\uff18\uff19"
  utf_integer_string.tr(utf_numeric_chars, ascii_numeric_chars).to_i
end

Pass in a string of fullwidth numeric characters and get out an integer.


Convert ‘compatibility’ characters like the fullwidths to their normalized versions (plain ASCII numbers in this case) before parsing as integer. For example, using Unicode::normalize_KC or UnicodeUtils::nfkc.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜