开发者

Character indices of a string containing unicode characters

I开发者_运维问答'm linkifying @mentions in status messages returned by Twitter's API.

One of the tweets has a unicode character in it. Parsing the JSON (with either the json gem's JSON.parse or ActiveSupport::JSON.decode) returns a string that displays correctly, but the indices for the start and end of the @mention specified by the entity don't match up with the parsed string.

How can I transform the unicode string in Ruby such that the indices of a character behave as expected (e.g., they treat the unicode character as a single character)?

The text of the tweet is:

Thanks! RT @Apigee Have an API? Consider adding a method for simulating errors\u2014an excellent idea from @andrewacove: http://bit.ly/aupTLp ^MG


If you are using ruby on rails you can use string.mb_chars.length. See: http://api.rubyonrails.org/classes/ActiveSupport/CoreExtensions/String/Multibyte.html

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜