Character indices of a string containing unicode characters
I开发者_运维问答'm linkifying @mentions in status messages returned by Twitter's API.
One of the tweets has a unicode character in it. Parsing the JSON (with either the json gem's JSON.parse or ActiveSupport::JSON.decode) returns a string that displays correctly, but the indices for the start and end of the @mention specified by the entity don't match up with the parsed string.
How can I transform the unicode string in Ruby such that the indices of a character behave as expected (e.g., they treat the unicode character as a single character)?
The text of the tweet is:
Thanks! RT @Apigee Have an API? Consider adding a method for simulating errors\u2014an excellent idea from @andrewacove: http://bit.ly/aupTLp ^MG
If you are using ruby on rails you can use string.mb_chars.length. See: http://api.rubyonrails.org/classes/ActiveSupport/CoreExtensions/String/Multibyte.html
精彩评论