开发者

Finding Unicode character name with Javascript

I need to find out the names for Unicode characters when the user enters the number for开发者_StackOverflow it. An example would be to enter 0041 and get given "Latin Capital Letter A" as the result.


As far as I know, there isn't a standard way to do this. You could probably parse the UnicodeData.txt file to get this information.


Here should be what you're looking for. The first array is simply http://unicode.org/Public/UNIDATA/Index.txt with replacing newlines with |;

// this mess..
var unc = "A WITH ACUTE, LATIN CAPITAL LETTER   00C1| /*... really big array ...*/ |zwsp    200B";
var uncs=unc.split("|");
var final_a = [];
var final_s = "";
for each (var item in uncs) {
    var _T=item.split("\t");
    //final_a [_T[1]] = _T[0];
    final_s += '"' + _T[1] + '"' + ' : ' + '"' + _T[0] + '",';
}

console.log (final_s);

// yields..

var unicode_lookup = { /*really big array*/ }

// which we can use like so ...

alert(unicode_lookup["1D01"]);
// AE, LATIN LETTER SMALL CAPITAL

SO doesn't preserve tabs so the first part may not work if you simply copy-paste it. You'll note that some characters are duplicates so you may want to do some cleanup.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜