Detecting individual Unicode character support with JavaScript
Is it possible to detect if the client supports a particular Unicode character or if it will be rendered as a missing glyph box?
Important: Support in as many browsers as possible
Not important: Efficiency, speed, or elegance
The only method I can think of trying is using a canvas, so I figured I'd ask before I start going down that road.
This is not intended for use on a public web site; I am just trying to compile a li开发者_开发问答st of characters supported by each browser.
This is more of a wild idea than a real answer:
If you could find a character which you knew would always render as a missing glyph box, you could use the same technique as this javascript font detector--render the character and the missing glyph box offscreen and compare their widths. If they're different, then you know the character is not rendering as a missing glyph box. Of course, this won't work at all for fixed-width fonts, and it could have a lot of fixed negatives for other fonts where a lot of the characters are the same width.
You can use a canvas to check whether or not the character is rendered identically to a character you know is not supported. U+FFFF
is a good choice for a character to compare to, since it's guaranteed not to be a valid unicode character.
So you create one canvas where you render a U+FFFF
character, and another canvas where you render the character you want to test. You then compare the two canvases by comparing their data URLs by using the toDataURL
method. If the canvases are identical, the test character was rendered identically to the unsupported U+FFFF
character, meaning it's not supported, and if the canvases are not identical, the test character was not rendered in the same way as unsupported characters so it is supported.
The following code does that:
//The first argument is the character you want to test, and the second argument is the font you want to test it in.
//If the second argument is left out, it defaults to the font of the <body> element.
//The third argument isn't used under normal circumstances, it's just used internally to avoid infinite recursion.
function characterIsSupported(character, font = getComputedStyle(document.body).fontFamily, recursion = false){
//Create the canvases
let testCanvas = document.createElement("canvas");
let referenceCanvas = document.createElement("canvas");
testCanvas.width = referenceCanvas.width = testCanvas.height = referenceCanvas.height = 150;
//Render the characters
let testContext = testCanvas.getContext("2d");
let referenceContext = referenceCanvas.getContext("2d");
testContext.font = referenceContext.font = "100px " + font;
testContext.fillStyle = referenceContext.fillStyle = "black";
testContext.fillText(character, 0, 100);
referenceContext.fillText('\uffff', 0, 100);
//Firefox renders unsupported characters by placing their character code inside the rectangle making each unsupported character look different.
//As a workaround, in Firefox, we hide the inside of the character by placing a black rectangle on top of it.
//The rectangle we use to hide the inside has an offset of 10px so it can still see part of the character, reducing the risk of false positives.
//We check for Firefox and browers that behave similarly by checking if U+FFFE is supported, since U+FFFE is, just like U+FFFF, guaranteed not to be supported.
if(!recursion && characterIsSupported('\ufffe', font, true)){
testContext.fillStyle = referenceContext.fillStyle = "black";
testContext.fillRect(10, 10, 80, 80);
referenceContext.fillRect(10, 10, 80, 80);
}
//Check if the canvases are identical
return testCanvas.toDataURL() != referenceCanvas.toDataURL();
}
//Examples
console.log("a is supported: " + characterIsSupported('a')); //Returns true, 'a' should be supported in all browsers
console.log("\ufffe is supported: " + characterIsSupported('\ufffe')); //Returns false, U+FFFE is guaranteed to be unsupported just like U+FFFF
console.log("\u2b61 is supported: " + characterIsSupported('\u2b61')); //Results vary depending on the browser. At the time of writing this, this returns true in Chrome on Windows and false in Safari on iOS.
console.log("\uf8ff is supported: " + characterIsSupported('\uf8ff')); //The unicode Apple logo is only supported on Apple devices, so this should return true on Apple devices and false on non-Apple devices.
Not sure whether it can be relied upon going forward (browsers might change what is shown for unsupported characters), nor am I sure that this is optimized (as I don't have a good understanding of the ideal boundaries to measure here), but the following approach (of drawing text in canvas and inspecting the result as an image) may, if reviewed, provide a more reliable and accurate check than could checking the width. All of the code in the beginning is just browser detection which we must use since feature detection is not possible.
(function () {
// http://www.quirksmode.org/js/detect.html
var BrowserDetect = {
init: function () {
this.browser = this.searchString(this.dataBrowser) || "An unknown browser";
this.version = this.searchVersion(navigator.userAgent)
|| this.searchVersion(navigator.appVersion)
|| "an unknown version";
this.OS = this.searchString(this.dataOS) || "an unknown OS";
},
searchString: function (data) {
for (var i=0;i<data.length;i++) {
var dataString = data[i].string;
var dataProp = data[i].prop;
this.versionSearchString = data[i].versionSearch || data[i].identity;
if (dataString) {
if (dataString.indexOf(data[i].subString) != -1)
return data[i].identity;
}
else if (dataProp)
return data[i].identity;
}
},
searchVersion: function (dataString) {
var index = dataString.indexOf(this.versionSearchString);
if (index == -1) return;
return parseFloat(dataString.substring(index+this.versionSearchString.length+1));
},
dataBrowser: [
{
string: navigator.userAgent,
subString: "Chrome",
identity: "Chrome"
},
{ string: navigator.userAgent,
subString: "OmniWeb",
versionSearch: "OmniWeb/",
identity: "OmniWeb"
},
{
string: navigator.vendor,
subString: "Apple",
identity: "Safari",
versionSearch: "Version"
},
{
prop: window.opera,
identity: "Opera",
versionSearch: "Version"
},
{
string: navigator.vendor,
subString: "iCab",
identity: "iCab"
},
{
string: navigator.vendor,
subString: "KDE",
identity: "Konqueror"
},
{
string: navigator.userAgent,
subString: "Firefox",
identity: "Firefox"
},
{
string: navigator.vendor,
subString: "Camino",
identity: "Camino"
},
{ // for newer Netscapes (6+)
string: navigator.userAgent,
subString: "Netscape",
identity: "Netscape"
},
{
string: navigator.userAgent,
subString: "MSIE",
identity: "Explorer",
versionSearch: "MSIE"
},
{
string: navigator.userAgent,
subString: "Gecko",
identity: "Mozilla",
versionSearch: "rv"
},
{ // for older Netscapes (4-)
string: navigator.userAgent,
subString: "Mozilla",
identity: "Netscape",
versionSearch: "Mozilla"
}
],
dataOS : [
{
string: navigator.platform,
subString: "Win",
identity: "Windows"
},
{
string: navigator.platform,
subString: "Mac",
identity: "Mac"
},
{
string: navigator.userAgent,
subString: "iPhone",
identity: "iPhone/iPod"
},
{
string: navigator.platform,
subString: "Linux",
identity: "Linux"
}
]
};
BrowserDetect.init();
/**
* Checks whether a given character is supported in the specified font. If the
* font argument is not provided, it will default to sans-serif, the default
* of the canvas element
* @param {String} chr Character to check for support
* @param {String} [font] Font Defaults to sans-serif
* @returns {Boolean} Whether or not the character is visually distinct from characters that are not supported
*/
function characterInFont (chr, font) {
var data,
size = 10, // We use 10 to confine results (could do further?) and minimum required for 10px
x = 0,
y = size,
canvas = document.createElement('canvas'),
ctx = canvas.getContext('2d');
// Necessary?
canvas.width = size;
canvas.height = size;
if (font) { // Default of canvas is 10px sans-serif
font = size + 'px ' + font; // Fix size so we can test consistently
/**
// Is there use to confining by this height?
var d = document.createElement("span");
d.font = font;
d.textContent = chr;
document.body.appendChild(d);
var emHeight = d.offsetHeight;
document.body.removeChild(d);
alert(emHeight); // 19 after page load on Firefox and Chrome regardless of canvas height
//*/
}
ctx.fillText(chr, x, y);
data = ctx.getImageData(0, 0, ctx.measureText(chr).width, canvas.height).data; // canvas.width
data = Array.prototype.slice.apply(data);
function compareDataToBox (data, box, filter) {
if (filter) { // We can stop making this conditional if we confirm the exact arrays will continue to work, or otherwise remove and rely on safer full arrays
data = data.filter(function (item) {
return item != 0;
});
}
return data.toString() !== box;
}
var missingCharBox;
switch (BrowserDetect.browser) {
case 'Firefox': // Draws nothing
missingCharBox = '';
break;
case 'Opera':
//missingCharBox = '0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,197,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0,73,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,36,0,0,0,0,0,0,0,0,0,0,0,197,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0,255,0,0,0,73,0,0,0,0';
missingCharBox = '197,255,255,255,255,73,36,36,36,36,36,36,36,36,197,255,255,255,255,73';
break;
case 'Chrome':
missingCharBox = '2,151,255,255,255,255,67,2,26,2,26,2,26,2,26,2,26,2,26,2,26,2,26,2,151,255,255,255,255,67';
break;
case 'Safari':
missingCharBox = '17,23,23,23,23,5,52,21,21,21,21,41,39,39,39,39,39,39,39,39,63,40,40,40,40,43';
break;
default:
throw 'characterInFont() not tested successfully for this browser';
}
return compareDataToBox(data, missingCharBox, true);
}
// EXPORTS
((typeof exports !== 'undefined') ? exports : this).characterInFont = characterInFont;
}());
var r1 = characterInFont('a', 'Arial'); // true
var r2 = characterInFont('\uFAAA', 'Arial'); // false
alert(r1);
alert(r2);
UPDATE 1
I tried to update for modern Firefox (to try to check for the expected hex digits within the canvas), and checking to ensure that, unlike my code above, the canvas (and pattern to match it) was just large enough to accommodate the widest character per context.measureText()
(U+0BCC from my testing, though presumably dependent on font, in my case "Arial Unicode MS"). Per https://bugzilla.mozilla.org/show_bug.cgi?id=442133#c9 , however, measureText
currently mistakenly responds to the zoom for only the unknown characters. Now, if only one could simulate the zoom in JavaScript canvas so as to affect these measurements (and only those measurements)...
Code available for reference at https://gist.github.com/brettz9/1f061bb2ce06368db3e5
You can always evaluate each character using the charCodeAt() method. This will return the unicode character value. Depending on what you are doing, you can restrict the range of which you want to accept as "Valid" characters... If you copy the character that's in the "box", you can use a character translator on the web to see what the corresponding unicode value is.
Here's one that I googled and found: enter link description here
If you want to maximize browser support, you probably don't want to rely on javascript for anything. Many mobile browsers don't even support it.
If the browser does not support a char set, what is the fall back? Displaying the content in another language? Perhaps links one the site that switch languages on demand would be more robust.
精彩评论