开发者

Encoding Issue in Firefox

The browser I am using is firefox, not sure if it does this in other browsers or what.. What is displaying on scree开发者_开发问答n for me in place of quotation marks or apostrophes or something along those lines is a box, and inside that box:

00
92

How do I get rid of these, I just want to replace them with a blank.


This is a common character encoding problem and has nothing to do with Firefox per se. We often see this problem with text pasted from (for example) Microsoft Word, which likes to replace standard ASCII single and double quotation marks with curved or angled "typographical" versions where the open and close quotes are different.

The problem is that the characters don't get translated from MS's codepage 1252 to whatever encoding your web page is displayed in (usually UTF-8 or latin1). There are lots of possible reasons for this; I won't even try to guess what's going on in your particular case. (Character number 92 in cp1252 is the curved closing single quote, often used for an apostrophe.)

It's often preferable to replace these characters with their standard ASCII equivalents (" or '). Another solution, if you're only displaying the data in web pages, would be to replace them with their equivalent HTML entities, such as ”, “, ’ and ‘.

As for getting rid of them, that depends how they're getting in. You'll need to remove/replace them in your HTML, or your database, or wherever they're stored.


Those boxes occur in three circumstances:

  • It represents a byte that isn't valid at that position for the document's encoding. This usually occurs when the document contains non-text and when the document specifies it's in one multi-byte encoding (e.g. UTF-8) when it's really in another (Windows-1252).

  • It represents a code-point that's not assigned in the document's encoding. This usually occurs when the document contains non-text and when specifies it's in one encoding (e.g. iso-8859-1) when it's really in another (Windows-1252).

  • It represents a character for which the browser's font has no glyph. (e.g. A Chinese character on a machine without fonts with Chinese characters.)

In this case, I suspect the document contains RIGHT SINGLE QUOTATION MARK (U+2019, "’"). This is encoded as byte 0x92 in Windows-1252, a very common encoding on Windows. If the browser is told the encoding is UTF-8 or iso-8859-1, you would encounter the first or second problem respectively.

Changing the encoding used or the encoding specified so that they match would fix this.


try to change your encoding (Top Menu -> View -> Character Enconding). If UTF-8 and ISO-8859-1 don't do it, try Auto-Detect -> Universal.

Cheers,

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜