2 encodings between an HTML representation
Im reading one c开发者_C百科hapter from the W3C HTML Document Representation
In the 5.1 says this:
User agents must also know the specific character encoding that was used to transform the document character stream into a byte stream.
Then in the 5.2 says this:
The "charset" parameter identifies a character encoding, which is a method of converting a sequence of bytes into a sequence of characters.
Char-Bytes
Bytes-Char
So im wrong or there are 2 encodings between the representation...
A "character encoding" such as UTF-8 is, strictly speaking, a specification for representing characters as a sequence of bytes. But the encodings are always reversible, so we can speak of a (single) character encoding as going both ways.
Other character encodings used in practice are UTF-16 ad UTF-32.
Each of these are specifications under which you can encode text as bytes and decode bytes into characters. Two parts of the same specification.
精彩评论