Do HTTP request headers have to be UTF-8 encoded?
I could not find anything开发者_高级运维 in the spec which says it should be. I have seen a couple of browsers setting their user-agents to non UTF8 encoded strings. There is however a Content-Type request header which specifies the media type (and charset), and I'm not sure if that is applicable only to the body of the request or the headers too.
HTTP RFC defines header content as type *TEXT, which is define on or about page 15 as ISO-8859-1 except when the non ISO-8859-1 is encoded pursuant to RFC 2047.
The Content-Type header applies to the body, not the headers.
The HTTP header field values may contain characters other than ASCII characters:
message-header = field-name ":" [ field-value ]
field-name = token
field-value = *( field-content | LWS )
field-content = <the OCTETs making up the field-value
and consisting of either *TEXT or combinations
of token, separators, and quoted-string>
See the Basic Rule for the definition of OCTET and TEXT:
OCTET = <any 8-bit sequence of data>
TEXT = <any OCTET except CTLs,
but including LWS>
But in general only ASCII characters are used for the field values as well.
精彩评论