How to send non-English unicode string using HTTP header?
I am novice to HTTP related matters. My question is in iOS development, I would like to send a string using HTTP Header, so I'm using:
[httpRequest setValue:@"nonEnglishString" forHTTPHeaderField:@"customHeader"];
The receiving server is Python(Google App Engine), saving the string value in the db model as StringProperty using:
dataEntityInstance.nonEnglishString = unicode(self.request.headers.get('customHeader')
However, the problem is when I try to send non-English string like Korean, it's saved in HTTP header like this:
Customheader = "\Uc8fc\Uba39\Uc774 \Uc6b4\Ub2e4";
a开发者_运维百科nd when it's received by Google App Engine and saved in DataStore, it's changed to be like:
??? ??
as if it can't find the proper characters for the unicode value.
Is it not POSSIBLE or ALLOWED to send non-English string using HTTP Header?
If my iOS uses just setHTTPBody, it can transfer non-English strings and save to App Engine's DataStore properly.
[httpRequest setHTTPBody:[httpBody dataUsingEncoding:NSUTF8StringEncoding]];
But I just can't find the right way to achieve same goal using HTTP Headers, like what many APIs like Foursquare's do and saving the strings in the proper forms in Python based Google App Engine's DataStore
Is it not POSSIBLE or ALLOWED to send non-English string using HTTP Header?
It's not possible as per HTTP standards to put non-ISO-8859-1 characters directly in an HTTP header. That gives you ASCII ("English"?) characters plus common Western European diacriticals.
However in practice you can't even use the extended ISO-8859-1 characters, because servers and browsers don't agree on what to do with non-ASCII characters in headers. Safari takes RFC2616 at its word and treats high bytes as ISO-8859-1 characters; Mozilla takes UTF-16 code unit low bytes, which is similar but weirder; Opera and Chrome decode from UTF-8; IE uses the local system code page.
So in reality all you can put in an HTTP header is simple ASCII with no control codes. If you want anything more, you'll have to come up with an encoding scheme (eg UTF-8+base64). The RFC2616 standard suggests RFC2047 encoded-words as a standard form of encoding, but this makes no sense given the definitions of when they are allowable in RFC2047 itself, and nothing supports it.
It is possible to use character sets other than ISO 8859-1 in HTTP headers, but they must be encoded as described in RFC 2047.
RFC 8187 describes the way you could pass header value in different encoding:
Extended notation, using the Unicode character U+00A3 ("£", POUND SIGN):
foo: bar; title*=utf-8'en'%C2%A3%20rates
精彩评论