HttpWebRequest an Unicode characters
I am using this code:
HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
string result = null;
using (HttpWebResponse resp = (HttpWebResponse)req.GetResponse())
{
StreamReader reader = new StreamReade开发者_开发百科r(resp.GetResponseStream());
result = reader.ReadToEnd();
reader.Close();
}
In result
I get text like 003cbr /003e003cbr /003e
(I think this should be 2 line breaks instead). I tried with the 2, 3 parameter versions of Streamreader
but the string was the same. (the request returns a json string)
Why am I getting those characters, and how can I avoid them?
It's not really clear what that text is, but you're not specifying an encoding at the moment. What content encoding is the server using? StreamReader
will default to UTF-8.
It sounds like actually you're getting some sort of oddly-encoded HTML, as U+003C is <
and U+003E is >
, giving <br /><br />
as the content. That's not JSON...
Two tests:
- Use
WebClient.DownloadString
, which will detect the right encoding to use - See what gets shown using the same URL in a browser
EDIT: Okay, now that I've seen the text, it's actually got:
\u003cbr /\u003e
The \u
part is important here - that's part of the JSON which states that the next four characters form ar the hex representation of a UTF-16 code unit.
Any JSON API used to parse that text should perform the unescaping for you.
精彩评论