开发者

HttpWebRequest an Unicode characters

I am using this code:

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
string result = null;
using (HttpWebResponse resp = (HttpWebResponse)req.GetResponse())
{
   StreamReader reader = new StreamReade开发者_开发百科r(resp.GetResponseStream());
   result = reader.ReadToEnd();
   reader.Close();
}

In result I get text like 003cbr /003e003cbr /003e (I think this should be 2 line breaks instead). I tried with the 2, 3 parameter versions of Streamreader but the string was the same. (the request returns a json string)

Why am I getting those characters, and how can I avoid them?


It's not really clear what that text is, but you're not specifying an encoding at the moment. What content encoding is the server using? StreamReader will default to UTF-8.

It sounds like actually you're getting some sort of oddly-encoded HTML, as U+003C is < and U+003E is >, giving <br /><br /> as the content. That's not JSON...

Two tests:

  • Use WebClient.DownloadString, which will detect the right encoding to use
  • See what gets shown using the same URL in a browser

EDIT: Okay, now that I've seen the text, it's actually got:

\u003cbr /\u003e

The \u part is important here - that's part of the JSON which states that the next four characters form ar the hex representation of a UTF-16 code unit.

Any JSON API used to parse that text should perform the unescaping for you.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜