开发者

Autodecode ˜ encoded values in XML

I have a C# app that makes a call out to a cold fusion web service (well it resembles a web service). This service returns 1252 encoded XML format, an开发者_开发知识库d characters passed a certain range come encoded like this: ˜. That is one of the characters that actually comes back. I know the actual text value for this is "˜" in codepage 1252 because I can see the value in the original format in the database.

I take the raw XML from the service and feed it into an XmlTextReader like this:

// turn our raw xml into a reader
byte[] responseBytes = UTF8Encoding.UTF8.GetBytes(rawXml);
MemoryStream responseStream = new MemoryStream(responseBytes);
state.XmlResponseReader = XmlTextReader.Create(
    responseStream,
    new XmlReaderSettings { IgnoreWhitespace = true });

Further down I call state.XmlResponseReader.Read(). When I do these hex encoded values are removed from the text entirely. So the following text: "˜hi there" will show up as "hi there". I want to get "~hi there".

I have tried quite a few different things to try and get these values to get decoded into their text equivalent but nothing has worked.

Manually I can get the correct value by taking the hex value (98), converting it to decimal (152). then doing this: Encoding.GetEncoding(1252).GetString(new byte[] {152}). However, that being entirely manual is less desireable. Does anyone know of a way I can get this converted using more streamlined functionality in the .Net framework?


Can you use Server.Decode on the value? see: http://msdn.microsoft.com/en-us/library/hwzhtkke.aspx


In the end I didn't find a way to get the XmlTextReader to autodecode the data but I did find the next best thing. Using ReadChars (which isn't available on XmlReader) I was able to retrieve the data from the InnerText of my XML node without having those characters corrupted resulting in loss of data.

Here is my code:

int readCharacters = 0;
short bufferSize = 40;
char[] buffer = new char[bufferSize];
StringBuilder innerString = new StringBuilder();

do
{
    readCharacters = reader.ReadChars(buffer, 0, bufferSize);

    innerString.Append(buffer, 0, readCharacters);

} while (readCharacters != 0);

This allows me to get back my raw data example(—˜) at which point I can manually take 97 and 98 out of that string, convert them to a decimal, then to an ascii character. So the solution is still at least 1/2 manual but the ReadChars has saved me some up front whole-string manipulation that would have otherwise been necessary to facilitate the manual steps.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜