开发者

Objects not serialising to XML (UTF-8) as expected .net?

I have a helper method that serialises an object, which works until you try to change the encoding... when received b开发者_Python百科y the consumer web service, appears to be incorrect with some strange characters.

Here is the log entries from the app,

UTF-16 (this works):

2011-08-09 11:16:03,140 DEBUG SomeRestfulService *   xmlData    <?xml version="1.0" encoding="utf-8"?>
<loginRequest xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <UserName>Admin</UserName>
  <Password>Password</Password>
  <MarketCode>GB</MarketCode>
</loginRequest>

UTF-8 (notice the strange character):

2011-08-09 11:21:30,687 DEBUG SomeRestfulService *   xmlData    <?xml version="1.0" encoding="utf-8"?><loginRequest xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema"><UserName>Admin</UserName><Password>Password</Password><MarketCode>GB</MarketCode></loginRequest>

I don't know why it is has lost the layout.

Helper method:

Public Shared Function SerializeObject(ByVal obj As Object, ByVal encoding As Text.Encoding) As String

    Dim serializer As New XmlSerializer(obj.GetType)

    If encoding Is Nothing Then
        Using strWriter As New IO.StringWriter()
            serializer.Serialize(strWriter, obj)
            Return strWriter.ToString
        End Using
    Else
        Using stream As New IO.MemoryStream, xtWriter As New Xml.XmlTextWriter(stream, encoding)
            serializer.Serialize(xtWriter, obj)
            Return encoding.GetString(stream.ToArray())
        End Using
    End If


End Function

Note: If I pass encoding as nothing, the default encoding is UTF-16, everything is ok, originally I never had the encoding part, but it is a requirement, so needs to be in there.

Am I doing the serialising incorrectly when encoding to UTF-8? How can I fix this?

I tried the following to omit the BOM, but still have the same problem:

Dim utf8 As New Text.UTF8Encoding(True)
Using stream As New IO.MemoryStream, xtWriter As New Xml.XmlTextWriter(stream, utf8)
    serializer.Serialize(xtWriter, obj)
    Return utf8.GetString(stream.ToArray())
End Using


What you're seeing is the byte order mark (BOM) that is often used at the start of text files or streams to indicate the byte order and the Unicode variant.

Your serializer is very strange. If you encode a string with some encoding such as UTF-8, you have to return it as an array of bytes. By first encoding the the XML in UTF-8 and then decoding the UTF-8 stream back to a string, you gain nothing (except introducing the problematic BOM).

Either go with UTF-16 only or return a byte array. As the function is now, the encoding just introduces problems.

Update:

Based on the code in the comment below, I'll see two approaches:

Approach 1: Create a string with the serialized data and convert it to UTF-8 late

Public Shared Function SerializeObject(ByVal obj As Object) As String

    Dim serializer As New XmlSerializer(obj.GetType)

    Using strWriter As New IO.StringWriter()
        serializer.Serialize(strWriter, obj)
        Return strWriter.ToString
    End Using

End Function

....

Dim serialisedObject As String = SerializeObject(object)
Dim postData As Byte() = New Text.UTF8Encoding(True).GetBytes(serialisedObject)

If you need a differnt encoding, change the last line. If you want to omit the byte order mark, pass False to UTF8Encoding().

Approach 2: Create the properly encoded data in the first place and continue with a byte array

Public Shared Function SerializeObject(ByVal obj As Object, ByVal encoding As Text.Encoding) As Byte()

    Dim serializer As New XmlSerializer(obj.GetType)

    If encoding Is Nothing Then
       Set encoding = Encoding.Unicode
    End If

    Using stream As New IO.MemoryStream, xtWriter As New Xml.XmlTextWriter(stream, encoding)
        serializer.Serialize(xtWriter, obj)
        Return stream.ToArray()
    End Using

End Function


....

Dim postData As Byte() = SerializeObject(object)

In this case, the XmlTextWriter directly encodes the data with the correct encoding. As since we have a byte array already, the last step is shorter: we directly have the data to send to the client.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜