DataContractSerializer vs BinaryFormatter performance
I was going through articles to understand more about the datacontractserializer and binaryformatter serializers. Based on the reading done so far I was under the impression that binaryformatter should have a lesser footprint than datacontractserializer. Reason being DataContractSerializer serializes to xml infoset while binaryformatter serializes to a proprietary binary format.
Following is the test
[Serializable]
[DataContract]
public class Packet
{
[DataMember]
public DataSet Data { get; set; }
[DataMember]
public string Name { get; set; }
[DataMember]
public string Description { get; set; }
}
DataSet was populated with 121317
rows from [AdventureWorks].[Sales]开发者_Go百科.[SalesOrderDetail]
table
using (var fs = new FileStream("test1.txt", FileMode.Create))
{
var dcs = new DataContractSerializer(typeof(Packet));
dcs.WriteObject(fs, packet);
Console.WriteLine("Total bytes with dcs = " + fs.Length);
}
using(var fs = new FileStream("test2.txt", FileMode.Create))
{
var bf = new BinaryFormatter();
bf.Serialize(fs, packet);
Console.WriteLine("Total bytes with binaryformatter = " + fs.Length);
}
Results
Total bytes with dcs = 57133023
Total bytes with binaryformatter = 57133984
Question Why is the byte count for binaryformatter more than datacontractserializer? Shouldn't it be much lesser?
DataSet
has a bad habit: it implements ISerializable
and then serializes its contents as a string of XML by default, even when passed to a BinaryFormatter
. This is why the two streams are nearly identical in size. If you change its RemotingFormat
property to Binary
, it will do the same thing but by creating a new BinaryFormatter
, dumping itself into a MemoryStream
, and then putting the resulting byte array as a value in the outer BinaryFormatter
's stream.
Outside of that, BinaryFormatter
carries more information about types, such as the full name of the assembly they came from; also, there is the per-object overhead on top of the XML for a DataSet
.
If you're trying to compare the behavior of the two serializers, DataSet
is a poor choice because it overrides too much.
精彩评论