开发者

Decompressing a very large serialized object and managing memory

I have an object that contains tons of data used for reports. In order to get this object from the server to the client I first serialize the object in a memory stream, then compress it using the Gzip stream of .NET. I then send the compressed object as a byte[] to the client.

The problem is on some clients, when they get the byte[] and try to decompress and deserialize the object, a System.OutOfMemory exception is thrown. Ive read that this exception can be caused by new() a bunch of objects, or holding on to a bunch of strings. Both of these are happening during the deserialization process.

So my question is: How do I prevent the exception (any good strategies)? The client needs all of the data, and ive trimmed down the number of strings as much as i can.

edit: here is the code i am using to serialize/compress (implemented as extension methods)

public static byte[] SerializeObject<T>(this object obj, T serializer) where T: XmlObjectSerializer
{
    Type t = obj.GetType();

    if (!Attribute.IsDefined(t, typeof(DataContractAttribute)))
        return null;

    byte[] initialBytes;

    usi开发者_如何学Cng (MemoryStream stream = new MemoryStream())
    {
        serializer.WriteObject(stream, obj);
        initialBytes = stream.ToArray();
    }

    return initialBytes;
}

public static byte[] CompressObject<T>(this object obj, T serializer) where T : XmlObjectSerializer
{
    Type t = obj.GetType();

    if(!Attribute.IsDefined(t, typeof(DataContractAttribute)))
        return null;

    byte[] initialBytes = obj.SerializeObject(serializer);

    byte[] compressedBytes;

    using (MemoryStream stream = new MemoryStream(initialBytes))
    {
        using (MemoryStream output = new MemoryStream())
        {
            using (GZipStream zipper = new GZipStream(output, CompressionMode.Compress))
            {
                Pump(stream, zipper);
            }

            compressedBytes = output.ToArray();
        }
    }

    return compressedBytes;
}

internal static void Pump(Stream input, Stream output)
{
    byte[] bytes = new byte[4096];
    int n;
    while ((n = input.Read(bytes, 0, bytes.Length)) != 0)
    {
        output.Write(bytes, 0, n);
    }
}

And here is my code for decompress/deserialize:

public static T DeSerializeObject<T,TU>(this byte[] serializedObject, TU deserializer) where TU: XmlObjectSerializer
{
    using (MemoryStream stream = new MemoryStream(serializedObject))
    {
        return (T)deserializer.ReadObject(stream);
    }
}

public static T DecompressObject<T, TU>(this byte[] compressedBytes, TU deserializer) where TU: XmlObjectSerializer
{
    byte[] decompressedBytes;

    using(MemoryStream stream = new MemoryStream(compressedBytes))
    {
        using(MemoryStream output = new MemoryStream())
        {
            using(GZipStream zipper = new GZipStream(stream, CompressionMode.Decompress))
            {
                ObjectExtensions.Pump(zipper, output);
            }

            decompressedBytes = output.ToArray();
        }
    }

    return decompressedBytes.DeSerializeObject<T, TU>(deserializer);
}

The object that I am passing is a wrapper object, it just contains all the relevant objects that hold the data. The number of objects can be a lot (depending on the reports date range), but ive seen as many as 25k strings.

One thing i did forget to mention is I am using WCF, and since the inner objects are passed individually through other WCF calls, I am using the DataContract serializer, and all my objects are marked with the DataContract attribute.


One thing you could try is pre-generating the XmlSerializer assemblies on the client side, if you haven't done this already.

.NET actually generates these at run-time unless you pre-generate and link against them.

More: Sgen.exe and on more on StackOverflow.


A developer I work with encountered a similar problem, where the large streams used for the serialization fragmented the memory heap and the garbage collector was unable to compact it sufficiently to allow him to reallocate the memory.

If you are serializing many objects repetitively I would allocate a single buffer, and then clear it each time you finish, as opposed to disposing it and creating a new one. That way you only need the memory to create it once and then your app should continue to work effectively.

I'd also mention @yetapb comment that the data might be paged and written in a streamed fashion. That way you shouldn't need an enormous buffer in memory to store the data.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜