Avoiding duplicate objects in Java deserialization
I have two lists (list1 and list2) containing references to some objects, where some of the list entries may point to the same object. Then, for various reasons, I am serializing these lists to two separate file开发者_如何学编程s. Finally, when I deserialize the lists, I would like to ensure that I am not re-creating more objects than needed. In other words, it should still be possible for some entry of List1 to point to the same object as some entry in List2.
MyObject obj = new MyObject();
List<MyObject> list1 = new ArrayList<MyObject>();
List<MyObject> list2 = new ArrayList<MyObject>();
list1.add(obj);
list2.add(obj);
// serialize to file1.ser
ObjectOutputStream oos = new ObjectOutputStream(...);
oos.writeObject(list1);
oos.close();
// serialize to file2.ser
oos = new ObjectOutputStream(...);
oos.writeObject(list2);
oos.close();
I think that sections 3.4 and A.2 of the spec say that deserialization strictly results in the creation of new objects, but I'm not sure. If so, some possible solutions might involve:
- Implementing equals() and hashCode() and checking references manually.
- Creating a "container class" to hold everything and then serializing the container class.
Is there an easy way to ensure that objects are not duplicated upon deserialization?
Thanks.
After deserialization of the second list you could iterate over it's the elements and replace duplicates by a reference to the first list.
According to 3.7 The readResolve Method the readResolve()
method is not invoked on the object until the object is fully constructed.
I think that sections 3.4 and A.2 of the spec say that deserialization strictly results in the creation of new objects, but I'm not sure. If so, some possible solutions might involve: ...
2, Creating a "container class" to hold everything and then serializing the container class.
I read these statements as "if I my understanding about deserialization always creating new objects is incorrect, then solution #2 of writing both lists wrapped in a container class to a single stream is an acceptable solution."
If I am understanding you correctly, this means you think writing out through a single container containing both lists won't work because it will still result in duplicate objects ("strictly results in ... new objects"). This is incorrect. When writing out the graph of objects (your wrapper class), each object is only serialized once, no matter how many occurrences in the graph. When the graph is read back in, that object is not duplicated.
http://java.sun.com/javase/6/docs/api/java/io/ObjectOutputStream.html
The default serialization mechanism for an object writes the class of the object, the class signature, and the values of all non-transient and non-static fields. References to other objects (except in transient or static fields) cause those objects to be written also. Multiple references to a single object are encoded using a reference sharing mechanism so that graphs of objects can be restored to the same shape as when the original was written.
So, if you can, use option #2.
Creating a "container class" to hold everything and then serializing the container class.
You can override the readResolve()
method to replace what's read from the stream with anything you want.
private Object readResolve() throws ObjectStreamException {
...
}
This is typically used for enforcing singletons. Prior to Java 5 it was also used for typesafe enums. I've never seen it used for this but scenario but I guess there's no reason it couldn't be.
Now this will work with individual objects that you control but I can't see how you'd make it with a List
. It could ensure that the objects returned in that list aren't duplicated (by whatever criteria you deem).
精彩评论