开发者

What is the best way to fully read a stream of objects from a file in Java?

I'm creating a potentially long log of objects and do not want to keep them all in memory before writing to a file, so I can't write a serialized collection of the objects to a file. I'm trying to find out the 'best' way of reading in the entire stream of objects after logging has been finished.

I have noticed that the following does not work:

FileInputStream fis = new FileInputStream(log);
ObjectInputStream in = new ObjectInputStream(fis);
while ((obj = in.readObject()) != null) {
  // do stuff with obj
}

because the stream throws an exception when it reaches the end of a file rather than returning null (presumably because one can write/read null to object streams, causing the above loop not to behave as expected).

Is there a better way to do something like what I want to accomplish with the above loop than:

FileInputStream fis = new FileInputStream(log);
ObjectInputStream in = new ObjectInputStream(fis);
try {
  while (true) {
    obj = in.readObject();
    // do stuff with obj
  }
} catch (EOFException e) {
}

This seems a little clumsy. For an end-of-file object sol开发者_运维问答ution, is the following the best way?

private static final class EOFObject implements Serializable {
  private static final long serialVersionUID = 1L;
}

void foo() {
  Object obj;
  while (!((obj = in.readObject()) instanceof EOFObject)) {
    BidRequest bidRequest = ((BidRequestWrapper) obj).getBidRequest();
    bidRequestList.add(bidRequest);
  }
}


Your solution seems fine. Just make sure you have a finally clause, where you close your stream.

Alternatively, you can create an EOF object of yours, and add it at the end. Thus you can check if the currently read object is the EofObject, and break at that point.


I'm creating a potentially long log of objects and do not want to keep them all in memory before writing to a file, so I can't write a serialized collection of the objects to a file

This requirement is not met when using Java serialization, because the serialization stream maintains strong references to the objects previously written, presumably in order to write back references should these objects need to be serialized again. This can be verified by running:

public static void main(String[] args) throws Exception {
    OutputStream os = new FileOutputStream("C:\\test");
    ObjectOutputStream oos = new ObjectOutputStream(os);
    for (Integer i = 0; i < 1E9; i++) {
        oos.writeObject(i);
    }
    oos.close();
}

A similar problem exists when deserializing the file. To resolve back references, the stream is very likely to keep all previously read objects alive to resolve potential back references to these objects from the serialization stream.

If you really need to be able to release these objects before the stream is fully written you might wish to use a fresh ObjectOutputStream for each (batch of) objects ObjectOutputStream.reset() - of course losing the capability to resolve back references from earlier streams. That is, the following program will not throw an OutOfMemoryError:

public static void main(String[] args) throws Exception {
    OutputStream os = new FileOutputStream("C:\\test");
    ObjectOutputStream oos = new ObjectOutputStream(os);
    for (Integer i = 0; i < 1E9; i++) {
        oos.writeObject(i);
        oos.reset();
    }
    oos.close();
}

Note that the metadata about the classes being serialized will be written anew after each reset, which is quite wasteful (the above program write about 80 bytes per Integer ...), so you should not reset too often, perhaps once every 100 objects?

For detecting the end of stream, I find bozho's suggestion of an EOF object best.


Write a boolean after each object, with the "last" object being followed by a false. So, in your stream that you write out:

true
<object>
true
<object>
true
<object>
false

Then, when reading them back in, you check the flag (you know there will always be one after each object) to decide whether or not to read another one.

boolean will be stored very compactly in a serialization stream, so it shouldn't add much to the file size.


Your code is incorrect. readObject() doesn't return null at EOS, it throws EOFException. So catch it. Null is returned if you wrote a null. You don't need all the booleans or marker objects suggested above.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜