开发者

How to refactor this IO code? [closed]

Closed. This question is off-topic. It is not currently accepting answers.

Want to improve this question? Update the question so it's on-topic for Stack Overflow.

Closed 11 years ago.

Improve this question

I need to read records from a flat file, where each 128 bytes constitutes a logical record. The calling module of this below reader does just the following.

while(iterator.hasNext()){
    iterator.next();
    //do Something
 }

Means there will be a next() call after every hasNext() invocation.

Now here goes the reader.

public class FlatFileiteratorReader implements Iterable<String> {

    FileChannel fileChannel;

public FlatFileiteratorReader(FileInputStream fileInputStream) {
    fileChannel = fileInputStream.getChannel();
}

private class SampleFileIterator implements Iterator<String> {
    Charset charset = Charset.forName("ISO-8859-1");
    ByteBuffer byteBuffer = MappedByteBuffer.allocateDirect(128 * 100);
    LinkedList<String> recordCollection = new LinkedList<String>();
    String record = null;

    @Override
    public boolean hasNext() {
        if (!recordCollection.isEmpty()) {
            record = recordCollection.poll();
            return true;
        } else {
            try {
                int numberOfBytes = fileChannel.read(byteBuffer);
                if (numberOfBytes > 0) {
                    byteBuffer.rewind();
                    loadRecordsIntoCollection(charset.decode(byteBuffer)
                            .toString().substring(0, numberOfBytes),
                            numberOfBytes);
                    byteBuffer.flip();
                    record = recordCollection.poll();
                    return true;
                }
            } catch (IOException e) {
                // Report Exception. Real exception logging code in place
            }
        }
        try {
            fileChannel.close();
        } catch (IOException e) {
            // TODO Report Exception. Logging
 开发者_开发问答       }
        return false;

    }

    @Override
    public String next() {
        return record;
    }

    @Override
    public void remove() {
        // NOT required

    }

    /**
     * 
     * @param records
     * @param length
     */
    private void loadRecordsIntoCollection(String records, int length) {
        int numberOfRecords = length / 128;
        for (int i = 0; i < numberOfRecords; i++) {
            recordCollection.add(records.substring(i * 128, (i + 1) * 128));
        }
    }

}

    @Override
    public Iterator<String> iterator() {
        return new SampleFileIterator();
    }
}

The code reads 80 mb of data in 1.2 seconds on a machine with 7200 RPM HDD, with Sun JVM and running Windows Xp OS. But I'm not that satisfied with the code I have written. Is there any other way to write this in a better way(Especially the decoding to character set and taking only the bytes that has been read, i mean the charset.decode(byteBuffer) .toString().substring(0, numberOfBytes) part. Please ignore //TODO things) ?


  1. There is no particular advantage to using a direct buffer here. You have to get the data across the JNI boundary into Java-land, so you may as well use a normal ByteBuffer. Direct buffers are for copying data when you don't want to look at it yourself really.

  2. Use a ByteBuffer that is a multiple of 512, e.g. 8192, so you aren't driving the I/O system and disk controller mad with reads across sector boundaries. In this case I would think about using 128*512 to agree with your record length.

  3. The .substring(0, numberOfBytes) is unnecessary. After the read and rewind, the ByteBuffer's position is zero and its limit equals numberOfBytes, so the charset.decode() operation is already delivering the correct amount of data.

  4. You're assuming you didn't get a short read from FileChannel.read(). You can't assume that, there is nothing in the Javadoc to support that assumption. You need to read until the buffer is full or you get EOF.

Having said all that, I would also experiment with a BufferedReader around an InputStreamReader around the FileInputStream, and just read 128 chars at a time. You might get a surprise as to which is faster.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜