开发者

How to make a concurrent iterator over some source?

I would like to have an iterator that can be read by multiple threads concurrently so that I can process the data of the iterator's source in parallel. The challenge is that I can't really couple hasNext() with its logical next() as thos开发者_如何学JAVAe could go to different threads. (That is, two threads can call hasNext(), each see true, and then have the second thread fail because there was only one item.) My problem is that for some sources I don't really know if it has a next element until I try to read it. One such example is reading lines from a file; another is reading Term instances from a Lucene index.

I was thinking of setting up a queue inside the iterator and feeding the queue with a separate thread. That way, hasNext() is implemented in terms of the queue size. But I don't see how I could guarantee that the queue is filled because that thread could get starved.

Should I ignore the Iterator contract, and just call next() exhaustively until a NoSuchElementException is thrown?

Is there a more elegant way of handling the problem?


Can your threads just pull from a BlockingQueue instead of an Iterator. As you have discovered, Iterators are not well suited for concurrent access.

Pass a LinkedBlockingQueue, and have your threads do queue.poll() until nothing is left.


One workaround / escape comes to my mind, to keep (most of) the contract and avoid NoSuchElementExceptions: The iterator.next() could return a custom "End-of-iteration" marker object, that can be processed but is nothing but a dummy. So if one thread receives a true for hasNext() but another thread already grabbed the last item, then the first thread will get a dummy (instead of an exception).

You should be able to use this kind of iterator in all normal use cases and single threaded uses should even notice the difference. Should be useable with the enhanced for loop too.

It will only fail if one tries to wait for NoSuchElementException instead of checking hasNext(), because that exception will not be thrown because of the dummy items.


The chosen answer will work, but it introduces complexity and potential unnecessary buffering. Why not ignore Iterator’s contract and write your own:

public interface ConcurrentIterator<T> {

    T next() throws EndOfIterationException;

}

This will be thread-safe if your implementation is. Can even wrap an Iterator in it.


As an up-to-date answer I think you should use a ConcurrentLinkedQueue which is available since Java 1.5.


I could be missing the point, but couldn't you use a synchronized block in this situation?

synchronized(iterator)
{
    if (iterator.hasNext()) element = iterator.next();
}

Here, when one thread is using the iterator, no other threads will be able to access it.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜