Committed changes visibility in Lucene. Best practices
I'm using Lucene 3.0.3. I've made a Spring bean aiming to encapsulate all operations on the same index.
public class IndexOperations {
private IndexWriter writer;
private IndexReader reader;
private IndexSearcher searcher;
public void init() {...}
public void destroy() {...}
public void save(Document d) {...}
public void delete(Document d) {...}
public List<Document> list() {...}
}
In order to permit fast changes and searches, I thought leaving writer, reader and searcher open could be a good idea. But the problem is that commited changes on writer can't be seen by readers until reopen. And this operation can be costly, so maybe is not a good idea for fast searches.
What would be the best app开发者_开发知识库roach for this typical scenario?
You should keep the writer always open, but don't persist the reader/searcher. When you need a searcher just do IndexSearcher srch = new IndexSearcher(writer.getReader());
This way the searcher will get the most recent changes, even if they aren't flushed to disk yet (giving you the best of both worlds).
For this sort of use-case, I can highly recommend Compass, which is a higher-level abstraction around Lucene. Specific to your question, it provides better concurrent control, plus transactions, which obviates the need to manually control the reader/writer/searcher problem that you have. It's pretty clever stuff, and to be honest it can be pretty baroque, but it's a good solution to the problem.
On the downside, it's based on Lucene 2.9 rather than 3.0, so it's not as fast as 3.0 can be, and it's no longer actively maintained, but it's stable, fairly well-documented, and much easier to use than raw Lucene.
精彩评论