开发者

What is the best Java text indexing library for Google App Engine?

To the moment I know that compass may handle this work. But indexing with compass lo开发者_开发百科oks pretty expensive. Is there any lighter alternatives?


To be honest, I don't know if Lucene will be lighter than Compass in terms of indexing (why would it be, doesn't Compass use Lucene for that?).

Anyway, because you asked for alternatives, there is GAELucene. I'm quoting its announcement below:

Enlightened by the discussion "Can I run Lucene in google app engine?", I implemented a google datastore based Lucene component, GAELucene, which can help you to run search applications on google app engine.

The main clazz of GAELucene include:

  • GAEDirectory - a read only Directory based on google datastore.
  • GAEFile - stands for an index file, the file's byte content will be splited into multi GAEFileContent.
  • GAEFileContent - stands for a segment of index file.
  • GAECategory - the identifier of different indices.
  • GAEIndexInput - a memory-resident IndexInput? implementation like the RAMInputStream.
  • GAEIndexReader - wrapper for IndexReader? that cached in GAEIndexReaderPool
  • GAEIndexReaderPool - pool for GAEIndexReader

The following code snippet demonstrates the use of GAELucene do searching:

Query queryObject = parserQuery(request);
GAEIndexReaderPool readerPool = GAEIndexReaderPool.getInstance();
GAEIndexReader indexReader = readerPool.borrowReader(INDEX_CATEGORY_DEMO);
IndexSearcher searcher = newIndexSearcher(indexReader);
Hits hits = searcher.search(queryObject);
readerPool.returnReader(indexReader);

I warmly recommend to read the whole discussion on nabble, very informative.

Just in case, regarding Compass, Shay Banon wrote a blog entry detailing how to use Compass in App Engine here: http://www.kimchy.org/searchable-google-appengine-with-compass/


Apache Lucene is the de-facto choice for full text indexing in Java. Looks like Compass Core contains "An implementation of Lucene Directory to store the index within a database (using Jdbc). It is separated from Compass code base and can be used with pure Lucene applications." plus tons of other stuff. You could try to separate just the Lucence component thereby stripping away several libs and making it more lightweight. Either that or ditch Compass altogether and use pure unadorned Lucene.


For Google App Engine, the only indexing library I've seen is appengine-search, with a description of how to use it on this page. I haven't tried it out though.

I've used Lucene (which Compass is based on) and found it to work great with comparatively low expense. The indexing is a task that you can schedule at times that work for your app.

Some alternatives indexing projects are mentioned in this SO thread, including Xapian and minion. I haven't checked either of these out though, since Lucene did everything I needed it to very well.


The Google App engine internal search seems better, and even havsupport synonyms:

https://developers.google.com/appengine/docs/java/search/


If you want to run Lucene on GAE you might also have a look at LuGAEne. It's an implementation of Lucene's Directory for GAE.

Usage is actually pretty simple, just replace one of Lucene's standard directories with GaeDirectory

Directory directory = new GaeDirectory("MyIndex");
Analyzer analyzer = new StandardAnalyzer(Version.LUCENE_43);
IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_43, analyzer);
IndexWriter writer = new IndexWriter(directory, config);
...

gaelucene seems to be in "maintenance mode" (no commit since Sep 2009) and lucene-appengine does not (yet) work when you're using Objectify version 4 in your application.

Disclaimer: I'm the author of LuGAEne.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜