开发者

What Linux Full Text Indexing Tool Has A Good C++ API?

I'm looking to add full text indexing to a Linux desktop application written in C++. I am thinking that the easiest way to do this would be to call an existing library or utility. This article reviews various open source utilities available for the Gnome and KDE desktops; metatracker, recoll and stigi are al开发者_如何学Cl written in C++ so they each seem reasonable. But I cannot find any notable documentation on how to use them as libraries or through an API. I could, instead, use something like Clucene or Xapian, which are generic full text indexing libraries. They seem more straightforward but if I used them, I'd have to implement my own indexing daemon, an unappealing prospect.

Also, Xesam seems to be the latest thing, does anyone have any evidence that it works?

So, does anyone have experience using any of the applications or libraries? How did you use it and what documentation was useful?


I used CLucene, which you mentioned (and also Lucene.NET), and found it to be pretty good.


There's also Strigi which AFAIK works with Xesam and is the default used in KDE.


After further looking around, I found and worked with Recol. It believe that it has the best C++ interface to a full text search engine, in this case Xapian.

It is important to realize that clucene and Xapian are both highly complex libraries designed primarily for multi-user server applications. Cutting them down to a level appropriate for a client-system is not easy. If I remember correctly, Strigi has a complex, pure C interface which isn't adapted.

Clucene also doesn't seem to be that actively maintained currently and Xapian seems to be maintained. But the thing is the existence of recol, which allows you to index particular files without the massive, massive setup that raw Xapian or clucene requires - creating your own "stemming" set is not normally desirable, etc.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜