开发者

How do I put Powerpoint and Excel documents in a full text search index like Sphinx or PostgreSQL text search?

I have a Rails application that accepts file uploads of arbitrary business documents such as from Word, Excel, Powerpoint, and PDF. I need to make 开发者_JAVA技巧all these documents searchable, preferably using Sphinx or PostgreSQL full text search. What are the best solutions?


As pointed out in the comments, this is covered pretty well by an older question.

In short: you're going to have to store the relevant extracted data from those files in the database for Sphinx, and likely for PostgreSQL full-text search as well. Sphinx can now also understand plain text files (as long as a database column points to a file), but that will still involve another tool extracting data from PDF, DOC, XLS et al.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜