开发者

how to index a folder using lucene.net

I am trying to develop a search engine in asp.net using lucene.net. I go through many tutorials and pages to get the appropriate results but i couldn't. Actually I have a folder with some files(doc,ppt,pdf,excel etc..) and i want to search within that folder only for contents and if the results are not found within that folder then ask user to search 开发者_如何学Con web.

for example i have a folder with thousands of files @ C:\test and if user searched for "miller" then it should search into every document. if results are found then it should display results like that

Searched text file no of occurences miller C:\test\1\file.doc 5 miller C:\test\1\11\new.doc 2

please help me i am not getting appropriate results .


Lucene / Lucene.NET is just an indexing engine, you still have to extract the text from the file types that you want to support yourself -on Windows you can use the IFilter interface for many file types, if you have Acrobat Reader 7+ installed there should be built in support for IFilter for PDF files. As for the indexing part itself there are many, many samples out there.

Also see this thread What's a good method for extracting text from a PDF using C# or classic ASP (VBScript)?

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜