开发者

Ad Hoc Reports Hadoop

I want to allow people to put in simple text search terms, run a pig job (if that's best? it's what I know best) and output the res开发者_开发技巧ults (the tsv file results?) so I can show them in a web interface.

Is there anything that approaches this problem?

Anything known to link a few disjointed pieces of the flow I am going for, together?

Thanks


Why don't you index the docs into Lucene or Solr? Then you can do text search in real-time. Hadoop is designed for batch oriented processes, which doesn't seem like what you want in this case.


Well, it depends on your project's requirements. Does it need low-latency, and how complex is the ad hoc search. Well I think hbase+pig might be a comprised solution. hbase can be used for search real-time search purpose (although its search function is not so powerful than RDBMS) and pig for batch_processing of large amount for data.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜