fast querying on hbase

2023-01-17 12:11 问答作者：

I am running a little test/poc here.

I need to load a few million rows every day into a database. And it's not log file data, I have comma delimited rows (of columns) which would exactly fit a relational database.

After the loading, I need to allow a very fast search mecha开发者_开发知识库nism. Looking a bit at Google's implementation of bigtable and structure around it, I originally thought of using hive integrated with hbase. Hive because of its querying capabilities. The loading works out fine, better than RDBMS perf. However, the querying bottleneck, which was the reason to look for alternatives to RDBMS in the first place, continues with hive too.

Testing hive for querying is not really blazing performance. Perhaps I need to look for alternatives..

Is there something else ? any other tool/solution/library that I can put on top of hbase ? or even without hbase ? (I looked at hbase as an alternative to the RDBMS, moving towards dist computing)

Suggestions please...

If you want general search capabilities you may want to look at solutions like Solr or ElasticSearch instead. HBase works well if you prepare the data for the queries you need (key design) not for general search. Also you can look at Lily which combines Solr and HBase

The problem you have is that hive runs most of its queries as mapreduce programs which are inherently slow.

If you write your own program to run appropriate scans and then group it yourself, hbase can be very fast. If you want a query language though there are currently no solutions I am aware of.

It's hard to say more than that as your description of the data and the kind of queries you want to run on it is very generic.

This isn't unthinkable to use MySQL for this number of rows. You might try it with some test data and see if you can get get away with it.

Have you looked at a solr or lucene type solution? It is not an SQL solution, but the query language is pretty flexible for some types of uses, and it is extremely fast. There are also ways of distributing it over a cluster of servers for improved performance, scaling either the size of the index, or the number of queries it can handle, or both.

继续阅读：hbase

fast querying on hbase

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？