HBase column wide scanning and fetching

2023-02-06 07:04 问答作者：

rowkey (attrId+attr_value) //compound key

column => doc:doc1, doc:doc2, ...

when use scan feature, i would fetch 1 row every time inside iterator, what if the column qualifier reach millions entries. how do you loop through that, and will there be a cache issue?

thanks.

Scans fetch rows. You can qualify a scan so that it only fetches given qualifiers or given families, but then that is all that will be returned from the scan (and you can only filter on data that is included in a scan).

If you have potentially millions of columns in a single row, that could be an issue: that means that returning that row could be a very large network transfer. If your row size exceeds your region size it could also cause OOM errors on your region servers, and you will have inefficient storage (one row per region).

However, ignoring all of that, you can loop through the columns and column qualifiers in the client.You can get a Map from the result set that maps from families to qualifiers to values. But that is probably not what you really want to do

You can workaround giant row fetches with a mixture of scans and column filters:

Scan s = ...;
s.setStartRow("some-row-key");
s.setStopRow("some-row-key");
Filter f = new ColumnRangeFilter(Bytes.toBytes("doc0000"), true,
                                 Bytes.toBytes("doc0100"), false);
s.setFilter(f);

Source: http://hadoop-hbase.blogspot.com/2012/01/hbase-intra-row-scanning.html

You can also limit the number of columns within a row returned at a time via Scan.setBatch.

继续阅读：hbase

HBase column wide scanning and fetching

更多精彩内容

精彩评论

最新问答

央视是哪个频道？

请问买过的朋友，舒提啦旅行箱实际使用体验如何？？

检查不孕不育需要的费用？

海信ULED电视画质有什么不同的地方?？

钉子可以挂的住画框幕布吗？

问答排行榜

河神2九牛入海钓河妖是第几集河妖什么来历可活吞牛？

性激素六项检查的最佳时间是多久？多少钱？？

Easiest way to get words of one line from istream into a vector?

《梦在燃烧 (《三国演义》动画片主题曲)》MP3歌词-汤子星？

抽烟只抽炫赫门？