开发者

HBase Column RegEx through Thrift from C#

I'm using the thrift interface (http://apache.mesi.com.ar//incubator/thrift/0.5.0-incubating/) to access HBase on my cluster. I can connect, get and display records; Use the start and stop dates.

The documentation (http://hbase.apache.org/docs/r0.89.20100924/apidocs/org/apache/hadoop/hbase/thrift/doc-files/Hbase.html#Fn_Hbase_scannerOpenWithStop) says,

It's also possible to pass a regex in the column qualifier.

My question is a simple - How?

My working Code:

int scannerId = client.scannerOpenWithStop("TABLE_NAME".ToByteArray(),
    "START_ROW".ToByteArray(), "STOP_ROW".ToByteArray(),
    new List<string>(){"COLUMN_FAMILY" }.ToByteArrayList());

The ToByteArray() and ToByteArrayList() are extension functions with ...List calling ToByteArray for each string, stuffs in list, ect. I'm putting it below in case my method of string->byte[] conversion can cause problems.

public static byte[] ToByteArray(this string s)
{
    System.Text.UTF8Encoding encoding = new System.Text.UTF8Encoding();
    return encoding.GetBytes(s);
}

When I run the above code, it returns data, varied by changes in the START_ROW and STOP_ROW. If I add a colon (:) to the entry in the List to be COLUMN_FAMILY: it still returns the data. If I put a full column name in COLUMN_FAMILY:http://www.myurl.co开发者_运维百科m/more/goes/here it will get all values for that URL.

What I want to do is have COLUMN_FAMILY:http://www.myurl.com/.* (or other regex) and have it return the relevant data, like the documentation seems to say it can.

An example or two is all I should need. I figure there is some formatting or trick I'm missing to get the regex working.


COLUMN_FAMILY:/(?i:^http://www.myurl.com)/

It looks like the leading and trailing / tell it that the contents should be parsed as

http://blog.hypertable.com/?cat=1

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜