开发者

How can I fetch Images from HBase

I have around 1 Gig of image .png files in my HDFS. Can anyone suggest me a way to store the index va开发者_开发问答lues to these images in HBase and retrieve the image by querying HBase. Or how can I used HDFS/HBase to serve images. Pls reply .

Urgent requirement :(

Thanks In Advance


The following code will help.

    //to store image file to hbase  
    Configuration conf = HBaseConfiguration.create();
    HTable table = new HTable(conf, "test".getBytes());
    Put put = new Put("row1".getBytes());
    put.add("C".getBytes(), "image".getBytes(),
            extractBytes("/path/to/image/input.jpg"));
    table.put(put);

    //to retrieve the image
    Get get = new Get("row1".getBytes());

    Result result = table.get(get);
    byte[] arr = result.getValue("C".getBytes(), "image".getBytes());


    OutputStream out = new BufferedOutputStream(new FileOutputStream(
            "/path/to/image/output.jpg"));
    out.write(arr);

    //function to convert image file to bytes.
    public static byte[] extractBytes(String ImageName) throws IOException {

    File file = new File(ImageName);
    BufferedImage originalImage = ImageIO.read(file);
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ImageIO.write(originalImage, "jpg", baos);
    byte[] imageInByte = baos.toByteArray();
    return imageInByte;
}


There are two basic ways of serving image files: storing the image in HBase itself, or storing a path to the image. HBase has successfully been used by a large-scale commercial photo sharing site for storing and retrieving images -- although they have had to carefully tune and monitor their system (see the HBase mailing list for details).

If you store your images on HDFS and only keep a path in HBase you will have to ensure that you will not have too many images as HDFS does not deal well with a lot of files (depends on the size of RAM allocated to your namenode, but there is still an upper limit).

Unless you plan on storing meta data along with each image, you may be able to get away with a very simple schema for either storing the data or the path to the image. I am imagining something like a single column family with two column qualifiers: data, and type. The data column could store either the path or the the actual image bytes. The type would store the image type (png, jpg, tiff, etc.). This would be useful for sending the correct mime type over the wire when returning the image.

Once you have that set up, all you need is a servlet (or something equivalent in thrift) to assemble the data and return it to the client.

0

上一篇:

下一篇:

精彩评论

暂无评论...
验证码 换一张
取 消

最新问答

问答排行榜