Hi,
I have an input which has about 10M records. Each record contains a rowkey of an Hbase table.
I can do a batched get as described http://stackoverflow.com/questions/13310434/hbase-api-get-data-rows-information-by-list-of-row-ids, but it is slowly because of the big input size.
I want to do it with pig script.
But how can i use batched gets in the pig UDF?
Any insight about this?
Thanks,
Lei
leiwangouc [ at ] gmail.com
I have an input which has about 10M records. Each record contains a rowkey of an Hbase table.
I can do a batched get as described http://stackoverflow.com/questions/13310434/hbase-api-get-data-rows-information-by-list-of-row-ids, but it is slowly because of the big input size.
I want to do it with pig script.
But how can i use batched gets in the pig UDF?
Any insight about this?
Thanks,
Lei
leiwangouc [ at ] gmail.com