hadoop - Store nested entity in Hbase and read it as rows in hive -
my requirement write nested entity(array of pojo objects) java hbase , read them individual records in hive.
(i,e) while writing java, single string(array). hive, array represents table whole. hive should have individual elements of array individual records in it.
any on appreciated.
thanks, gk
perhaps should take hive udtf functions explode
, depending on store , need retrieve may work noticed have important limitations:
- no other expressions allowed in select
select pageid, explode(adid_list) mycol...
not supported- udtf's can't nested
select explode(explode(adid_list)) mycol...
not supported- group / cluster / distribute / sort not supported
select explode(adid_list) mycol ... group mycol
not supported
if standard udtfs don't fit case , you're in mood, can this:
- store each item of array json string in different column: i0, i1, i2 ... in
- write own udtf function process each row columns , emit 1 row per column.
imho, i'll write 1 row per element of array, appending rowkey index of each array item, faster when processing data , you'll have lot less headaches. shouldn't worry writing billions of rows if that's case.
Comments
Post a Comment