hadoop - How to create a nested table in Hive for a FIX file format -


in research using ( fix 5.0 sp2 ) protocol format , ( cme datamine ) data. having problems creating table data (nested tables). example basic message format given msg#1, starting @ (^a1128) , ending @ (^a10). in case of msg#2 have same basic message starting @ (^a1128) , multiple messages starting @ (^a1023) , ending @ (^a346) .

        msg#1                       msg#2 ---------------------------------------------------------------------     ^a1128=9                ^a1128=9          ^a9=136                     ^a9=1417         ^a35=x                      ^a35=x          ^a49=cme                    ^a49=cme          ^a34=578                    ^a34=731          ^a52=20130714180133577      ^a52=20130714190005961         ^a75=20130715               ^a75=20130710          ^a268=1                     ^a268=15         ^a279=0                     ^a279=0         ^a22=8                      ^a22=8         ^a48=111473                 ^a48=28112          ^a64=20130712               ^a83=2          ^a83=1                      ^a107=esz3         ^a107=esu4                  ^a269=0         ^a269=6                     ^a270=166000         ^a270=164350                ^a271=1          ^a273=180133000             ^a273=190005000     ^a10=248                        ^a336=0                                     ^a346=1                                               ^a1023=1                                                     ^a279=0                                                     ^a22=8                                                     ^a48=28112                                                     ^a83=3                                                     ^a107=esz3                                                    ^a269=0                                                     ^a270=165675                                                     ^a271=1                                                     ^a273=190005000                                                    ^a336=0                                                     ^a346=1                                       ...   ...   ...  ...                                                ^a1023=5                                       ^a10=029 

the data comes in sequence of lines without spaces , @ end of message there ( \n ) character. this,

^a1128=9^a9=157^a35=x^a49=cme^a34=406238^a52=20130715123719934^a75=20130715^a268=1^a279=1^a22=8^a48=28112^a83=49004^a107=esz3^a269=1^a270=166775^a271=186^a273=123719000^a336=2^a346=3^a1023=1^a10=210^a 

here code using create table in hive without success.

        create external table fix_map         (tag map<int, string>)         row format delimited         collection items terminated '1'         fields terminated '2'         map keys terminated '='         location '/user/data/'; 

any suggestions appreciated it.

i didnt find concrete answer questions used different approach solve problem. first solution create 2 different tables, 1 main fix msg other nested part join 2 tables. used code create first table:

        create table futures_fix(           tag map<int,string>           )         row format delimited         fields terminated '2'         collection items terminated '1'         map keys terminated '='         location '/user/lcoation/data/'; 

the other approach use fis2json (npm) tool (sungard labs) guys created great tool decoded fix json format ( fix2json ). after decoded data have more tools make work nested tables fi2json decoder change tags correct names! here example of fix2json output.

     {         "applverid":"fix50sp2",         "bodylength":"136",         "msgtype":"marketdataincrementalrefresh",         "sendercompid":"cme",         "msgseqnum":"578",         "sendingtime":"20130714180133577",         "tradedate":"20130715",         "nomdentries":"1",         "mdentries":[            {               "mdupdateaction":"new",               "securityidsource":"exchange symbol",               "securityid":"111473",               "settldate":"20130712",               "rptseq":"1",               "securitydesc":"esu4",               "mdentrytype":"settlement price",               "mdentrypx":"164350",               "mdentrytime":"180133000"            }         ],         "checksum":"248"      } 

Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -