Using Google Dataproc to import CSV data in Bigtable -

June 15, 2010

i'm trying use instance of dataproc cluster import large csv files hdfs, export them sequencefile format, import latest bigtable described here: https://cloud.google.com/bigtable/docs/exporting-importing

i imported csv files external table in hive, exported them inserting them in sequencefile backed table.

however (probably since seems dataproc ships hive 1.0?), faced cast exception error mentioned here: bigtable import error

i can't seem hbase shell or zookeeper , running on dataproc master vm, can't run simple export job cli.

is there alternative way export bigtable-compatible sequence files dataproc ?
what's proper configuration setup hbase , zookeeper running dataproc vm master node ?

the import instructions linked instructions importing data existing hbase deployment.

if input format you're working csv, creating sequencefiles unnecessary step. how writing hadoop mapreduce process csv files , write directly cloud bigtable? dataflow fit here.

take @ samples here: https://github.com/googlecloudplatform/cloud-bigtable-examples/tree/master/java

Search This Blog

TSQL

Using Google Dataproc to import CSV data in Bigtable -

Comments

Post a Comment

Popular posts from this blog

java - WARN : org.springframework.web.servlet.PageNotFound - No mapping found for HTTP request with URI [/board/] in DispatcherServlet with name 'appServlet' -

1111. appearing after print sequence - php -

android - How to create dynamically Fragment pager adapter -