hadoop - GoogleHadoopFileSystem class is missing when enabling TEZ -


we're running hive on hadoop leveraging google cloud storage using bdutil version 1.3.2. 1 of options installed tez.

ambari_services='falcon flume ganglia hbase hdfs kafka kerberos mapreduce2 nagios oozie slider sqoop storm tez yarn zookeeper knox hive pig'

when running query using mapreduce query engine, runs expected:

set hive.execution.engine=mr;

but, if change execution engine tez:

set hive.execution.engine=tez;

i long stacktrace, error is:

status: failed vertex failed, vertexname=map 4, vertexid=vertex_1443128249582_0182_1_00, diagnostics=[vertex vertex_1443128249582_0182_1_00 [map 4] killed/failed due to:root_input_init_failure, vertex input: t1 initializer failed, vertex=vertex_1443128249582_0182_1_00 [map 4], java.lang.runtimeexception: java.lang.classnotfoundexception: class com.google.cloud.hadoop.fs.gcs.googlehadoopfilesystem not found

i'm assuming means gcs connector jar isn't in right path, i've added link jar below paths, still same error:

/usr/hdp/2.2.6.0-2800/hadoop/lib/gcs-connector-1.4.2-hadoop2.jar
/usr/hdp/2.2.6.0-2800/tez/lib/gcs-connector-1.4.2-hadoop2.jar
/usr/hdp/2.2.6.0-2800/tez/gcs-connector-1.4.2-hadoop2.jar
/usr/lib/hadoop/lib/gcs-connector-1.4.2-hadoop2.jar

any thoughts on needs changed enable tez?

thanks help.

the full stack trace is:
status: failed vertex failed, vertexname=map 4, vertexid=vertex_1443128249582_0182_1_00, diagnostics=[vertex vertex_1443128249582_0182_1_00 [map 4] killed/failed due to:root_input_init_failure, vertex input: t1 initializer failed, vertex=vertex_1443128249582_0182_1_00 [map 4], java.lang.runtimeexception: java.lang.classnotfoundexception: class com.google.cloud.hadoop.fs.gcs.googlehadoopfilesystem not found @ org.apache.hadoop.conf.configuration.getclass(configuration.java:2076) @ org.apache.hadoop.fs.filesystem.getfilesystemclass(filesystem.java:2601) @ org.apache.hadoop.fs.filesystem.createfilesystem(filesystem.java:2614) @ org.apache.hadoop.fs.filesystem.access$200(filesystem.java:91) @ org.apache.hadoop.fs.filesystem$cache.getinternal(filesystem.java:2653) @ org.apache.hadoop.fs.filesystem$cache.get(filesystem.java:2635) @ org.apache.hadoop.fs.filesystem.get(filesystem.java:370) @ org.apache.hadoop.fs.path.getfilesystem(path.java:296) @ org.apache.hadoop.mapred.fileinputformat.singlethreadedliststatus(fileinputformat.java:256) @ org.apache.hadoop.mapred.fileinputformat.liststatus(fileinputformat.java:228) @ org.apache.hadoop.mapred.fileinputformat.getsplits(fileinputformat.java:313) @ org.apache.hadoop.hive.ql.io.hiveinputformat.addsplitsforgroup(hiveinputformat.java:300) @ org.apache.hadoop.hive.ql.io.hiveinputformat.getsplits(hiveinputformat.java:402) @ org.apache.hadoop.hive.ql.exec.tez.hivesplitgenerator.initialize(hivesplitgenerator.java:130) @ org.apache.tez.dag.app.dag.rootinputinitializermanager$inputinitializercallable$1.run(rootinputinitializermanager.java:245) @ org.apache.tez.dag.app.dag.rootinputinitializermanager$inputinitializercallable$1.run(rootinputinitializermanager.java:239) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:415) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1628) @ org.apache.tez.dag.app.dag.rootinputinitializermanager$inputinitializercallable.call(rootinputinitializermanager.java:239) @ org.apache.tez.dag.app.dag.rootinputinitializermanager$inputinitializercallable.call(rootinputinitializermanager.java:226) @ java.util.concurrent.futuretask.run(futuretask.java:262) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615) @ java.lang.thread.run(thread.java:745) caused by: java.lang.classnotfoundexception: class com.google.cloud.hadoop.fs.gcs.googlehadoopfilesystem not found @ org.apache.hadoop.conf.configuration.getclassbyname(configuration.java:1982) @ org.apache.hadoop.conf.configuration.getclass(configuration.java:2074) ... 24 more ] vertex failed, vertexname=map 3, vertexid=vertex_1443128249582_0182_1_01, diagnostics=[vertex vertex_1443128249582_0182_1_01 [map 3] killed/failed due to:root_input_init_failure, vertex input: t2 initializer failed, vertex=vertex_1443128249582_0182_1_01 [map 3], java.lang.runtimeexception: java.lang.classnotfoundexception: class com.google.cloud.hadoop.fs.gcs.googlehadoopfilesystem not found @ org.apache.hadoop.conf.configuration.getclass(configuration.java:2076) @ org.apache.hadoop.fs.filesystem.getfilesystemclass(filesystem.java:2601) @ org.apache.hadoop.fs.filesystem.createfilesystem(filesystem.java:2614) @ org.apache.hadoop.fs.filesystem.access$200(filesystem.java:91) @ org.apache.hadoop.fs.filesystem$cache.getinternal(filesystem.java:2653) @ org.apache.hadoop.fs.filesystem$cache.get(filesystem.java:2635) @ org.apache.hadoop.fs.filesystem.get(filesystem.java:370) @ org.apache.hadoop.fs.path.getfilesystem(path.java:296) @ org.apache.hadoop.mapred.fileinputformat.singlethreadedliststatus(fileinputformat.java:256) @ org.apache.hadoop.mapred.fileinputformat.liststatus(fileinputformat.java:228) @ org.apache.hadoop.mapred.fileinputformat.getsplits(fileinputformat.java:313) @ org.apache.hadoop.hive.ql.io.hiveinputformat.addsplitsforgroup(hiveinputformat.java:300) @ org.apache.hadoop.hive.ql.io.hiveinputformat.getsplits(hiveinputformat.java:402) @ org.apache.hadoop.hive.ql.exec.tez.hivesplitgenerator.initialize(hivesplitgenerator.java:130) @ org.apache.tez.dag.app.dag.rootinputinitializermanager$inputinitializercallable$1.run(rootinputinitializermanager.java:245) @ org.apache.tez.dag.app.dag.rootinputinitializermanager$inputinitializercallable$1.run(rootinputinitializermanager.java:239) @ java.security.accesscontroller.doprivileged(native method) @ javax.security.auth.subject.doas(subject.java:415) @ org.apache.hadoop.security.usergroupinformation.doas(usergroupinformation.java:1628) @ org.apache.tez.dag.app.dag.rootinputinitializermanager$inputinitializercallable.call(rootinputinitializermanager.java:239) @ org.apache.tez.dag.app.dag.rootinputinitializermanager$inputinitializercallable.call(rootinputinitializermanager.java:226) @ java.util.concurrent.futuretask.run(futuretask.java:262) @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145) @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615) @ java.lang.thread.run(thread.java:745) caused by: java.lang.classnotfoundexception: class com.google.cloud.hadoop.fs.gcs.googlehadoopfilesystem not found @ org.apache.hadoop.conf.configuration.getclassbyname(configuration.java:1982) @ org.apache.hadoop.conf.configuration.getclass(configuration.java:2074) ... 24 more ] vertex killed, vertexname=map 1, vertexid=vertex_1443128249582_0182_1_02, diagnostics=[vertex received kill in inited state., vertex vertex_1443128249582_0182_1_02 [map 1] killed/failed due to:null] vertex killed, vertexname=reducer 2, vertexid=vertex_1443128249582_0182_1_03, diagnostics=[vertex received kill in inited state., vertex vertex_1443128249582_0182_1_03 [reducer 2] killed/failed due to:null] dag failed due vertex failure. failedvertices:2 killedvertices:2 failed: execution error, return code 2 org.apache.hadoop.hive.ql.exec.tez.teztask


Comments

Popular posts from this blog

html - Outlook 2010 Anchor (url/address/link) -

javascript - Why does running this loop 9 times take 100x longer than running it 8 times? -

Getting gateway time-out Rails app with Nginx + Puma running on Digital Ocean -