在 spark/kafka 上的 google 中有流式传输的示例,如何在批处理模式下使用 spark(任何 api java/scala)从 kafka 中减去数据。
Nikolay Baranenko
Asked:
2020-12-17 16:56:36 +0800 CST
你好。
Ubuntu 16.10下安装Hadoop
Hadoop 3.0.0-alpha1
Source code repository https://git-wip-us.apache.org/repos/asf/hadoop.git -r a990d2ebcd6de5d7dc2d3684930759b0f0ea4dc3
Compiled by andrew on 2016-08-30T07:02Z
Compiled with protoc 2.5.0
From source with checksum f3a9644139eac17acbb91bfce7f68e2
This command was run using /home/hduser/hadoop/share/hadoop/common/hadoop-common-3.0.0-alpha1.jar
安装Hive 2.1.1后
#Hive environment
export HIVE_HOME=/home/hduser/hive
export PATH=$PATH:$HIVE_HOME/bin
export CLASSPATH=$CLASSPATH:/home/hduser/Hadoop/lib/*:.
export CLASSPATH=$CLASSPATH:/home/hduser/hive/lib/*:.
hive-env.sh
export HADOOP_HOME=/home/hduser/hadoop
和db-derby-10.13.1.1
#Derby environment
export DERBY_HOME=/home/hduser/derby
export PATH=$PATH:$DERBY_HOME/bin
export CLASSPATH=$CLASSPATH:$DERBY_HOME/lib/derby.jar:$DERBY_HOME/lib/derbytools.jar
hive-site.xml 元存储配置
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true </value>
<description>
jdbc:derby:;databaseName=metastore_db;create=true
JDBC connect string for a JDBC metastore.
To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
</description>
</property>
当我尝试运行配置单元时出现错误
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/hduser/hive/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/hduser/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Logging initialized using configuration in jar:file:/home/hduser/hive/lib/hive-common-2.1.1.jar!/hive-log4j2.properties Async: true
Exception in thread "main" java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:591)
at org.apache.hadoop.hive.ql.session.SessionState.beginStart(SessionState.java:531)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:705)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:226)
at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:366)
at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:310)
at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:290)
at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:266)
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:558)
... 9 more
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1654)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:80)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:130)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:101)
at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3367)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3406)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3386)
at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3640)
at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:236)
at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:221)
... 14 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1652)
... 23 more
Caused by: java.lang.IllegalArgumentException: Unrecognized Hadoop major version number: 3.0.0-alpha1
at org.apache.hadoop.hive.shims.ShimLoader.getMajorVersion(ShimLoader.java:169)
at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:136)
at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:95)
at org.apache.hadoop.hive.metastore.ObjectStore.getDataSourceProps(ObjectStore.java:476)
at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:278)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:76)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:136)
at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:599)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:564)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:626)
at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:416)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:78)
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:84)
at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6490)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:238)
at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:70)
... 28 more
哪个版本的 hive (derby) 可以与 Hadoop 3.0.0 一起使用。或者这是其他方面的问题?有哪些可能的解决方案?