spark1.x升级spark2如何升级及需要考虑的问题

问题导读 1.spark2升级哪些内容变化? 2.升级中spark哪些没有发生变化? 3.cloudera中,spark1和spark2能否并存? 4.升级后,可能会遇到什么问题? spark2出来已经很长时间了,但是由于spark1.6比较稳定,很多依然在使用。如果想使用spark2,那么该如何升级。我们window升级一般为直接点击升级即可,剩下的事情,不用我们管。但是spark的升级确实有点出乎意料。相当于我们直接安装,但是可以借用以前的配置,比如配置文件基本是不变的,如果目录相同,环境变量变化也不大。 如果只是单纯的学习,升级是没有问题的。但是如果我们生产环境,升级就需要注意了,因为升级后会带来不少的负作用。 spark安装参考http://www.aboutyun.com/forum.php?mod=viewthread&tid=20620 下面介绍如何升级: 1.spark升级 首先停止所有服务

[Bash shell] 纯文本查看 复制代码

?

./stop-all.sh

这里额外补充一些内容: spark有stop-all.sh,

hadoop也有同样的命令,只不过hadoop在准备弃用下面两个命令。那么如果想使用这两个命令,我们最好到对应的目录里面sbin,然后执行

[Bash shell] 纯文本查看 复制代码

?

./stop-all.sh

既然手工配置,升级我们需要考虑的问题: 1.配置文件是否变化 参考官网spark1.x和2.x所幸应该是没有变化的,配置文件还是那些。 http://spark.apache.org/docs/latest/spark-standalone.html,这样升级就放心了,因为我们可以使用原先的配置文件,不能再麻烦了。 2.变化的有哪些 我们停止集群后,后面开始相关的配置。 我这里的spark版本为1.6,这里要升级为2.2 首先重命名spark文件夹

[Bash shell] 纯文本查看 复制代码

?

sudo mv spark spark1.6

解压spark2.2包

[Bash shell] 纯文本查看 复制代码

?

sudo tar zxvf spark-2.2.0-bin-hadoop2.7.tgz -C /data

查看权限为500

为了防止出现问题,因此改变下权限:

[Bash shell] 纯文本查看 复制代码

?

sudo chown -R aboutyun:aboutyun spark-2.2.0-bin-hadoop2.7/

[Bash shell] 纯文本查看 复制代码

?

sudo chmod -R 777 spark-2.2.0-bin-hadoop2.7/

我们队这个文件夹重命名

[Bash shell] 纯文本查看 复制代码

?

sudo mv spark-2.2.0-bin-hadoop2.7/ spark

将spark1.6的文件spark-env.sh、slaves、spark-defaults.conf复制到spark 对于三个文件,如果都比较完善的话,是不需要修改的 slaves 机器不变化,是不需要修改的。 spark-env.sh JAVA_HOME=/data/jdk1.8 SCALA_HOME=/data/scala2 SPARK_MASTER_HOST=192.168.1.10 HADOOP_CONF_DIR=/data/hadoop/etc/hadoop SPARK_LOCAL_DIR=/data/spark_data SPARK_WORKER_DIR=/data/spark_data/spark_works 说明:SPARK_MASTER_IP在spark1.x中,spark2中使用的是SPARK_MASTER_HOST spark-defaults.conf spark.master spark://master:7077 spark.eventLog.enabled true spark.eventLog.dir file:///data/spark_data/history/event-log spark.serializer org.apache.spark.serializer.KryoSerializer spark.history.fs.LogDirectory file:///data/spark_data/history/spark-events 上面都不需要修改,当然如果需要调整的自行修改即可。 修改环境变量 ~/.bashrc

[Bash shell] 纯文本查看 复制代码

?

export HADOOP_HOME=/data/hadoop
export SPARK_HOME=/data/spark
export ZOOKEEPER_HOME=/data/zookeeper-3.4.6
export KAFKA_HOME=/data/kafka_2.11
export HIVE_HOME=/data/hive-1.2.1
export PATH=$HIVE_HOME/bin:$KAFKA_HOME/bin:$ZOOKEEPER_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$SPARK_HOME/bin:$SPARK_HOME/sbin:$PATH
 
export FLUME_HOME=/data/flume-1.6.0
export PATH=$FLUME_HOME/bin:$PATH

[AppleScript] 纯文本查看 复制代码

?

source ~/.bashrc

这一步很重要,否则可能还是原先的版本

上面由于我们文件名为spark,因此不需要修改。 接着我们复制到其它客户端:

[Bash shell] 纯文本查看 复制代码

?

scp -r spark aboutyun@slave1:/data

[Bash shell] 纯文本查看 复制代码

?

scp -r spark aboutyun@slave2:/data

在远程复制的时候,需要记得将slave1和slave2的hadoop文件夹删除,否则会将hadoop2.7.4和hadoop2.6.5包混合 说明: 一般来讲我们是不能直接复制到非home目录的,所以我们需要将data文件夹授权为777,这样我们才能远程复制成功。 接着我们启动spark,进入spark的sbin目录

[Bash shell] 纯文本查看 复制代码

?

./start-all.sh

对于spark的升级,注意如果使用的是hadoop,需要对应hadoop版本,否则可能会出错。对于Scala版本同样需要注意,Scala支持版本为2.11

#########################

cloudera升级 除了spark原生态升级,对于cloudera升级就比较简单了,cloudera中,spark1.6和spark2是可以并存的,直接安装spark2即可。 ######################### spark升级带来哪些副作用 如果我们已经线上使用,那么需要谨慎升级,否则可能会发生预料之外的事情。下面内容仅供大家参考 计算准确性 SELECT '0.1' = 0返回的是true!Spark 2.2中,0.1会被转换为int,如果你的数据类型全部是文本类型,做数值计算时,结果极有可能不正确。之前的版本中0.1会被转换为double类型绝大多数场景下这样的处理是正确的。目前为止,社区还没有很好的处理这个问题,针对这个问题,我给社区提交过一个PR,想要自己解决这个问题的同学,可以手动合并下:https://github.com/apache/spark/pull/18986 过于复杂的SQL语句执行可能会出现64KB字节码编译限制的问题,这算是个老问题了,Spark自从上了Tungsten基本上一直存在这个问题,也算是受到了JVM的限制,遇到此类问题,建议大家找找PR:https://github.com/apache/spark/search?utf8=%E2%9C%93&q=64KB&type=Issues 数据计算精度有问题,SELECT 1 > 0.0001会报错,这个问题已在2.1.2及2.2.0中修复:https://issues.apache.org/jira/browse/SPARK-20211 2.1.0版本中INNER JOIN涉及到常量计算结果不正确,后续版本已修复:https://issues.apache.org/jira/browse/SPARK-19766 2.1.0中,执行GROUPING SET(col),如果col列数据为null,会报空指针异常,后续版本已修复:https://issues.apache.org/jira/browse/SPARK-19509 2.1.0中,嵌套的CASE WHEN语句执行有可能出错,后续版本已修复:https://issues.apache.org/jira/browse/SPARK-19472 行为变化 那些不算太致命,改改代码或配置就可以兼容的问题。 Spark 2.2的UDAF实现有所变动,如果你的Hive UDAF没有严格按照标准实现,有可能会计算报错或数据不正确,建议将逻辑迁移到Spark AF,同时也能获得更好的性能 Spark 2.1开始全表读取分区表采用FilePartition的方式,单个Partition内可以读取多个文件,如果对文件做了压缩,这种方式有可能导致查询性能变差,可以适当降低spark.sql.files.maxPartitionBytes的值,默认是128MB(对于大部分的Parquet压缩表来说,这个默认设置其实会导致性能问题) Spark 2.x限制了Hive表中spark.sql.*相关属性的操作,明明存在的属性,使用SHOW TBLPROPERTIES tb("spark.sql.sources.schema.numParts")无法获取到,同理也无法执行ALTER TABLE tb SET TBLPROPERTIES ('spark.sql.test' = 'test')进行修改 无法修改外部表的属性ALTER TABLE tb SET TBLPROPERTIES ('test' = 'test')这里假设tb是EXTERNAL类型的表 DROP VIEW IF EXISTS tb,如果这里的tb是个TABLE而非VIEW,执行会报错AnalysisException: Cannot drop a table with DROP VIEW,在2.x以下不会报错,由于我们指定了IF EXISTS关键字,这里的报错显然不合理,需要做异常处理。 如果你访问的表不存在,异常信息在Spark2.x里由之前的Table not found变成了Table or view not found,如果你的代码里依赖这个异常信息,就需要注意调整了。 EXPLAIN语句的返回格式变掉了,在1.6里是多行文本,2.x中是一行,而且内容格式也有稍微的变化,相比Spark1.6,少了Tungsten关键字;EXPLAIN中显示的HDFS路径过长的话,在Spark 2.x中会被省略为... 2.x中默认不支持笛卡尔积操作,需要通过参数spark.sql.crossJoin.enabled开启 OLAP分析中常用的GROUPING__ID函数在2.x变成了GROUPING_ID() 如果你有一个基于Hive的UDF名为abc,有3个参数,然后又基于Spark的UDF实现了一个2个参数的abc,在2.x中,2个参数的abc会覆盖掉Hive中3个参数的abc函数,1.6则不会有这个问题 执行类似SELECT 1 FROM tb GROUP BY 1的语句会报错,需要单独设置spark.sql.groupByOrdinal false类似的参数还有spark.sql.orderByOrdinal false CREATE DATABASE默认路径发生了变化,不在从hive-site.xml读取hive.metastore.warehouse.dir,需要通过Spark的spark.sql.warehouse.dir配置指定数据库的默认存储路径。 CAST一个不存在的日期返回null,如:year('2015-03-40'),在1.6中返回2015 Spark 2.x不允许在VIEW中使用临时函数(temp function)https://issues.apache.org/jira/browse/SPARK-18209 Spark 2.1以后,窗口函数ROW_NUMBER()必须要在OVER内添加ORDER BY,以前的ROW_NUMBER() OVER()执行会报错 Spark 2.1以后,SIZE(null)返回-1,之前的版本返回null Parquet文件的默认压缩算法由gzip变成了snappy,据官方说法是snappy有更好的查询性能,大家需要自己验证性能的变化 DESC FORMATTED tb返回的内容有所变化,1.6的格式和Hive比较贴近,2.x中分两列显示 异常信息的变化,未定义的函数,Spark 2.x: org.apache.spark.sql.AnalysisException: Undefined function: 'xxx’., Spark 1.6: AnalysisException: undefined function xxx,参数格式错误:Spark 2.x:Invalid number of arguments, Spark 1.6: No handler for Hive udf class org.apache.hadoop.hive.ql.udf.generic.GenericUDAFXXX because: Exactly one argument is expected.. Spark Standalone的WebUI中已经没有这个API了:/api/v1/applications:https://issues.apache.org/jira/browse/SPARK-12299,https://issues.apache.org/jira/browse/SPARK-18683 内容摘自: http://www.jianshu.com/p/482407c88d27 ########################################### spark升级遇到问题总结 spark的升级后,会遇到很奇怪的问题, 1.进程会有多个master 2.端口无缘无故被暂用 3.进程都正常,master连接不上 启用spark-shell报错如下

[Bash shell] 纯文本查看 复制代码

?

To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
17/11/17 11:30:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/11/17 11:30:14 WARN client.StandaloneAppClient$ClientEndpoint: Failed to connect to master master:7077
org.apache.spark.SparkException: Exception thrown in awaitResult: 
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
        at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
        at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
        at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.StreamCorruptedException: invalid stream header: 01000C31
        at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:806)
        at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
        at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.<init>(JavaSerializer.scala:64)
        at org.apache.spark.serializer.JavaDeserializationStream.<init>(JavaSerializer.scala:64)
        at org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:123)
        at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:108)
        at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:258)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
        at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:310)
        at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:257)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
        at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:256)
        at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:588)
        at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:570)
        at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:149)
        at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:102)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)
 
        at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:207)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:120)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
        ... 1 more
17/11/17 11:30:33 WARN client.StandaloneAppClient$ClientEndpoint: Failed to connect to master master:7077
org.apache.spark.SparkException: Exception thrown in awaitResult: 
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
        at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
        at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
        at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.StreamCorruptedException: invalid stream header: 01000C31
        at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:806)
        at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
        at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.<init>(JavaSerializer.scala:64)
        at org.apache.spark.serializer.JavaDeserializationStream.<init>(JavaSerializer.scala:64)
        at org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:123)
        at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:108)
        at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:258)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
        at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:310)
        at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:257)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
        at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:256)
        at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:588)
        at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:570)
        at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:149)
        at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:102)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)
 
        at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:207)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:120)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
        ... 1 more
17/11/17 11:30:53 WARN client.StandaloneAppClient$ClientEndpoint: Failed to connect to master master:7077
org.apache.spark.SparkException: Exception thrown in awaitResult: 
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
        at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:100)
        at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:108)
        at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: java.io.StreamCorruptedException: invalid stream header: 01000C31
        at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:806)
        at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
        at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.<init>(JavaSerializer.scala:64)
        at org.apache.spark.serializer.JavaDeserializationStream.<init>(JavaSerializer.scala:64)
        at org.apache.spark.serializer.JavaSerializerInstance.deserializeStream(JavaSerializer.scala:123)
        at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:108)
        at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1$$anonfun$apply$1.apply(NettyRpcEnv.scala:258)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
        at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:310)
        at org.apache.spark.rpc.netty.NettyRpcEnv$$anonfun$deserialize$1.apply(NettyRpcEnv.scala:257)
        at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
        at org.apache.spark.rpc.netty.NettyRpcEnv.deserialize(NettyRpcEnv.scala:256)
        at org.apache.spark.rpc.netty.NettyRpcHandler.internalReceive(NettyRpcEnv.scala:588)
        at org.apache.spark.rpc.netty.NettyRpcHandler.receive(NettyRpcEnv.scala:570)
        at org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:149)
        at org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:102)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:104)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:51)
        at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:266)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)
 
        at org.apache.spark.network.client.TransportResponseHandler.handle(TransportResponseHandler.java:207)
        at org.apache.spark.network.server.TransportChannelHandler.channelRead(TransportChannelHandler.java:120)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:85)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:336)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1294)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:357)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:343)
        at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:911)
        at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:643)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:566)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:480)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:442)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
        ... 1 more
17/11/17 11:31:13 ERROR cluster.StandaloneSchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
17/11/17 11:31:13 WARN cluster.StandaloneSchedulerBackend: Application ID is not initialized yet.

显然是端口的问题,这时候排查7077 netstat -anp | grep 7077 发现被暂用,于是kill掉进程。但是依然不行,最后重启,进入spark sbin目录

[Bash shell] 纯文本查看 复制代码

?

./stop-all.sh
./start-all.sh

问题得到解决

原文发布于微信公众号 - about云(wwwaboutyuncom)

原文发表时间:2017-11-17

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏圣杰的专栏

Asp.net mvc 知多少(十)

本系列主要翻译自《ASP.NET MVC Interview Questions and Answers 》- By Shailendra Chauhan,想...

21210
来自专栏程序员的SOD蜜

PDF.NET SOD 开源框架红包派送活动 && 新手快速入门指引

一、框架的由来  快速入门 有关框架的更多信息,请看框架官方主页! 本套框架的思想是借鉴Java平台的Hibernate 和 iBatis 而来,兼有...

2959
来自专栏GopherCoder

『No18: Go 实现世界杯后台管理系统』

趁着周末更新一期,上一期讲到 如何快速熟悉一个项目, 文章的最后讲到,最好的方法是借用相同的技术栈重新实现一个项目。

1861
来自专栏杨建荣的学习笔记

一天内碰到的3个rac节点问题 (r6笔记第36天)

说到问题,真是层出不穷,自己也算搭建了也不少的rac环境的,但是在本地试验的时候总是会碰到一些问题,昨晚铲掉旧环境,搭建了两遍rac环境,终于在凌晨搭建好了环境...

3467
来自专栏数据之美

Spark 伪分布式 & 全分布式 安装指南

0、前言 3月31日是 Spark 五周年纪念日,从第一个公开发布的版本开始,Spark走过了不平凡的5年:从刚开始的默默无闻,到13年的鹊起,14年的大爆发...

5725
来自专栏数据库新发现

Statspack之十四-"log file sync" 等待事件

http://www.eygle.com/statspack/statspack14-LogFileSync.htm 当一个用户提交(commits)或者回滚...

1201
来自专栏分布式系统进阶

ReplicaManager源码解析2-LeaderAndIsr 请求响应

其中最主要的操作调用ReplicaManager.becomeLeaderOrFollower来初始化Partition

701
来自专栏FreeBuf

远程RPC溢出EXP编写实战之MS06-040

0x01 前言 MS06-040算是个比较老的洞了,在当年影响十分之广,基本上Microsoft大部分操作系统都受到了影响,威力不亚于17年爆出的”永恒之蓝”漏...

30410
来自专栏辣子鸡的技术分享

Mybatis自动代码生成器的实现

原博地址https://laboo.top/2018/11/26/a-db/#more

1266
来自专栏JackieZheng

Spring实战——Profile

  看到Profile这个关键字,或许你从来没有正眼瞧过他,又或者脑海中有些模糊的印象,比如除了这里Springmvc中的Profile,maven中也有Pro...

3166

扫码关注云+社区

领取腾讯云代金券