问火花超时可能是由于HDFS中有100多万个文件的binaryFiles()造成的
EN

Stack Overflow用户

提问于 2015-06-08 08:52:08

回答 1查看 1.3K关注 0票数 4

我正在读取数百万个xml文件

val xmls = sc.binaryFiles(xmlDir)

这一操作在当地进行得很好，但在纱线上却失败了：

 client token: N/A
 diagnostics: Application application_1433491939773_0012 failed 2 times due to ApplicationMaster for attempt appattempt_1433491939773_0012_000002 timed out. Failing the application.
 ApplicationMaster host: N/A
 ApplicationMaster RPC port: -1
 queue: default
 start time: 1433750951883
 final status: FAILED
 tracking URL: http://controller01:8088/cluster/app/application_1433491939773_0012
 user: ariskk
Exception in thread "main" org.apache.spark.SparkException: Application finished with failed status
at org.apache.spark.deploy.yarn.Client.run(Client.scala:622)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:647)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:569)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

在hadoops/userlog上，我经常收到以下消息：

15/06/08 09:15:38 WARN util.AkkaUtils: Error sending message [message = Heartbeat(1,[Lscala.Tuple2;@2b4f336b,BlockManagerId(1, controller01.stratified, 58510))] in 2 attempts
java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
at scala.concurrent.Await$.result(package.scala:107)
at org.apache.spark.util.AkkaUtils$.askWithReply(AkkaUtils.scala:195)
at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:427)

我运行我的火花作业通过火花提交，它工作的另一个HDFS目录，其中只包含37k文件。有什么办法解决这个问题吗？

hadoop

apache-spark

回答 1

Stack Overflow用户

发布于 2015-06-08 15:15:30

好吧，在sparks邮件列表上得到一些帮助后，我发现有两个问题：

src目录，如果它被指定为/my_dir/，那么它会使spark失败并创建心跳问题。相反，应该将其命名为hdfs://my_dir/*。
修复#1后，日志中会出现内存不足的错误。这是火花驱动程序在内存不足的纱线上运行，原因是文件的数量(显然，它将所有文件信息保存在内存中)。所以我用--conf spark.driver.memory=8g解决了这个问题。

票数 4

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/30704814

复制

相似问题

问火花超时可能是由于HDFS中有100多万个文件的binaryFiles()造成的
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问火花超时可能是由于HDFS中有100多万个文件的binaryFiles()造成的EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问火花超时可能是由于HDFS中有100多万个文件的binaryFiles()造成的
EN