前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >[1217]org.apache.hadoop.hive.ql.exec.mr.MapRedTask. GC overhead limit exceeded

[1217]org.apache.hadoop.hive.ql.exec.mr.MapRedTask. GC overhead limit exceeded

作者头像
周小董
发布2023-10-10 08:40:11
4900
发布2023-10-10 08:40:11
举报
文章被收录于专栏:python前行者

在集群上跑的时候报了这样的错:Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)

然后根据job的id去yarn上面查询了一下日志,发现报错如下: FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.OutOfMemoryError: GC overhead limit exceeded

原来是内存溢出了,原因是数据量太大,导致在map的阶段内存不足。这时在SQL语句中加上设置参数的语句

代码语言:javascript
复制
set mapreduce.map.memory.mb=10150; 
set mapreduce.map.java.opts=-Xmx6144m;

当然这种情况还可能出现在reduce的阶段

代码语言:javascript
复制
set mapreduce.reduce.memory.mb=10150; 
set mapreduce.reduce.java.opts=-Xmx8120m;

参数的值自己可调,根据自己的需要设置就好。

Error while processing statement: FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. GC overhead limit exceeded

一般map读取一个片的数据不会内存不够,所以:

1、调大reduce个数 2、group by 数据倾斜 3、使用大的队列

代码语言:javascript
复制
set  mapreduce.job.queuename=hive;
set mapred.reduce.tasks=300;
set hive.optimize.skewjoin = true;

Hive执行报错org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.InterruptedException: sleep interrupted

报错日志如下:(肯定有时报错信息不准确,不能准确定位问题出现在哪里)

代码语言:javascript
复制
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.lang.InterruptedException: sleep interrupted
	at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:348)
	at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:428)
	at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:568)
	at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323)
	at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
	at org.apache.hadoop.mapreduce.Job.getJobState(Job.java:352)
	at org.apache.hadoop.mapred.JobClient$NetworkedJob.getJobState(JobClient.java:300)
	at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:244)
	at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:549)
	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:438)
	at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75)
Caused by: java.lang.InterruptedException: sleep interrupted
	at java.lang.Thread.sleep(Native Method)
	at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:345)
	... 17 more
Total MapReduce CPU Time Spent: -2 msec
Job Submission failed with exception 'org.apache.hadoop.yarn.exceptions.YarnRuntimeException(java.lang.InterruptedException: sleep interrupted)'

或者如下:

代码语言:javascript
复制
2021-10-31 09:00:11,340 [Thread-72] ERROR com.hadoop.compression.lzo.GPLNativeCodeLoader  - Could not load native gpl library
java.lang.UnsatisfiedLinkError: /home/pirate/dev/disk-5/tmp/yarn-local/usercache/pirate/appcache/application_1635150008466_34289/container_1635150008466_34289_01_000001/tmp/unpacked-3959672880919352106-libgplcompression.so: /home/pirate/dev/disk-5/tmp/yarn-local/usercache/pirate/appcache/application_1635150008466_34289/container_1635150008466_34289_01_000001/tmp/unpacked-3959672880919352106-libgplcompression.so: failed to map segment from shared object: Operation not permitted
	at java.lang.ClassLoader$NativeLibrary.load(Native Method)
	at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
	at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
	at java.lang.Runtime.load0(Runtime.java:809)
	at java.lang.System.load(System.java:1086)
	at com.hadoop.compression.lzo.GPLNativeCodeLoader.<clinit>(GPLNativeCodeLoader.java:51)
	at com.hadoop.compression.lzo.LzoCodec.<clinit>(LzoCodec.java:71)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:348)
	at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
	at org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:132)
	at org.apache.hadoop.io.compress.CompressionCodecFactory.<init>(CompressionCodecFactory.java:179)
	at org.apache.hadoop.mapred.lib.CombineFileInputFormat.isSplitable(CombineFileInputFormat.java:159)
	at org.apache.hadoop.mapred.lib.CombineFileInputFormat.isSplitable(CombineFileInputFormat.java:151)
	at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:283)
	at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:239)
	at org.apache.hadoop.mapred.lib.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:75)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getSplits(HadoopShimsSecure.java:309)
	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getCombineSplits(CombineHiveInputFormat.java:470)
	at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:571)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:328)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:320)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:196)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290)
	at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287)
	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575)
	at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570)
	at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561)
	at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:432)
	at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
	at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:88)
	at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:75)
2021-10-31 09:00:11,341 [Thread-72] ERROR com.hadoop.compression.lzo.LzoCodec  - Cannot load native-lzo without native-hadoop

排查hive脚本发现,Hive指定优化参数如下:

代码语言:javascript
复制
set hive.exec.compress.output=true;
set mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
set mapred.output.compression.type=BLOCK;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.exec.dynamic.partition=true;
set hive.auto.convert.join=true;
set mapreduce.map.memory.mb=40960;
set mapreduce.reduce.memory.mb=40960;
set mapred.child.java.opts=-Xmx1536m;
set mapreduce.job.reduce.slowstart.completedmaps=0.8;
set hive.exec.parallel=true;

考虑可能是mapreduce.map.memory.mb 或者  mapreduce.reduce.memory.mb参数配置过大引起的,这两个参数代表需要向yarn container中申请的内存大小,查找Hadoop yarn-site.xml配置文件发现如下配置:

代码语言:javascript
复制
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>30720</value>
</property>

于是将上述参数调小至此参数范围内,重新提交脚本,发现脚本执行成功;

总结: Mapper/Reducer阶段JVM堆内存溢出参数调优 目前MapReduce主要通过两个组参数去控制内存:(将如下参数调大)

代码语言:javascript
复制
Maper:
mapreduce.map.java.opts=-Xmx2048m(默认参数,表示jvm堆内存,注意是mapreduce不是mapred)
mapreduce.map.memory.mb=2304(container的内存)

Reducer:
mapreduce.reduce.java.opts=-=-Xmx2048m(默认参数,表示jvm堆内存)
mapreduce.reduce.memory.mb=2304(container的内存)

注意:因为在yarn container这种模式下,map/reduce task是运行在Container之中的,所以上面提到的mapreduce.map(reduce).memory.mb大小都大于mapreduce.map(reduce).java.opts值的大小。 mapreduce.{map|reduce}.java.opts能够通过Xmx设置JVM最大的heap的使用,一般设置为0.75倍的memory.mb,因为需要为java code等预留些空间

参考:https://blog.csdn.net/random0815/article/details/84944815 https://blog.csdn.net/qq_35896718/article/details/127783938 https://blog.51cto.com/u_15127593/4522219

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2023-08-06,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档