前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >[1133]flink问题集锦

[1133]flink问题集锦

作者头像
周小董
发布2022-04-28 17:27:38
3.5K0
发布2022-04-28 17:27:38
举报
文章被收录于专栏:python前行者python前行者

文章目录

问题1: Could not get job jar and dependencies from JAR file: JAR file does not exist: -yn

原因:flink1.8版本之后已弃用该参数,ResourceManager将自动启动所需的尽可能多的容器,以满足作业请求的并行性。解决方法:去掉即可

问题2: java.lang.IllegalStateException: No Executor found. Please make sure to export the HADOOP_CLASSPATH environment variable or have hadoop in your classpath.

方法1:配置环境变量

代码语言:javascript
复制
vi /etc/profile

#添加下面一行
export HADOOP_CLASSPATH=`hadoop classpath`

# 环境生效
source /etc/profile

方法2:下载对应版本 flink-shaded-hadoop-2-uber,放到flink的lib目录下

问题3: Failed to execute goal net.alchim31.maven:scala-maven-plugin:3.2.2:compile (default) on project book-stream: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 1 (Exit value: 1)

image.png
image.png

产生这个问题的原因有很多,重要的是查看error报错的信息,我这边主要是scala中调用了java的方法,但build时只指定了打包scala的资源,所以会找不到类报错,下面是build出错的行,把它注释掉、删掉,不指定sourceDirectory,所有的sourceDirectory都会打包进去就可解决。

代码语言:javascript
复制
<sourceDirectory>src/main/scala</sourceDirectory>

问题4: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: Could not find a suitable table factory for ‘org.apache.flink.table.planner.delegation.ParserFactory’ in the classpath.

这个错误也是因为打包时候没有将依赖打包进去、或者需要将依赖放到flink的lib目录下

maven换成了如下的build 的pulgin

代码语言:javascript
复制
 <build>
        <plugins>
 
            <plugin>
                <groupId>org.scala-tools</groupId>
                <artifactId>maven-scala-plugin</artifactId>
                <version>2.15.2</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
 
            <plugin>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.6.0</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                </configuration>
            </plugin>
 
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-surefire-plugin</artifactId>
                <version>2.19</version>
                <configuration>
                    <skip>true</skip>
                </configuration>
            </plugin>
 
        </plugins>
    </build>

问题5: Multiple versions of scala libraries detected! Expected all dependencies to require Scala version: 2.11.12 org.apache.flink:flink-runtime_2.11:1.13.2 requires scala version: 2.11.12 org.apache.flink:flink-scala_2.11:1.13.2 requires scala version: 2.11.12 org.apache.flink:flink-scala_2.11:1.13.2 requires scala version: 2.11.12 org.scala-lang:scala-reflect:2.11.12 requires scala version: 2.11.12 org.apache.flink:flink-streaming-scala_2.11:1.13.2 requires scala version: 2.11.12 org.apache.flink:flink-streaming-scala_2.11:1.13.2 requires scala version: 2.11.12 org.scala-lang:scala-compiler:2.11.12 requires scala version: 2.11.12 org.scala-lang.modules:scala-xml_2.11:1.0.5 requires scala version: 2.11.7

这是由于scala-maven-plugin打包插件版本低的问题

Starting from Scala 2.10 all changes in bugfix/patch version should be backward compatible, so these warnings don’t really have the point in this case. But they are still very important in case when, let’s say, you somehow end up with scala 2.9 and 2.11 libraries. It happens that since version 3.1.6 you can fix this using scalaCompatVersion configuration

方法1:指定scalaCompatVersion一样的版本

代码语言:javascript
复制
 <configuration>
        <scalaCompatVersion>${scala.binary.version}</scalaCompatVersion>
        <scalaVersion>${scala.version}</scalaVersion> 
 </configuration>

下面是完整的

代码语言:javascript
复制
<plugin>    
    <groupId>net.alchim31.maven</groupId>    
    <artifactId>scala-maven-plugin</artifactId>    
    <version>3.1.6</version>    
    <configuration>        
        <scalaCompatVersion>${scala.binary.version}</scalaCompatVersion>                   <scalaVersion>${scala.binary.version}</scalaVersion>    
    </configuration>    
    <executions>        
        <execution>            
            <goals>                
                <goal>compile</goal>            
            </goals>        
        </execution>    
    </executions>
</plugin>

方法2:打包插件换成4.x的版本

代码语言:javascript
复制
<plugin>    
    <groupId>net.alchim31.maven</groupId>    
    <artifactId>scala-maven-plugin</artifactId>    
    <version>4.2.0</version>    
    <executions>        
        <execution>            
            <goals>                
                <goal>compile</goal>            
            </goals>        
        </execution>    
    </executions>
</plugin>

问题6: cannot be cast to com.google.protobuf.Message

代码语言:javascript
复制
Caused by: java.lang.ClassCastException: org.apache.hadoop.yarn.proto.YarnServiceProtos$RegisterApplicationMasterRequestProto cannot be cast to com.google.protobuf.Message
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227)
	at com.sun.proxy.$Proxy14.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:106)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
	at com.sun.proxy.$Proxy15.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:222)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:214)
	at org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.registerApplicationMaster(AMRMClientAsyncImpl.java:138)
	at org.apache.flink.yarn.YarnResourceManager.createAndStartResourceManagerClient(YarnResourceManager.java:205)
	at org.apache.flink.yarn.YarnResourceManager.initialize(YarnResourceManager.java:234)
	... 11 common frames omitted

这种问题一般是由于自己工程的hadoop的jar包和flink集群的jar包冲突导致的,解决办法:排除自己工程中的hadoop相关的jar,打包的时候不要打进来.

代码语言:javascript
复制
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>${hadoop.version}</version>
            <scope>provided</scope>
        </dependency>

问题7: Flink应用提交到集群报错:org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies

image.png
image.png

产生上述问题是在Flink中操作了HDFS文件系统(比如checkpoint到HDFS) 但是缺少配置导致的(缺少hadoop相关依赖)

解决方法: 1.环境变量加入以下配置(别忘了刷新环境变量,之后重启Flink或者刷新环境变量无效的可以重启)

代码语言:javascript
复制
HADOOP_HOME=xxx
export HADOOP_HOME
export HADOOP_CLASSPATH=`hadoop classpath`

2.如果第一个步骤确定没问题还是不行的话 需要下载一个jar包放在Flink的lib目录下

flink-shaded-hadoop-2-uber-2.7.5-7.0下载地址: https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/2.7.5-7.0/flink-shaded-hadoop-2-uber-2.7.5-7.0.jar

flink-shaded-hadoop-2-uber下载地址:https://repo.maven.apache.org/maven2/org/apache/flink/flink-shaded-hadoop-2-uber/

要按照HADOOP版本下载,我这里是2.9.2

image.png
image.png

放进去后要重启Flink集群 部分系统要重启系统才行 我这里就是放进去了不行 重启后好了!!!

问题8: 完整报错信息: Exception in thread “main” java.lang.NoSuchMethodError: org.apache.commons.cli.Option.builder(Ljava/lang/String;)Lorg/apache/commons/cli/Option$Builder; at org.apache.flink.runtime.entrypoint.parser.CommandLineOptions.(CommandLineOptions.java:27) at org.apache.flink.runtime.entrypoint.DynamicParametersConfigurationParserFactory.options(DynamicParametersConfigurationParserFactory.java:43) at org.apache.flink.runtime.entrypoint.DynamicParametersConfigurationParserFactory.getOptions(DynamicParametersConfigurationParserFactory.java:50) at org.apache.flink.runtime.entrypoint.parser.CommandLineParser.parse(CommandLineParser.java:42) at org.apache.flink.runtime.entrypoint.ClusterEntrypointUtils.parseParametersOrExit(ClusterEntrypointUtils.java:63) at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:89)

报错原因:

依赖中commons-cli版本过低导致运行时找不到新版本的方法

解决办法: 排除Hadoop中commons-cli依赖,并添加高版本

代码语言:javascript
复制
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-common</artifactId>
            <version>3.2.2</version>
            <exclusions>
                <exclusion>
                    <groupId>commons-cli</groupId>
                    <artifactId>commons-cli</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        
        <dependency>
            <groupId>commons-cli</groupId>
            <artifactId>commons-cli</artifactId>
            <version>1.3.1</version>
        </dependency>

问题9: Exception in thread “Thread-8” java.lang.IllegalStateException: Trying to access closed classloader. Please check if you store classloaders directly or indirectly in static fields. If the stacktrace suggests that the leak occurs in a third party library and cannot be fixed immediately, you can disable this check with the configuration ‘classloader.check-leaked-classloader’.

解决方法:在flink-conf.yaml中添加

代码语言:javascript
复制
classloader.check-leaked-classloader: false

问题10: Could not deploy Yarn job cluster

任务提交时,报错: Could not deploy Yarn job cluster

image.png
image.png

原因:我们往下看

image.png
image.png

原因:设置的内存超过了限制。

解决:修改内存大小设置

yarn.scheduler.maximum-allocation-mb yarn.nodemanager.resource.memory-mb

image.png
image.png
image.png
image.png

问题11: org.apache.flink.client.deployment.ClusterDeploymentException Couldn’t deploy Yarn

出现此类错误,主要的原因是Current usage: 75.1 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container.

字面原因是容器内存不够,实际上是flink on yarn启动时检查虚拟内存造成的

所以修改配置文件,让它不检查就没事了

修改etc/hadoop/yarn-site.xml

代码语言:javascript
复制
<property> 
    <name>yarn.nodemanager.vmem-check-enabled</name> 
    <value>false</value> 
</property>

问题12: org.apache.hadoop.yarn.client.RMProxy - Connecting to ResourceManager at /0.0.0.0:8032

尝试重新连接Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s)失败.

代码语言:javascript
复制
2021-03-19 07:43:15,103 WARN  org.apache.flink.runtime.util.HadoopUtils                     - Could not find Hadoop configuration via any of the supported methods (Flink configuration, environment variables).
2021-03-19 07:43:15,545 WARN  org.apache.hadoop.util.NativeCodeLoader                       - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2021-03-19 07:43:15,657 INFO  org.apache.flink.runtime.security.modules.HadoopModule        - Hadoop user set to ryxiong (auth:SIMPLE), credentials check status: true
2021-03-19 07:43:15,715 INFO  org.apache.flink.runtime.security.modules.JaasModule          - Jaas file will be created as /tmp/jaas-1195372589162118065.conf.
2021-03-19 07:43:15,734 WARN  org.apache.flink.yarn.cli.FlinkYarnSessionCli                 - The configuration directory ('/opt/module/flink-1.10.1/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
2021-03-19 07:43:15,802 INFO  org.apache.hadoop.yarn.client.RMProxy                         - Connecting to ResourceManager at /0.0.0.0:8032

2021-03-19 07:43:27,189 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-03-19 07:43:28,195 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2021-03-19 07:43:29,201 INFO  org.apache.hadoop.ipc.Client                                  - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)

原因: 1.检查是否启动hadoop集群, 如果没有启动, 是无法连接到hadoop的yarn

2.flink运行于yarn上,flink要能找到hadoop配置,因为要连接到yarn的resourcemanager和hdfs。

如果正常启动还无法连接yarn, 可以查看一下hadoop的环境变量是否配置好

解决方案: 1.启动hadoop集群

2.配置hadoop的环境变量

代码语言:javascript
复制
# HADOOP_HOME
export HADOOP_HOME=/opt/module/hadoop-2.7.2
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin

问题13: HADOOP启动错误: LOG4J.PROPERTIES IS NOT FOUND…CORE-SITE.XML NOT FOUND的解决办法

描述:在禁用CDH集群的KERBEROS认证后,进行**服务功能验证,**查看HDFS文件系统时出现CORE-SITE.XML找不到错误

代码语言:javascript
复制
[root@utility ~]# hadoop fs -ls /
WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.
Exception in thread "main" java.lang.RuntimeException: core-site.xml not found
        at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2867)
        at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2815)
        at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2692)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1329)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1301)
        at org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1642)
        at org.apache.hadoop.util.GenericOptionsParser.processGeneralOptions(GenericOptionsParser.java:339)
        at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:569)
        at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:174)
        at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:156)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:389)
image.png
image.png

按照提示说core-site.xml找不到, 很疑惑;明明core-site.xml等配置文件存在,且配置没有任何问题。好在经过查阅资料找到了解决办法:原来是环境变量的问题,需要配置HADOOP_CONF_DIR路径。HADOOP_CONF_DIR 变量为自己的Hadoop目录(默认是个错误的路径所以会跳错)

代码语言:javascript
复制
vi /etc/profile
 
export HADOOP_HOME=/opt/cloudera/parcels/CDH/lib/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin

保存退出使环境变量立即生效。

代码语言:javascript
复制
source /etc/profile

再次运行命令,恢复正常。

image.png
image.png

如果修改之后仍报错,可以查看是否将HADOOP_CONF_DIR的路径配置在hadoop-env.sh中,若没有,添加保存即可解决。

image.png
image.png

注:出现此错误主要需要检查了etc/hadoop目录下的hadoop-env.sh,mapred-env.sh与yarn-env.sh下配置的HADOOP_CONF_DIR路径。

问题14: java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory

解决方法 第一种方法:导入commons-logging.jar包

第二种方法,如果用的是maven项目,则直接在pom.xml中加入commons-logging依赖包,如下:

代码语言:javascript
复制
<dependency>
	<groupId>commons-logging</groupId>
	<artifactId>commons-logging</artifactId>
	<version>1.2</version>
</dependency>

注:需在标签里的开头位置添加,若在其它位置添加,则可能还会存在该问题。

参考:

  • flink问题集锦:https://blog.csdn.net/Chenftli/article/details/123581749
  • flink开发过程中的常见问题:https://www.codeleading.com/article/8265926617/
  • flink on yarn的一则jar冲突问题,你遇到过没:https://cloud.tencent.com/developer/article/1863679
  • flink1.13启动失败:https://blog.csdn.net/syyyyyyyyyyyyyyh/article/details/118027965
  • org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/depend: https://blog.csdn.net/qq_39211575/article/details/104460319
  • flink1.12 提交Job 时 Exception in thread “main“ java.lang.NoSuchMethodError: org.apache.commons.cli.Opt:https://blog.csdn.net/M283592338/article/details/120768957
  • flink1.13.2运行错误问题:https://blog.csdn.net/jkllb123/article/details/120433753
  • 任务提交时,报错: Could not deploy Yarn job cluster:https://blog.csdn.net/u011110301/article/details/119104942
  • flink on yarn模式出现The main method caused an error: Could not deploy Yarn job cluster问题排查+解决 https://www.codeleading.com/article/76613848513/
  • Flink集群启动报错 org.apache.flink.client.deployment.ClusterDeploymentException https://blog.csdn.net/A1585570507/article/details/114991690 https://blog.csdn.net/weixin_44393345/article/details/106517394
  • flink无法连接yarn- Connecting to ResourceManager at /0.0.0.0:8032: https://blog.csdn.net/Ryxiong728/article/details/115558066
  • HADOOP启动错误: LOG4J.PROPERTIES IS NOT FOUND…CORE-SITE.XML NOT FOUND的解决办法: https://www.freesion.com/article/5840820802/ https://blog.csdn.net/danielchan2518/article/details/81007308
  • 用History Server实现Flink 作业宕机查看:https://blog.csdn.net/weixin_42073629/article/details/109213696
  • java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory解决方法:https://www.cnblogs.com/jun-zi/p/12079586.html https://blog.csdn.net/hymas/article/details/77963833
  • Hive连接+Kerberos认证各种报错及解决办法汇总 https://blog.csdn.net/Jason_yesly/article/details/110840130 https://blog.csdn.net/Jason_yesly/article/details/110845993
本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2022-04-23,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 文章目录
相关产品与服务
专用宿主机
专用宿主机(CVM Dedicated Host,CDH)提供用户独享的物理服务器资源,满足您资源独享、资源物理隔离、安全、合规需求。专用宿主机搭载了腾讯云虚拟化系统,购买之后,您可在其上灵活创建、管理多个自定义规格的云服务器实例,自主规划物理资源的使用。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档