我试图使用命令spark-submit --master yarn --deploy-mode cluster app.py
提交一个pyspark应用程序。
我正在获取信息客户端: application_1663517069168_0003的应用程序报告(状态:接受),无休止的
我创建了一个AWS EMR集群,其中只有一个主节点和一个核心节点,并且试图通过连接到主节点来提交应用程序。
22/09/18 18:25:16 INFO RMProxy: Connecting to ResourceManager at ip-172-31-90-73.ec2.internal/172.31.90.73:8032
22/09/18 18:25:16 INFO Client: Requesting a new application from cluster with 1 NodeManagers
22/09/18 18:25:16 INFO Configuration: resource-types.xml not found
22/09/18 18:25:16 INFO ResourceUtils: Unable to find 'resource-types.xml'.
22/09/18 18:25:16 INFO ResourceUtils: Adding resource type - name = memory-mb, units = Mi, type = COUNTABLE
22/09/18 18:25:16 INFO ResourceUtils: Adding resource type - name = vcores, units = , type = COUNTABLE
22/09/18 18:25:16 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (6144 MB per container)
22/09/18 18:25:16 INFO Client: Will allocate AM container, with 2432 MB memory including 384 MB overhead
22/09/18 18:25:16 INFO Client: Setting up container launch context for our AM
22/09/18 18:25:16 INFO Client: Setting up the launch environment for our AM container
22/09/18 18:25:16 INFO Client: Preparing resources for our AM container
22/09/18 18:25:16 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
22/09/18 18:25:19 INFO Client: Uploading resource file:/mnt/tmp/spark-fb80e661-49d6-4738-8f29-351b4efdf337/__spark_libs__2372700901935780238.zip -> hdfs://ip-172-31-90-73.ec2.internal:8020/user/hadoop/.sparkStaging/application_1663517069168_0003/__spark_libs__2372700901935780238.zip
22/09/18 18:25:21 INFO Client: Uploading resource file:/home/hadoop/app.py -> hdfs://ip-172-31-90-73.ec2.internal:8020/user/hadoop/.sparkStaging/application_1663517069168_0003/app.py
22/09/18 18:25:21 INFO Client: Uploading resource file:/usr/lib/spark/python/lib/pyspark.zip -> hdfs://ip-172-31-90-73.ec2.internal:8020/user/hadoop/.sparkStaging/application_1663517069168_0003/pyspark.zip
22/09/18 18:25:21 INFO Client: Uploading resource file:/usr/lib/spark/python/lib/py4j-0.10.7-src.zip -> hdfs://ip-172-31-90-73.ec2.internal:8020/user/hadoop/.sparkStaging/application_1663517069168_0003/py4j-0.10.7-src.zip
22/09/18 18:25:21 INFO Client: Uploading resource file:/mnt/tmp/spark-fb80e661-49d6-4738-8f29-351b4efdf337/__spark_conf__1449366969974574614.zip -> hdfs://ip-172-31-90-73.ec2.internal:8020/user/hadoop/.sparkStaging/application_1663517069168_0003/__spark_conf__.zip
22/09/18 18:25:22 INFO SecurityManager: Changing view acls to: hadoop
22/09/18 18:25:22 INFO SecurityManager: Changing modify acls to: hadoop
22/09/18 18:25:22 INFO SecurityManager: Changing view acls groups to:
22/09/18 18:25:22 INFO SecurityManager: Changing modify acls groups to:
22/09/18 18:25:22 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hadoop); groups with view permissions: Set(); users with modify permissions: Set(hadoop); groups with modify permissions: Set()
22/09/18 18:25:24 INFO Client: Submitting application application_1663517069168_0003 to ResourceManager
22/09/18 18:25:24 INFO YarnClientImpl: Submitted application application_1663517069168_0003
22/09/18 18:25:25 INFO Client: Application report for application_1663517069168_0003 (state: ACCEPTED)
22/09/18 18:25:25 INFO Client:
client token: N/A
diagnostics: [Sun Sep 18 18:25:24 +0000 2022] Application is added to the scheduler and is not yet activated. Queue's AM resource limit exceeded. Details : AM Partition = CORE; AM Resource Request = <memory:2432, max memory:6144, vCores:1, max vCores:4>; Queue Resource Limit for AM = <memory:3072, vCores:1>; User AM Resource Limit of the queue = <memory:3072, vCores:1>; Queue AM Resource Usage = <memory:2432, vCores:1>;
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1663525524352
final status: UNDEFINED
tracking URL: http://ip-172-31-90-73.ec2.internal:20888/proxy/application_1663517069168_0003/
user: hadoop
22/09/18 18:25:26 INFO Client: Application report for application_1663517069168_0003 (state: ACCEPTED)
22/09/18 18:25:27 INFO Client: Application report for application_1663517069168_0003 (state: ACCEPTED)
22/09/18 18:25:28 INFO Client: Application report .......```
发布于 2022-09-19 04:35:20
只要读一读日志,问题就很清楚了。
将
应用程序添加到调度程序中,但尚未激活。队列的AM资源限制超过了。详细信息: AM分区=核心;AM资源请求=<内存:2432,最大内存:6144,vCores:1,max vCores:4>;AM的队列资源限制为<内存:3072,vCores:1>;队列的用户AM资源限制=<内存:3072,vCores:1>;队列AM资源使用=<内存:2432,vCores:1>;
您请求的资源比集群的资源还多。你可以通过给你的电子病历集群提供更多的核心来改变这一点。
https://stackoverflow.com/questions/73765508
复制相似问题