我已经没有主意了.我尝试了许多配置,但都不起作用。我试图通过我的hadoop集群上的yarn运行一个jar文件,结果却是: 2020-10-07 21:27:01,960 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for application appattempt_1602101475531_0003_000002
2020-10-07 21:27:02,145 INFO [main] org.apache.hadoop.mapreduce.v2.app.MRAppMast
我想做并行处理在for循环中使用吡火花。
from pyspark.sql import SparkSession
spark = SparkSession.builder.master('yarn').appName('myAppName').getOrCreate()
spark.conf.set("mapreduce.fileoutputcommitter.marksuccessfuljobs", "false")
data = [a,b,c]
for i in data:
try:
df =