localSpark.sparkContext.parallelize(pubDataIt.toList) // in real instead of takethere will be ML based logic that need to be executed on localDF }
res.write.mo
我尝试过设置spark.executor.memory和spark.executor.heartbeatInterval,但是错误仍然存在。我还尝试将.cache()放在不同行的末尾,没有任何更改。$WriterThread$$anonfun$run$3$$anonfun$apply$4.apply(PythonRDD.scala:344) at org.apache.spark
t=spark.sql("SET").withColumn("rw",expr("row_number() over(order by key)")).collect()[0].asDict()local-1594577194330我也在试着用斯卡拉-火花。,local-1594580739413,1)local-1594580739413在实际问题中,我几乎有200+列,不想使用case类方法。val