我使用我的PC作为Spark服务器,同时作为Spark Worker,使用Spark 2.3.1。
起初,我使用我的Ubuntu 16.04 LTS。一切运行正常,我尝试运行SparkPi示例(使用spark-submit和spark-shell),它能够正常运行。我还尝试使用来自Spark的REST API运行它,并使用以下POST字符串:
curl -X POST http://192.168.1.107:6066/v1/submissions/create --header "Content-Type:application/json" --data '{
"action": "CreateSubmissionRequest",
"appResource": "file:/home/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar",
"clientSparkVersion": "2.3.1",
"appArgs": [ "10" ],
"environmentVariables" : {
"SPARK_ENV_LOADED" : "1"
},
"mainClass": "org.apache.spark.examples.SparkPi",
"sparkProperties": {
"spark.jars": "file:/home/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar",
"spark.driver.supervise":"false",
"spark.executor.memory": "512m",
"spark.driver.memory": "512m",
"spark.submit.deployMode":"cluster",
"spark.app.name": "SparkPi",
"spark.master": "spark://192.168.1.107:7077"
}
}'
在测试了这个和那个之后,我必须转移到Windows,因为它无论如何都会在Windows上完成。我可以运行服务器和worker (手动),添加winutils.exe,还可以使用spark-shell和spark-submit运行SparkPi示例,所有这些都可以运行。问题是,当我使用REST API时,使用了这个POST字符串:
curl -X POST http://192.168.1.107:6066/v1/submissions/create --header "Content-Type:application/json" --data '{
"action": "CreateSubmissionRequest",
"appResource": "file:D:/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar",
"clientSparkVersion": "2.3.1",
"appArgs": [ "10" ],
"environmentVariables" : {
"SPARK_ENV_LOADED" : "1"
},
"mainClass": "org.apache.spark.examples.SparkPi",
"sparkProperties": {
"spark.jars": "file:D:/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar",
"spark.driver.supervise":"false",
"spark.executor.memory": "512m",
"spark.driver.memory": "512m",
"spark.submit.deployMode":"cluster",
"spark.app.name": "SparkPi",
"spark.master": "spark://192.168.1.107:7077"
}
}'
只是路径有点不同,但是我的worker总是失败。日志上写着:
"Exception from the cluster: java.lang.NullPointerException
org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:151)
org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scal173)
org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92)"
我找过了,但还没有找到解决方案。
发布于 2018-09-07 16:03:33
所以,我终于找到了原因。
通过检查,我得出结论,问题不是来自Spark,而是参数没有被正确读取。这意味着不知何故,我放错了参数格式。
因此,在尝试了几种方法之后,下面这一种方法是正确的:
appResource": "file:D:/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar"
更改为:
appResource": "file:///D:/Workspace/Spark/spark-2.3.1-bin-hadoop2.7/examples/jars/spark-examples_2.11-2.3.1.jar"
我用spark.jars参数也做了同样的事情。
小小的差异几乎耗费了我24小时的工作……
https://stackoverflow.com/questions/52215099
复制相似问题