我开始使用Livy,在我的设置中,Livy服务器在Unix机器上运行,我可以对它执行curl并执行作业。我已经创建了一个fat jar并将其上传到hdfs,我只是从Livy调用它的main方法。Livy的Json负载如下所示:
{
"file" : "hdfs:///user/data/restcheck/spark_job_2.11-3.0.0-RC1-
SNAPSHOT.jar",
"proxyUser" : "test_user",
"className" : "com.local.test.spark.pipeline.path.LivyTest",
"files" : ["hdfs:///user/data/restcheck/hivesite.xml","hdfs:///user/data/restcheck/log4j.properties"],
"driverMemory" : "5G",
"executorMemory" : "10G",
"executorCores" : 5,
"numExecutors" : 10,
"queue" : "user.queue",
"name" : "LivySampleTest2",
"conf" : {"spark.master" : "yarn","spark.executor.extraClassPath" :
"/etc/hbase/conf/","spark.executor.extraJavaOptions" : "-Dlog4j.configuration=file:log4j.properties","spark.driver.extraJavaOptions" : "-Dlog4j.configuration=file:log4j.properties","spark.ui.port" : 4100,"spark.port.maxRetries" : 100,"JAVA_HOME" : "/usr/java/jdk1.8.0_60","HADOOP_CONF_DIR" :
"/etc/hadoop/conf:/etc/hive/conf:/etc/hbase/conf","HIVE_CONF_DIR" :
"/etc/hive/conf"}
}下面是我对它的卷曲调用:
curl -X POST --negotiate -u:"test_user" --data @/user/data/Livy/SampleFile.json -H "Content-Type: application/json" https://livyhost:8998/batches 我正在尝试将其转换为REST API调用,并遵循WordCount example provided by Cloudera,但无法将我的curl调用转换为REST API。我已经在HDFS中添加了所有jar,所以我认为我不需要执行upload jar调用。
发布于 2019-10-02 13:13:53
它应该也可以与curl一起工作
请尝试下面的JSON。
curl -H "Content-Type: application/json" https://livyhost:8998/batches
-X POST --data '{
"name" : "LivyREST",
"className" : "com.local.test.spark.pipeline.path.LivyTest",
"file" : "/user/data/restcheck/spark_job_2.11-3.0.0-RC1-
SNAPSHOT.jar"
}' 此外,我还添加了一些参考资料
http://gethue.com/how-to-use-the-livy-spark-rest-job-server-api-for-submitting-batch-jar-python-and-streaming-spark-jobs/
https://stackoverflow.com/questions/58191185
复制相似问题