The general steps could be found in this link: http://stackoverflow.com/questions/22252534/how-to-run-a-spark-java-program-from-command-line
below is my pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>spark.examples</groupId> --- 和命令行里指定的groupid 一致
<artifactId>JavaWordCount</artifactId>--- 和命令行里指定的groupid 一致
<packaging>jar</packaging>
<version>1</version>
<name>JavaWordCount</name>
<url>http://maven.apache.org</url>
<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-examples_2.10</artifactId>
<version>1.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.4.1</version>
</dependency>
</dependencies>
</project>
```xml
3. cd example-java-build/JavaWordCount
mvn package
This creates your fat jar file inside the target directory.
![clipboard2](https://user-images.githubusercontent.com/5669954/28005843-f8c64808-654c-11e7-8a72-bf61d78e15bd.png)
在classes folder里有零散的.class file:
![clipboard3](https://user-images.githubusercontent.com/5669954/28005849-fd29e0e4-654c-11e7-8e25-8563f2219e18.png)
Copy the jar file to any location on the server. Go to the your bin folder of your spark.
Submit spark job: ./spark-submit --class "org.apache.spark.examples.JavaWordCount" --master local /root/devExpert/spark-1.4.1/example-java- build/JavaWordCount/target/JavaWordCount-1.jar
use jd.exe to open the compiled java class, make sure the value specified by --class equals to the complate name of class,
in my example it is org.apache.spark.examples.JavaWordCount. Or else you will meet with java.lang.ClassNotFoundException.
![clipboard4](https://user-images.githubusercontent.com/5669954/28005858-0459ae6c-654d-11e7-9c61-d61b36d20334.png)
4. ./spark-submit --class "org.apache.spark.examples.JavaWordCount" --master local /root/devExpert/spark-1.4.1/example-java-build/JavaWordCount/target/JavaWordCount-1.jar /root/devExpert/spark-1.4.1/bin/test.txt
-debug: sh -x ./spark-submit --class "org.apache.spark.examples.JavaWordCount" --master local /root/devExpert/spark-1.4.1/example-java-build/JavaWordCount/target/JavaWordCount-1.jar /root/devExpert/spark-1.4.1/bin/test.txt
等价于:/usr/jdk1.7.0_79/bin/java -cp /root/devExpert/spark-1.4.1/conf/:/root/devExpert/spark-1.4.1/assembly/target/scala-2.10/spark-assembly-1.4.1-hadoop2.4.0.jar:/root/devExpert/spark-1.4.1/lib_managed/jars/datanucleus-rdbms-3.2.9.jar:/root/devExpert/spark-1.4.1/lib_managed/jars/datanucleus-core-3.2.10.jar:/root/devExpert/spark-1.4.1/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar -Xms512m -Xmx512m -XX:MaxPermSize=256m org.apache.spark.deploy.SparkSubmit --master local --class org.apache.spark.examples.JavaWordCount /root/devExpert/spark-1.4.1/example-java-build/JavaWordCount/target/JavaWordCount-1.jar /root/devExpert/spark-1.4.1/bin/test.txt
-cp 和 -classpath 一样,是指定类运行所依赖其他类的路径,通常是类库,jar包之类,需要全路径到jar包,window上分号“;”
分隔,linux上是分号“:”分隔。不支持通配符,需要列出所有jar包,用一点“.”代表当前路径。
output:
![clipboard6](https://user-images.githubusercontent.com/5669954/28005861-095700e0-654d-11e7-86da-e3b08feda93b.png)