我正试着在我的macbook air上运行pyspark。当我尝试启动它时,我得到了错误:
Exception: Java gateway process exited before sending the driver its port number当sc = SparkContext()在启动时被调用。我已经尝试运行以下命令:
./bin/pyspark
./bin/spark-shell
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"无济于事。我也看过这里:
Spark + Python - Java gateway process exited before sending the driver its port number?
但这个问题一直没有得到回答。请帮帮我!谢谢。
发布于 2016-09-03 06:06:10
一个可能的原因是没有设置JAVA_HOME,因为没有安装java。
我遇到了同样的问题。上面写着
Exception in thread "main" java.lang.UnsupportedClassVersionError: org/apache/spark/launcher/Main : Unsupported major.minor version 51.0
    at java.lang.ClassLoader.defineClass1(Native Method)
    at java.lang.ClassLoader.defineClass(ClassLoader.java:643)
    at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
    at java.net.URLClassLoader.defineClass(URLClassLoader.java:277)
    at java.net.URLClassLoader.access$000(URLClassLoader.java:73)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:212)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:296)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
    at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:406)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/spark/python/pyspark/conf.py", line 104, in __init__
    SparkContext._ensure_initialized()
  File "/opt/spark/python/pyspark/context.py", line 243, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway()
  File "/opt/spark/python/pyspark/java_gateway.py", line 94, in launch_gateway
    raise Exception("Java gateway process exited before sending the driver its port number")
Exception: Java gateway process exited before sending the driver its port number在sc = pyspark.SparkConf()。我通过以下方式解决了这个问题
sudo add-apt-repository ppa:webupd8team/java
sudo apt-get update
sudo apt-get install oracle-java8-installer它来自https://www.digitalocean.com/community/tutorials/how-to-install-java-with-apt-get-on-ubuntu-16-04
发布于 2016-04-02 08:03:30
这应该会对你有帮助
一种解决方案是将pyspark-shell添加到外壳环境变量PYSPARK_SUBMIT_ARGS中:
export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"python/pyspark/javagateway.py中有一个变化,如果用户设置了PYSPARK_SUBMIT_ARGS变量,则需要PYSPARK_SUBMIT_ARGS包含pyspark-shell。
发布于 2018-06-26 05:38:45
我在Ubuntu上运行了这个错误消息,通过安装openjdk-8-jdk包摆脱了这个错误
from pyspark import SparkConf, SparkContext
sc = SparkContext(conf=SparkConf().setAppName("MyApp").setMaster("local"))
^^^ error安装Open JDK 8:
apt-get install openjdk-8-jdk-headless -qq    在MacOS上
在Mac OS上也是如此,我输入了一个终端:
$ java -version
No Java runtime present, requesting install. 系统提示我从Oracle's download site安装Java,选择MacOS安装程序,单击jdk-13.0.2_osx-x64_bin.dmg,然后检查Java是否已安装
$ java -version
java version "13.0.2" 2020-01-14编辑以安装JDK8您需要转到https://www.oracle.com/java/technologies/javase-jdk8-downloads.html (需要登录)
在那之后,我可以用pyspark开始一个Spark上下文。
检查它是否工作
在Python中:
from pyspark import SparkContext 
sc = SparkContext.getOrCreate() 
# check that it really works by running a job
# example from http://spark.apache.org/docs/latest/rdd-programming-guide.html#parallelized-collections
data = range(10000) 
distData = sc.parallelize(data)
distData.filter(lambda x: not x&1).take(10)
# Out: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]注意,您可能需要设置环境变量Python和PYSPARK_DRIVER_PYTHON,并且它们必须与您用来运行PYSPARK_PYTHON (驱动程序)的Python (或IPython)版本相同。
https://stackoverflow.com/questions/31841509
复制相似问题