首页
学习
活动
专区
工具
TVP
发布
社区首页 >问答首页 >winutils spark windows安装env_variable

winutils spark windows安装env_variable
EN

Stack Overflow用户
提问于 2016-05-19 00:14:29
回答 2查看 25.1K关注 0票数 5

我正在尝试在Windows10上安装Spark 1.6.1,到目前为止我已经完成了以下工作……

spark

  • 下载了spark 1.6.1,解压到某个目录,然后设置SPARK_HOME

  • Downloaded scala 2.11.8,解压到某个目录,然后设置SCALA_HOME

  • ,只需下载压缩目录,然后设置_JAVA_OPTION环境变量,就可以从https://github.com/steveloughran/winutils.git中设置HADOOP_HOME环境。(不确定这是否正确,因为权限被拒绝,我无法克隆目录)。

当我转到spark home并运行bin\spark-shell时,我得到

代码语言:javascript
复制
'C:\Program' is not recognized as an internal or external command, operable program or batch file.

我肯定遗漏了什么,我看不出我怎么能在windows环境下运行bash脚本。但希望我不需要理解就能让它正常工作。我一直在关注这个人的教程-- https://hernandezpaul.wordpress.com/2016/01/24/apache-spark-installation-on-windows-10/。任何帮助都将不胜感激。

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2016-05-19 00:17:53

您需要下载winutils可执行文件,而不是源代码。

你可以下载它,或者如果你真的想要整个here发行版,你可以找到2.6.0二进制文件here。然后,您需要将HADOOP_HOME设置为包含winutils.exe的目录。

另外,要确保你放置的目录不包含空格Spark,这一点非常重要,否则它将无法工作。

设置好后,不是启动spark-shell.sh,而是启动spark-shell.cmd

代码语言:javascript
复制
C:\Spark\bin>spark-shell
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties
To adjust logging level use sc.setLogLevel("INFO")
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 1.6.1
      /_/

Using Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_91)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc.
16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-core-3.2.10.jar."
16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-api-jdo-3.2.6.jar."
16/05/18 19:31:56 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-rdbms-3.2.9.jar."
16/05/18 19:31:56 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/05/18 19:31:56 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/05/18 19:32:01 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/05/18 19:32:01 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-core-3.2.10.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-core-3.2.10.jar."
16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus.api.jdo" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-api-jdo-3.2.6.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-api-jdo-3.2.6.jar."
16/05/18 19:32:07 WARN General: Plugin (Bundle) "org.datanucleus.store.rdbms" is already registered. Ensure you dont have multiple JAR versions of the same plugin in the classpath. The URL "file:/C:/Spark/lib/datanucleus-rdbms-3.2.9.jar" is already registered, and you are trying to register an identical plugin located at URL "file:/C:/Spark/bin/../lib/datanucleus-rdbms-3.2.9.jar."
16/05/18 19:32:07 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/05/18 19:32:08 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/05/18 19:32:12 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
16/05/18 19:32:12 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
SQL context available as sqlContext.

scala>
票数 8
EN

Stack Overflow用户

发布于 2020-07-22 03:13:30

在windows上,您需要显式指定hadoop二进制文件的位置。

下面是设置spark-scala独立应用程序的步骤。

  1. 下载winutil.exe并将其放到bin文件夹下的某个文件夹/目录中,例如c:\hadoop\bin

完整的路径看起来像c:\hadoop\bin\winutil.exe

  1. 现在,在创建sparkSession时,我们需要指定此路径。请参考下面的代码片段:

包com.test.config

导入org.apache.spark.sql.SparkSession对象Spark2Config扩展了可序列化的{ System.setProperty("hadoop.home.dir","C:\hadoop") val spark =}

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/37305001

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档