前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >配置Ipython Nodebook 运

配置Ipython Nodebook 运

作者头像
py3study
发布2020-01-13 00:00:39
1.6K0
发布2020-01-13 00:00:39
举报
文章被收录于专栏:python3python3

配置Ipython Nodebook 运行 Python Spark 程序

1.1、安装Anaconda

可以去Anaconda的官网下载对应的版本

1.1.1、下载Anaconda

代码语言:javascript
复制
$ cd /opt/local/src/
$ wget -c https://repo.anaconda.com/archive/Anaconda3-5.2.0-Linux-x86_64.sh

1.1.2、安装Anaconda

代码语言:javascript
复制
# 参数 -b 表示 batch -p 表示指定安装目录
$ bash Anaconda3-5.2.0-Linux-x86_64.sh -p /opt/local/anaconda -b

1.1.3、配置Anaconda相关环境变量

  • 配置环境变量
代码语言:javascript
复制
$ tail -n 8 ~/.bashrc

# Anaconda3
export ANACONDA_PATH=/opt/local/anaconda
export PATH=$ANACONDA_PATH/bin:$PATH

# PySpark
export PYSPARK_DRIVER_PYTHON=$ANACONDA_PATH/bin/ipython
export PYSPARK_PYTHON=$ANACONDA_PATH/bin/python
  • 启用环境变量
代码语言:javascript
复制
$ source ~/.bashrc
  • 验证
代码语言:javascript
复制
$ python --version
Python 3.6.5 :: Anaconda, Inc.

1.2、在Ipython Notebook 使用pySpark

1.2.1、创建工作目录

代码语言:javascript
复制
$ mkdir  ~/ipynotebook
$ cd ~/ipynotebook

1.2.2、Ipython Notebook 运行pySpark

  • 运行Ipython Notebook
代码语言:javascript
复制
$ PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark
[TerminalIPythonApp] WARNING | Subcommand `ipython notebook` is deprecated and will be removed in future versions.
[TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook` in the future
[I 14:21:56.030 NotebookApp] JupyterLab beta preview extension loaded from /opt/local/anaconda/lib/python3.6/site-packages/jupyterlab
[I 14:21:56.030 NotebookApp] JupyterLab application directory is /opt/local/anaconda/share/jupyter/lab
[I 14:21:56.037 NotebookApp] Serving notebooks from local directory: /home/hadoop/ipynotebook
[I 14:21:56.037 NotebookApp] 0 active kernels
[I 14:21:56.037 NotebookApp] The Jupyter Notebook is running at:
[I 14:21:56.037 NotebookApp] http://localhost:8888/?token=5b68718fdabe4488decf07703a3bd76bf46d5dc733a6617d
[I 14:21:56.037 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 14:21:56.040 NotebookApp] 

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:8888/?token=5b68718fdabe4488decf07703a3bd76bf46d5dc733a6617d&token=5b68718fdabe4488decf07703a3bd76bf46d5dc733a6617d
[I 14:21:56.683 NotebookApp] Accepting one-time-token-authenticated connection from 127.0.0.1

会自动通过默认的浏览器打开http://localhost:8888 页面

  • 在IPython Notebook 上编写程序
配置Ipython Nodebook 运行 Python Spark 程序
配置Ipython Nodebook 运行 Python Spark 程序

1.2.3、Ipython Notebook 在Hadoop Yarn 运行pySpark

  • 运行Ipython Notebook
代码语言:javascript
复制
$ PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" HADOOP_CONF_DIR=/opt/local/hadoop/etc/hadoop MASTER=yarn-client pyspark
[TerminalIPythonApp] WARNING | Subcommand `ipython notebook` is deprecated and will be removed in future versions.
[TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook` in the future
[I 14:50:48.149 NotebookApp] JupyterLab beta preview extension loaded from /opt/local/anaconda/lib/python3.6/site-packages/jupyterlab
[I 14:50:48.149 NotebookApp] JupyterLab application directory is /opt/local/anaconda/share/jupyter/lab
[I 14:50:48.157 NotebookApp] Serving notebooks from local directory: /home/hadoop/ipynotebook
[I 14:50:48.157 NotebookApp] 0 active kernels
[I 14:50:48.157 NotebookApp] The Jupyter Notebook is running at:
[I 14:50:48.157 NotebookApp] http://localhost:8888/?token=8fe2c599dc39a23104dd6a058a0e05de3d9e88cfeda71b45
[I 14:50:48.157 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 14:50:48.161 NotebookApp] 

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:8888/?token=8fe2c599dc39a23104dd6a058a0e05de3d9e88cfeda71b45&token=8fe2c599dc39a23104dd6a058a0e05de3d9e88cfeda71b45
  • 在IPython Notebook 上编写程序
配置Ipython Nodebook 运行 Python Spark 程序
配置Ipython Nodebook 运行 Python Spark 程序
  • 在YARN查看任务
代码语言:javascript
复制
$ yarn application -list
18/06/24 14:53:06 INFO client.RMProxy: Connecting to ResourceManager at node/192.168.20.10:8032
Total number of applications (application-types: [] and states: [SUBMITTED, ACCEPTED, RUNNING]):1
                Application-Id      Application-Name        Application-Type          User       Queue               State         Final-State         Progress                        Tracking-URL
application_1529805293111_0001          PySparkShell                   SPARK        hadoop     default             RUNNING           UNDEFINED              10%                    http://node:4040

1.2.4、Ipython Notebook 在Spark Stand Alone 运行pySpark

  • 启动Spark Stand Alone
代码语言:javascript
复制
$ /opt/local/spark/sbin/start-master.sh

$ /opt/local/spark/sbin/start-slaves.sh

$ jps
13249 Jps
13027 Master
13188 Worker
  • 运行Ipython Notebook
代码语言:javascript
复制
$ PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" MASTER=spark://node:7077 pyspark --num-executors 1 --total-executor-cores 1 --executor-memory 512m 
[TerminalIPythonApp] WARNING | Subcommand `ipython notebook` is deprecated and will be removed in future versions.
[TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook` in the future
[I 15:11:59.211 NotebookApp] JupyterLab beta preview extension loaded from /opt/local/anaconda/lib/python3.6/site-packages/jupyterlab
[I 15:11:59.212 NotebookApp] JupyterLab application directory is /opt/local/anaconda/share/jupyter/lab
[I 15:11:59.230 NotebookApp] Serving notebooks from local directory: /home/hadoop/ipynotebook
[I 15:11:59.230 NotebookApp] 0 active kernels
[I 15:11:59.230 NotebookApp] The Jupyter Notebook is running at:
[I 15:11:59.230 NotebookApp] http://localhost:8888/?token=1972eb523fea28d541985df7ed2ce55cc2bfada7e31eb9ea
[I 15:11:59.230 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 15:11:59.233 NotebookApp] 

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
        http://localhost:8888/?token=1972eb523fea28d541985df7ed2ce55cc2bfada7e31eb9ea&token=1972eb523fea28d541985df7ed2ce55cc2bfada7e31eb9ea
[I 15:12:02.594 NotebookApp] Accepting one-time-token-authenticated connection from 127.0.0.1
  • 在IPython Notebook 上编写程序
配置Ipython Nodebook 运行 Python Spark 程序
配置Ipython Nodebook 运行 Python Spark 程序
  • 查看Spark Standalone Web UI 界面
配置Ipython Nodebook 运行 Python Spark 程序
配置Ipython Nodebook 运行 Python Spark 程序

1.3、总结

启动启动Ipython Notebook,首先进入Ipython Notebook的工作目录,如~/ipynotebook这个根据实际的情况确定;

1.3.1、Local 启动Ipython Notebook

代码语言:javascript
复制
PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark
#### 或者
PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" pyspark --master local[*]

1.3.2、Hadoop YARN 启动Ipython Notebook

代码语言:javascript
复制
PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" HADOOP_CONF_DIR=/opt/local/hadoop/etc/hadoop MASTER=yarn-client pyspark
#### 或者
PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" HADOOP_CONF_DIR=/opt/local/hadoop/etc/hadoop pyspark --master yarn --deploy-mode client

1.3.3、Spark Stand Alone 启动Ipython Notebook

代码语言:javascript
复制
PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" MASTER=spark://node:7077 pyspark --num-executors 1 --total-executor-cores 1 --executor-memory 512m 
本文参与 腾讯云自媒体分享计划,分享自作者个人站点/博客。
原始发表:2019-08-26 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 配置Ipython Nodebook 运行 Python Spark 程序
    • 1.1、安装Anaconda
      • 1.1.1、下载Anaconda
      • 1.1.2、安装Anaconda
      • 1.1.3、配置Anaconda相关环境变量
    • 1.2、在Ipython Notebook 使用pySpark
      • 1.2.1、创建工作目录
      • 1.2.2、Ipython Notebook 运行pySpark
      • 1.2.3、Ipython Notebook 在Hadoop Yarn 运行pySpark
      • 1.2.4、Ipython Notebook 在Spark Stand Alone 运行pySpark
    • 1.3、总结
      • 1.3.1、Local 启动Ipython Notebook
      • 1.3.2、Hadoop YARN 启动Ipython Notebook
      • 1.3.3、Spark Stand Alone 启动Ipython Notebook
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档