我试图在EMR上安装conda,下面是我的引导脚本,看起来conda正在安装,但它没有添加到环境变量中。当我在EMR主节点上手动更新$PATH
变量时,它可以识别conda
。我想用conda治疗齐柏林飞艇。
在启动EMR实例时,我还尝试在配置中添加以下条件,但是仍然得到了下面提到的错误。
"classification": "spark-env",
"properties": {
"conda": "/home/hadoop/conda/bin"
}
[hadoop@ip-172-30-5-150 ~]$ PATH=/home/hadoop/conda/bin:$PATH
[hadoop@ip-172-30-5-150 ~]$ conda
usage: conda [-h] [-V] command ...
conda is a tool for managing and deploying applications, environments and packages.
#!/usr/bin/env bash
# Install conda
wget https://repo.continuum.io/miniconda/Miniconda3-4.2.12-Linux-x86_64.sh -O /home/hadoop/miniconda.sh \
&& /bin/bash ~/miniconda.sh -b -p $HOME/conda
conda config --set always_yes yes --set changeps1 no
conda install conda=4.2.13
conda config -f --add channels conda-forge
rm ~/miniconda.sh
echo bootstrap_conda.sh completed. PATH now: $PATH
export PYSPARK_PYTHON="/home/hadoop/conda/bin/python3.5"
echo -e '\nexport PATH=$HOME/conda/bin:$PATH' >> $HOME/.bashrc && source $HOME/.bashrc
conda create -n zoo python=3.7 # "zoo" is conda environment name, you can use any name you like.
conda activate zoo
sudo pip3 install tensorflow
sudo pip3 install boto3
sudo pip3 install botocore
sudo pip3 install numpy
sudo pip3 install pandas
sudo pip3 install scipy
sudo pip3 install s3fs
sudo pip3 install matplotlib
sudo pip3 install -U tqdm
sudo pip3 install -U scikit-learn
sudo pip3 install -U scikit-multilearn
sudo pip3 install xlutils
sudo pip3 install natsort
sudo pip3 install pydot
sudo pip3 install python-pydot
sudo pip3 install python-pydot-ng
sudo pip3 install pydotplus
sudo pip3 install h5py
sudo pip3 install graphviz
sudo pip3 install recmetrics
sudo pip3 install openpyxl
sudo pip3 install xlrd
sudo pip3 install xlwt
sudo pip3 install tensorflow.io
sudo pip3 install Cython
sudo pip3 install ray
sudo pip3 install zoo
sudo pip3 install analytics-zoo
sudo pip3 install analytics-zoo[ray]
#sudo /usr/bin/pip-3.6 install -U imbalanced-learn
发布于 2022-02-05 00:12:03
我通过修改脚本使conda工作,如下所示,emr python版本与conda版本发生了冲突。
wget https://repo.anaconda.com/miniconda/Miniconda3-py37_4.9.2-Linux-x86_64.sh -O /home/hadoop/miniconda.sh \
&& /bin/bash ~/miniconda.sh -b -p $HOME/conda
echo -e '\n export PATH=$HOME/conda/bin:$PATH' >> $HOME/.bashrc && source $HOME/.bashrc
conda config --set always_yes yes --set changeps1 no
conda config -f --add channels conda-forge
conda create -n zoo python=3.7 # "zoo" is conda environment name
conda init bash
source activate zoo
conda install python 3.7.0 -c conda-forge orca
sudo /home/hadoop/conda/envs/zoo/bin/python3.7 -m pip install virtualenv
并将齐柏林蟒和火种参数设置为:
“spark.pyspark.python": "/home/hadoop/conda/envs/zoo/bin/python3",
"spark.pyspark.virtualenv.enabled": "true",
"spark.pyspark.virtualenv.type":"native",
"spark.pyspark.virtualenv.bin.path":"/home/hadoop/conda/envs/zoo/bin/,
"zeppelin.pyspark.python" : "/home/hadoop/conda/bin/python",
"zeppelin.python": "/home/hadoop/conda/bin/python"
Orca只支持TF到1.5,因此它没有工作,因为我使用的是TF2。
https://stackoverflow.com/questions/70901724
复制相似问题