首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >为Python3.8.5安装NVIDIA并与PyTorch 1.9兼容

为Python3.8.5安装NVIDIA并与PyTorch 1.9兼容
EN

Stack Overflow用户
提问于 2021-09-14 01:04:33
回答 2查看 5.6K关注 0票数 3

我运行的代码显然需要NVIDIA顶点(我最初不知道并安装了错误的顶点)。我不知道如何修复最后的错误:

代码语言:javascript
运行
复制
(proxy) [jalal@goku proxynca_pp]$ CUDA_VISIBLE_DEVICES=0,1 python train.py --dataset cub  --config config/cub.json --mode train --apex --seed 0
(1024, 4096)
train.py:12: MatplotlibDeprecationWarning: The 'warn' parameter of use() is deprecated since Matplotlib 3.1 and will be removed in 3.3.  If any parameter follows 'warn', they should be pass as keyword, not positionally.
  matplotlib.use('agg', warn=False, force=True)
Traceback (most recent call last):
  File "train.py", line 70, in <module>
    from apex import amp
  File "/scratch3/venv/proxy/lib/python3.8/site-packages/apex/__init__.py", line 13, in <module>
    from pyramid.session import UnencryptedCookieSessionFactoryConfig
ImportError: cannot import name 'UnencryptedCookieSessionFactoryConfig' from 'pyramid.session' (unknown location)

在得到上述错误后,我尝试了这样的答案:https://stackoverflow.com/a/67188946/2414957

代码语言:javascript
运行
复制
(proxy) [jalal@goku proxynca_pp]$ pip uninstall apex
Found existing installation: apex 0.9.10.dev0
Uninstalling apex-0.9.10.dev0:
  Would remove:
    /scratch3/venv/proxy/lib/python3.8/site-packages/apex-0.9.10.dev0-py3.8.egg-info
    /scratch3/venv/proxy/lib/python3.8/site-packages/apex/*
Proceed (Y/n)? y
  Successfully uninstalled apex-0.9.10.dev0
(proxy) [jalal@goku proxynca_pp]$ git clone https://github.com/NVIDIA/apex
Cloning into 'apex'...
remote: Enumerating objects: 8256, done.
remote: Counting objects: 100% (343/343), done.
remote: Compressing objects: 100% (192/192), done.
remote: Total 8256 (delta 204), reused 240 (delta 139), pack-reused 7913
Receiving objects: 100% (8256/8256), 14.20 MiB | 0 bytes/s, done.
Resolving deltas: 100% (5605/5605), done.
(proxy) [jalal@goku proxynca_pp]$ cd apex
(proxy) [jalal@goku apex]$ pip install -v --disable-pip-version-check --no-cache-dir \
> --global-option="--cpp_ext" --global-option="--cuda_ext" ./
/scratch3/venv/proxy/lib/python3.8/site-packages/pip/_internal/commands/install.py:229: UserWarning: Disabling all use of wheels due to the use of --build-option / --global-option / --install-option.
  cmdoptions.check_install_build_global(options)
Using pip 21.2.4 from /scratch3/venv/proxy/lib/python3.8/site-packages/pip (python 3.8)
Processing /scratch3/research/code/fashion/proxynca_pp/apex
  DEPRECATION: A future pip version will change local packages to be built in-place without first copying to a temporary directory. We recommend you use --use-feature=in-tree-build to test your packages with this new behavior before it becomes the default.
   pip 21.3 will remove support for this functionality. You can find discussion regarding this at https://github.com/pypa/pip/issues/7555.
    Running command python setup.py egg_info


    torch.__version__  = 1.9.0+cu111


    running egg_info
    creating /scratch/tmp/pip-pip-egg-info-yc32vm37/apex.egg-info
    writing /scratch/tmp/pip-pip-egg-info-yc32vm37/apex.egg-info/PKG-INFO
    writing dependency_links to /scratch/tmp/pip-pip-egg-info-yc32vm37/apex.egg-info/dependency_links.txt
    writing top-level names to /scratch/tmp/pip-pip-egg-info-yc32vm37/apex.egg-info/top_level.txt
    writing manifest file '/scratch/tmp/pip-pip-egg-info-yc32vm37/apex.egg-info/SOURCES.txt'
    reading manifest file '/scratch/tmp/pip-pip-egg-info-yc32vm37/apex.egg-info/SOURCES.txt'
    writing manifest file '/scratch/tmp/pip-pip-egg-info-yc32vm37/apex.egg-info/SOURCES.txt'
    /scratch/tmp/pip-req-build-fg_khhkt/setup.py:67: UserWarning: Option --pyprof not specified. Not installing PyProf dependencies!
      warnings.warn("Option --pyprof not specified. Not installing PyProf dependencies!")
Skipping wheel build for apex, due to binaries being disabled for it.
Installing collected packages: apex
    Running command /scratch3/venv/proxy/bin/python3.8 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/scratch/tmp/pip-req-build-fg_khhkt/setup.py'"'"'; __file__='"'"'/scratch/tmp/pip-req-build-fg_khhkt/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' --cpp_ext --cuda_ext install --record /scratch/tmp/pip-record-u812zb2v/install-record.txt --single-version-externally-managed --compile --install-headers /scratch3/venv/proxy/include/site/python3.8/apex


    torch.__version__  = 1.9.0+cu111


    /scratch/tmp/pip-req-build-fg_khhkt/setup.py:67: UserWarning: Option --pyprof not specified. Not installing PyProf dependencies!
      warnings.warn("Option --pyprof not specified. Not installing PyProf dependencies!")

    Compiling cuda extensions with
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2018 NVIDIA Corporation
    Built on Sat_Aug_25_21:08:01_CDT_2018
    Cuda compilation tools, release 10.0, V10.0.130
    from /usr/local/cuda-10.0/bin

    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/scratch/tmp/pip-req-build-fg_khhkt/setup.py", line 159, in <module>
        check_cuda_torch_binary_vs_bare_metal(CUDA_HOME)
      File "/scratch/tmp/pip-req-build-fg_khhkt/setup.py", line 99, in check_cuda_torch_binary_vs_bare_metal
        raise RuntimeError("Cuda extensions are being compiled with a version of Cuda that does " +
    RuntimeError: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries.  Pytorch binaries were compiled with Cuda 11.1.
    In some cases, a minor-version mismatch will not cause later errors:  https://github.com/NVIDIA/apex/pull/323#discussion_r287021798.  You can try commenting out this check (at your own risk).
    Running setup.py install for apex ... error
ERROR: Command errored out with exit status 1: /scratch3/venv/proxy/bin/python3.8 -u -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/scratch/tmp/pip-req-build-fg_khhkt/setup.py'"'"'; __file__='"'"'/scratch/tmp/pip-req-build-fg_khhkt/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' --cpp_ext --cuda_ext install --record /scratch/tmp/pip-record-u812zb2v/install-record.txt --single-version-externally-managed --compile --install-headers /scratch3/venv/proxy/include/site/python3.8/apex Check the logs for full command output.

我安装了这些软件包:

代码语言:javascript
运行
复制
(proxy) [jalal@goku apex]$ pip freeze
anykeystore==0.2
certifi==2021.5.30
charset-normalizer==2.0.4
cryptacular==1.6.2
cycler==0.10.0
defusedxml==0.7.1
greenlet==1.1.1
h5py==3.4.0
hupper==1.10.3
idna==3.2
joblib==1.0.1
kiwisolver==1.3.2
MarkupSafe==2.0.1
matplotlib==3.2.0
numpy==1.21.2
oauthlib==3.1.1
PasteDeploy==2.1.1
pbkdf2==1.3
Pillow==8.3.2
plaster==1.0
plaster-pastedeploy==0.7
pyparsing==2.4.7
pyramid==2.0
pyramid-mailer==0.15.1
python-dateutil==2.8.2
python3-openid==3.2.0
repoze.sendmail==4.4.1
requests==2.26.0
requests-oauthlib==1.3.0
scikit-learn==0.24.2
scipy==1.7.1
six==1.16.0
sklearn==0.0
SQLAlchemy==1.4.23
threadpoolctl==2.2.0
torch==1.9.0+cu111
torchaudio==0.9.0
torchvision==0.10.0+cu111
tqdm==4.62.2
transaction==3.0.1
translationstring==1.4
typing-extensions==3.10.0.2
urllib3==1.26.6
velruse==1.1.1
venusian==3.0.0
WebOb==1.8.7
WTForms==2.3.3
wtforms-recaptcha==0.3.2
zope.deprecation==4.4.0
zope.interface==5.4.0
zope.sqlalchemy==1.6

这是这个GitHub回购的代码。

编辑:我通过一个堆叠溢出的答案找到了现在找不到的步骤(链接在上面)。我不知道如何找到与PyTorch 1.9兼容的正确链接或安装。

FYI,git回购没有安装指示,所以我是盲目地安装东西。

EN

Stack Overflow用户

发布于 2021-09-14 06:18:57

您的cuda版本似乎是v10,而您的Py手电筒则构建在v11.1上。可能在抱怨这件事。

从错误:

代码语言:javascript
运行
复制
Compiling cuda extensions with
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
from /usr/local/cuda-10.0/bin

RuntimeError: Cuda extensions are being compiled with a version of Cuda that does not match the version used to compile Pytorch binaries. 
Pytorch binaries were compiled with Cuda 11.1.

你能试着确保两个版本是相同的吗?

  1. 如果已安装CUDA 11.1,则导出其path export CUDA_HOME=/usr/local/cuda-11.1/
  2. 否则,用CUDA 10安装火把。
  3. 最后一种选择是,您只需删除次要版本检查,例如,您已经安装了CUDA 10.0,但是pytorch是10.2。

setup.py

代码语言:javascript
运行
复制
if (bare_metal_major != torch_binary_major) #or (bare_metal_minor != torch_binary_minor):
票数 -1
EN
查看全部 2 条回答
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/69170666

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档