首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >我可以通过conda更新cuda版本吗?

我可以通过conda更新cuda版本吗?
EN

Stack Overflow用户
提问于 2021-08-03 17:46:45
回答 1查看 6.3K关注 0票数 2

我现在:

代码语言:javascript
运行
复制
nvidia-smi
Wed Aug  4 01:40:39 2021       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79       Driver Version: 410.79       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla V100-SXM2...  On   | 00000000:00:0C.0 Off |                    0 |
| N/A   34C    P0    37W / 300W |      0MiB / 16130MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-SXM2...  On   | 00000000:00:0D.0 Off |                    0 |
| N/A   34C    P0    36W / 300W |      0MiB / 16130MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-SXM2...  On   | 00000000:00:0E.0 Off |                    0 |
| N/A   33C    P0    39W / 300W |      0MiB / 16130MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-SXM2...  On   | 00000000:00:0F.0 Off |                    0 |
| N/A   37C    P0    41W / 300W |      0MiB / 16130MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

我想安装Tensorflow 2.3/2.4,所以我需要至少在Conda中将cuda升级到10.1。我知道如何在conda安装cudakit:

代码语言:javascript
运行
复制
conda install cudatoolkit=10.1

但这似乎还不够:

代码语言:javascript
运行
复制
Status: CUDA driver version is insufficient for CUDA runtime version

如果我想保留旧版本的cuda 10.0,我可以通过Conda将cuda更新为10.1吗?这不管用:

代码语言:javascript
运行
复制
conda install cuda=10.1

我正在使用Python3.8。如果我不能保持库达10.0,如何直接升级库达10.1与或不使用康达?最好我能升级到康达。

增添:

我安装了cudatoolkit=10.1,但库达的驱动程序仍然不好。我的conda env列表显示:

代码语言:javascript
运行
复制
cudatoolkit               10.1.243             h6bb024c_0  
tensorflow-gpu            2.3.0                    pypi_0    pypi

以下测试是好的:

代码语言:javascript
运行
复制
import tensorflow as tf
2021-08-04 04:21:31.110443: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1

In [3]: print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
2021-08-04 04:21:34.499432: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2021-08-04 04:21:34.665738: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-04 04:21:34.666369: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:00:0c.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-08-04 04:21:34.666459: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-04 04:21:34.667017: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 1 with properties: 
pciBusID: 0000:00:0d.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-08-04 04:21:34.667064: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-04 04:21:34.667613: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 2 with properties: 
pciBusID: 0000:00:0e.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.75GiB deviceMemoryBandwidth: 836.37GiB/s
2021-08-04 04:21:34.667644: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2021-08-04 04:21:34.670275: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2021-08-04 04:21:34.672971: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2021-08-04 04:21:34.673378: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2021-08-04 04:21:34.676043: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2021-08-04 04:21:34.677370: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2021-08-04 04:21:34.681850: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2021-08-04 04:21:34.681989: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-04 04:21:34.682604: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-04 04:21:34.683196: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-04 04:21:34.683782: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-04 04:21:34.684353: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-04 04:21:34.684961: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-04 04:21:34.685513: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0, 1, 2
Num GPUs Available:  3

但下列测试失败:

代码语言:javascript
运行
复制
import tensorflow as tf
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)

with tf.Session() as sess:
    print (sess.run(c))

错误信息:

代码语言:javascript
运行
复制
2021-08-04 04:27:30.934969: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0, 1, 2
2021-08-04 04:27:30.935028: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
---------------------------------------------------------------------------
InternalError                             Traceback (most recent call last)
......
InternalError: cudaGetDevice() failed. Status: CUDA driver version is insufficient for CUDA runtime version

如果这个语句是正确的,那么为什么我的安装仍然很糟糕,因为我已经在Conda中安装了cudatoolkit=10.1:

代码语言:javascript
运行
复制
If you want to install a GPU driver, you could install a newer CUDA toolkit, which will have a newer GPU driver (installer) bundled with it. 

cudatoolkit和cuda司机还不匹配吗?

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-08-03 19:44:59

不,您不能通过conda更新GPU驱动程序,这就是支持CUDA10.1或其他更新的在你的情况下需要什么。请参阅这里

Anaconda要求用户最近安装了一个符合下表中版本要求的NVIDIA驱动程序。

(最新的表是这里)

如果您想安装GPU驱动程序,您可以安装一个较新的CUDA工具包,它将有一个新的GPU驱动程序(安装程序)与它捆绑在一起。或者您可以检索驱动程序这里并安装它。所谓更新的CUDA工具包,我指的是由NVIDIA提供的CUDA工具包安装程序,它们是可用的这里,而不是通过conda提供的。您不能通过conda进行驱动程序更新。

我建议您学习CUDA linux安装指南,因为用于安装前一个驱动程序(runfile或包管理器)的方法可能是您想要用于下一个驱动程序的方法。

作为另一种选择(例如,如果您没有或无法获得系统的管理员访问权限),您可以调查CUDA 前向兼容性。(也可能对兼容性感兴趣。)

票数 3
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68640658

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档