前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >Tensorflow-gpu 运行在 cpu 母机的问题

Tensorflow-gpu 运行在 cpu 母机的问题

作者头像
runzhliu
发布2020-08-06 10:08:08
5230
发布2020-08-06 10:08:08
举报
文章被收录于专栏:容器计算

tensorflow-gpu 的镜像当然运行在 GPU 的母机上了,但是如果容器被调度到没有 GPU 的母机上呢?

代码语言:javascript
复制
# 导入 tensorflow
# python -c "import tensorflow"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/__init__.py", line 22, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

如果是 tensorflow-gpu 的镜像,正常来说应该是需要 GPU 的,但是有可能用户想要运行在 CPU 上呢?虽然需求是不太合理的,既然使用了 tensorflow-gpu 就应该运行在 GPU 上,不然跑在 CPU 上干啥呢?

目前的调度逻辑,对于此类任务,会被调度到只有 CPU 的机器上,而这些机器不仅没有安装 CUDA 的库,并且也没有使用 nvidia-docker,那么在 import tensorflow 的时候,这类 GPU 的镜像就必然找不到 CUDA 的库,从而报错了。

代码语言:javascript
复制
# 运行这个命令
# LD_DEBUG=libs python -c "import tensorflow"
ib/x86_64:/usr/lib		(system search path)
       475:	  trying file=/lib/x86_64-linux-gnu/tls/x86_64/libcuda.so.1
       475:	  trying file=/lib/x86_64-linux-gnu/tls/libcuda.so.1
       475:	  trying file=/lib/x86_64-linux-gnu/x86_64/libcuda.so.1
       475:	  trying file=/lib/x86_64-linux-gnu/libcuda.so.1
       475:	  trying file=/usr/lib/x86_64-linux-gnu/tls/x86_64/libcuda.so.1
       475:	  trying file=/usr/lib/x86_64-linux-gnu/tls/libcuda.so.1
       475:	  trying file=/usr/lib/x86_64-linux-gnu/x86_64/libcuda.so.1
       475:	  trying file=/usr/lib/x86_64-linux-gnu/libcuda.so.1
       475:	  trying file=/lib/tls/x86_64/libcuda.so.1
       475:	  trying file=/lib/tls/libcuda.so.1
       475:	  trying file=/lib/x86_64/libcuda.so.1
       475:	  trying file=/lib/libcuda.so.1
       475:	  trying file=/usr/lib/tls/x86_64/libcuda.so.1
       475:	  trying file=/usr/lib/tls/libcuda.so.1
       475:	  trying file=/usr/lib/x86_64/libcuda.so.1
       475:	  trying file=/usr/lib/libcuda.so.1
       475:
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/__init__.py", line 22, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

可能更合理的做法应该是避免用户使用 GPU 的 tensorflow 的镜像,而又要运行到 CPU 的机器上。

本文参与 腾讯云自媒体同步曝光计划,分享自作者个人站点/博客。
原始发表:2019/09/02 ,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 作者个人站点/博客 前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
相关产品与服务
容器服务
腾讯云容器服务(Tencent Kubernetes Engine, TKE)基于原生 kubernetes 提供以容器为核心的、高度可扩展的高性能容器管理服务,覆盖 Serverless、边缘计算、分布式云等多种业务部署场景,业内首创单个集群兼容多种计算节点的容器资源管理模式。同时产品作为云原生 Finops 领先布道者,主导开源项目Crane,全面助力客户实现资源优化、成本控制。
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档