我在计算集群上运行python脚本时遇到了问题,如果这是一个天真的错误,我提前道歉。我不确定这个问题是否源于我错误地配置了我自己的conda虚拟环境,但当我运行以下命令时,问题仍然会重现:
srun -p use-everything --pty python test.py
我得到了错误
Traceback (most recent call last):
File "test.py", line 4, in
from acme.agents.tf import dqn
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/agents/tf/dqn/__init__.py", line 18, in
from acme.agents.tf.dqn.agent import DQN
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/agents/tf/dqn/agent.py", line 20, in
from acme import datasets
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/datasets/__init__.py", line 17, in
from acme.datasets.reverb import make_reverb_dataset
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/datasets/reverb.py", line 22, in
from acme.adders import reverb as adders
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/adders/reverb/__init__.py", line 21, in
from acme.adders.reverb.base import DEFAULT_PRIORITY_TABLE
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/acme/adders/reverb/base.py", line 26, in
import reverb
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/reverb/__init__.py", line 27, in
from reverb import item_selectors as selectors
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/reverb/item_selectors.py", line 19, in
from reverb import pybind
File "/om2/user/armas/anaconda/envs/dist_rl/lib/python3.7/site-packages/reverb/pybind.py", line 1, in
import tensorflow as _tf; from .libpybind import *; del _tf
ImportError: libpython3.7m.so.1.0: cannot open shared object file: No such file or directory
srun: error: node014: task 0: Exited with exit code 1
在我的本地机器上,当我运行虚拟环境时,我也在为同样的问题而苦苦挣扎,我简单地用以下命令解决了这个问题
..。
这里有一些其他的事情可能会对你有所帮助。
$which libpython
/usr/bin/which: no libpython in (/om2/user/armas/anaconda/envs/dist_rl/bin:/om2/user/armas/anaconda/bin:/om2/user/armas/anaconda/condabin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)
$echo $PATH
/om2/user/armas/anaconda/envs/dist_rl/bin:/om2/user/armas/anaconda/bin:/om2/user/armas/anaconda/condabin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
$echo $LD_LIBRARY_PATH
/om2/user/armas/anaconda/bin/
当我更改我的
,即
然后运行脚本,我的anaconda认为我没有安装jax。我运行pip install dm-acme
jax
现在,当我运行脚本时,它告诉我没有一个名为atari的模块
_
py。我认为它正在引导我走下一条依赖链。
我使用以下命令安装了acme
此链接
,但使用的是conda环境。我的系统管理员说可能是acme不是为anaconda设计的。如果是这样的话,为什么会这样呢?
如果我遗漏了什么,请告诉我,我一定会补充的,再次感谢!
发布于 2020-09-02 14:53:06
试试这个:
sudo apt-get install libpython3.7
https://stackoverflow.com/questions/62945126
复制相似问题