首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >问答首页 >Slurm找不到select/linear插件

Slurm找不到select/linear插件
EN

Stack Overflow用户
提问于 2018-01-03 11:40:15
回答 2查看 1K关注 0票数 0

我一直在使用this guide在我的节点上安装Slurm。我仅尝试将slurmd示例复制到slurm.conf并按照指南运行sudo start slurmd,这会在journalctl_ per中生成以下错误输出:

代码语言:javascript
运行
复制
Dec 29 19:16:22 Node_2 slurmd[27681]: error: plugin_load_from_file: dlopen(/usr/lib/slurm/select_linear.so): /usr/lib/slurm/select_linear.so: undefined symbol: slurm_job_preempt_mode
Dec 29 19:16:22 Node_2 slurmd[27681]: error: Couldn't load specified plugin name for select/linear: Dlopen of plugin file failed
Dec 29 19:16:22 Node_2 systemd[1]: slurmd.service: Control process exited, code=exited status=1
Dec 29 19:16:22 Node_2 slurmd[27681]: fatal: Can't find plugin for select/linear
Dec 29 19:16:22 Node_2 systemd[1]: slurmd.service: Failed with result 'exit-code'.
Dec 29 19:16:22 Node_2 systemd[1]: Failed to start Slurm node daemon.

但是,我确实有这个插件,我可以看到它:

代码语言:javascript
运行
复制
sudo ls /usr/lib/slurm/select_linear.so
/usr/lib/slurm/select_linear.so

我还尝试了sudo slurmd -cDvvvvv,并得到了以下输出。

代码语言:javascript
运行
复制
slurmd: error: plugin_load_from_file: dlopen(/usr/lib/slurm/select_linear.so): /usr/lib/slurm/select_linear.so: undefined symbol: slurm_job_preempt_mode
slurmd: error: Couldn't load specified plugin name for select/linear: Dlopen of plugin file failed
slurmd: fatal: Can't find plugin for select/linear

我还尝试了sudo slurmctld -cDvvvvv,得到的结果如下:

代码语言:javascript
运行
复制
slurmctld: debug:  Log file re-opened
slurmctld: debug:  creating clustername file: /var/spool/slurm/ctld/clustername
slurmctld: Stack size set to 8388608
slurmctld: slurmctld version 17.11.0 started on cluster linux
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/crypto_munge.so
slurmctld: Munge cryptographic signature plugin loaded
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/select_linear.so
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/preempt_none.so
slurmctld: preempt/none loaded
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/checkpoint_none.so
slurmctld: debug3: Success.
slurmctld: debug:  Checkpoint plugin loaded: checkpoint/none
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/acct_gather_energy_none.so
slurmctld: debug:  AcctGatherEnergy NONE plugin loaded
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/acct_gather_profile_none.so
slurmctld: debug:  AcctGatherProfile NONE plugin loaded
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/acct_gather_interconnect_none.so
slurmctld: debug:  AcctGatherInterconnect NONE plugin loaded
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/acct_gather_filesystem_none.so
slurmctld: debug:  AcctGatherFilesystem NONE plugin loaded
slurmctld: debug3: Success.
slurmctld: debug2: No acct_gather.conf file (/etc/slurm-llnl/acct_gather.conf)
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/jobacct_gather_none.so
slurmctld: debug:  Job accounting gather NOT_INVOKED plugin loaded
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/ext_sensors_none.so
slurmctld: ExtSensors NONE plugin loaded
slurmctld: debug3: Success.
slurmctld: debug3: Trying to load plugin /usr/lib/slurm/switch_none.so
slurmctld: debug:  switch NONE plugin loaded
slurmctld: debug3: Success.
slurmctld: debug:  power_save module disabled, SuspendTime < 0
slurmctld: error: this host (Node_2/Node_2) not a valid controller (linux0 or (null))

你知道我需要做什么才能看到这个插件吗?

EN

回答 2

Stack Overflow用户

发布于 2018-01-07 17:42:41

在我看来,这看起来像是SLURM中的一个bug。

我的猜测是,select/linear插件只有在被slurmctld使用时才有意义,而在被slurmd使用时没有意义。slurm_job_preempt_mode符号确实是在slurmctld中定义的,但在slurmd中没有定义。

FWIW是一个稍微老一点的版本,在slurmd中有相同的“缺失”符号,在RHEL7上运行得很好,所以我猜行为取决于操作系统的链接器(配置)。

我最好的选择是你直接向SLURM的人报告这个问题。

票数 0
EN

Stack Overflow用户

发布于 2018-04-27 15:01:03

我用另一个插件(lua)也遇到了同样的问题。我的解决方案是在rpmbuild上提到它:

代码语言:javascript
运行
复制
rpmbuild --with lua -ta slurm-16.05.7.tar.bz2

也许这也会对你有帮助,我会试着这样做

代码语言:javascript
运行
复制
rpmbuild --with linear -ta slurm-16.05.7.tar.bz2
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/48070840

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档