本来想使用ceph-fuse,结果一直在报mount can't read super block的错误,但是我只给主机安装了ceph-fuse的rpm包,kublet居然在使用内核的挂载方式挂载cephfs,这简直是不能再有的错误,可是如何指定kubelet来使用ceph-fuse来进行cephfs的挂载呢,我google了一大圈,都没人讲这个小问题。 后来实在无奈,直接翻了源码: kubernetes/pkg/volume/cephfs/cephfs.go
// check whether it belongs to fuse, if not, default to use kernel mount.
if cephfsVolume.checkFuseMount() {
klog.V(4).Info("CephFS fuse mount.")
err = cephfsVolume.execFuseMount(dir)
// cleanup no matter if fuse mount fail.
keyringPath := cephfsVolume.GetKeyringPath()
_, StatErr := os.Stat(keyringPath)
if !os.IsNotExist(StatErr) {
os.RemoveAll(keyringPath)
}
if err == nil {
// cephfs fuse mount succeeded.
return nil
}
// if cephfs fuse mount failed, fallback to kernel mount.
klog.V(2).Infof("CephFS fuse mount failed: %v, fallback to kernel mount.", err)
}
klog.V(4).Info("CephFS kernel mount.")
err = cephfsVolume.execMount(dir)
这就很明白了,默认先检查ceph-fuse,检查通过直接使用ceph-fuse挂载,如果挂载失败或者检查失败,就使用内核挂载的方式再试。也就是说之所以kubelet在使用内核挂载的原因可能是ceph-fuse检查失败或者挂载失败所致。盘一下看看它是怎么检查的
func (cephfsVolume *cephfsMounter) checkFuseMount() bool {
execute := cephfsVolume.plugin.host.GetExec(cephfsVolume.plugin.GetPluginName())
switch runtime.GOOS {
case "linux":
if _, err := execute.Command("/usr/bin/test", "-x", "/sbin/mount.fuse.ceph").CombinedOutput(); err == nil {
klog.V(4).Info("/sbin/mount.fuse.ceph exists, it should be fuse mount.")
return true
}
return false
}
return false
}
就只判断/sbin/mount.fuse.ceph是否可以执行。找一下日志
kubelet: I0424 17:09:38.366917 7829 reconciler.go:252] operationExecutor.MountVolume started for volume "xxx" (UniqueName: "kubernetes.io/cephfs/xxxx") pod "xxxx" (UID: "xxx")
kubelet: I0424 17:09:38.632672 7829 cephfs.go:259] CephFS fuse mount failed: Ceph-fuse failed: signal: aborted
kubelet: arguments: [-k /data/kubelet/pods/9db514c0-0d0b-4343-ad0f-ff71abe9d38d/volumes/kubernetes.io~cephfs/xxx
-pro~keyring/admin.keyring -m xxxx
olumes/kubernetes.io~cephfs/xxxx -r /production --id admin]
kubelet: Output: 2020-04-24 17:09:38.375590 7fa188e78ec0 -1 did not load config file, using default settings.
kubelet: 2020-04-24 17:09:38.393668 7fa188e78ec0 -1 init, newargv = 0x557071540720 newargc=11
kubelet: ceph-fuse[1256508]: starting ceph client
kubelet: ceph-fuse[1256508]: starting fuse
kubelet: , fallback to kernel mount.
systemd: Started Kubernetes transient mount for /data/kubelet/pods/xxx/volumes/kubernetes.io~cephfs/xx.
kernel: libceph: mon1 xxx session established
kernel: libceph: xx fsid xxx
kernel: ceph: problem parsing mds trace -5
kernel: ceph: mds parse_reply err -5
kernel: ceph: mdsc_handle_reply got corrupt reply mds0(tid:1)
kubelet: E0424 17:09:38.967287 7829 mount_linux.go:140] Mount failed: exit status 32
从日志来看是挂载有毛病,但是我手动执行挂载是没有问题的,唯一有差别的可能只是参数,-k -m这种的,我判断是老版本的ceph-fuse不支持这些参数,我只要升级ceph-fuse到最新版即可,事实也是如此。 没想到这里还有这些坑