专栏首页Linux内核深入分析Linux设备驱动模型-Uevent

Linux设备驱动模型-Uevent

前言

当一个设备动态的加入到系统时候(比如常见的将U盘插入到PC机器上), 设备驱动程序就需要动态的检测到有设备插入了系统,就需要将此事件通知到用户层,然后用户层对这一事件做响应的处理,比如加载USB驱动,更新UI等。而将此事件通知到用户层就需要某种机制,典型的就是mdev hotplug和udev。关于udev和mdev hotplug可以在上篇文章有解释。Linux系统对uevent机制的具体实现是建立在设备模型的基础上的,通过kobject_uevent函数实现。

在前面的kset小节中提到了注册一个kset的接口,可以在这里习复下。

/**
 * kset_register - initialize and add a kset.
 * @k: kset.
 */
int kset_register(struct kset *k)
{
	int err;

	if (!k)
		return -EINVAL;

	kset_init(k);
	err = kobject_add_internal(&k->kobj);
	if (err)
		return err;
	kobject_uevent(&k->kobj, KOBJ_ADD);
	return 0;
}

可以看到这里调用了kobject_uevent接口,发送一个action为: KOBJ_ADD的事件。而kobject和kset的主要区别就是,将一个kset注册到系统的时候,就需要将此事件通过kobject_uevent发送到用户空间,而kobject如果是单独的,没有依赖kset,则无法通过uevent机制发送事件到用户空间。

数据结构

struct kset_uevent_ops {
	int (* const filter)(struct kset *kset, struct kobject *kobj);
	const char *(* const name)(struct kset *kset, struct kobject *kobj);
	int (* const uevent)(struct kset *kset, struct kobject *kobj,
		      struct kobj_uevent_env *env);
};

kset_uevent_ops代表意思是Kset事件处理函数集合。

filter: 当上报uevent的时候,kset会通过filter接口去过滤,阻止不希望上报的uevent。

name: 返回kset的名称,如果此kset没有名称,也是不允许上报event

uevent: 通常会调用此回调处理一些Kset的私有事情。

struct kobj_uevent_env {
	char *argv[3];
	char *envp[UEVENT_NUM_ENVP];
	int envp_idx;
	char buf[UEVENT_BUFFER_SIZE];
	int buflen;
};

envp: 用户保存每个环境变量的地址,最大支持32个。

envp_idx: 用户访问envp。

buf: 保存环境的buffer,最大支持2048

buflen: 用于访问buf。

代码分析

/**
 * kobject_uevent - notify userspace by sending an uevent
 *
 * @action: action that is happening
 * @kobj: struct kobject that the action is happening to
 *
 * Returns 0 if kobject_uevent() is completed with success or the
 * corresponding error when it fails.
 */
int kobject_uevent(struct kobject *kobj, enum kobject_action action)
{
	return kobject_uevent_env(kobj, action, NULL);
}

可以看到注释: 通过发送一个uevent通知事件到用户层, action就是当前发生的事件类型,如下action是个枚举类型

enum kobject_action {
	KOBJ_ADD,             
	KOBJ_REMOVE,
	KOBJ_CHANGE,
	KOBJ_MOVE,
	KOBJ_ONLINE,
	KOBJ_OFFLINE,
	KOBJ_MAX
};

KOBJ_ADD/KOBJ_REMOVE代表添加或者移除

KOBJ_ONLINE/KOBJ_OFFLINE代表上线或这下线

/**
 * kobject_uevent_env - send an uevent with environmental data
 *
 * @action: action that is happening
 * @kobj: struct kobject that the action is happening to
 * @envp_ext: pointer to environmental data
 *
 * Returns 0 if kobject_uevent_env() is completed with success or the
 * corresponding error when it fails.
 */
int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
		       char *envp_ext[])
{
	struct kobj_uevent_env *env;
	const char *action_string = kobject_actions[action];
	const char *devpath = NULL;
	const char *subsystem;
	struct kobject *top_kobj;
	struct kset *kset;
	const struct kset_uevent_ops *uevent_ops;
	int i = 0;
	int retval = 0;
#ifdef CONFIG_NET
	struct uevent_sock *ue_sk;
#endif

	pr_debug("kobject: '%s' (%p): %s\n",
		 kobject_name(kobj), kobj, __func__);

	/* search the kset we belong to */
	top_kobj = kobj;
	while (!top_kobj->kset && top_kobj->parent)            //听过while循环找到kobj所属的顶层kset
		top_kobj = top_kobj->parent;

	if (!top_kobj->kset) {                                 //发送一个envet必须存在kset
		pr_debug("kobject: '%s' (%p): %s: attempted to send uevent "
			 "without kset!\n", kobject_name(kobj), kobj,
			 __func__);
		return -EINVAL;
	}

	kset = top_kobj->kset;                                //得到最顶层kset的uevent_ops
	uevent_ops = kset->uevent_ops;

	/* skip the event, if uevent_suppress is set*/
	if (kobj->uevent_suppress) {                           //如果uevnet_suppress=1,则不发送uevent
		pr_debug("kobject: '%s' (%p): %s: uevent_suppress "
				 "caused the event to drop!\n",
				 kobject_name(kobj), kobj, __func__);
		return 0;
	}
	/* skip the event, if the filter returns zero. */
	if (uevent_ops && uevent_ops->filter)                 //通过filter函数过滤,如果返回0,则说明顶层的kset过滤了此event
		if (!uevent_ops->filter(kset, kobj)) {
			pr_debug("kobject: '%s' (%p): %s: filter function "
				 "caused the event to drop!\n",
				 kobject_name(kobj), kobj, __func__);
			return 0;
		}

	/* originating subsystem */
	if (uevent_ops && uevent_ops->name)                     //通过name函数设置subsystem       
		subsystem = uevent_ops->name(kset, kobj);
	else
		subsystem = kobject_name(&kset->kobj);
	if (!subsystem) {
		pr_debug("kobject: '%s' (%p): %s: unset subsystem caused the "
			 "event to drop!\n", kobject_name(kobj), kobj,
			 __func__);
		return 0;
	}

	/* environment buffer */
	env = kzalloc(sizeof(struct kobj_uevent_env), GFP_KERNEL);      //分配环境变量buff
	if (!env)
		return -ENOMEM;

	/* complete object path */
	devpath = kobject_get_path(kobj, GFP_KERNEL);                 //得到此obj的路径
	if (!devpath) {
		retval = -ENOENT;
		goto exit;
	}

	/* default keys */
	retval = add_uevent_var(env, "ACTION=%s", action_string);         //添加环境变量,ACTION, DEVPATH, SUBSYSTEM到环境变量buff中
	if (retval)
		goto exit;
	retval = add_uevent_var(env, "DEVPATH=%s", devpath);
	if (retval)
		goto exit;
	retval = add_uevent_var(env, "SUBSYSTEM=%s", subsystem);
	if (retval)
		goto exit;

	/* keys passed in from the caller */
	if (envp_ext) {                                                       //添加调用者提供的参数
		for (i = 0; envp_ext[i]; i++) {
			retval = add_uevent_var(env, "%s", envp_ext[i]);
			if (retval)
				goto exit;
		}
	}

	/* let the kset specific function add its stuff */                    //让Kset完成一些自己的私人处理
	if (uevent_ops && uevent_ops->uevent) {
		retval = uevent_ops->uevent(kset, kobj, env);
		if (retval) {
			pr_debug("kobject: '%s' (%p): %s: uevent() returned "
				 "%d\n", kobject_name(kobj), kobj,
				 __func__, retval);
			goto exit;
		}
	}

	/*
	 * Mark "add" and "remove" events in the object to ensure proper
	 * events to userspace during automatic cleanup. If the object did
	 * send an "add" event, "remove" will automatically generated by
	 * the core, if not already done by the caller.
	 */
	if (action == KOBJ_ADD)
		kobj->state_add_uevent_sent = 1;
	else if (action == KOBJ_REMOVE)
		kobj->state_remove_uevent_sent = 1;

	mutex_lock(&uevent_sock_mutex);
	/* we will send an event, so request a new sequence number */                     //更新uevent seq number
	retval = add_uevent_var(env, "SEQNUM=%llu", (unsigned long long)++uevent_seqnum);
	if (retval) {
		mutex_unlock(&uevent_sock_mutex);
		goto exit;
	}

#if defined(CONFIG_NET)            //如果开启了CONFIG_NET就使用netlink发送Uevent
	/* send netlink message */
	list_for_each_entry(ue_sk, &uevent_sock_list, list) {
		struct sock *uevent_sock = ue_sk->sk;
		struct sk_buff *skb;
		size_t len;

		if (!netlink_has_listeners(uevent_sock, 1))
			continue;

		/* allocate message with the maximum possible size */
		len = strlen(action_string) + strlen(devpath) + 2;
		skb = alloc_skb(len + env->buflen, GFP_KERNEL);
		if (skb) {
			char *scratch;

			/* add header */
			scratch = skb_put(skb, len);
			sprintf(scratch, "%s@%s", action_string, devpath);

			/* copy keys to our continuous event payload buffer */
			for (i = 0; i < env->envp_idx; i++) {
				len = strlen(env->envp[i]) + 1;
				scratch = skb_put(skb, len);
				strcpy(scratch, env->envp[i]);
			}

			NETLINK_CB(skb).dst_group = 1;
			retval = netlink_broadcast_filtered(uevent_sock, skb,
							    0, 1, GFP_KERNEL,
							    kobj_bcast_filter,
							    kobj);
			/* ENOBUFS should be handled in userspace */
			if (retval == -ENOBUFS || retval == -ESRCH)
				retval = 0;
		} else
			retval = -ENOMEM;
	}
#endif
	mutex_unlock(&uevent_sock_mutex);

#ifdef CONFIG_UEVENT_HELPER             //如果开启了就是用uevent_helper发送uevent。
	/* call uevent_helper, usually only enabled during early boot */
	if (uevent_helper[0] && !kobj_usermode_filter(kobj)) {
		struct subprocess_info *info;

		retval = add_uevent_var(env, "HOME=/");
		if (retval)
			goto exit;
		retval = add_uevent_var(env,
					"PATH=/sbin:/bin:/usr/sbin:/usr/bin");
		if (retval)
			goto exit;
		retval = init_uevent_argv(env, subsystem);
		if (retval)
			goto exit;

		retval = -ENOMEM;
		info = call_usermodehelper_setup(env->argv[0], env->argv,
						 env->envp, GFP_KERNEL,
						 NULL, cleanup_uevent_env, env);
		if (info) {
			retval = call_usermodehelper_exec(info, UMH_NO_WAIT);
			env = NULL;	/* freed by cleanup_uevent_env */
		}
	}
#endif

exit:
	kfree(devpath);
	kfree(env);
	return retval;
}

uevent_helper机制

目前内核支持两种方式,netlink和uevent_helper,本节重点分析uevent_helper的实现。

uevent_helper的定义如下

#ifdef CONFIG_UEVENT_HELPER
char uevent_helper[UEVENT_HELPER_PATH_LEN] = CONFIG_UEVENT_HELPER_PATH;
#endif

CONFIG_UEVENT_HELPER_PATH可以在内核的config文件中找到。

CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"

对应的/sbin/hotplug, 而此hotplug对应的程序是什么? 如果是嵌入式设备,会在etc目录下看到这样的配置:

echo /sbin/mdev >/proc/sys/kernel/hotplug
/sbin/mdev -s

也就是说uevent_helper最终调用到/sbin/mdev.

接着会到kobject_uevent函数中,继续分析。

在开是调用userhelper之前的准备工作。

struct subprocess_info *call_usermodehelper_setup(char *path, char **argv,
		char **envp, gfp_t gfp_mask,
		int (*init)(struct subprocess_info *info, struct cred *new),
		void (*cleanup)(struct subprocess_info *info),
		void *data)
{
	struct subprocess_info *sub_info;
	sub_info = kzalloc(sizeof(struct subprocess_info), gfp_mask);
	if (!sub_info)
		goto out;

	INIT_WORK(&sub_info->work, __call_usermodehelper);                //初始化一个工作队列。
	sub_info->path = path;                                            //通过参数初始化subprocess_info
	sub_info->argv = argv;
	sub_info->envp = envp;

	sub_info->cleanup = cleanup;
	sub_info->init = init;
	sub_info->data = data;
  out:
	return sub_info;
}

接着调用call_usermodehelper_exec函数开启一个用户模式的应用程序。

int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
{
	DECLARE_COMPLETION_ONSTACK(done);                            //初始化一个完成对象
	int retval = 0;

	if (!sub_info->path) {                                      //如果没有path变量,就执行cleanup
		call_usermodehelper_freeinfo(sub_info);
		return -EINVAL;
	}
	helper_lock();                                               //原子变量running_helpers加1
	if (!khelper_wq || usermodehelper_disabled) {                //如果不存在khelper_wq,或者usermodehelper已经disabled
		retval = -EBUSY;
		goto out;
	}
	/*
	 * Worker thread must not wait for khelper thread at below
	 * wait_for_completion() if the thread was created with CLONE_VFORK
	 * flag, for khelper thread is already waiting for the thread at
	 * wait_for_completion() in do_fork().
	 */
	if (wait != UMH_NO_WAIT && current == kmod_thread_locker) {     
		retval = -EBUSY;
		goto out;
	}

	/*
	 * Set the completion pointer only if there is a waiter.
	 * This makes it possible to use umh_complete to free
	 * the data structure in case of UMH_NO_WAIT.
	 */
	sub_info->complete = (wait == UMH_NO_WAIT) ? NULL : &done;
	sub_info->wait = wait;

	queue_work(khelper_wq, &sub_info->work);                               //提交工作节点到工作队列。
	if (wait == UMH_NO_WAIT)	/* task has freed sub_info */          //如果wait等于NO_WAIT则就返回。
		goto unlock;

	if (wait & UMH_KILLABLE) {                                
		retval = wait_for_completion_killable(&done);                   //如果支持可kill的
		if (!retval)
			goto wait_done;

		/* umh_complete() will see NULL and free sub_info */
		if (xchg(&sub_info->complete, NULL))
			goto unlock;
		/* fallthrough, umh_complete() was already called */
	}

	wait_for_completion(&done);                                      //如果wait不是上述的两种,就一直等待,那等待什么?  当然是等待有人解放它。
wait_done:
	retval = sub_info->retval;
out:
	call_usermodehelper_freeinfo(sub_info);
unlock:
	helper_unlock();
	return retval;
}

那什么时候会唤醒等待? 当然是工作队列上的任务完成之后,就会触发complete,唤醒等待。

/* This is run by khelper thread  */
static void __call_usermodehelper(struct work_struct *work)
{
	struct subprocess_info *sub_info =
		container_of(work, struct subprocess_info, work);
	int wait = sub_info->wait & ~UMH_KILLABLE;
	pid_t pid;

	/* CLONE_VFORK: wait until the usermode helper has execve'd
	 * successfully We need the data structures to stay around
	 * until that is done.  */
	if (wait == UMH_WAIT_PROC)
		pid = kernel_thread(wait_for_helper, sub_info,
				    CLONE_FS | CLONE_FILES | SIGCHLD);
	else {
		pid = kernel_thread(call_helper, sub_info,
				    CLONE_VFORK | SIGCHLD);
		/* Worker thread stopped blocking khelper thread. */
		kmod_thread_locker = NULL;
	}

	if (pid < 0) {
		sub_info->retval = pid;
		umh_complete(sub_info);
	}
}

此处通过创建一个内核线程,当调度到call_helper函数,此函数调用到____call_usermodehelper。在此函数中最终调用

	retval = do_execve(getname_kernel(sub_info->path),
			   (const char __user *const __user *)sub_info->argv,
			   (const char __user *const __user *)sub_info->envp);

在内核空间执行应用程序。最终在umh_complete函数中调用complete,唤醒等待。

static void umh_complete(struct subprocess_info *sub_info)
{
	struct completion *comp = xchg(&sub_info->complete, NULL);
	/*
	 * See call_usermodehelper_exec(). If xchg() returns NULL
	 * we own sub_info, the UMH_KILLABLE caller has gone away
	 * or the caller used UMH_NO_WAIT.
	 */
	if (comp)
		complete(comp);
	else
		call_usermodehelper_freeinfo(sub_info);
}

至此就分析完毕。

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • Linux设备驱动模型-Kset

    当多个kobject属于同一类的时候,为了方便管理,就引入了Kset。Kset可以认为是一组kobject的集合,是kobject的容器。

    DragonKingZhu
  • CFS Scheduler(CFS调度器)

    前面我们分享了O(n)和O(1)调度器的实现原理,同时也了解了各个调度器的缺陷和面临的问题。总的来说O(1)调度器的出现是为了解决O(n)调度器不能解决的问题,...

    DragonKingZhu
  • udev和mdev hotplug

    1. udev 和mdev 是两个使用uevent 机制处理热插拔问题的用户空间程序,两者的实现机理不同。udev 是基于netlink 机制的,它在系统启...

    DragonKingZhu
  • MySQL日志介绍

    (1) 错误日志log_error:记录MySQL服务的启动、运行或停止MySQL服务时出现的问题

    AsiaYe
  • Linux OOM机制分析

    oom_killer(out of memory killer)是Linux内核的一种内存管理机制,在系统可用内存较少的情况下,内核为保证系统还能够继续运行下去...

    ivanren
  • JDBC事务控制管理

    今天是学习计划的第二天,感觉自己的学习热情还是很高涨的啊,那我们就趁热打铁,开始今天的学习。 今天的学习内容是JDBC的事务控制管理。 首先是概念性的内容 ...

    wangweijun
  • BeEF 客户端攻击框架的秘密(一)

    BeEF,全称The Browser Exploitation Framework,是一款针对浏览器的渗透测试工具。

    Aran
  • 对spring web启动时IOC源码研究(二)

    发现这样debug到哪说到哪好像有点回不来了~让我重新理下思路,主要步骤先上图,有什么不同意见欢迎批评教育~

    老梁
  • PANDA Banker 恶意软件攻击银行机构、加密货币交易平台以及社交媒体

    安全公司 F5 近日发布报告,称黑客利用 PANDA Banker 恶意软件频繁攻击银行机构、加密货币交易平台以及社交媒体。

    周俊辉
  • 自动化测试到底是什么

    偶然在群里有人问自动化测试到底是啥,搞不懂。qtp对象库好麻烦,jmeter怎么做测试。。。。一堆一堆的问题。其实说实话真心不知道该咋解答了,我的内心是累的~ ...

    企鹅号小编

扫码关注云+社区

领取腾讯云代金券