前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >通俗易懂 empowered RL

通俗易懂 empowered RL

作者头像
用户1908973
发布2018-12-18 14:11:23
4090
发布2018-12-18 14:11:23
举报
文章被收录于专栏:CreateAMindCreateAMind

"All Else Being Equal Be Empowered":

Inspired by examples from the animal kingdom, social sciences and games the authors proposed empowerment, a rather universal function, defined as the information-theoretic capacity of an agent’s actuation channel.

Organisms may be seen to maintain “essential variables”, like body temperature, sugar levels, pH levels. Homeostasis provides organisms with a local gradient telling which actions to make or which states to seek. The mechanism itself is universal and quite simple, however the choice of variables and the methods of regulation is not. They are evolved and are specific to different phyla.

The unifying theme of these and many other examples is the striving towards situations where in the long term one could do many different things if one wanted to, where one has more control or influence over the world. Predators with better sensors and actuators can hunt better. Having high status in a group of chimpanzees allows one more mating choices. Having a lot of money enables one to engage in more activities. One can choose from an array of options. However, if one doesn’t know what to do, a good rule of thumb is to choose actions leading to higher status, more power, money and control. We will now apply this idea to “embodied” agents.

Empowerment can be seen as the agent’s potential to change the world, that is, how much the agent could do in principle. This is in general different from the actual change the agent inflicts.

Briefly, empowerment is defined as the capacity of the actuation channel of the agent.

The Communication Problem:

There is a sender and a receiver. The sender transmits a signal, denoted by a random variable X, to the receiver, who receives a potentially different signal, denoted by a random variable Y. The communication channel between the sender and the receiver defines how transmitted signals correspond to received signals. In the case of discrete signals the channel can be described by a conditional probability distribution p(y|x).

Given p(y|x):

Channel capacity is the maximum amount of information the received signal can contain about the transmitted signal. Thus, mutual information is a function of p(x) and p(y|x), whereas channel capacity is a function of the channel p(y|x) only. Another important difference is that mutual information is symmetric in X and Y and is thus acausal, whereas channel capacity requires complete control over X and is thus asymmetric and causal.

Assuming that the agent is allowed to perform any actions for n time steps, what is the maximum amount of information it can “inject” into the momentary reading of its sensor after these n time steps? The more of the information can be made to appear in the sensor, the more control or influence the agent has over its sensor.

We need to measure the maximum amount of information the agent could “inject” or transmit into its sensor by performing a sequence of actions of length n.

In the paper titled "Empowerment-driven Exploration using Mutual Information Estimation", empowerment is used as an intrinsic reward to empower reinforcement learning.

本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2018-11-27,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 CreateAMind 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档