通俗易懂 empowered RL

CreateAMind

发布于 2018-12-18 14:11:23

4160

发布于 2018-12-18 14:11:23

文章被收录于专栏：CreateAMind

"All Else Being Equal Be Empowered":

Inspired by examples from the animal kingdom, social sciences and games the authors proposed empowerment, a rather universal function, defined as the information-theoretic capacity of an agent’s actuation channel.

Organisms may be seen to maintain “essential variables”, like body temperature, sugar levels, pH levels. Homeostasis provides organisms with a local gradient telling which actions to make or which states to seek. The mechanism itself is universal and quite simple, however the choice of variables and the methods of regulation is not. They are evolved and are specific to different phyla.

The unifying theme of these and many other examples is the striving towards situations where in the long term one could do many different things if one wanted to, where one has more control or influence over the world. Predators with better sensors and actuators can hunt better. Having high status in a group of chimpanzees allows one more mating choices. Having a lot of money enables one to engage in more activities. One can choose from an array of options. However, if one doesn’t know what to do, a good rule of thumb is to choose actions leading to higher status, more power, money and control. We will now apply this idea to “embodied” agents.

Empowerment can be seen as the agent’s potential to change the world, that is, how much the agent could do in principle. This is in general different from the actual change the agent inflicts.

Briefly, empowerment is defined as the capacity of the actuation channel of the agent.

The Communication Problem:

There is a sender and a receiver. The sender transmits a signal, denoted by a random variable X, to the receiver, who receives a potentially different signal, denoted by a random variable Y. The communication channel between the sender and the receiver defines how transmitted signals correspond to received signals. In the case of discrete signals the channel can be described by a conditional probability distribution p(y|x).

Given p(y|x):

Channel capacity is the maximum amount of information the received signal can contain about the transmitted signal. Thus, mutual information is a function of p(x) and p(y|x), whereas channel capacity is a function of the channel p(y|x) only. Another important difference is that mutual information is symmetric in X and Y and is thus acausal, whereas channel capacity requires complete control over X and is thus asymmetric and causal.

Assuming that the agent is allowed to perform any actions for n time steps, what is the maximum amount of information it can “inject” into the momentary reading of its sensor after these n time steps? The more of the information can be made to appear in the sensor, the more control or influence the agent has over its sensor.

We need to measure the maximum amount of information the agent could “inject” or transmit into its sensor by performing a sequence of actions of length n.

In the paper titled "Empowerment-driven Exploration using Mutual Information Estimation", empowerment is used as an intrinsic reward to empower reinforcement learning.

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2018-11-27，如有侵权请联系 cloudcommunity@tencent.com 删除

其他

本文分享自 CreateAMind 微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

其他

登录后参与评论

0 条评论

热度

通俗易懂 empowered RL

通俗易懂 empowered RL

社区

活动

资源

关于

腾讯云开发者

热门产品

热门推荐

更多推荐