专栏首页CreateAMind深度抽象强化学习-提高抽象学习能力-论文解读

深度抽象强化学习-提高抽象学习能力-论文解读

Towards Deep Symbolic Reinforcement Learning

Abstract

Deep reinforcement learning (DRL) brings the power of deep neural networks to bear on the generic task of trial-and-error learning, 强化学习本质是试错学习

and its effectiveness has been convincingly demonstrated on tasks such as Atari video games and the game of Go. However, contemporary DRL systems inherit a number of shortcomings 缺点很多 from the current generation of deep learning techniques. For example, they require very large datasets 大数据量 to work effectively, entailing that they are slow 慢 to learn even when such datasets are available.

Moreover, they lack the ability to reason on an abstract level 缺少抽象思维, which makes it difficult to implement high-level cognitive functions such as transfer learning, analogical reasoning, and hypothesis-based reasoning. Finally, their operation is largely opaque to humans 机理不透明, rendering them unsuitable for domains in which verifiability is important.

训练好的模型很难迁移到其他任务

In this paper, we propose an end-to-end reinforcement learning architecture comprising a neural back end and a symbolic front end with the potential to overcome each of these shortcomings. As proof-of-concept, we present a preliminary implementation of the architecture and apply it to several variants of a simple video game. We show that the resulting system – though just a prototype – learns effectively, and, by acquiring a set of symbolic rules that are easily comprehensible to humans, dramatically outperforms a conventional, fully neural DRL system on a stochastic variant of the game.

在深度学习上面搭建抽象符号,增加抽象能力,提高学习能力。

从深度学习的视觉输入进行更高层次的抽象,在更高的语义单元、认知水平上面进行思考和行动及规划是提供深度网络学习能力的根本。是构建认知、推理、常识、学习迁移、环境适应灵活性、行动规划等能力的途径。提供认识的抽象层次非常重要。进展中的工作包括deepmind 做通用人工智能的思路 谷歌:beta-vae 可以媲美infogan的无监督学习框架-多图-及代码;

1 Introduction

Here we take a different approach. We propose a novel reinforcement learning architecture that addresses all of these issues at once in a principled way by combining neural network learning 结合深度网络和传统符号计算 期望能增强学习能力 with aspects of classical symbolic AI, gaining the advantages of both methodologies without their respective disadvantages. Central to classical AI is the use of language-like propositional representations to encode knowledge.

Thanks to their compositional structure, such representations are amenable to endless extension and recombination, an essential feature for the acquisition and deployment of high-level abstract concepts, which are key to general intelligence (McCarthy, 1987). Moreover, knowledge expressed in propositional form can be exploited by multiple high-level reasoning processes and has general-purpose application across multiple tasks and domains. Features such as these, derived from the benefits of human language, motivated several decades of research in symbolic AI.

But as an approach to general intelligence, classical symbolic AI has been disappointing. A major obstacle here is the symbol grounding problem (Harnad, 1990; Shanahan, 2005). The symbolic elements of a representation in classical AI – the constants, functions, and predicates – are typically hand-crafted, rather than grounded in data from the real world. Philosophically speaking, this means their semantics are parasitic on meanings in the heads of their designers rather than deriving from a direct connection with the world.

传统手工特征存在的问题是:哲学家的分析是,特征不是从原始数据生成,数据不是建立在真实世界信息之上,信息是通过人的处理之后的抽象信息,因此信息失去了原始的丰富特性。对后续的扩展,新环境适应,都失去了根基。 深度学习是解决这个问题的方法;deepmind 做通用人工智能的思路; 因此整合深度学习和符号计算是一个很好的尝试

Pragmatically, hand-crafted representations cannot capture the rich statistics of real-world perceptual data, cannot support ongoing adaptation to an unknown environment, and are an obvious barrier to full autonomy. By contrast, none of these problems afflict machine learning. Deep neural networks in particular have proven to be remarkably effective for supervised learning from large datasets using backpropagation (LeCun et al., 2015; Schmidhuber,2015). Deep learning is therefore already a viable solution to the symbol grounding problem in the supervised case, and for the unsupervised case, which is essential for a full solution, rapid progress is being made (Chen et al., 2016; Goodfellow et al., 2014; Greff et al., 2016; Higgins et al., 2016; Kingma and Welling, 2013). The hybrid neural-symbolic reinforcement learning architecture we propose relies on a deep learning solution to the symbol grounding problem.

论文的概念原型架构的四大设计原则

1) Conceptual abstraction 概念抽象学习

这里可以使用beta-VAE 谷歌:beta-vae 可以媲美infogan的无监督学习框架-多图-及代码; 进行语义基本的特征学习,大幅提高对外部世界的理解和认知

2) Compositional structure. 学习到的概念能够进行组合比较等,生成新的概念属性等。

3) Common sense priors ; 有先验知识,有常识知识,在任务训练前有对世界的理解的基础知识。

4) Causal reasoning. 因果推理的能力。

从而整合深度学习和符号计算的优点。

后面就是论文对概念原型的设计和训练及结果的介绍和分析。

结果是比DQN大幅提高

本文由zdx3578推荐。

本文分享自微信公众号 - CreateAMind(createamind),作者:zdx3578

原文出处及转载信息见文内详细说明,如有侵权,请联系 yunjia_community@tencent.com 删除。

原始发表时间:2016-11-18

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

我来说两句

0 条评论
登录 后参与评论

相关文章

  • deepmind 做通用人工智能的思路

    Automated discovery of early visual concepts from raw image data is a major open...

    用户1908973
  • Why Neurons Have Thousands of Synapses

    用户1908973
  • State Abstraction as 压缩 in Apprenticeship Learning

    State Abstraction as Compression in Apprenticeship Learning https://github.com/d...

    用户1908973
  • 深度神经网络的捷径学习问题(CS Computer Vision and Patter Recognition)

    深度学习促使了人工智能的崛起,并且是当代机器智能的主力军。大量的成功案例讯速遍布了整个科学界、工业界以及社会,但是它的局限直到最近才得到关注。从局限性角度来看,...

    Donuts_choco
  • python学习之文章数据分析

    通常我们在进行NLP学习的时候,会经常的处理一些语料,同时也会对这些语料进行一些分析,今天的这篇文章我们通过分析quora上的Andrew NG的一个回答来实际...

    云时之间
  • 机器学习神书推荐 Hands on Machine Learning

    本次为大家推荐的是一本机器学习神书英文原版《Hands-On Machine Learning with Scikit-Learn and TensorFlow...

    算法与编程之美
  • 通过游戏促进孩童对计算机知识的掌握(CS CY)

    学习编程,或者更广泛地说,学习计算机科学相关知识是一个日益扩大的活动和研究领域。在计算思维的标签下,计算机相关概念在计算机科学之外的许多学科领域越来越多地被用作...

    Elva
  • 柏拉图对话系统:一个灵活的人工智能会话研究平台(cs AI)

    随着语音对话系统和会话人工智能领域的发展,对工具和环境的需求也在增长,这些工具和环境可以抽象出实现细节,从而加快开发过程,降低进入该领域的门槛,并为新思想提供一...

    RockNPeng
  • Tencent Joins the GPL Cooperation Commitment

    ? Hong Kong, 07 November, 2018 – Tencent, a leading provider of Internet servic...

    腾讯开源
  • 作为发散最小化的动作和感知

    我们引入了智能代理的行为和感知的统一目标。通过扩展表征学习和控制,我们最小化了世界和目标分布之间的联合差异。直觉上,这样的代理人利用感知使他们的信念与世界一致,...

    用户7703613

扫码关注云+社区

领取腾讯云代金券