Towards Deep Symbolic Reinforcement Learning


Deep reinforcement learning (DRL) brings the power of deep neural networks to bear on the generic task of trial-and-error learning, 强化学习本质是试错学习

and its effectiveness has been convincingly demonstrated on tasks such as Atari video games and the game of Go. However, contemporary DRL systems inherit a number of shortcomings 缺点很多 from the current generation of deep learning techniques. For example, they require very large datasets 大数据量 to work effectively, entailing that they are slow 慢 to learn even when such datasets are available.

Moreover, they lack the ability to reason on an abstract level 缺少抽象思维, which makes it difficult to implement high-level cognitive functions such as transfer learning, analogical reasoning, and hypothesis-based reasoning. Finally, their operation is largely opaque to humans 机理不透明, rendering them unsuitable for domains in which verifiability is important.


In this paper, we propose an end-to-end reinforcement learning architecture comprising a neural back end and a symbolic front end with the potential to overcome each of these shortcomings. As proof-of-concept, we present a preliminary implementation of the architecture and apply it to several variants of a simple video game. We show that the resulting system – though just a prototype – learns effectively, and, by acquiring a set of symbolic rules that are easily comprehensible to humans, dramatically outperforms a conventional, fully neural DRL system on a stochastic variant of the game.


从深度学习的视觉输入进行更高层次的抽象,在更高的语义单元、认知水平上面进行思考和行动及规划是提供深度网络学习能力的根本。是构建认知、推理、常识、学习迁移、环境适应灵活性、行动规划等能力的途径。提供认识的抽象层次非常重要。进展中的工作包括deepmind 做通用人工智能的思路 谷歌:beta-vae 可以媲美infogan的无监督学习框架-多图-及代码;

1 Introduction

Here we take a different approach. We propose a novel reinforcement learning architecture that addresses all of these issues at once in a principled way by combining neural network learning 结合深度网络和传统符号计算 期望能增强学习能力 with aspects of classical symbolic AI, gaining the advantages of both methodologies without their respective disadvantages. Central to classical AI is the use of language-like propositional representations to encode knowledge.

Thanks to their compositional structure, such representations are amenable to endless extension and recombination, an essential feature for the acquisition and deployment of high-level abstract concepts, which are key to general intelligence (McCarthy, 1987). Moreover, knowledge expressed in propositional form can be exploited by multiple high-level reasoning processes and has general-purpose application across multiple tasks and domains. Features such as these, derived from the benefits of human language, motivated several decades of research in symbolic AI.

But as an approach to general intelligence, classical symbolic AI has been disappointing. A major obstacle here is the symbol grounding problem (Harnad, 1990; Shanahan, 2005). The symbolic elements of a representation in classical AI – the constants, functions, and predicates – are typically hand-crafted, rather than grounded in data from the real world. Philosophically speaking, this means their semantics are parasitic on meanings in the heads of their designers rather than deriving from a direct connection with the world.

传统手工特征存在的问题是:哲学家的分析是,特征不是从原始数据生成,数据不是建立在真实世界信息之上,信息是通过人的处理之后的抽象信息,因此信息失去了原始的丰富特性。对后续的扩展,新环境适应,都失去了根基。 深度学习是解决这个问题的方法;deepmind 做通用人工智能的思路; 因此整合深度学习和符号计算是一个很好的尝试

Pragmatically, hand-crafted representations cannot capture the rich statistics of real-world perceptual data, cannot support ongoing adaptation to an unknown environment, and are an obvious barrier to full autonomy. By contrast, none of these problems afflict machine learning. Deep neural networks in particular have proven to be remarkably effective for supervised learning from large datasets using backpropagation (LeCun et al., 2015; Schmidhuber,2015). Deep learning is therefore already a viable solution to the symbol grounding problem in the supervised case, and for the unsupervised case, which is essential for a full solution, rapid progress is being made (Chen et al., 2016; Goodfellow et al., 2014; Greff et al., 2016; Higgins et al., 2016; Kingma and Welling, 2013). The hybrid neural-symbolic reinforcement learning architecture we propose relies on a deep learning solution to the symbol grounding problem.


1) Conceptual abstraction 概念抽象学习

这里可以使用beta-VAE 谷歌:beta-vae 可以媲美infogan的无监督学习框架-多图-及代码; 进行语义基本的特征学习,大幅提高对外部世界的理解和认知

2) Compositional structure. 学习到的概念能够进行组合比较等,生成新的概念属性等。

3) Common sense priors ; 有先验知识,有常识知识,在任务训练前有对世界的理解的基础知识。

4) Causal reasoning. 因果推理的能力。





本文分享自微信公众号 - CreateAMind(createamind),作者:zdx3578

原文出处及转载信息见文内详细说明,如有侵权,请联系 yunjia_community@tencent.com 删除。




0 条评论
登录 后参与评论


  • deepmind 做通用人工智能的思路

    Automated discovery of early visual concepts from raw image data is a major open...

  • Why Neurons Have Thousands of Synapses

  • State Abstraction as 压缩 in Apprenticeship Learning

    State Abstraction as Compression in Apprenticeship Learning https://github.com/d...

  • 深度神经网络的捷径学习问题(CS Computer Vision and Patter Recognition)


  • python学习之文章数据分析

    通常我们在进行NLP学习的时候,会经常的处理一些语料,同时也会对这些语料进行一些分析,今天的这篇文章我们通过分析quora上的Andrew NG的一个回答来实际...

  • 机器学习神书推荐 Hands on Machine Learning

    本次为大家推荐的是一本机器学习神书英文原版《Hands-On Machine Learning with Scikit-Learn and TensorFlow...

  • 通过游戏促进孩童对计算机知识的掌握(CS CY)


  • 柏拉图对话系统:一个灵活的人工智能会话研究平台(cs AI)


  • Tencent Joins the GPL Cooperation Commitment

    ? Hong Kong, 07 November, 2018 – Tencent, a leading provider of Internet servic...

  • 作为发散最小化的动作和感知