深度抽象强化学习-提高抽象学习能力-论文解读

Towards Deep Symbolic Reinforcement Learning

Abstract

Deep reinforcement learning (DRL) brings the power of deep neural networks to bear on the generic task of trial-and-error learning, 强化学习本质是试错学习

and its effectiveness has been convincingly demonstrated on tasks such as Atari video games and the game of Go. However, contemporary DRL systems inherit a number of shortcomings 缺点很多 from the current generation of deep learning techniques. For example, they require very large datasets 大数据量 to work effectively, entailing that they are slow 慢 to learn even when such datasets are available.

Moreover, they lack the ability to reason on an abstract level 缺少抽象思维, which makes it difficult to implement high-level cognitive functions such as transfer learning, analogical reasoning, and hypothesis-based reasoning. Finally, their operation is largely opaque to humans 机理不透明, rendering them unsuitable for domains in which verifiability is important.

训练好的模型很难迁移到其他任务

In this paper, we propose an end-to-end reinforcement learning architecture comprising a neural back end and a symbolic front end with the potential to overcome each of these shortcomings. As proof-of-concept, we present a preliminary implementation of the architecture and apply it to several variants of a simple video game. We show that the resulting system – though just a prototype – learns effectively, and, by acquiring a set of symbolic rules that are easily comprehensible to humans, dramatically outperforms a conventional, fully neural DRL system on a stochastic variant of the game.

在深度学习上面搭建抽象符号,增加抽象能力,提高学习能力。

从深度学习的视觉输入进行更高层次的抽象,在更高的语义单元、认知水平上面进行思考和行动及规划是提供深度网络学习能力的根本。是构建认知、推理、常识、学习迁移、环境适应灵活性、行动规划等能力的途径。提供认识的抽象层次非常重要。进展中的工作包括deepmind 做通用人工智能的思路 谷歌:beta-vae 可以媲美infogan的无监督学习框架-多图-及代码;

1 Introduction

Here we take a different approach. We propose a novel reinforcement learning architecture that addresses all of these issues at once in a principled way by combining neural network learning 结合深度网络和传统符号计算 期望能增强学习能力 with aspects of classical symbolic AI, gaining the advantages of both methodologies without their respective disadvantages. Central to classical AI is the use of language-like propositional representations to encode knowledge.

Thanks to their compositional structure, such representations are amenable to endless extension and recombination, an essential feature for the acquisition and deployment of high-level abstract concepts, which are key to general intelligence (McCarthy, 1987). Moreover, knowledge expressed in propositional form can be exploited by multiple high-level reasoning processes and has general-purpose application across multiple tasks and domains. Features such as these, derived from the benefits of human language, motivated several decades of research in symbolic AI.

But as an approach to general intelligence, classical symbolic AI has been disappointing. A major obstacle here is the symbol grounding problem (Harnad, 1990; Shanahan, 2005). The symbolic elements of a representation in classical AI – the constants, functions, and predicates – are typically hand-crafted, rather than grounded in data from the real world. Philosophically speaking, this means their semantics are parasitic on meanings in the heads of their designers rather than deriving from a direct connection with the world.

传统手工特征存在的问题是:哲学家的分析是,特征不是从原始数据生成,数据不是建立在真实世界信息之上,信息是通过人的处理之后的抽象信息,因此信息失去了原始的丰富特性。对后续的扩展,新环境适应,都失去了根基。 深度学习是解决这个问题的方法;deepmind 做通用人工智能的思路; 因此整合深度学习和符号计算是一个很好的尝试

Pragmatically, hand-crafted representations cannot capture the rich statistics of real-world perceptual data, cannot support ongoing adaptation to an unknown environment, and are an obvious barrier to full autonomy. By contrast, none of these problems afflict machine learning. Deep neural networks in particular have proven to be remarkably effective for supervised learning from large datasets using backpropagation (LeCun et al., 2015; Schmidhuber,2015). Deep learning is therefore already a viable solution to the symbol grounding problem in the supervised case, and for the unsupervised case, which is essential for a full solution, rapid progress is being made (Chen et al., 2016; Goodfellow et al., 2014; Greff et al., 2016; Higgins et al., 2016; Kingma and Welling, 2013). The hybrid neural-symbolic reinforcement learning architecture we propose relies on a deep learning solution to the symbol grounding problem.

论文的概念原型架构的四大设计原则

1) Conceptual abstraction 概念抽象学习

这里可以使用beta-VAE 谷歌:beta-vae 可以媲美infogan的无监督学习框架-多图-及代码; 进行语义基本的特征学习,大幅提高对外部世界的理解和认知

2) Compositional structure. 学习到的概念能够进行组合比较等,生成新的概念属性等。

3) Common sense priors ; 有先验知识,有常识知识,在任务训练前有对世界的理解的基础知识。

4) Causal reasoning. 因果推理的能力。

从而整合深度学习和符号计算的优点。

后面就是论文对概念原型的设计和训练及结果的介绍和分析。

结果是比DQN大幅提高

本文由zdx3578推荐。

原文发布于微信公众号 - CreateAMind(createamind)

原文发表时间:2016-11-18

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏磐创AI技术团队的专栏

【干货】二十五个深度学习相关公开数据集

(选自Analytics Vidhya;作者:Pranav Dar;磐石编译) 目录 介绍 图像处理相关数据集 自然语言处理相关数据集 语音处理相关数据集 ...

3845
来自专栏专知

一文看全ACL 2018亮点:表示学习和更具挑战性环境下的模型评价

【导读】第56届ACL大会于2018年7月15日至20日在澳大利亚墨尔本举行,Sebastian Ruder参加了会议并发表了三篇论文,并分享了他的参会感想,点...

1840
来自专栏新智元

Yoshua Bengio最新演讲:Attention 让深度学习取得巨大成功(46ppt)

【新智元导读】机器翻译是深度学习技术最切近实际的应用之一,现在在互联网上有很广泛的使用。此外,不久前,许多科技大公司也相应地推出了为图片或视频自动生成字幕的应用...

3984
来自专栏PPV课数据科学社区

空间数据挖掘常用的17种方法

PPV课大数据学习社区如果你对大数据感兴趣;如果你想转行做大数据;如果你想了解大数据是怎么改变我们生活,请点标题下蓝字关注PPV课大数据 ? 问题1:空间数据挖...

3819
来自专栏生信宝典

数据可视化基本套路总结

真依然很拉风,简书《数据可视化》专栏维护者,里面有很多优秀的文章,本文便是其中一篇。

4942
来自专栏专知

【干货荟萃】机器学习&深度学习知识资料大全集(一)(论文/教程/代码/书籍/数据/课程等)

点击上方“专知”关注获取更多AI知识! 【导读】转载来自ty4z2008(GItHub)整理的机器学习&深度学习知识资料大全荟萃,包含各种论文、代码、视频、书籍...

8455
来自专栏AI科技大本营的专栏

全球股市巨震,如何用深度学习预测股价?

这两天全球股市都可谓血雨腥风! 这个时候,营长照例会点燃一根烟,看着满屏高高低低的K线,心中又出现了那个历史之问:这时候是该卖出手中持仓?还是用剩余资金抄底?...

4095
来自专栏WOLFRAM

中学生同样也能玩转机器学习

2148
来自专栏计算机视觉战队

视频大数据处理的挑战和机遇

背景: 视频在许多应用中是非常重要的问题,如内容搜索、智能内容识别广告等。现在正处在一个数据爆炸性增长的"大数据"时代,大数据对社会经济、政治、文化,人们生活等...

5359
来自专栏Albert陈凯

机器学习(Machine Learning)&深度学习(Deep Learning)资料(Chapter 1)

机器学习(Machine Learning)&深度学习(Deep Learning)资料(Chapter 1) 注:机器学习资料篇目一共500条,篇目二开始更新...

4798

扫码关注云+社区

领取腾讯云代金券