【AlphaGoZero核心技术】深度强化学习知识资料全集(论文/代码/教程/视频/文章等)

【导读】昨天 Google DeepMind在Nature上发表最新论文,介绍了迄今最强最新的版本AlphaGo Zero,不使用人类先验知识,使用纯强化学习,将价值网络和策略网络整合为一个架构,3天训练后就以100比0击败了上一版本的AlphaGo。Alpha Zero的背后核心技术是深度强化学习,为此,专知特别收录整理聚合了关于强化学习的最全知识资料,欢迎大家查看!

先看下Google DeepMind 研究人员David Silver介绍 AlphaGo Zero:

视频内容

专知 -Deep Reinforcement Learning 最全资料集合:

  • Nature 论文 Mastering the game of Go without human knowledge Nature 550, 7676 (2017). doi:10.1038/nature24270 Authors: David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel & Demis Hassabis 网址:https://www.nature.com/nature/journal/v550/n7676/full/nature24270.html 请下载pdf查看!

Mastering the game of Go with deep neural networks and tree search

David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Vedavyas Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy P. Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis: Nature 529(7587): 484-489 (2016)

  • Papers

Mastering the Game of Go without Human Knowledge

https://deepmind.com/documents/119/agz_unformatted_nature.pdf

Human level control with deep reinforcement learning

http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html

Play Atari game with deep reinforcement learning

https://www.cs.toronto.edu/%7Evmnih/docs/dqn.pdf

Prioritized experience replay

https://arxiv.org/pdf/1511.05952v2.pdf

Dueling DQN

https://arxiv.org/pdf/1511.06581v3.pdf

Deep reinforcement learning with double Q Learning

https://arxiv.org/abs/1509.06461

Deep Q learning with NAF

https://arxiv.org/pdf/1603.00748v1.pdf

Deterministic policy gradient

http://jmlr.org/proceedings/papers/v32/silver14.pdf

Continuous control with deep reinforcement learning) (DDPG)

https://arxiv.org/pdf/1509.02971v5.pdf

Asynchronous Methods for Deep Reinforcement Learning

https://arxiv.org/abs/1602.01783

Policy distillation

https://arxiv.org/abs/1511.06295

Control of Memory, Active Perception, and Action in Minecraft

https://arxiv.org/pdf/1605.09128v1.pdf

Unifying Count-Based Exploration and Intrinsic Motivation

https://arxiv.org/pdf/1606.01868v2.pdf

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

https://arxiv.org/pdf/1507.00814v3.pdf

Action-Conditional Video Prediction using Deep Networks in Atari Games

https://arxiv.org/pdf/1507.08750v2.pdf

Control of Memory, Active Perception, and Action in Minecraft

https://web.eecs.umich.edu/~baveja/Papers/ICML2016.pdf

PathNet

https://arxiv.org/pdf/1701.08734.pdf

  • Papers for NLP

Coarse-to-Fine Question Answering for Long Documents

https://homes.cs.washington.edu/~eunsol/papers/acl17eunsol.pdf

A Deep Reinforced Model for Abstractive Summarization

https://arxiv.org/pdf/1705.04304.pdf

Reinforcement Learning for Simultaneous Machine Translation

https://www.umiacs.umd.edu/~jbg/docs/2014_emnlp_simtrans.pdf

Dual Learning for Machine Translation

https://papers.nips.cc/paper/6469-dual-learning-for-machine-translation.pdf

Learning to Win by Reading Manuals in a Monte-Carlo Framework

http://people.csail.mit.edu/regina/my_papers/civ11.pdf

Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning

http://people.csail.mit.edu/regina/my_papers/civ11.pdf

Deep Reinforcement Learning with a Natural Language Action Space

http://www.aclweb.org/anthology/P16-1153

Deep Reinforcement Learning for Dialogue Generation

https://arxiv.org/pdf/1606.01541.pdf

Reinforcement Learning for Mapping Instructions to Actions

http://people.csail.mit.edu/branavan/papers/acl2009.pdf

Language Understanding for Text-based Games using Deep Reinforcement Learning

https://arxiv.org/pdf/1506.08941.pdf

End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning

https://arxiv.org/pdf/1606.01269v1.pdf

End-to-End Reinforcement Learning of Dialogue Agents for Information Access

https://arxiv.org/pdf/1609.00777v1.pdf

Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning

https://arxiv.org/pdf/1702.03274.pdf

Deep Reinforcement Learning for Mention-Ranking Coreference Models

https://arxiv.org/abs/1609.08667

  • 精选文章

wiki

https://en.wikipedia.org/wiki/Reinforcement_learning

Deep Reinforcement Learning: Pong from Pixels

http://karpathy.github.io/2016/05/31/rl/

CS 294: Deep Reinforcement Learning

http://rll.berkeley.edu/deeprlcourse/

强化学习系列之一:马尔科夫决策过程

http://www.algorithmdog.com/%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0-%E9%A9%AC%E5%B0%94%E7%A7%91%E5%A4%AB%E5%86%B3%E7%AD%96%E8%BF%87%E7%A8%8B

强化学习系列之九:Deep Q Network (DQN)

http://www.algorithmdog.com/drl

强化学习系列之三:模型无关的策略评价

http://www.algorithmdog.com/reinforcement-learning-model-free-evalution

【整理】强化学习与MDP

http://www.cnblogs.com/mo-wang/p/4910855.html

强化学习入门及其实现代码

http://www.jianshu.com/p/165607eaa4f9

深度强化学习系列(二):强化学习

http://blog.csdn.net/ikerpeng/article/details/53031551

采用深度 Q 网络的 Atari 的 Demo: Nature 上关于深度 Q 网络 (DQN) 论文:

http://www.nature.com/articles/nature14236

David视频里所使用的讲义pdf

https://pan.baidu.com/s/1nvqP7dB

什么是强化学习?

http://www.cnblogs.com/geniferology/p/what_is_reinforcement_learning.html

DavidSilver 关于 深度确定策略梯度 DPG的论文

http://www.jmlr.org/proceedings/papers/v32/silver14.pdf

Nature 上关于 AlphaGo 的论文:

http://www.nature.com/articles/nature16961

AlphaGo 相关的资源

deepmind.com/research/alphago/

What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?

https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/

Deep Learning in a Nutshell: Reinforcement Learning

https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/

Bellman equation

https://en.wikipedia.org/wiki/Bellman_equation

Reinforcement learning

https://en.wikipedia.org/wiki/Reinforcement_learning

Mastering the Game of Go without Human Knowledge

https://deepmind.com/documents/119/agz_unformatted_nature.pdf

Reinforcement Learning(RL) for Natural Language Processing(NLP)

https://github.com/adityathakker/awesome-rl-nlp

  • 视频教程

强化学习教程(莫烦)

https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/

强化学习课程 by David Silver

https://www.bilibili.com/video/av8912293/?from=search&seid=1166472326542614796

CS234: Reinforcement Learning

http://web.stanford.edu/class/cs234/index.html

什么是强化学习? (Reinforcement Learning)

https://www.youtube.com/watch?v=NVWBs7b3oGk

什么是 Q Learning (Reinforcement Learning 强化学习)

https://www.youtube.com/watch?v=HTZ5xn12AL4

强化学习-莫烦

https://morvanzhou.github.io/tutorials/machine-learning/ML-intro/

David Silver深度强化学习第1课 - 简介 (中文字幕)

https://www.bilibili.com/video/av9831889/

David Silver的这套视频公开课(Youtube)

https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT

David Silver的这套视频公开课(Bilibili)

http://www.bilibili.com/video/av9831889/?from=search&seid=17387316110198388304

Deep Reinforcement Learning

http://videolectures.net/rldm2015_silver_reinforcement_learning/

  • Tutorial Reinforcement Learning for NLPhttp://www.umiacs.umd.edu/~jbg/teaching/CSCI_7000/11a.pdfICML 2016, Deep Reinforcement Learning tutorialhttp://icml.cc/2016/tutorials/deep_rl_tutorial.pdf DQN tutorialhttps://medium.com/@awjuliani/simple-reinforcement-learning-with-tensorflow-part-4-deep-q-networks-and-beyond-8438a3e2b8df#.28wv34w3a
  • 代码 OpenAI Gymhttps://github.com/openai/gymGoogleDeep Mind 团队深度 Q 网络 (DQN) 源码:http://sites.google.com/a/deepmind.com/dqn/ReinforcementLearningCodehttps://github.com/halleanwoo/ReinforcementLearningCodereinforcement-learninghttps://github.com/dennybritz/reinforcement-learningDQNhttps://github.com/devsisters/DQN-tensorflowDDPGhttps://github.com/stevenpjg/ddpg-aigymA3C01https://github.com/miyosuda/async_deep_reinforceA3C02https://github.com/openai/universe-starter-agent

原文发布于微信公众号 - 专知(Quan_Zhuanzhi)

原文发表时间:2017-10-20

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

相关文章

来自专栏专知

【论文推荐】最新6篇图像描述生成相关论文—语言为枢纽、细粒度、生成器、注意力机制、策略梯度优化、判别性目标

【导读】专知内容组整理了最近六篇图像描述生成(Image Caption)相关文章,为大家进行介绍,欢迎查看! 1. Unpaired Image Captio...

3757
来自专栏AI研习社

126篇殿堂级深度学习论文分类整理 从入门到应用(下)

AI 研习社:本文接“126篇殿堂级深度学习论文分类整理 从入门到应用(上)”,是该整理的下半部分,即应用篇;按照各应用领域对论文进行分类。 3 应用 3.1 ...

3176
来自专栏专知

【论文推荐】最新四篇CVPR2018 视频描述生成相关论文—双向注意力、Transformer、重构网络、层次强化学习

【导读】专知内容组在昨天推出八篇视频描述生成(Video Captioning)相关文章,今天为大家推出CVPR2018最新视频描述生成相关论文,欢迎查看!

3342
来自专栏专知

【专知荟萃23】深度强化学习RL知识资料全集(入门/进阶/论文/综述/代码/专家,附查看)

【AlphaGoZero核心技术】深度强化学习专知荟萃 【AlphaGoZero核心技术】深度强化学习专知荟萃 基础入门 进阶文章 Papers Papers ...

7179
来自专栏专知

【论文推荐】最新七篇视觉问答(VQA)相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理

2732
来自专栏数据派THU

为你分享73篇论文解决深度强化学习的18个关键问题

来源:PaperWeekly 作者:王凌霄 本文共2434字,建议阅读5分钟。 本文为大家分享了73篇论文,介绍深度学习的方法策略以及关键问题分析。 这两天我阅...

3089
来自专栏PPV课数据科学社区

为你分享73篇论文解决深度强化学习的18个关键问题

本文共2434字,建议阅读5分钟。 本文为大家分享了73篇论文,介绍深度学习的方法策略以及关键问题分析。

1982
来自专栏专知

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

3371
来自专栏专知

【论文推荐】最新六篇视觉问答(VQA)相关论文—盲人问题、物体计数、多模态解释、视觉关系、对抗性网络、对偶循环注意力

【导读】专知内容组整理了最近六篇视觉问答(Visual Question Answering)相关文章,为大家进行介绍,欢迎查看! 1. VizWiz Gran...

4725
来自专栏专知

【论文推荐】最新八篇机器翻译相关论文—自注意力残差解码器、条件序列生成式对抗网络、检索译文、域自适应、细粒度注意力机制

2934

扫码关注云+社区

领取腾讯云代金券