前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >【AlphaGoZero核心技术】深度强化学习知识资料全集(论文/代码/教程/视频/文章等)

【AlphaGoZero核心技术】深度强化学习知识资料全集(论文/代码/教程/视频/文章等)

作者头像
WZEARW
发布2018-04-09 15:33:40
1.4K0
发布2018-04-09 15:33:40
举报
文章被收录于专栏:专知专知

【导读】昨天 Google DeepMind在Nature上发表最新论文,介绍了迄今最强最新的版本AlphaGo Zero,不使用人类先验知识,使用纯强化学习,将价值网络和策略网络整合为一个架构,3天训练后就以100比0击败了上一版本的AlphaGo。Alpha Zero的背后核心技术是深度强化学习,为此,专知特别收录整理聚合了关于强化学习的最全知识资料,欢迎大家查看!

先看下Google DeepMind 研究人员David Silver介绍 AlphaGo Zero:

视频内容

专知 -Deep Reinforcement Learning 最全资料集合:

  • Nature 论文 Mastering the game of Go without human knowledge Nature 550, 7676 (2017). doi:10.1038/nature24270 Authors: David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel & Demis Hassabis 网址:https://www.nature.com/nature/journal/v550/n7676/full/nature24270.html 请下载pdf查看!

Mastering the game of Go with deep neural networks and tree search

David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Vedavyas Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy P. Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis: Nature 529(7587): 484-489 (2016)

  • Papers

Mastering the Game of Go without Human Knowledge

https://deepmind.com/documents/119/agz_unformatted_nature.pdf

Human level control with deep reinforcement learning

http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html

Play Atari game with deep reinforcement learning

https://www.cs.toronto.edu/%7Evmnih/docs/dqn.pdf

Prioritized experience replay

https://arxiv.org/pdf/1511.05952v2.pdf

Dueling DQN

https://arxiv.org/pdf/1511.06581v3.pdf

Deep reinforcement learning with double Q Learning

https://arxiv.org/abs/1509.06461

Deep Q learning with NAF

https://arxiv.org/pdf/1603.00748v1.pdf

Deterministic policy gradient

http://jmlr.org/proceedings/papers/v32/silver14.pdf

Continuous control with deep reinforcement learning) (DDPG)

https://arxiv.org/pdf/1509.02971v5.pdf

Asynchronous Methods for Deep Reinforcement Learning

https://arxiv.org/abs/1602.01783

Policy distillation

https://arxiv.org/abs/1511.06295

Control of Memory, Active Perception, and Action in Minecraft

https://arxiv.org/pdf/1605.09128v1.pdf

Unifying Count-Based Exploration and Intrinsic Motivation

https://arxiv.org/pdf/1606.01868v2.pdf

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

https://arxiv.org/pdf/1507.00814v3.pdf

Action-Conditional Video Prediction using Deep Networks in Atari Games

https://arxiv.org/pdf/1507.08750v2.pdf

Control of Memory, Active Perception, and Action in Minecraft

https://web.eecs.umich.edu/~baveja/Papers/ICML2016.pdf

PathNet

https://arxiv.org/pdf/1701.08734.pdf

  • Papers for NLP

Coarse-to-Fine Question Answering for Long Documents

https://homes.cs.washington.edu/~eunsol/papers/acl17eunsol.pdf

A Deep Reinforced Model for Abstractive Summarization

https://arxiv.org/pdf/1705.04304.pdf

Reinforcement Learning for Simultaneous Machine Translation

https://www.umiacs.umd.edu/~jbg/docs/2014_emnlp_simtrans.pdf

Dual Learning for Machine Translation

https://papers.nips.cc/paper/6469-dual-learning-for-machine-translation.pdf

Learning to Win by Reading Manuals in a Monte-Carlo Framework

http://people.csail.mit.edu/regina/my_papers/civ11.pdf

Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning

http://people.csail.mit.edu/regina/my_papers/civ11.pdf

Deep Reinforcement Learning with a Natural Language Action Space

http://www.aclweb.org/anthology/P16-1153

Deep Reinforcement Learning for Dialogue Generation

https://arxiv.org/pdf/1606.01541.pdf

Reinforcement Learning for Mapping Instructions to Actions

http://people.csail.mit.edu/branavan/papers/acl2009.pdf

Language Understanding for Text-based Games using Deep Reinforcement Learning

https://arxiv.org/pdf/1506.08941.pdf

End-to-end LSTM-based dialog control optimized with supervised and reinforcement learning

https://arxiv.org/pdf/1606.01269v1.pdf

End-to-End Reinforcement Learning of Dialogue Agents for Information Access

https://arxiv.org/pdf/1609.00777v1.pdf

Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning

https://arxiv.org/pdf/1702.03274.pdf

Deep Reinforcement Learning for Mention-Ranking Coreference Models

https://arxiv.org/abs/1609.08667

  • 精选文章

wiki

https://en.wikipedia.org/wiki/Reinforcement_learning

Deep Reinforcement Learning: Pong from Pixels

http://karpathy.github.io/2016/05/31/rl/

CS 294: Deep Reinforcement Learning

http://rll.berkeley.edu/deeprlcourse/

强化学习系列之一:马尔科夫决策过程

http://www.algorithmdog.com/%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0-%E9%A9%AC%E5%B0%94%E7%A7%91%E5%A4%AB%E5%86%B3%E7%AD%96%E8%BF%87%E7%A8%8B

强化学习系列之九:Deep Q Network (DQN)

http://www.algorithmdog.com/drl

强化学习系列之三:模型无关的策略评价

http://www.algorithmdog.com/reinforcement-learning-model-free-evalution

【整理】强化学习与MDP

http://www.cnblogs.com/mo-wang/p/4910855.html

强化学习入门及其实现代码

http://www.jianshu.com/p/165607eaa4f9

深度强化学习系列(二):强化学习

http://blog.csdn.net/ikerpeng/article/details/53031551

采用深度 Q 网络的 Atari 的 Demo: Nature 上关于深度 Q 网络 (DQN) 论文:

http://www.nature.com/articles/nature14236

David视频里所使用的讲义pdf

https://pan.baidu.com/s/1nvqP7dB

什么是强化学习?

http://www.cnblogs.com/geniferology/p/what_is_reinforcement_learning.html

DavidSilver 关于 深度确定策略梯度 DPG的论文

http://www.jmlr.org/proceedings/papers/v32/silver14.pdf

Nature 上关于 AlphaGo 的论文:

http://www.nature.com/articles/nature16961

AlphaGo 相关的资源

deepmind.com/research/alphago/

What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?

https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/

Deep Learning in a Nutshell: Reinforcement Learning

https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/

Bellman equation

https://en.wikipedia.org/wiki/Bellman_equation

Reinforcement learning

https://en.wikipedia.org/wiki/Reinforcement_learning

Mastering the Game of Go without Human Knowledge

https://deepmind.com/documents/119/agz_unformatted_nature.pdf

Reinforcement Learning(RL) for Natural Language Processing(NLP)

https://github.com/adityathakker/awesome-rl-nlp

  • 视频教程

强化学习教程(莫烦)

https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/

强化学习课程 by David Silver

https://www.bilibili.com/video/av8912293/?from=search&seid=1166472326542614796

CS234: Reinforcement Learning

http://web.stanford.edu/class/cs234/index.html

什么是强化学习? (Reinforcement Learning)

https://www.youtube.com/watch?v=NVWBs7b3oGk

什么是 Q Learning (Reinforcement Learning 强化学习)

https://www.youtube.com/watch?v=HTZ5xn12AL4

强化学习-莫烦

https://morvanzhou.github.io/tutorials/machine-learning/ML-intro/

David Silver深度强化学习第1课 - 简介 (中文字幕)

https://www.bilibili.com/video/av9831889/

David Silver的这套视频公开课(Youtube)

https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxT

David Silver的这套视频公开课(Bilibili)

http://www.bilibili.com/video/av9831889/?from=search&seid=17387316110198388304

Deep Reinforcement Learning

http://videolectures.net/rldm2015_silver_reinforcement_learning/

  • Tutorial Reinforcement Learning for NLPhttp://www.umiacs.umd.edu/~jbg/teaching/CSCI_7000/11a.pdfICML 2016, Deep Reinforcement Learning tutorialhttp://icml.cc/2016/tutorials/deep_rl_tutorial.pdf DQN tutorialhttps://medium.com/@awjuliani/simple-reinforcement-learning-with-tensorflow-part-4-deep-q-networks-and-beyond-8438a3e2b8df#.28wv34w3a
  • 代码 OpenAI Gymhttps://github.com/openai/gymGoogleDeep Mind 团队深度 Q 网络 (DQN) 源码:http://sites.google.com/a/deepmind.com/dqn/ReinforcementLearningCodehttps://github.com/halleanwoo/ReinforcementLearningCodereinforcement-learninghttps://github.com/dennybritz/reinforcement-learningDQNhttps://github.com/devsisters/DQN-tensorflowDDPGhttps://github.com/stevenpjg/ddpg-aigymA3C01https://github.com/miyosuda/async_deep_reinforceA3C02https://github.com/openai/universe-starter-agent
本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2017-10-20,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 专知 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档