前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >【代码集合】深度强化学习Pytorch实现集锦

【代码集合】深度强化学习Pytorch实现集锦

作者头像
昱良
发布2018-11-23 10:42:53
1.8K0
发布2018-11-23 10:42:53
举报

本次分享的是用PyTorch语言编写的深度强化学习算法的高质量实现,这些IPython笔记本的目的主要是帮助练习和理解这些论文;因此,在某些情况下,我将选择可读性而不是效率。首先,我会上传论文的实现,然后是标记来解释代码的每一部分。

相关论文


  1. Human Level Control Through Deep Reinforement Learning [Publication] https://deepmind.com/research/publications/human-level-control-through-deep-reinforcement-learning/ [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/01.DQN.ipynb
  2. Multi-Step Learning (from Reinforcement Learning: An Introduction, Chapter 7) [Publication] https://github.com/qfettes/DeepRL-Tutorials/blob/master/01.DQN.ipynb [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/02.NStep_DQN.ipynb
  3. Deep Reinforcement Learning with Double Q-learning [Publication] https://arxiv.org/abs/1509.06461 [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/03.Double_DQN.ipynb
  4. Dueling Network Architectures for Deep Reinforcement Learning [Publication] https://arxiv.org/abs/1511.06581 [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/04.Dueling_DQN.ipynb
  5. Noisy Networks for Exploration [Publication] https://github.com/qfettes/DeepRL-Tutorials/blob/master/04.Dueling_DQN.ipynb [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/05.DQN-NoisyNets.ipynb
  6. Prioritized Experience Replay [Publication] https://arxiv.org/abs/1511.05952?context=cs [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/06.DQN_PriorityReplay.ipynb
  7. A Distributional Perspective on Reinforcement Learning [Publication] https://arxiv.org/abs/1707.06887 [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/07.Categorical-DQN.ipynb
  8. Rainbow: Combining Improvements in Deep Reinforcement Learning [Publication] https://arxiv.org/abs/1710.02298 [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/08.Rainbow.ipynb
  9. Distributional Reinforcement Learning with Quantile Regression [Publication] https://arxiv.org/abs/1710.10044 [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/09.QuantileRegression-DQN.ipynb
  10. Rainbow with Quantile Regression [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/10.Quantile-Rainbow.ipynb
  11. Deep Recurrent Q-Learning for Partially Observable MDPs [Publication] https://arxiv.org/abs/1507.06527 [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/11.DRQN.ipynb
  12. Advantage Actor Critic (A2C) [Publication1] https://arxiv.org/abs/1602.01783 [Publication2] https://blog.openai.com/baselines-acktr-a2c/ [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/12.A2C.ipynb
  13. High-Dimensional Continuous Control Using Generalized Advantage Estimation [Publication] https://arxiv.org/abs/1506.02438 [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/13.GAE.ipynb
  14. Proximal Policy Optimization Algorithms [Publication] https://arxiv.org/abs/1707.06347 [code] https://github.com/qfettes/DeepRL-Tutorials/blob/master/14.PPO.ipynb

PyTorch实现


本文参与 腾讯云自媒体分享计划,分享自微信公众号。
原始发表:2018-10-23,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 机器学习算法与Python学习 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体分享计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档