开发者社区

文档建议反馈控制台

文章/答案/技术大牛

发布

社区首页 >专栏 >强化学习族谱

强化学习族谱

CreateAMind

发布于 2018-07-24 15:38:47

发布于 2018-07-24 15:38:47

9730

举报

文章被收录于专栏：CreateAMindCreateAMind

https://github.com/tigerneil/deep-reinforcement-learning-family

deep-reinforcement-learning-records

Explicitly show the relationships between various techniques of deep reinforcement learning methods.

Dedicated for learning and researching on DRL.

Policy gradient methods

Equivalence Between Policy Gradients and Soft Q-Learning
Trust Region Policy Optimization
Reinforcement Learning with Deep Energy-Based Policies
Q-PROP: SAMPLE-EFFICIENT POLICY GRADIENT WITH AN OFF-POLICY CRITIC
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning 1 Jun 2017

Explorations in DRL

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models
EX2: Exploration with Exemplar Models for Deep Reinforcement Learning

Actor-Critic methods

The Reactor: A Sample-Efficient Actor-Critic Architecture 15 Apr 2017
SAMPLE EFFICIENT ACTOR-CRITIC WITH EXPERIENCE REPLAY
REINFORCEMENT LEARNING WITH UNSUPERVISED AUXILIARY TASKS
Continuous control with deep reinforcement learning

Connection with other methods

Connecting Generative Adversarial Networks and Actor-Critic Methods

Connecting value and policy methods

Bridging the Gap Between Value and Policy Based Reinforcement Learning
Policy gradient and Q-learning

Unifying

Multi-step Reinforcement Learning: A Unifying Algorithm

Faster DRL

Neural Episodic Control

Apply RL to other domains

TUNING RECURRENT NEURAL NETWORKS WITH REINFORCEMENT LEARNING

Multiagent Settings

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments 7 Jun 2017
Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games 29 Mar 2017

本文参与腾讯云自媒体同步曝光计划，分享自微信公众号。

原始发表：2017-08-02，如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 CreateAMind 微信公众号，前往查看

如有侵权，请联系 cloudcommunity@tencent.com 删除。

本文参与腾讯云自媒体同步曝光计划，欢迎热爱写作的你一起参与！

评论

登录后参与评论

0 条评论

热度

最新

目录

deep-reinforcement-learning-records
- Policy gradient methods
- Explorations in DRL
- Actor-Critic methods
- Connection with other methods
- Connecting value and policy methods
- Unifying
- Faster DRL
- Apply RL to other domains
- Multiagent Settings