前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
工具
TVP
发布
社区首页 >专栏 >博客 | 过去10年NIPS顶会强化学习论文(100多篇)汇总(2008-2018年)

博客 | 过去10年NIPS顶会强化学习论文(100多篇)汇总(2008-2018年)

作者头像
AI研习社
发布2019-05-22 14:35:55
1.3K0
发布2019-05-22 14:35:55
举报
文章被收录于专栏:AI研习社

本文原载于微信公众号:深度强化学习算法 AI研习社经授权转载。欢迎关注 深度学习强化算法 微信公众号、及 AI研习社博客专栏

NIPS(NeurIPS),全称神经信息处理系统大会(Conference and Workshop on Neural Information Processing Systems),是一个关于机器学习和计算神经科学的国际会议。该会议固定在每年的12月举行,由NIPS基金会主办。NIPS是机器学习领域的顶级会议。在中国计算机学会的国际学术会议排名中,NIPS为人工智能领域的A类会议,自1987年到2000年,NIPS都在美国的丹佛举办。而在2001年到2010年,NIPS的举办地则是在加拿大的温哥华。此后,NIPS分别在在西班牙的格兰纳达(2011年),太浩湖(Lake Tahoe)(2012年到2013年),以及加拿大的蒙特利尔(2014到2015年)

自从数年前深度学习流行以来,NIPS 成为学术界、产业界重点关注的学术会议之一,参会人数从 5 年前的 2000 人一度飙升到 2017 年的 8000 多人。除参会人员,2017 年 NIPS 的论文投稿也创造了历史新高,达到了 3240 篇。最近的统计显示,NIPS 2018 论文投稿数量高达 4900 篇,比去年又多了 1600 多篇,在过去几年中,各个领域文章特别多,本文汇总了过去10年NIPS会议接收的108篇强化学习领域的文章内容,具体总结如下:

从表中我们可以看出强化学习在2012年是一个分水岭,经历了火热之后开始衰退,然后从2015年开始一路攀升,达到了录取38篇的数量,下面是历届accept paper的题目list

2008年(3)

  • Near-optimal Regret Bounds for Reinforcement Learning
  • Stress, noradrenaline, and realistic prediction of mouse behaviour using reinforcement learning
  • Optimization on a Budget: A Reinforcement Learning Approach

2009(3)

  • Manifold Embeddings for Model-Based Reinforcement Learning under Partial Observability
  • Skill Discovery in Continuous Reinforcement Learning Domains using Skill Chaining
  • Training Factor Graphs with Reinforcement Learning for Efficient MAP Inference

2010年(5)

  • Nonparametric Bayesian Policy Priors for Reinforcement Learning
  • Constructing Skill Trees for Reinforcement Learning Agents from Demonstration Trajectories
  • Feature Construction for Inverse Reinforcement Learning
  • PAC-Bayesian Model Selection for Reinforcement Learning
  • Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains

2011年(7)

  • Nonlinear Inverse Reinforcement Learning with Gaussian Processes
  • A Reinforcement Learning Theory for Homeostatic Regulation
  • Action-Gap Phenomenon in Reinforcement Learning
  • Optimal Reinforcement Learning for Gaussian Systems
  • Reinforcement Learning using Kernel-Based Stochastic Factorization
  • MAP Inference for Bayesian Inverse Reinforcement Learning
  • Selecting the State-Representation in Reinforcement Learning

2012年(11)

  • Bayesian Hierarchical Reinforcement Learning
  • Exploration in Model-based Reinforcement Learning by Empirically Estimating Learning Progress
  • Nonparametric Bayesian Inverse Reinforcement Learning for Multiple Reward Functions
  • Inverse Reinforcement Learning through Structured Classification
  • Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search
  • On-line Reinforcement Learning Using Incremental Kernel-Based Stochastic Factorization
  • Online Regret Bounds for Undiscounted Continuous Reinforcement Learning
  • Neurally Plausible Reinforcement Learning of Working Memory Tasks
  • Transferring Expectations in Model-based Reinforcement Learning
  • Efficient Reinforcement Learning for High Dimensional Linear Quadratic Systems
  • Cost-Sensitive Exploration in Bayesian Reinforcement Learning

2013年(3)

  • Reinforcement Learning in Robust Markov Decision Processes
  • Policy Shaping: Integrating Human Feedback with Reinforcement Learning
  • (More) Efficient Reinforcement Learning via Posterior Sampling

2014年(5)

  • Model-based Reinforcement Learning and the Eluder Dimension
  • Sparse Multi-Task Reinforcement Learning
  • Difference of Convex Functions Programming for Reinforcement Learning
  • Near-optimal Reinforcement Learning in Factored MDPs
  • RAAM: The Benefits of Robustness in Approximating Aggregated MDPs in Reinforcement Learning

2015年(2)

  • Sample Complexity of Episodic Fixed-Horizon Reinforcement Learning
  • Inverse Reinforcement Learning with Locally Consistent Reward Functions

2016年(7)

  • Tree-Structured Reinforcement Learning for Sequential Object Localization
  • Safe and Efficient Off-Policy Reinforcement Learning
  • Contextual-MDPs for PAC Reinforcement Learning with Rich Observations
  • Learning to Communicate with Deep Multi-Agent Reinforcement Learning
  • Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation
  • Cooperative Inverse Reinforcement Learning
  • Linear Feature Encoding for Reinforcement Learning

2017年(24)

  • Hybrid Reward Architecture for Reinforcement Learning
  • Shallow Updates for Deep Reinforcement Learning
  • Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning
  • Optimistic posterior sampling for reinforcement learning: worst-case regret bounds
  • Cold-Start Reinforcement Learning with Softmax Policy Gradient
  • Log-normality and Skewness of Estimated State/Action Values in Reinforcement Learning
  • Safe Model-based Reinforcement Learning with Stability Guarantees
  • Data-Efficient Reinforcement Learning in Continuous State-Action Gaussian-POMDPs
  • Deep Reinforcement Learning from Human Preferences
  • EX2: Exploration with Exemplar Models for Deep Reinforcement Learning
  • Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning
  • Compatible Reward Inverse Reinforcement Learning
  • Bridging the Gap Between Value and Policy Based Reinforcement Learning
  • Compatible Reward Inverse Reinforcement Learning
  • Online Reinforcement Learning in Stochastic Games
  • Reinforcement Learning under Model Mismatch
  • A multi-agent reinforcement learning model of common-pool resource appropriation
  • Imagination-Augmented Agents for Deep Reinforcement Learning
  • Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation
  • Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning
  • Repeated Inverse Reinforcement Learning
  • A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning
  • Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning
  • Agent Reinforcement Learning

2018年(38)

  • The Importance of Sampling inMeta-Reinforcement Learning
  • Learning Temporal Point Processes via Reinforcement Learning
  • Data-Efficient Hierarchical Reinforcement Learning
  • Fast deep reinforcement learning using online adjustments from the past
  • Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing
  • A Lyapunov-based Approach to Safe Reinforcement Learning
  • Reinforcement Learning of Theorem Proving
  • Simple random search of static linear policies is competitive for reinforcement learning
  • Meta-Gradient Reinforcement Learning
  • Reinforcement Learning for Solving the Vehicle Routing Problem
  • Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning
  • REFUEL: Exploring Sparse Features in Deep Reinforcement Learning for Fast Disease Diagnosis
  • Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
  • Distributed Multitask Reinforcement Learning with Quadratic Convergence
  • Constrained Cross-Entropy Method for Safe Reinforcement Learning
  • Reinforcement Learning with Multiple Experts: A Bayesian Model Combination Approach
  • Verifiable Reinforcement Learning via Policy Extraction
  • Deep Reinforcement Learning of Marked Temporal Point Processes
  • Evolution-Guided Policy Gradient in Reinforcement Learning
  • Meta-Reinforcement Learning of Structured Exploration Strategies
  • Diversity-Driven Exploration Strategy for Deep Reinforcement Learning
  • Genetic-Gated Networks for Deep Reinforcement Learning
  • Visual Reinforcement Learning with Imagined Goals
  • Unsupervised Video Object Segmentation for Deep Reinforcement Learning
  • Total stochastic gradient algorithms and applications in reinforcement learning
  • Fighting Boredom in Recommender Systems with Linear Reinforcement Learning
  • Randomized Prior Functions for Deep Reinforcement Learning
  • Scalable Coordinated Exploration in Concurrent Reinforcement Learning
  • Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization
  • Negotiable Reinforcement Learning for Pareto Optimal Sequential Decision-Making
  • Teaching Inverse Reinforcement Learners via Features and Demonstrations
  • Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies
  • Lifelong Inverse Reinforcement Learning
  • Multiple-Step Greedy Policies in Approximate and Online Reinforcement Learning
  • Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
  • Sample-Efficient Reinforcement Learning with Stochastic Ensemble Value Expansion
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2019-05-18,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 AI研习社 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • 2008年(3)
  • 2009(3)
  • 2010年(5)
  • 2011年(7)
  • 2012年(11)
  • 2013年(3)
  • 2014年(5)
  • 2015年(2)
  • 2016年(7)
  • 2017年(24)
  • 2018年(38)
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档