CreateAMind

544 篇文章
32 人订阅

全部文章

用户1908973

SQN with Lunar landing challenge

Our team combine SAC algorithm and DQN method gave a useful way to solve the...

691
用户1908973

Teach agent how to walk with sac algorithm

https://github.com/rail-berkeley/softlearning training about ten hours with 24 ...

704
用户1908973

Randomized Prior Functions : bayes探索 代码及视频

https://sites.google.com/view/randomized-prior-nips-2018/

381
用户1908973

RND 笔记

RND: https://blog.openai.com/reinforcement-learning-with-prediction-based-reward...

883
用户1908973

逆强化学习-学习人先验的动机

LEARNING A PRIOR OVER INTENT VIA META-INVERSE REINFORCEMENT LEARNING

792
用户1908973

用信息瓶颈的迁移学习和探索

Transfer and Exploration via the Information Bottleneck

642
用户1908973

用信息瓶颈的迁移学习和探索

Transfer and Exploration via the Information Bottleneck

511
用户1908973

A Theory of State Abstraction for Reinforcement Learning

A Theory of State Abstraction for Reinforcement Learning

431
用户1908973

State Abstraction as 压缩 in Apprenticeship Learning

State Abstraction as Compression in Apprenticeship Learning https://github.com/d...

783
用户1908973

SQN算法效果及代码: Breakout-ram-v4 打砖块

再看LunarLander-v2的效果(也是比较简单了。。。),AverageEpRet就是不上300... : (

851
用户1908973

The demo for current work

Our dream is creating a safe driving system working well under all circumstance,...

973
用户1908973

ppo trained carla demo show and method

Our dream is creating a safe driving system working well under all circumstance,...

582
用户1908973

自律、分享、学习——打卡群内容分享

793
用户1908973

Motion Selective Prediction for Video Frame Synthesis

https://www.arxiv-vanity.com/papers/1812.10157/

873
用户1908973

favae Sequence Disentanglement using Information Bottleneck

FAVAE: Sequence Disentanglement using Information Bottleneck Principle

381
用户1908973

planet 代码注释解读

https://github.com/JingbinLiu/planet_A/commits/master

692
用户1908973

planet 训练过程及debug流程学习笔记

https://github.com/createamind/Planet/issues/3

793
用户1908973

Motion Selective Prediction for Video Frame Synthesis

https://www.arxiv-vanity.com/papers/1812.10157/

772
用户1908973

Carla Challenge比赛

1011
用户1908973

腾讯paper 模仿学习

3. 基于分批历史数据的指数加权模仿学习方法 Exponentially Weighted Imitation Learning for Batched H...

972

扫码关注云+社区

领取腾讯云代金券