Playing MontezumaRevenge with RND 含视频

MontezumaRevengeNoFrameskip-v4'

https://github.com/openai/random-network-distillation

https://blog.openai.com/reinforcement-learning-with-prediction-based-rewards/

Our team running a new experiment on MontezumaRevenge with RND model, which successfully reach 17 rooms. Watching that agent collecting keys and using the sword. You got to question, what is the real definition of intelligence.

使用replayer.py运行默认在/tmp下面的日志目录即可出现下面的训练记录plot video show or save video

up is actual is videos but this is only one frame

Demo

Start training

原文发布于微信公众号 - CreateAMind(createamind)

原文发表时间:2019-02-01

本文参与腾讯云自媒体分享计划,欢迎正在阅读的你也加入,一起分享。

发表于

我来说两句

0 条评论
登录 后参与评论

扫码关注云+社区

领取腾讯云代金券