GitHub - uber-research/ape-x: This repo replicates the results Horgan et al obta...

6 years ago

source link: https://github.com/uber-research/ape-x
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

README.md

Replication of Ape-X (Distributed Prioritized Experience Replay)

This repo replicates the results Horgan et al obtained:

[1] Distributed Prioritized Experience Replay

Our code is based off of code from OpenAI baselines. The original code and related paper from OpenAI can be found here. Their implementation of DQN was modified to use Tensorflow custom ops.

Although Ape-X was originally a distributed algorithm, this implementation was meant to maximize throughput on a single machine. It was optimized for 2 GPUs (data gathering + optimization) but could be modified to use only one. With 2 GPUs and 20~40 CPUs you should be able to achieve human median performance in about 2 hours.

How to run

clone repo

git clone https://github.com/uber-research/ape-x.git

create python3 virtual env

python3 -m venv env
. env/bin/activate

install requirements

pip install tensorflow-gpu gym

Follow the setup under gym_tensorflow/README.md and run ./make to compile the custom ops.

launch experiment

python apex.py --env video_pinball --num-timesteps 1000000000 --logdir=/tmp/agent

Monitor your results with tensorboard

tensorboard --logdir=/tmp/agent

visualize results

python demo.py --env video_pinball --logdir=/tmp/agent

Recommend

GitHub - uber-research/ape-x: This repo replicates the results Horgan et al obta...

README.md

Replication of Ape-X (Distributed Prioritized Experience Replay)

How to run

Recommend

GitHub - rmst/yoke: Linux client for the Yoke Gamepad App

必领神券、值友专享:京东全球购 moony纸尿裤专场优惠满299减70、值友专享满788减300...

SAMSUNG 三星 Galaxy Note9 全网通智能手机 6GB+128GB 6699元包邮（需用券）_国美优惠

第一次看海。

比的那一下真是引起极度舒适...

My Own Private CDN

API难解释？这次用啤酒和积木来破局

浅谈 React 16 框架：Fiber

程序员夏洛克之被踢出去的用户

Apache Commons DbUtils整合Spring框架实现简单的CRUD

About Joyk