README.md

Pytorch implementation of the "WorldModels"

Paper: Ha and Schmidhuber, "World Models", 2018. https://doi.org/10.5281/zenodo.1207631

Prerequisites

The implementation is based on Python3 and PyTorch, check their website here for installation instructions. The rest of the requirements is included in the requirements file, to install them:

pip3 install -r requirements.txt

Running the worldmodels

The model is composed of three parts:

A Variational Auto-Encoder (VAE), whose task is to compress the input images into a compact latent representation.
A Mixture-Density Recurrent Network (MDN-RNN), trained to predict the latent encoding of the next frame given past latent encodings and actions.
A linear Controller (C), which takes both the latent encoding of the current frame, and the hidden state of the MDN-RNN given past latents and actions as input and outputs an action. It is trained to maximize the cumulated reward using the Covariance-Matrix Adaptation Evolution-Strategy (CMA-ES) from the cma python package.

In the given code, all three sections are trained separately, using the scripts trainvae.py, trainmdrnn.py and traincontroller.py.

Training scripts take as argument:

--logdir : The directory in which the models will be stored. If the logdir specified already exists, it loads the old model and continues the training.
--noreload : If you want to override a model in logdir instead of reloading it, add this option.

1. Data generation

Before launching the VAE and MDN-RNN training scripts, you need to generate a dataset of random rollouts and place it in the datasets/carracing folder.

Data generation is handled through the data/generation_script.py script, e.g.

python data/generation_script.py --rollouts 1000 --dir datasets/carracing --threads 8

Rollouts are generated using a brownian random policy, instead of the white noise random action_space.sample() policy from gym, providing more consistent rollouts.

2. Training the VAE

The VAE is trained using the trainvae.py file, e.g.

python trainvae.py --logdir exp_dir

3. Training the MDN-RNN

The MDN-RNN is trained using the trainmdrnn.py file, e.g.

python trainmdrnn.py --logdir exp_dir

A VAE must have been trained in the same exp_dir for this script to work.

4. Training and testing the Controller

Finally, the controller is trained using CMA-ES, e.g.

python traincontroller.py --logdir exp_dir

You can test the obtained policy with test_controller.py e.g.

python test_controller.py --logdir exp_dir

Authors

Corentin Tallec - ctallec
Léonard Blier - leonardblier
Diviyan Kalainathan - diviyan-kalainathan

License

This project is licensed under the MIT License - see the LICENSE.md file for details

GitHub - ctallec/world-models: Reimplementation of World-Models (Ha and Schmidhu...

README.md

Pytorch implementation of the "WorldModels"

Prerequisites

Running the worldmodels

1. Data generation

2. Training the VAE

3. Training the MDN-RNN

4. Training and testing the Controller

Authors

License

Recommend

ES6 的解构赋值前每次都创建一个对象吗？会加重 GC 的负担吗？ - justjavac的前端进阶...

1个小米员工≈11个美团员工，BATJMM哪家员工创收高？

《侏罗纪世界》关于恐龙的不科学：暴龙怒吼还是低鸣

高科技偷懒：85后是主力，90后法子多，95后入坑快

华大基因被举报套骗国有资产追踪：两年获补助4000万

人民日报:电商专供套路多真相或是偷工减料以次充好

【原创】分布式之数据库和缓存双写一致性方案解析(二) - 孤独烟

StackOverflow 2018调查报告：TensorFlow秒杀所有开源框架

黑客攻防日记

格外音乐 - 高冷的原创音乐 - NEXT

About Joyk