5 Frameworks for Reinforcement Learning on Python

Programming your own Reinforcement Learning implementation from scratch can be a lot of work, but you don’t need to do that. There are lots of great, easy and free frameworks to get you started in few minutes.

Mauricio Fadel Argerich

Jun 4 ·6min read

yUzQbme.jpg!web

You can save a lot of effort by re-using existing RL libraries [photo by Markus Spiske on Unsplash .]

There are lots of standard libraries for supervised and unsupervised machine learning like Scikit-learn, XGBoost or even Tensorflow, that can get you started in no time and you can find log nads of support online. Sadly, for Reinforcement Learning (RL) this is not the case.

It is not that there are no frameworks, as a matter of fact, there are many frameworks for RL out there. The problem is that there is no standard yet, and so finding support online for starting, fixing a problem or customizing a solution is not easily found. This is probably caused by the fact that, while RL is a very popular research topic, it is still in its early days of being implemented and used in the industry.

But this doesn’t mean there are no great frameworks out there that can help you start and use RL for solving any problem you like. I have made a list here of some frameworks I have come to know and use along time, with their benefits and cons. I hope this gives you a quick overview about some of the RL frameworks currently available, so you can choose the one that better fits your needs.

Keras-RL

I have to admit from the whole list, this is my favorite. I believe it is by far the simplest to understand code implementation of several RL algorithms including Deep Q Learning (DQN), Double DQN, Deep Deterministic Policy Gradient (DDPG), Continuous DQN (CDQN or NAF), Cross-Entropy Method (CEM), Dueling DQN) and SARSA. When I say “simplest to understand code” I refer not to use, but to customize it and utilize it as a building block for your project*. The Keras-RL github also contains some examples that you can use to get started in no time. It uses Keras of course, and you can use it along with Tensorflow or PyTorch.

Unfortunately, Keras-RL has not been well-maintained for a while already and its official documentation is not the best. This has given light to a fork of this project called Keras-RL2 .

(*) What did I use this framework for? Well, I’m glad you asked — or was it me? I have used this framework to create a customized Tutored DQN agent, you can read more about it here .

Keras-RL2

Keras-RL2 is a fork from Keras-RL and as such it shares support for the same agents as Keras-RL2 and is easily customizable. The big change here is that Keras-RL2 is better maintained and uses Tensorflow 2.1.0. Unfortunately, there is no documentation for this library, even though the documentation for Keras-RL can be easily used for this fork too.

OpenAI Baselines

OpenAI Baselines is a set of high-quality implementations of RL algorithms by OpenAI, one of the leading companies in research and development of AI and in particular RL. It was conceived so researchers could compare their RL algorithms easily, using as a baseline the state-of-the-art implementations from OpenAI — thus the name. The framework contains implementations of many popular agents such as A2C , DDPG , DQN , PPO2 and TRPO .

BVz6fiI.jpg!web

[plots from Stable baselines benchmark .]

On the downside, OpenAI Baselines is not well documented, even though there are lots of useful comments on the code. In addition, since it was developed to be used as a baseline and not as a building block, the code is not so friendly if you want to customize or modify some of the agents for your projects. In fact, the next framework is a fork from this and solves most of these issues.

Stable Baselines

QzARN3u.png!web

[image from Stable Baselines documentation .]

Stable Baselines is a fork of OpenAI Baselines , with a major structural refactoring and code cleanups. The changes listed in their official documentation site are the following:

Unified structure for all algorithms
PEP8 compliant (unified code style)
Documented functions and classes
More tests & more code coverage
Additional algorithms: SAC and TD3 (+ HER support for DQN, DDPG, SAC and TD3)

I have personally used Stable Baselines in the past and I can confirm it is really well documented and easy to use. It is even possible to train an agent for OpenAI Gym environments with a one liner:

from stable_baselines import PPO2model = PPO2('MlpPolicy', 'CartPole-v1').learn(10000)

Acme

j2IbIzb.jpg!web

The Coyote has been using ACME for decades, way ahead of his time! [image from Comicbook And Beyond .]

Acme comes from DeepMind, probably the most well-known company working on RL in research. As such, it has been developed for building readable, efficient, research-oriented RL algorithms and contains implementations of several state-of-the-art agents such as D4PG, DQN, R2D2, R2D3 and more. Acme uses Tensorflow as backend and also some agent implementations use a combination of JAX and Tensorflow.

Acme was developed keeping in mind to make its code as re-usable as possible, so its design is modular and easy to customize. Its documentation is not abundant but enough to give you a nice introduction to the library and there are also some examples to get you started in Jupyter notebooks.

Takeaways

All of the frameworks listed here are solid options for any RL project; deciding which one to use depends on your preferences and what you want to do with it exactly. To visualize better each framework and its pros and cons, I’ve made the following visual summary:

Keras-RL — Github

Choices of RL algorithms: ☆☆☆

Documentation: ☆☆☆

Customization: ☆☆☆☆☆

Maintenance: ☆

Backend: Keras and Tensorflow 1.14.

Keras-RL2 — Github

Choices of RL algorithms: ☆☆☆

Documentation: Not available

Customization: ☆☆☆☆☆

Maintenance: ☆☆☆

Backend: Keras and Tensorflow 2.1.0.

OpenAI Baselines — Github

Choices of RL algorithms: ☆☆☆

Documentation: ☆☆

Customization: ☆☆

Maintenance: ☆☆☆

Backend: Tensorflow 1.14.

Stable Baselines — Github

Choices of RL algorithms: ☆☆☆☆

Documentation: ☆☆☆☆☆

Customization: ☆☆☆

Maintained: ☆☆☆☆☆

Backend: Tensorflow 1.14.

Acme — Github

Choices of RL algorithms: ☆☆☆☆

Documentation: ☆☆☆

Customization: ☆☆☆☆

Maintenance: ☆☆☆☆☆

Backend: Tensorflow v2+ and JAX

If you have already decided on what framework to use, all you need now is an environment. You can start using OpenAI Gym, which is already used in most examples of these frameworks, but if you want to try RL on other tasks such as Trading stocks, networking or producing recommendations, you can find a comprehensible list of ready-to-use environments here:

Reinforcement Learning Environments

I’ve been lately working with Reinforcement Learning (RL) and I have found there are lots of great articles, tutorials…

medium.com

If you know about any other good RL framework, please let me know in responses below! Thanks for reading! :)

Keras-RL

Keras-RL2

OpenAI Baselines

Stable Baselines

Acme

Takeaways

Keras-RL — Github

Keras-RL2 — Github

OpenAI Baselines — Github

Stable Baselines — Github

Acme — Github

Reinforcement Learning Environments

medium.com

Recommend

Thoughts about State Handling on Android

Science：AI领域那么多引人注目的「进展」，竟是无用功

Q1营收净利润大幅增长疫情之火究竟能帮Zoom“火”多久？

360手机已彻底凉凉周鸿祎再不甘心也得认命

11.经典O(n²)比较型排序算法

关于xml包在Unmarshal时将\r\n重写为\n的问题

美国新移民的困境，连外星人也免不了

Leakgate: Tipster excorciates Vivo for "ghosting" him on a Vivo X50 &q...

"超前点播案"原告:爱奇艺庭审中曝光观影记录感觉隐私被侵犯

注册了一个域名， imoegirl.com，大家觉得怎么样

About Joyk