103

GitHub - higgsfield/RL-Adventure: Pytorch easy-to-follow step-by-step Deep Q Lea...

 6 years ago
source link: https://github.com/higgsfield/RL-Adventure
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

README.md

DQN Adventure: from Zero to State of the Art

68747470733a2f2f70702e757365726170692e636f6d2f633834373132302f763834373132303936302f383262342f7847424b397058416b77382e6a7067

This is easy-to-follow step-by-step Deep Q Learning tutorial with clean readable code.

The deep reinforcement learning community has made several independent improvements to the DQN algorithm. This tutorial presents latest extensions to the DQN algorithm in the following order:

  1. Playing Atari with Deep Reinforcement Learning [arxiv] [code]
  2. Deep Reinforcement Learning with Double Q-learning [arxiv] [code]
  3. Dueling Network Architectures for Deep Reinforcement Learning [arxiv] [code]
  4. Prioritized Experience Replay [arxiv] [code]
  5. Noisy Networks for Exploration [arxiv] [code]
  6. A Distributional Perspective on Reinforcement Learning [arxiv] [code]
  7. Rainbow: Combining Improvements in Deep Reinforcement Learning [arxiv] [code]
  8. Distributional Reinforcement Learning with Quantile Regression [arxiv] [code]
  9. Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation [arxiv] [code]
  10. Neural Episodic Control [arxiv] [code]

Environments

At the very begining I recommend to use small test problems to run experiments quickly. Then you can continue on environmnets with large observation space.

  • CartPole - classic RL environment that can be solved even on a single cpu
  • Atari Pong - the easiest atari environment, it takes ~ 1 million frames to converge, comparing with other atari games that take > 40 millions
  • Atari others - change hyperparameters, target network update frequency=10K, replay buffer size=1M

If you get stuck…

  • First, remember that you are not stuck unless you have spent more than a week on a single algorithm. It is perfectly normal if you do not have all the required knowledge of mathematics and CS. For example, you will need knowledge of the fundamentals of measure theory and statistics, especially the Wasserstein metric and quantile regression. Statistical inference: importance sampling. Data structures: Segment Tree and K-dimensional Tree.
  • Carefully go through the paper. Try to see what is the problem that authors are solving. First you should understand a high-level idea of the approach, then you can read the code skipping the proofs, and after that go over the mathematical details and proofs.

Best RL courses

  • David Silver's course link
  • Berkeley deep RL link
  • Practical RL link

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK