

GitHub - geek-ai/MAgent: A Platform for Many-agent Reinforcement Learning
source link: https://github.com/geek-ai/MAgent
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

This project is no longer maintained
Please see https://github.com/Farama-Foundation/MAgent for a maintained fork of this project that's installable with pip.
MAgent
MAgent is a research platform for many-agent reinforcement learning. Unlike previous research platforms that focus on reinforcement learning research with a single agent or only few agents, MAgent aims at supporting reinforcement learning research that scales up from hundreds to millions of agents.
Requirement
MAgent supports Linux and OS X running Python 2.7 or python 3. We make no assumptions about the structure of your agents. You can write rule-based algorithms or use deep learning frameworks.
Install on Linux
git clone [email protected]:geek-ai/MAgent.git
cd MAgent
sudo apt-get install cmake libboost-system-dev libjsoncpp-dev libwebsocketpp-dev
bash build.sh
export PYTHONPATH=$(pwd)/python:$PYTHONPATH
Install on OSX
Note: There is an issue with homebrew for installing websocketpp, please refer to #17
git clone [email protected]:geek-ai/MAgent.git
cd MAgent
brew install cmake llvm [email protected]
brew install jsoncpp argp-standalone
brew tap david-icracked/homebrew-websocketpp
brew install --HEAD david-icracked/websocketpp/websocketpp
brew link --force [email protected]
bash build.sh
export PYTHONPATH=$(pwd)/python:$PYTHONPATH
Examples
The training time of following tasks is about 1 day on a GTX1080-Ti card. If out-of-memory errors occur, you can tune infer_batch_size smaller in models.
Note : You should run following examples in the root directory of this repo. Do not cd to examples/
.
Train
Three examples shown in the above video. Video files will be saved every 10 rounds. You can use render to watch them.
-
pursuit
python examples/train_pursuit.py --train
-
gathering
python examples/train_gather.py --train
-
battle
python examples/train_battle.py --train
An interactive game to play with battle agents. You will act as a general and dispatch your soldiers.
- battle game
python examples/show_battle_game.py
Baseline Algorithms
The baseline algorithms parameter-sharing DQN, DRQN, a2c are implemented in Tensorflow and MXNet. DQN performs best in our large number sharing and gridworld settings.
Acknowledgement
Many thanks to Tianqi Chen for the helpful suggestions.
Recommend
-
65
Memcached的特点:在Memcached中可以保存的item数据量是没有限制的,只要内存足够;Memcached单进程最大使用内存为2GB,要使用更多内存,可以分别在不同端口启动多个Memcached进程;Memcached是一种无阻塞的socket通信方式的服务,基于libevent库,由于无阻塞通信...
-
33
Magent缓存代理介绍 因为Memcached服务器与服务器之间没有任何通讯,所以当任何服务器节点出现故障时,会出现单点故障。通过Magent缓存代理,防止单点现象。通过客户端连接到缓存代理服务器,缓存代理服务器可以连接多台Memcached机器并同时将每台Memcached机...
-
54
README.md
-
31
Horizon is the first open source end-to-end platform that uses applied reinforcement learning (RL) to optimize systems in large-scale production environments. The workflows and algorithms included in thi...
-
47
README.md Deep Reinforcement Learning for Keras
-
15
ML Platform Meetup: Infra for Contextual Bandits and Reinforcement LearningFaisal Siddiqi
-
12
[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning DouZero is a reinforcement learning framework for DouDizhu (
-
5
README.md Tianshou
-
30
Model-Based RL: Policy and Value Iteration using Dynamic Programming Learning Goals Understand the difference between Policy Evaluation and Policy Improvement and how these processes interact Understand the Pol...
-
5
Multi-Agent Deep Reinforcement Learning for Voltage Control With Coordinated Active and Reactive Power OptimizationIEEE websites place cookies on your device to give you the best user experience. By using our websites, you agree t...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK