Github GitHub - huggingface/accelerate: A simple way to train and use PyTorch mo...
source link: https://github.com/huggingface/accelerate
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Run your *raw* PyTorch training script on any kind of device
Easy to integrate
Accelerate was created for PyTorch users who like to write the training loop of PyTorch models but are reluctant to write and maintain the boilerplate code needed to use multi-GPUs/TPU/fp16.
Accelerate abstracts exactly and only the boilerplate code related to multi-GPUs/TPU/fp16 and leaves the rest of your code unchanged.
Here is an example:
import torch import torch.nn.functional as F from datasets import load_dataset + from accelerate import Accelerator + accelerator = Accelerator() - device = 'cpu' + device = accelerator.device model = torch.nn.Transformer().to(device) optim = torch.optim.Adam(model.parameters()) dataset = load_dataset('my_dataset') data = torch.utils.data.DataLoader(dataset, shuffle=True) + model, optim, data = accelerator.prepare(model, optim, data) model.train() for epoch in range(10): for source, targets in data: source = source.to(device) targets = targets.to(device) optimizer.zero_grad() output = model(source) loss = F.cross_entropy(output, targets) - loss.backward() + accelerator.backward(loss) optimizer.step()
As you can see in this example, by adding 5-lines to any standard PyTorch training script you can now run on any kind of single or distributed node setting (single CPU, single GPU, multi-GPUs and TPUs) as well as with or without mixed precision (fp16).
In particular, the same code can then be run without modification on your local machine for debugging or your training environment.
Accelerate even handles the device placement for you (which requires a few more changes to your code, but is safer in general), so you can even simplify your training loop further:
import torch import torch.nn.functional as F from datasets import load_dataset + from accelerate import Accelerator - device = 'cpu' + accelerator = Accelerator() - model = torch.nn.Transformer().to(device) + model = torch.nn.Transformer() optim = torch.optim.Adam(model.parameters()) dataset = load_dataset('my_dataset') data = torch.utils.data.DataLoader(dataset, shuffle=True) + model, optim, data = accelerator.prepare(model, optim, data) model.train() for epoch in range(10): for source, targets in data: - source = source.to(device) - targets = targets.to(device) optimizer.zero_grad() output = model(source) loss = F.cross_entropy(output, targets) - loss.backward() + accelerator.backward(loss) optimizer.step()
Accelerate also provides an optional CLI tool that allows you to quickly configure and test your training environment before launching the scripts. No need to remember how to use
torch.distributed.launch or to write a specific launcher for TPU training!
On your machine(s) just run:
and answer the questions asked. This will generate a config file that will be used automatically to properly set the default options when doing
accelerate launch my_script.py --args_to_my_script
For instance, here is how you would run the GLUE example on the MRPC task (from the root of the repo):
accelerate launch examples/nlp_example.py
Why should I use Accelerate?
You should use Accelerate when you want to easily run your training scripts in a distributed environment without having to renounce full control over your training loop. This is not a high-level framework above PyTorch, just a thin wrapper so you don't have to learn a new library, In fact the whole API of Accelerate is in one class, the
Why shouldn't I use Accelerate?
You shouldn't use Accelerate if you don't want to write a training loop yourself. There are plenty of high-level libraries above PyTorch that will offer you that, Accelerate is not one of them.
This repository is tested on Python 3.6+ and PyTorch 1.4.0+
First, create a virtual environment with the version of Python you're going to use and activate it.
Then, you will need to install PyTorch: refer to the official installation page regarding the specific install command for your platform. Then Accelerate can be installed using pip as follows:
pip install accelerate
- CPU only
- single GPU
- multi-GPU on one node (machine)
- multi-GPU on several nodes (machines)
- FP16 with native AMP (apex on the roadmap)
Aggregate valuable and interesting links.
Joyk means Joy of geeK