7

Testing nebullvm, the open-source AI accelerator, on TensorFlow, Pytorch and Hug...

 2 years ago
source link: https://emilec.medium.com/testing-nebullvm-the-open-source-ai-accelerator-on-tensorflow-pytorch-and-hugging-face-7ac3596fd718
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Testing nebullvm, the open-source AI accelerator, on TensorFlow, Pytorch and Hugging Face

Photo by CHUTTERSNAP on Unsplash

You may have read on some blogs about nebullvm, the open-source library that optimizes AI models to make them faster in inference.

The real question is, does nebullvm really achieve what it claims to do? Does it really reach ~10x inference acceleration of deep learning models by just adding a few lines to your code?

Let’s test its full potential.

Intro about the open-source library

Nebullvm takes an AI model as input and accelerates it by leveraging a technology called deep learning compilers. The library outputs an optimized version of the model that performs inference 5–20 times faster, where acceleration depends strongly on the input model and the hardware on which the optimization is performed.

As stated on github, the goal of nebullvm is to let any developer benefit from deep learning (DL) compilers without having to spend tons of hours understanding, installing, testing and debugging this powerful technology.

Testing nebullvm on your models

We built 3 notebooks where the library can be tested on the most popular AI frameworks Tensorflow, Pytorch and Hugging Face.

The notebooks will run locally on your hardware so you can get an idea of the performance you would achieve with nebullvm on your AI models.

Note that it may take a few minutes to install the library the first time, as the library also installs the deep learning compilers responsible for optimization.

Benchmarks

We have tested nebullvm on popular AI models and hardware from leading vendors.

  • Hardware: M1 Pro, NVIDIA T4, Intel Xeon, AMD EPYC
  • AI Models: EfficientNet, Resnet, SqueezeNet, BERT, GPT2

The table below shows the response time in milliseconds (ms) of the non-optimized model and the optimized model for the various model-hardware couplings as an average value over 100 experiments. It also displays the speedup provided by nebullvm, where speedup is defined as the response time of the optimized model over the response time of the non-optimized model.

Benchmarking of nebullvm on eficientNet, Resnet, SqueezeNet, BERT, GPT2. Source: Image by the author

At first glance, we can observe that speedup varies greatly across hardware-model couplings. Overall, the library provides great positive results, most ranging from 2 to 30+ times speedup.

To summarize, the results are:

  • Nebullvm provides positive acceleration to non-optimized AI models
  • These early results show poorer (yet positive) performance on Hugging Face models. Support for Hugging Face has just been released and improvements will be included in future versions
  • The library provides a ~2–3x boost on Intel and AMD hardware. These results are most likely related to an already highly optimized implementation of PyTorch for x86 devices
  • Nebullvm delivers extremely good performance on NVIDIA machines
  • The library provides great performances also on Apple M1 chips And across all scenarios, nebullvm can result very useful for its ease of use, allowing you to take advantage of deep learning compilers without having to spend hours studying, testing and debugging this technology.

Remarks on pre-optimized models

Nebullvm is benchmarked against models that have not been optimized with some other accelerator. If you are already using a deep learning compiler on your models such as Apache TVM, TensorRT, or OpenVINO, it is likely that you will not get 5–20x speedup with nebullvm over your pre-optimized model.

Even in this case, nebullvm could be of great help to you for its ease of use.

More about nebullvm

Full documentation on nebullvm is provided on github. The main contributor to the library is Diego Fiori, with support from the github community.

The library quickly grew to 800+ github stars in just the first month after launch, and aims to continuously expand in performance and coverage. As mentioned on github, the library aims to become:

  • Deep learning model agnostic. Nebullvm supports all the most popular architectures such as transformers, LSTMs, CNNs and FCNs.
  • Hardware agnostic. The library now works on most CPUs and GPUs and will soon support TPUs and other deep learning-specific ASICs.
  • Framework agnostic. Nebullvm supports the most widely used frameworks (PyTorch, TensorFlow and Hugging Face) and will soon support many more.
  • Secure. Everything runs locally on your machine.
  • Easy-to-use. It takes a few lines of code to install the library and optimize your models.
  • Leveraging the best deep learning compilers. There are tons of DL compilers that optimize the way your AI models run on your hardware. It would take tons of hours for a developer to install and test them at every model deployment. The library does it for you!

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK