42

Dragonfly: Alibaba P2P file distribution system

 5 years ago
source link: https://www.tuicool.com/articles/hit/BRraiii
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Dragonfly

qmieY3m.png!web

Contents

Introduction

Dragonfly is an intelligent P2P based file distribution system. It resolves issues like low-efficiency, low-success rate and waste of network bandwidth in large-scale file distribution scenarios such as application deployment, large-scale cache file distribution, data file distribution, image distribution etc. In Alibaba, the system transfers 2 billion times and distributes 3.4PB data every month, it becomes one of the most important infrastructures in Alibaba. The reliability is up to 99.9999%.

DevOps takes a lot of benefits from container technologies, but at the same time, it also brings a lot of challenges: the efficiency of image distribution, especially when you have a lot of applications and require image distribution at the same time. Dragonfly works extremely well with both Docker and Pouch , and actually we are compatible with any other container technologies without any modifications of container engine.

It delivers up to 57 times the throughput of native docker and saves up to 99.5% the out bandwidth of registry.

Dragonfly makes it simple and cost-effective to set up, operate, and scale any kind of files/images/data distribution.

Features

The project is an open source version of the dragonfly and more internal features will be gradually opened .

  • P2P based file distribution : Using P2P technology for file transmission, which can make full use of the bandwidth resources of each peer to improve download efficiency, saves a lot of cross-IDC bandwidth, especially costly cross-board bandwidth
  • Non-invasive support all kinds of container technologies : Dragonfly can seamlessly support various containers for distributing images.
  • Host level speed limit : Many downloading tools(wget/curl) only have rate limit for the current download task,but dragonfly also provides rate limit for the entire host.
  • Passive CDN : The CDN mechanism can avoid repetitive remote downloads.
  • Strong consistency : Dragonfly can guarantee that all downloaded files must be consistent even if users do not provide any check code(MD5).
  • Disk protection and high efficient IO : Precheck Disk space, delay synchronization, write file-block in the best order, split net-read / disk-write, and so on.
  • High performance : Cluster Manager is completely closed-loop, which means, it does not rely on any DB or distributed cache, processing requests with extremely high performance.
  • Exception auto isolation : Dragonfly will automatically isolate exception nodes(peer or Cluster Manager) to improve download stability.
  • No pressure on file source : Generally, as long as a few Cluster Managers download file from the source.
  • Support standard http header : Support http header, Submit authentication information through http header.
  • Effective concurrency control of Registry Auth : Reduce the pressure of the Registry Auth Service.
  • Simple and easy to use : Very few configurations are needed.

Comparison

Test Environment Dragonfly server 2 * (24core 64GB 2000Mb/s) File Source server 2 * (24core 64GB 2000Mb/s) Client 4core 8GB 200Mb/s Target file size 200MB Executed Date 2016-04-20

NJNNni7.png!web

For Dragonfly, no matter how many clients issue the file downloading, the average downloading time is always around 12 seconds. And for wget, the downloading time keeps increasing when you have more clients, and as the amount of wget clients reaches 1200, the file source will crash, then it can not serve any client.

License

Dragonfly is available under the Apache 2.0 License .

Commercial Support

If you need commercial support of Dragonfly, please contact us for more information: 云效 .

Dragonfly is already integrated with AliCloud Container Services If you need commercial support of AliCloud Container Service, please contact us for more information: Container Service


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK