41

GitHub - BIGBALLON/CIFAR-ZOO: PyTorch implementation of CNNs for CIFAR dataset (...

 5 years ago
source link: https://github.com/BIGBALLON/CIFAR-ZOO
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

README.md

Awesome CIFAR Zoo

44953557-0fb54e80-aec9-11e8-9d38-2388bc70c5c5.png

This repository contains the pytorch code for multiple CNN architectures and improve methods based on the following papers, hope the implementation and results will helpful for your research!!

Requirements and Usage

Requirements

  • Python >= 3.5
  • PyTorch >= 0.4
  • TensorFlow/Tensorboard (if you want to use the tensorboard for visualization)
  • Other dependencies (pyyaml, easydict, tensorboardX)
pip install -r requirements.txt

Usage

simply run the cmd for the training:

## 1 GPU for lenet
CUDA_VISIBLE_DEVICES=0 python -u train.py --work-path ./experiments/cifar10/lenet

## resume from ckpt
CUDA_VISIBLE_DEVICES=0 python -u train.py --work-path ./experiments/cifar10/lenet --resume

## 2 GPUs for resnet1202
CUDA_VISIBLE_DEVICES=0,1 python -u train.py --work-path ./experiments/cifar10/preresnet1202

## 4 GPUs for densenet190bc
CUDA_VISIBLE_DEVICES=0,1,2,3 python -u train.py --work-path ./experiments/cifar10/densenet190bc

We use yaml file config.yaml to save the parameters, check any files in ./experimets for more details.
You can see the training curve via tensorboard, tensorboard --logdir path-to-event --port your-port.
The training log will be dumped via logging, check log.txt in your work path.

Results on CIFAR

Vanilla architecures

architecure GPUs params batch size epoch C10 test acc (%) C100 test acc (%) Lecun 1 x 1080TI 62K 128 250 67.46 34.10 alexnet 1 x 1080TI 2.4M 128 250 75.56 38.67 vgg19 1 x 1080TI 20M 128 250 93.00 72.07 preresnet20 1 x 1080TI 0.27M 128 250 91.88 67.03 preresnet110 1 x 1080TI 1.7M 128 250 94.24 72.96 preresnet1202 2 x 1080TI 19.4M 128 250 94.74 75.28 densenet100bc 2 x 1080TI 0.76M 64 300 95.08 77.55 densenet190bc 4 x 1080TI 25.6M 64 300 96.11 82.59 resnext29_16x64d 2 x 1080TI 68.1M 128 300 95.94 83.18 se_resnext29_16x64d 2 x 1080TI 68.6M 128 300 96.15 83.65

With additional regularization

PS: the default data augmentation methods are RandomCrop + RandomHorizontalFlip + Normalize,
and the means which additional method be used. ?

architecure epoch cutout mixup C10 test acc (%) preresnet20 250

91.88 preresnet20 250 √

92.57 preresnet20 250

√ 92.71 preresnet20 250 √ √ 92.66 preresnet110 250

94.24 preresnet110 250 √

94.67 preresnet110 250

√ 94.94 preresnet110 250 √ √ 95.66 se_resnext29_16x64d 300

96.15 se_resnext29_16x64d 300 √

96.60 se_resnext29_16x64d 300

√ 96.86 se_resnext29_16x64d 300 √ √ 97.03 shake_resnet26_2x64d 1800

96.94 shake_resnet26_2x64d 1800 √

97.20 shake_resnet26_2x64d 1800

97.42 shake_resnet26_2x64d 1800 √ √ 97.71

PS: shake_resnet26_2x64d achieved 97.71% test accuracy with cutout and mixup!!
It's cool, right?

With different LR scheduler

architecure epoch step decay cosine htd(-6,3) C10 test acc (%) preresnet20 250 √

91.88 preresnet20 250

92.13 preresnet20 250

92.44 preresnet110 250 √

94.24 preresnet110 250

94.48 preresnet110 250

94.82

Acknowledgments

Provided codes were adapted from

Feel free to contact me if you have any suggestions or questions, issues are welcome,
create a PR if you find any bugs or you want to contribute. ?


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK