README.md

A Pytorch Implementation of Detectron

Example output of *e2e_mask_rcnn-R-101-FPN_2x* using Detectron pretrained weight.

Corresponding example output from Detectron.

Example output of *e2e_keypoint_rcnn-R-50-FPN_s1x* using Detectron pretrained weight.

This code follows the implementation architecture of Detectron. Only part of the functionality is supported. Check this section for more information.

With this code, you can...

Train your model from scratch.
Inference using the pretrained weight file (*.pkl) from Detectron.

This repository is originally built on jwyang/faster-rcnn.pytorch. However, after many modifications, the structure changes a lot and it's now more similar to Detectron. I deliberately make everything similar or identical to Detectron's implementation, so as to reproduce the result directly from official pretrained weight files.

This implementation has the following features:

It is pure Pytorch code. Of course, there are some CUDA code.
It supports multi-image batch training.
It supports multiple GPUs training.
It supports three pooling methods. Notice that only roi align is revised to match the implementation in Caffe2. So, use it.
It is memory efficient. For data batching, there are two techiniques available to reduce memory usage: 1) Aspect grouping: group images with similar aspect ratio in a batch 2) Aspect cropping: crop images that are too long. Aspect grouping is implemented in Detectron, so it's used for default. Aspect cropping is the idea from jwyang/faster-rcnn.pytorch, and it's not used for default.

Besides of that, I implement a customized nn.DataParallel module which enables different batch blob size on different gpus. Check My nn.DataParallel section for more details about this.

Supported Network modules

Backbone architecture:
- ResNet series: ResNet50_conv4_body, ResNet50_conv5_body, ResNet101_Conv4_Body, ResNet101_Conv5_Body, ResNet152_Conv5_Body
- FPN: fpn_ResNet50_conv5_body, fpn_ResNet50_conv5_P2only_body, fpn_ResNet101_conv5_body, fpn_ResNet101_conv5_P2only_body, fpn_ResNet152_conv5_body, fpn_ResNet152_conv5_P2only_body
ResNeXt are also implemented but not yet tested.
Box head: ResNet_roi_conv5_head, roi_2mlp_head
Mask head: mask_rcnn_fcn_head_v0upshare, mask_rcnn_fcn_head_v0up, mask_rcnn_fcn_head_v1up4convs, mask_rcnn_fcn_head_v1up
Keypoints head: roi_pose_head_v1convX

NOTE: the naming is similar to the one used in Detectron. Just remove the prepending add_ if it any.

Supported Datasets

Only COCO is supported for now. However, the whole dataset library implementation is almost identical to Detectron's, so it should be easy to add more datasets supported by Detectron.

Configuration Options

Architecture specific configuration files are put under configs. The general configuration file lib/core/config.py has almost all the options with same default values as in Detectron's, so it's effortless to transform the architecture specific configs from Detectron.

How to transform configuration files from Detectron

Remove MODEL.NUM_CLASSES. It will be set during the initialization of JsonDataset.
Remove TRAIN.WEIGHTS, TRAIN.DATASETS and TEST.DATASETS
For module type options (e.g MODEL.CONV_BODY, FAST_RCNN.ROI_BOX_HEAD ...), remove add_ in the string if exists.
If want to load ImageNet pretrained weights for the model, add RESNETS.IMAGENET_PRETRAINED_WEIGHTS pointing to the pretrained weight file. If not, set MODEL.LOAD_IMAGENET_PRETRAINED_WEIGHTS to False.

Some more details

Some options are not used because the corresponding functionalities are not implemented yet. While, some are not use because I implement the program in different way.

Here are some options that have no effects and worth noticing:

SOLVER.LR_POLICY, SOLVER.MAX_ITER, SOLVER.STEPS,SOLVER.LRS: For now, the training policy is controlled by these command line arguments:
- --epochs: How many epochs to train. One epoch means one travel through the whole training sets. Defaults to 6.
- --lr_decay_epochs: Epochs to decay the learning rate on. Decay happens on the beginning of a epoch. Epoch is 0-indexed. Defaults to [4, 5].
For more command lie arguments, please refer to python train_net.py --help
SOLVER.WARM_UP_ITERS, SOLVER.WARM_UP_FACTOR, SOLVER.WARM_UP_METHOD: Training warm up in the paper Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour is not implemented.
OUTPUT_DIR: Use the command line argument --output_base_dir to specify the output directory instead.

While some more options are provided:

MODEL.LOAD_IMAGENET_PRETRAINED_WEIGHTS = True: Whether to load ImageNet pretrained weights.
- RESNETS.IMAGENET_PRETRAINED_WEIGHTS = '': Path to pretrained residual network weights. If start with '/', then it is treated as a absolute path. Otherwise, treat as a relative path to ROOT_DIR.
TRAIN.ASPECT_CROPPING = False, TRAIN.ASPECT_HI = 2, TRAIN.ASPECT_LO = 0.5: Options for aspect cropping to restrict image aspect ratio range.
RPN.OUT_DIM_AS_IN_DIM = True, RPN.OUT_DIM = 512, RPN.CLS_ACTIVATION = 'sigmoid': Official implement of RPN has same input and output feature channels and use sigmoid as the activation function for fg/bg class prediction. In jwyang's implementation, it fix output channel number to 512 and use softmax as activation function.

My nn.DataParallel

TBA

Getting Started

Clone the repo:

git clone https://github.com/roytseng-tw/mask-rcnn.pytorch.git

Requirements

Tested under python3.

python packages
- pytorch==0.3.1 (cuda80, cudnn7.1.2)
- torchvision==0.2.0
- numpy
- scipy
- opencv
- pyyaml
- pycocotools — for COCO dataset, also available from pip.
- tensorboardX — for logging the losses in Tensorboard
An NVIDAI GPU and CUDA 8.0 or higher. Some operations only have gpu implementation.
NOTICE: different versions of Pytorch package have different memory usages.

Compilation

Compile the CUDA code:

cd lib  # please change to this directory
sh make.sh

If your are using Volta GPUs, uncomment this line in lib/mask.sh and remember to postpend a backslash at the line above. CUDA_PATH defaults to /usr/loca/cuda. If you want to use a CUDA library on different path, change this line accordingly.

It will compile all the modules you need, including NMS, ROI_Pooing, ROI_Crop and ROI_Align. (Actually gpu nms is never used ...)

Note that, If you use CUDA_VISIBLE_DEVICES to set gpus, make sure at least one gpu is visible when compile the code.

Data Preparation

Create a data folder under the repo,

cd {repo_root}
mkdir data

COCO: Download the coco images and annotations from coco website.

And make sure to put the files as the following structure:
```
coco
├── annotations
|   ├── instances_minival2014.json
│   ├── instances_train2014.json
│   ├── instances_train2017.json
│   ├── instances_val2014.json
│   ├── instances_val2017.json
│   ├── instances_valminusminival2014.json
│   ├── person_keypoints_train2014.json
│   ├── person_keypoints_train2017.json
│   ├── person_keypoints_val2014.json
│   └── person_keypoints_val2017.json
└── images
    ├── train2014
    ├── train2017
    ├── val2014
    └── val2017
```
Download link for instances_minival2014.json and instances_valminusminival2014.json

Feel free to put the dataset at any place you want, and then soft link the dataset under the data/ folder:
```
ln -s path/to/coco data/coco
```
Recommend to put the images on a SSD for possible better training performance

In my experience, COCO2014 has some mask annotations that have different (h,w) shape to the corresponding images. Maybe instances_minival2014.json and instances_valminusminival2014.json contains corrupted mask annotations. However, COCO2017 doesn't have this issue. It is said that COCO train2017 equals to (COCO train 2014 + COCO minival 2014) and COCO test 2017 equals to COCO valminusminival 2014. Hence, it should be fine to use COCO 2017 train-val splits to reproduce the results.

Pretrained Model

I use ImageNet pretrained weights from Caffe for the backbone networks.

ResNet50, ResNet101, ResNet152
VGG16 (vgg backbone is not implemented yet)

Download them and put them into the {repo_root}/data/pretrained_model.

You can the following command to download them all:

- extra required packages: argparse_color_formater, colorama

python tools/download_imagenet_weights.py

NOTE: Caffe pretrained weights have slightly better performance than Pytorch pretrained. Suggest to use Caffe pretrained models from the above link to reproduce the results. By the way, Detectron also use pretrained weights from Caffe.

If you want to use pytorch pre-trained models, please remember to transpose images from BGR to RGB, and also use the same data preprocessing (minus mean and normalize) as used in Pytorch pretrained model.

Train

Train mask-rcnn with res50 backbone from scratch
```
python tools/train_net.py --dataset coco2017 --cfg configs/e2e_mask_rcnn_R-50-C4.yml --use_tfboard --bs {batch_size} --nw {num_workers}
```
Use --bs to overwrite the default batch size (e.g. 8) to a proper value that fits into your GPUs. Simliar for --nw, number of data loader threads defaults to 4 in config.py.

Specify —use_tfboard to log the losses on the Tensorboard.
Resume training with exactly same settings from the end of an epoch
```
python tools/train_net.py --dataset coco2017 --cfg configs/e2e_mask_rcnn_R-50-C4.yml --resume --load_ckpt {path/to/the/checkpoint} --bs {batch_size}
```
The difference of w/ and w/o --resume: Optimizer state will be loaded from the checkpoint file if --resume is specified. Otherwise, not.

Train keypoint-rcnn

python tools/train_net.py --dataset keypoints_coco2017 ...

Fine tune from the Detectron pretrained weights

python train_net.py --dataset coco2017 --cfg cfgs/e2e_mask_rcnn_R-50-C4.yml --load_detectron {path/to/detectron/weight} --bs {batch_size}

NOTE: optimizer state (momentums for SGD) are not loaded. (To be implemented)

Inference

python tools/infer_simple.py --dataset coco --cfg cfgs/e2e_mask_rcnn_R-50-C4.yml --load_detectron {path/to/detectron/weight} --image_dir {dir/of/input/images}  --output_dir {dir/to/save/visualizations}

--output_dir defaults to infer_outputs.

Benchmark

TBA

Visualization

Train e2e_mask_rcnn_R-50_C4 from scratch for 1 epoch on coco_train_2017 with batch size 4:

GitHub - roytseng-tw/Detectron.pytorch: A pytorch implementation of Detectron. B...

README.md

A Pytorch Implementation of Detectron

Supported Network modules

Supported Datasets

Configuration Options

How to transform configuration files from Detectron

Some more details

My nn.DataParallel

Getting Started

Requirements

Compilation

Data Preparation

Pretrained Model

Train

Inference

Benchmark

Visualization

Recommend

36氪专访 | 欢聚时代旗下BIGO LIVE负责人：卡位全球直播市场，不惧BAT的竞争

另一条战线上的腾讯

神似新垣结衣的中国女生龙梦柔在日本接拍首支广告 - 人物 - cnBeta.COM

重庆小伙用微信语音汇报工作被批“态度有问题” - WeChat 腾讯微信 - cnBeta.COM

IQOS电子烟是“戒烟黑科技” 还是“换汤不换药”？ - 视点·观察 - cnBeta.COM

为什么裸照泄露事件会持续发生？ - 安全 - cnBeta.COM

中国电信部分区域地推新套餐：29元不限量流量 - China Telecom 中国电信 - cnBeta.COM

印度人钟爱中国手机 “橙绿蓝红”各领风骚 - 手机 - cnBeta.COM

改进 JavaScript 和 Rust 的互操作性：深入认识 wasm-bindgen 组件

LKML: Linus Torvalds: Linux 4.17-rc1

About Joyk