GitHub - KaiyangZhou/deep-person-reid: Pytorch implementation of deep person re-...
source link: https://github.com/KaiyangZhou/deep-person-reid
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
README.md
deep-person-reid
This repo contains pytorch implementations of deep person re-identification models.
Pretrained models are available.
We will actively maintain this repo to incorporate new models.
Install
cd
to the folder where you want to download this repo.- run
git clone https://github.com/KaiyangZhou/deep-person-reid
.
Prepare data
Create a directory to store reid datasets under this repo via
cd deep-person-reid/
mkdir data/
Market1501 [7]:
- download dataset to
data/
from http://www.liangzheng.org/Project/project_reid.html. - extract dataset and rename to
market1501
.
MARS [8]:
- create a directory named
mars/
underdata/
. - download dataset to
data/mars/
from http://www.liangzheng.com.cn/Project/project_mars.html. - extract
bbox_train.zip
andbbox_test.zip
. - download split information from https://github.com/liangzheng06/MARS-evaluation/tree/master/info and put
info/
indata/mars
. (we want to follow the standard split in [8])
Dataset loaders
These are implemented in dataset_loader.py
where we have two main classes that subclass torch.utils.data.Dataset:
ImageDataset
: processes image-based person reid datasets.VideoDataset
: processes video-based person reid datasets.
These two classes are used for torch.utils.data.DataLoader that can provide batched data. Data loader wich ImageDataset
outputs batch data of (batch, channel, height, width)
, while data loader with VideoDataset
outputs batch data of (batch, sequence, channel, height, width)
.
Models
models/ResNet.py
: ResNet50 [1], ResNet50M [2].models/DenseNet.py
: DenseNet121 [3].
Loss functions
xent
: cross entropy + label smoothing regularizer [5].htri
: triplet loss with hard positive/negative mining [4] .
We use Adam
[6] everywhere, which turned out to be the most effective optimizer in our experiments.
Train
Training codes are implemented mainly in
train_img_model_xent.py
: train image model with cross entropy loss.train_img_model_xent_htri.py
: train image model with combination of cross entropy loss and hard triplet loss.train_vid_model_xent.py
: train video model with cross entropy loss.train_vid_model_xent_htri.py
: train video model with combination of cross entropy loss and hard triplet loss.
For example, to train an image reid model using ResNet50 and cross entropy loss, run
python train_img_model_xent.py -d market1501 -a resnet50 --max-epoch 60 --train-batch 32 --test-batch 32 --stepsize 20 --eval-step 20 --save-dir log/resnet50-xent-market1501 --gpu-devices 0
Then, you will see
========== Args:Namespace(arch='resnet50', dataset='market1501', eval_step=20, evaluate=False, gamma=0.1, gpu_devices='0', height=256, lr=0.0003, max_epoch=60, print_freq=10, resume='', save_dir='log/resnet50/', seed=1, start_epoch=0, stepsize=20, test_batch=32, train_batch=32, use_cpu=False, weight_decay=0.0005, width=128, workers=4) ========== Currently using GPU 0 Initializing dataset market1501 => Market1501 loaded Dataset statistics: ------------------------------ subset | # ids | # images ------------------------------ train | 751 | 12936 query | 750 | 3368 gallery | 751 | 15913 ------------------------------ total | 1501 | 32217 ------------------------------ Initializing model: resnet50 Model size: 25.04683M ==> Epoch 1/60 Batch 10/404 Loss 6.665115 (6.781841) Batch 20/404 Loss 6.792669 (6.837275) Batch 30/404 Loss 6.592124 (6.806587) ... ... ==> Epoch 60/60 Batch 10/404 Loss 1.101616 (1.075387) Batch 20/404 Loss 1.055073 (1.075455) Batch 30/404 Loss 1.081339 (1.073036) ... ... ==> Test Extracted features for query set, obtained 3368-by-2048 matrix Extracted features for gallery set, obtained 15913-by-2048 matrix Computing distance matrix Computing CMC and mAP Results ---------- mAP: 68.8% CMC curve Rank-1 : 85.4% Rank-5 : 94.1% Rank-10 : 95.9% Rank-20 : 97.2% ------------------ Finished. Total elapsed time (h:m:s): 1:57:44
To use multiple GPUs, you can set --gpu-devices 0,1,2,3
.
Please run python train_blah_blah.py -h
for more details regarding arguments.
Results
Image person reid
Market1501
Model Size (M) Loss Rank-1/5/10 (%) mAP (%) Model weights Published Rank Published mAP DenseNet121 7.72 xent 86.5/93.6/95.7 67.8 download
DenseNet121 7.72 xent+htri 89.5/96.3/97.5 72.6 download
ResNet50 25.05 xent 85.4/94.1/95.9 68.8 download 87.3/-/- 67.6 ResNet50 25.05 xent+htri 87.5/95.3/97.3 72.3 download
ResNet50M 30.01 xent 89.0/95.5/97.3 75.0 download 89.9/-/- 75.6 ResNet50M 30.01 xent+htri 90.4/96.7/98.0 76.6 download
Video person reid
MARS
Model Size (M) Loss Rank-1/5/10 (%) mAP (%) Model weights Published Rank Published mAP DenseNet121 7.59 xent+htri 82.6/93.2/95.4 74.6 download
ResNet50 24.79 xent 74.5/88.8/91.8 64.0 download
ResNet50 24.79 xent+htri 80.8/92.1/94.3 74.0 download
ResNet50M 29.63 xent 77.8/89.8/92.8 67.5 download
ResNet50M 29.63 xent+htri 82.3/93.8/95.3 75.4 download
Test
Say you have downloaded ResNet50 trained with xent
on market1501
. The path to this model is 'saved-models/resnet50_xent_market1501.pth.tar'
(create a directory to store model weights mkdir saved-models/
). Then, run the following command to test
python train_img_model_xent.py -d market1501 -a resnet50 --evaluate --resume saved-models/resnet50_xent_market1501.pth.tar --save-dir log/resnet50-xent-market1501 --test-batch 32
Likewise, to test video reid model, you should have a pretrained model saved under saved-models/
, e.g. saved-models/resnet50_xent_mars.pth.tar
, then run
python train_vid_model_xent.py -d mars -a resnet50 --evaluate --resume saved-models/resnet50_xent_mars.pth.tar --save-dir log/resnet50-xent-mars --test-batch 2
Note that --test-batch
in video reid represents number of tracklets. If we set this argument to 2, and sample 15 images per tracklet, the resulting number of images per batch is 2*15=30. Adjust this argument according to your GPU memory.
References
[1] He et al. Deep Residual Learning for Image Recognition. CVPR 2016.
[2] Yu et al. The Devil is in the Middle: Exploiting Mid-level Representations for Cross-Domain Instance Matching. arXiv:1711.08106.
[3] Huang et al. Densely Connected Convolutional Networks. CVPR 2017.
[4] Hermans et al. In Defense of the Triplet Loss for Person Re-Identification. arXiv:1703.07737.
[5] Szegedy et al. Rethinking the Inception Architecture for Computer Vision. CVPR 2016.
[6] Kingma and Ba. Adam: A Method for Stochastic Optimization. ICLR 2015.
[7] Zheng et al. Scalable Person Re-identification: A Benchmark. ICCV 2015.
[8] Zheng et al. MARS: A Video Benchmark for Large-Scale Person Re-identification. ECCV 2016.
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK