香港中文大学多媒体实验室 | 开源视频目标检测&跟踪平台（附源码下载） - JOYK Joy of Geek, Geek News, Link all geek

欢迎关注“

计算机视觉研究院

”

JfYrYrU.png!mobile

MMDetection V1.0版本发布以来，就获得很多用户的喜欢，发布以来，其中有不少有价值的建议，同时也有很多开发者贡献代码，在2020年5月6日，发布了MMDetection V2.0。

yeiu6fn.png!mobile

经过对模型各个组件的重构和优化，全面提升了MMDetection的速度和精度，达到了现有检测框架中的最优水平。通过更细粒度的模块化设计，MMDetection的任务拓展性大大增强，成为了检测相关项目的基础平台。同时对文档和教程进行了完善，增强用户体验。

z2IBfa.png!mobile

MMDetectio n 中实现了RPN，Fast R-CNN，Faster R-CNN，Mask R-CNN等网络和框架。先简单介绍一下和 Detectron 的对比：

performance 稍高
训练速度稍快
所需显存稍小

但更重要的是，基于PyTorch和基于Caffe2的code相比，易用性是有代差的。成功安装 Detectron的时间，大概可以装好一打的mmdetection吧。

当然Detectron有些优势也很明显，作为第一个全面的detection codebase，加上FAIR的金字招牌，release的模型也比较全面。研究者也在努力扩充model zoo，奈何人力和算力还是有很大差距，所以还需要时间。

具体说说上面提到的三个方面吧。首先是 performance ，由于PyTorch官方model zoo里面的ResNet结构和Detectron所用的ResNet有细微差别（mmdetection中可以通过backbone的style参数指定），导致模型收敛速度不一样，所以用两种结构都跑了实验，一般来说在1x的lr schedule下Detectron的会高，但2x的结果PyTorch的结构会比较高。

AfUfIbv.png!mobile

速度方面Mask R-CNN差距比较大，其余的很小。采用相同的setting，Detectron每个iteration需要0.89s，而mmdetection只需要0.69s。Fast R-CNN比较例外，比Detectron的速度稍慢。另外在自己的服务器上跑Detectron会比官方report的速度慢20%左右，猜测是FB的Big Basin服务器性能比研究者好？

fEjiqmV.png!mobile

显存方面优势比较明显，会小30%左右。但这个和框架有关，不完全是codebase优化的功劳。一个让研究者比较意外的结果是现在的codebase版本跑ResNet-50的Mask R-CNN，每张卡（12 G）可以放4张图，比研究者比赛时候小了不少。

uQVFfeY.png!mobile

MMTracking

MMDetectio n 是商汤科技（2018 COCO 目标检测挑战赛冠军）和香港中文大学开源的一个基于Pytorch实现的深度学习目标检测工具箱。

新年2021年，香港中文大学多媒体实验室（MMLab）OpenMMLab 又研究并贡献新的平台工具，发布了一款一体化视频目标感知平台MMTracking 。该框架基于PyTorch写成，支持单目标跟踪、多目标跟踪与视频目标检测，目前已开源。我们开始详细分下下。

r6niu2E.gif!mobile

主要特征：

第一个统一的视频感知平台

MMLab 是第一个统一多功能视频感知任务的开源工具箱，包括视频目标检测，单个目标跟踪，多个目标跟踪。

模块化设计

MMLab 将视频感知框架分解成不同的组件，可以很容易地通过组合不同的模块来构建定制的方法。

Simple, Fast and Strong

Simple ： MMTracking与其他Open MMLab项目交互。它是建立在MMDetection上的，通过修改配置文件选择。

Fast： 所有操作都运行在GPU上。训练和推理速度比其他实现快。

Strong ：性能超过最先进的模型，其中一些模型甚至优于官方的实现。

如何使用：

1、Create a conda virtual environment and activate it.

conda create -n open-mmlab python=3.7 -y
conda activate open-mmlab

2、 Install PyTorch and torchvision following the official instructions , e.g.,

conda install pytorch torchvision -c pytorch

Note: Make sure that your compilation CUDA version and runtime CUDA version match. You can check the supported CUDA version for precompiled packages on the PyTorch website.

E.g.1 If you have CUDA 10.1 installed under /usr/local/cuda and would like to install PyTorch 1.5, you need to install the prebuilt PyTorch with CUDA 10.1.

conda install pytorch cudatoolkit=10.1 torchvision -c pytorch

E.g. 2 If you have CUDA 9.2 installed under /usr/local/cuda and would like to install PyTorch 1.3.1., you need to install the prebuilt PyTorch with CUDA 9.2.

conda install pytorch=1.3.1 cudatoolkit=9.2 torchvision=0.4.2 -c pytorch

If you build PyTorch from source instead of installing the prebuilt pacakge, you can use more CUDA versions such as 9.0.

3、Install mmcv-full, we recommend you to install the pre-build package as below.

pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.6.0/index.html

See here for different versions of MMCV compatible to different PyTorch and CUDA versions. Optionally you can choose to compile mmcv from source by the following command

git clone https://github.com/open-mmlab/mmcv.git
cd mmcv
MMCV_WITH_OPS=1 pip install -e .  # package mmcv-full will be installed after this step
cd ..

Or directly run

pip install mmcv-full

4、Install MMDetection

pip install mmdet

Optionally, you can also build MMDetection from source in case you want to modify the code:

git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
pip install -r requirements/build.txt
pip install -v -e .  # or "python setup.py develop"

5、 Clone the MMTracking repository.

git clone https://github.com/open-mmlab/mmtracking.git
cd mmtracking

6、 Install build requirements and then install MMTracking.

pip install -r requirements/build.txt
pip install -v -e .  # or "python setup.py develop"

使用该平台测试：

This section will show how to test existing models on supported datasets. The following testing environments are supported:

single GPU
single node multiple GPU
multiple nodes

During testing, different tasks share the same API and we only support samples_per_gpu = 1 .

You can use the following commands for testing:

# single-gpu testing
python tools/test.py ${CONFIG_FILE} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${GPU_NUM} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

Optional arguments:

CHECKPOINT_FILE : Filename of the checkpoint. You do not need to define it when applying some MOT methods but specify the checkpoints in the config.
RESULT_FILE : Filename of the output results in pickle format. If not specified, the results will not be saved to a file.
EVAL_METRICS : Items to be evaluated on the results. Allowed values depend on the dataset, e.g., bbox is available for ImageNet VID, track is available for LaSOT, bbox and track are both suitable for MOT17.
--cfg-options : If specified, the key-value pair optional cfg will be merged into config file
--eval-options : If specified, the key-value pair optional eval cfg will be kwargs for dataset.evaluate() function, it’s only for evaluation
--format-only : If specified, the results will be formated to the offical format.

mM3eE3b.png!mobile

我们开创“ 计算机视觉协会 ”知识星球一年有余，也得到很多同学的认可，我们定时会推送实践型内容与大家分享，在星球里的同学可以随时提问，随时提需求，我们都会及时给予回复及给出对应的答复。

ANvaYjr.jpg!mobile

如果想加入我们“ 计算机视觉研究院 ”，请扫二维码加入我们。我们会按照你的需求将你拉入对应的学习群！

计算机视觉研究院 主要涉及 深度学习 领域，主要致力于 人脸检测、人脸识别，多目标检测、目标跟踪、图像分割等 研究方向。 研究院 接下来会不断分享最新的论文算法新框架，我们这次改革不同点就是，我们要着重” 研究 “。之后我们会针对相应领域分享实践过程，让大家真正体会 摆脱理论 的真实场景，培养爱动手编程爱动脑思考的习惯！

计算机视觉研究院

长按扫描二维码

回复“

MMLab

”， 获取框架源代码

香港中文大学多媒体实验室 | 开源视频目标检测&跟踪平台（附源码下载）

主要特征：

如何使用：

使用该平台测试：

Recommend

腾讯多媒体实验室：基于三维卷积神经网络的全参考视频质量评估算法

腾讯多媒体实验室开源国内首个视频质量评估算法DVQA

字节跳动多媒体实验室联合 ISCAS 举办第二届神经网络视频编码竞赛

香港中文大学更新校徽，「凤」形态更为清晰灵动

史上最快换标！香港中文大学换回旧校徽

腾讯获2022年度AVS产业技术创新奖，多媒体实验室助力国家标准建设

腾讯多媒体实验室AIGC能力助力数据万象开启智能剪辑大门-品玩

打开多媒体新场景，腾讯多媒体实验室亮相中国互联网大会-品玩

香港中文大学等多所高校联合开发音频生成模型 UniAudio

全面的中文大语言模型评测来啦！香港中文大学研究团队发布

About Joyk