CVPR2021 最全整理：论文分类汇总 / 代码 / 项目 / 论文解读（更新中）【计算机视觉】

4周前 ⋅ 4052 ⋅ 1 ⋅ 0

作为计算机视觉领域三大顶会之一，CVPR2021目前已公布了所有接收论文ID，一共有1663篇论文被接收，接收率为23.7%，虽然接受率相比去年有所上升，但竞争也是非常激烈，相关报道：CVPR 2021接收结果出炉！录用1663篇，接受率提升，你的论文中了吗？。

在本文中，我们对CVPR2021的最新论文进行了分类汇总，并将对优秀论文解读报道和技术直播。我们将对CVPR2021顶会论文进行实时跟进和分类，欢迎点击文末关注按钮，即可获取本帖最新更新消息。

此前我们也对CVPR2020、CVPR2019的论文进行了整理，做了分类汇总，点击下列推文即可前往：

所有关于CVPR的论文整理都汇总在了我们的Github项目中，该项目目前已收获6100 Star。
Github项目地址：https://github.com/extreme-assistant/CVPR2021-Paper-Code-Interpretation

CVPR2021同系列整理：

下文为对CVPR2021论文的分方向整理：

分类目录：

1. 检测

2. 图像分割(Image Segmentation)

3. 图像处理(Image Processing)

4. 估计(Estimation)

5. 图像/视频检索(Image Retrieval)

6. 人脸(Face)

7. 目标跟踪(Object Tracking)

8. 医学影像(Medical Imaging)

9. 文本检测/识别(Text Detection/Recognition)

10. 遥感图像(Remote Sensing Image)

11. GAN/生成式/对抗式(GAN/Generative/Adversarial)

12. 三维视觉(3D Vision)

13. 神经网络架构(Neural Network Structure)

14. 神经网络架构搜索(NAS)

15. 数据处理(Data Processing)

16. 模型压缩(Model Compression)

知识蒸馏(Knowledge Distillation)

17. 模型评估(Model Evaluation)

18. 数据集(Database)

19. 主动学习(Active Learning)

20. 小样本/零样本学习(Few-shot/Zero-shot Learning)

21. 持续学习(Continual Learning/Life-long Learning)

22. 视觉推理(Visual Reasoning)

23. 迁移学习/domain/自适应

24. 对比学习(Contrastive Learning)

25. 强化学习(Reinforcement Learning)

暂无分类

图像目标检测(Image Object Detection)

[9] MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection(用于类别识别无监督域自适应对象检测)

paper

[8] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)

paper|code

[7] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)

paper

[6] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)

paper

[5] Instance Localization for Self-supervised Detection Pretraining(自监督检测预训练的实例定位)

paper｜code

[4] Multiple Instance Active Learning for Object Detection（用于对象检测的多实例主动学习）

paper|code

[3] Towards Open World Object Detection(开放世界中的目标检测)

paper|code

[2] Positive-Unlabeled Data Purification in the Wild for Object Detection(野外检测对象的阳性无标签数据提纯)

[1] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

paper|code

解读：无监督预训练检测器

视频目标检测(Video Object Detection)

[3] Depth from Camera Motion and Object Detection(相机运动和物体检测的深度)

paper

[2] There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge(多模态知识提取的自监督多目标检测与有声跟踪)

paper|video|project

[1] Dogfight: Detecting Drones from Drone Videos（从无人机视频中检测无人机）

三维目标检测(3D object detection)

[2] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection(利用IoU预测进行半监督3D对象检测)

paper|code|project|video

[1] Categorical Depth Distribution Network for Monocular 3D Object Detection(用于单目三维目标检测的分类深度分布网络)

paper

动作检测(Activity Detection)

[1] Coarse-Fine Networks for Temporal Activity Detection in Videos

paper

异常检测(Anomally Detection)

[1] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)

paper

人物交互检测(HOI Detection)

[1] End-to-End Human Object Interaction Detection with HOI Transformer(使用HOI Transformer进行端到端的人类对象交互检测)

paper|code

伪装目标检测(Camouflaged Object Detection)

[1] Simultaneously Localize, Segment and Rank the Camouflaged Objects(同时定位，分割和排序伪装的对象)

paper|code

图像分割(Image Segmentation)

[2] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

paper|code

[1] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)

全景分割(Panoptic Segmentation)

[2] Cross-View Regularization for Domain Adaptive Panoptic Segmentation(用于域自适应全景分割的跨视图正则化)

paper

[1] 4D Panoptic LiDAR Segmentation（4D全景LiDAR分割）

paper

语义分割(Semantic Segmentation)

[5] Learning Statistical Texture for Semantic Segmentation(学习用于语义分割的统计纹理)

paper

[4] Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation(基于双层域混合的半监督域自适应语义分割)

paper

[3] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation(多源领域自适应与协作学习的语义分割)

paper

[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割：数据集，基准和挑战)

paper|code

[1] PLOP: Learning without Forgetting for Continual Semantic Segmentation（PLOP：学习而不会忘记连续的语义分割）

paper

实例分割(Instance Segmentation)

[1] End-to-End Video Instance Segmentation with Transformers(使用Transformer的端到端视频实例分割)

paper

抠图(Matting)

[1] Real-Time High Resolution Background Matting

paper|code|project|video

9. 估计(Estimation)

人体姿态估计(Human Pose Estimation)

[3] Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing(用于实例感知人类语义解析的可微分多粒度人类表示学习)

paper|code

[2] CanonPose: Self-supervised Monocular 3D Human Pose Estimation in the Wild（野外自监督的单眼3D人类姿态估计）

[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers（具有透视作物层的3D姿势的几何感知神经重建）

paper

手势估计(Gesture Estimation)

[1] Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration(基于语义聚合和自适应2D-1D配准的相机空间手部网格恢复)

paper|code

光流/位姿/运动估计(Flow/Pose/Motion Estimation)

[3] GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation(用于单眼6D对象姿态估计的几何引导直接回归网络)

paper|code

[2] Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments(在动态室内环境中，通过空间划分的鲁棒神经路由可实现摄像机的重新定位)

paper|project

[1] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)

paper|code

深度估计(Depth Estimation)

图像处理(Image Processing)

图像复原(Image Restoration)/超分辨率(Super Resolution)

[5] ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic(通过数据特征加速超分辨率网络的通用框架)

paper

[4] Learning Continuous Image Representation with Local Implicit Image Function(通过局部隐含图像功能学习连续图像表示)

paepr|code|video|project

[3] Multi-Stage Progressive Image Restoration(多阶段渐进式图像复原)

paper|code

[2] Data-Free Knowledge Distillation For Image Super-Resolution(DAFL算法的SR版本)

[1] AdderSR: Towards Energy Efficient Image Super-Resolution(将加法网路应用到图像超分辨率中)

paper|code

解读：华为开源加法神经网络

图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)

[2] Robust Reflection Removal with Reflection-free Flash-only Cues(通过无反射的仅含Flash线索进行鲁棒的反射去除)

paper|code

[1] Auto-Exposure Fusion for Single-Image Shadow Removal(用于单幅图像阴影去除的自动曝光融合)

paper|code

图像去噪/去模糊/去雨去雾(Image Denoising)

[2] ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring(学习用于视频去模糊的全范围体积对应)

paper

[1] DeFMO: Deblurring and Shape Recovery of Fast Moving Objects(快速移动物体的去模糊和形状恢复)

paper|code|video

图像编辑/图像修复(Image Edit/Inpainting)

[5] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

paper|code

[4] DeFLOCNet: Deep Image Editing via Flexible Low level Controls(通过灵活的低级控件进行深度图像编辑)

[3] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)

[2] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

paper|code

[1] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing（利用GAN中潜在的空间维度进行实时图像编辑）

图像翻译（Image Translation）

[3] Spatially-Adaptive Pixelwise Networks for Fast Image Translation(空间自适应像素网络，用于快速图像翻译)

paper|project

[2] Image-to-image Translation via Hierarchical Style Disentanglement

paper|code

[1] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码：用于图像到图像翻译的StyleGAN编码器)

paper|code|project

人脸(Face)

[8] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)

paper|code|project

[7] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

paper|code

[6] WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition(揭示了百万级深度人脸识别力量的基准测试)

paper|benchmark

[5] Cross Modal Focal Loss for RGBD Face Anti-Spoofing(跨模态焦点损失，用于RGBD人脸反欺骗)
paper

[4] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时：一个多任务学习框架)

paper|code

[3] Multi-attentional Deepfake Detection(多注意的深伪检测)

paper

[2] Image-to-image Translation via Hierarchical Style Disentanglement

paper|code

[1] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)

paper

目标跟踪(Object Tracking)

[4] HPS: localizing and tracking people in large 3D scenes from wearable sensors(通过可穿戴式传感器对大型3D场景中的人进行定位和跟踪)

[3] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段：在线多对象跟踪器)

project|video

[2] Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking(多目标跟踪的概率小波计分和修复)

paper

[1] Rotation Equivariant Siamese Networks for Tracking（旋转等距连体网络进行跟踪）

paper

图像/视频检索(Image/Video Retrieval)

[1] QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval(实用的查询高效的图像检索黑盒攻击)

paper

行为识别/动作识别(Action/Activity Recognition)

[1] Behavior-Driven Synthesis of Human Dynamics(行为驱动的人类动力学综合)

paper|code<>

[3] Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification(基于视频的人员重新识别的全球指导对等学习)

paper

[2] Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification(联合抗噪学习和元相机移位自适应，用于无监督人员的重新识别)

paper

[1] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)

paper

医学影像(Medical Imaging)

[5] DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images(一种心脏标记磁共振图像运动跟踪的无监督深度学习方法)

paper

[4] Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning(多机构协作改进基于深度学习的联合学习磁共振图像重建)

paper|code

[3] 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management(用于胰腺肿块分割，诊断和定量患者管理的3D图形解剖学几何集成网络)

[2] Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies(深部病变追踪器：在4D纵向成像研究中监控病变)

paper

[1] Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization(通过脊柱矫正和解剖学约束优化在CT中自动进行椎骨定位和识别)

paper

文本检测/识别(Text Detection/Recognition)

[1] What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels(如果我们仅将真实数据集用于场景文本识别该怎么办？带有较少标签的场景文本识别)

paepr|code

遥感图像(Remote Sensing Image)

[1] Deep Gradient Projection Networks for Pan-sharpening(【超分辨率】泛锐化的深梯度投影网络)

paper|code

神经网络架构搜索(NAS)

[4] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)

paper|code

[3] AttentiveNAS: Improving Neural Architecture Search via Attentive(通过注意力改善神经架构搜索)

paper

[2] ReNAS: Relativistic Evaluation of Neural Architecture Search(NAS predictor当中ranking loss的重要性)

paper

[1] HourNAS: Extremely Fast Neural Architecture Search Through an Hourglass Lens（降低NAS的成本）

paper

GAN/生成式/对抗式(GAN/Generative/Adversarial)

[11] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)

paper|code|project

[10] LOHO: Latent Optimization of Hairstyles via Orthogonalization(LOHO：通过正交化潜在地优化发型)

paper

[9] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

paper|code

[8] Closed-Form Factorization of Latent Semantics in GANs(GAN中潜在语义的闭式分解)

paper|code

[7] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)

[6] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

paper|code

[5] Efficient Conditional GAN Transfer with Knowledge Propagation across Classes(高效的有条件GAN转移以及跨课程的知识传播)

paper|code

[4] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing（利用GAN中潜在的空间维度进行实时图像编辑）

[3] Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs(Hijack-GAN：意外使用经过预训练的黑匣子GAN)

paper

[2] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码：用于图像到图像翻译的StyleGAN编码器)

paper|code|project

[1] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)

paper

三维视觉(3D Vision)

[2] A Deep Emulator for Secondary Motion of 3D Characters(三维角色二次运动的深度仿真器)
paper

[1] 3D CNNs with Adaptive Temporal Feature Resolutions(具有自适应时间特征分辨率的3D CNN)

paper

点云(Point Cloud)

[9] Robust Point Cloud Registration Framework Based on Deep Graph Matching(基于深度图匹配的鲁棒点云配准框架)

paper|code

[8] TPCN: Temporal Point Cloud Networks for Motion Forecasting(面向运动预测的时态点云网络)
paper|[code]()

[7] PointGuard: Provably Robust 3D Point Cloud Classification(可证明稳健的三维点云分类)

paper

[6] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割：数据集，基准和挑战)

paper|code

[5] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration(SpinNet：学习用于3D点云配准的通用表面描述符)

paper|code

[4] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)

paper|code

[3] Diffusion Probabilistic Models for 3D Point Cloud Generation(三维点云生成的扩散概率模型)

paper|code

[2] Style-based Point Generator with Adversarial Rendering for Point Cloud Completion(用于点云补全的对抗性渲染基于样式的点生成器)

paper

[1] PREDATOR: Registration of 3D Point Clouds with Low Overlap(预测器：低重叠的3D点云的配准)

paper|code|project

三维重建(3D Reconstruction)

[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers（具有透视作物层的3D姿势的几何感知神经重建）

paper

模型压缩(Model Compression)

[2] Manifold Regularized Dynamic Network Pruning（动态剪枝的过程中考虑样本复杂度与网络复杂度的约束）

[1] Learning Student Networks in the Wild（一种不需要原始训练数据的模型压缩和加速技术）

paper|code

解读：华为诺亚方舟实验室提出无需数据网络压缩技术

知识蒸馏(Knowledge Distillation)

[5] Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning(少班级增量学习的语义感知知识蒸馏)

paper

[4] Teachers Do More Than Teach: Compressing Image-to-Image Models(https://arxiv.org/abs/2103.03467)

paper|code

[3] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)

paper

[2] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)

paper

[1] Distilling Object Detectors via Decoupled Features（前景背景分离的蒸馏技术）

神经网络架构(Neural Network Structure)

[4] Coordinate Attention for Efficient Mobile Network Design(协调注意力以实现高效的移动网络设计)

paper

[3] Rethinking Channel Dimensions for Efficient Model Design(重新考虑通道尺寸以进行有效的模型设计)

paper|code

[2] Inverting the Inherence of Convolution for Visual Recognition（颠倒卷积的固有性以进行视觉识别）

[1] RepVGG: Making VGG-style ConvNets Great Again

paper|code

解读：RepVGG：极简架构，SOTA性能，让VGG式模型再次伟大

Transformer

[3] Transformer Interpretability Beyond Attention Visualization(注意力可视化之外的Transformer可解释性)

paper|code

[2] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

paper|code

解读：无监督预训练检测器

[1] Pre-Trained Image Processing Transformer(底层视觉预训练模型)

paper

图神经网络(GNN)

[2] Quantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)

paper

[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)

paper

数据处理(Data Processing)

数据增广(Data Augmentation)

[1] KeepAugment: A Simple Information-Preserving Data Augmentation(一种简单的保存信息的数据扩充)

paper

表征学习(Representation Learning)

[1] VirTex: Learning Visual Representations from Textual Annotations（【表示学习】从文本注释中学习视觉表示）

paper|code

归一化/正则化(Batch Normalization)

[3] Adaptive Consistency Regularization for Semi-Supervised Transfer Learning(半监督转移学习的自适应一致性正则化)

paper|code

[2] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)

paper

[1] Representative Batch Normalization with Feature Calibration（具有特征校准功能的代表性批量归一化）

图像聚类(Image Clustering)

[2] Improving Unsupervised Image Clustering With Robust Learning（通过鲁棒学习改善无监督图像聚类）

paper|code

[1] Reconsidering Representation Alignment for Multi-view Clustering(重新考虑多视图聚类的表示对齐方式)

模型评估(Model Evaluation)

[1] Are Labels Necessary for Classifier Accuracy Evaluation?(测试集没有标签，我们可以拿来测试模型吗？)

paper|解读

数据集(Database)

[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割：数据集，基准和挑战)

paper|code

[1] Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels（重新标记ImageNet：从单标签到多标签，从全局标签到本地标签）

paper|code

主动学习(Active Learning)

[3] Vab-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning

paper|code

[2] Multiple Instance Active Learning for Object Detection（用于对象检测的多实例主动学习）

paper|code

[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)

paper

小样本学习(Few-shot Learning)/零样本学习(Zero-shot Learning)

[6] Goal-Oriented Gaze Estimation for Zero-Shot Learning(零样本学习的目标导向注视估计)

paper|code

[5] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

paper|code

[4] Counterfactual Zero-Shot and Open-Set Visual Recognition(反事实零射和开集视觉识别)

paper|code

[3] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)

paper

[2] Few-shot Open-set Recognition by Transformation Consistency(转换一致性很少的开放集识别)

[1] Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning(探索少量学习的不变表示形式和等变表示形式的互补强度)

paper|

持续学习(Continual Learning/Life-long Learning)

[2] Rainbow Memory: Continual Learning with a Memory of Diverse Samples（不断学习与多样本的记忆）

[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)

视觉推理(Visual Reasoning)

[1] Transformation Driven Visual Reasoning(转型驱动的视觉推理)

paper|code|project

迁移学习/domain/自适应](#domain)

[6] Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation(基于双层域混合的半监督域自适应语义分割)

paper

[5] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation(多源领域自适应与协作学习的语义分割)

paper

[4] Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning(通过域随机化和元学习对视觉表示进行连续调整)

paper

[3] Domain Generalization via Inference-time Label-Preserving Target Projections(基于推理时间保标目标投影的区域泛化)

paper

[2] MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing(可伸缩的自适应视频压缩传感重建)

paper|code

[1] FSDR: Frequency Space Domain Randomization for Domain Generalization(用于域推广的频域随机化)

paper

对比学习(Contrastive Learning)

[1] Fine-grained Angular Contrastive Learning with Coarse Labels(粗标签的细粒度角度对比学习)

paper

强化学习(Reinforcement Learning)

[1] Unsupervised Learning for Robust Fitting:A Reinforcement Learning Approach(无监督学习以进行稳健拟合：一种强化学习方法)

paper

Consensus Maximisation Using Influences of Monotone Boolean Functions(利用单调布尔函数的影响实现共识最大化)

paper

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food(实现对通用食品的自动营养理解)

paper

Structured Scene Memory for Vision-Language Navigation(用于视觉语言导航的结构化场景存储器)

paper|code

Learning Asynchronous and Sparse Human-Object Interaction in Videos(视频中异步稀疏人-物交互的学习)

paper

Self-supervised Geometric Perception(自我监督的几何知觉)

paper

Quantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)

paper

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts(探索具有对比场景上下文的数据高效3D场景理解)

paper|project|video

Data-Free Model Extraction(无数据模型提取)

paper

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition(用于【位置识别】的局部全局描述符的【多尺度融合】)

paper|code

Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations(适用于正确概念的权利：通过可解释性来修正神经符号概念)

paper

Multi-Objective Interpolation Training for Robustness to Label Noise(多目标插值训练的鲁棒性)

paper|code

VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs(【文本生成】VX2TEXT：基于视频的文本生成的端到端学习来自多模式输入)

paper

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans(【图像字幕】Scan2Cap：RGB-D扫描中的上下文感知密集字幕)
paper|code|project|video

Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph(基于目标关系图的分层部分可观测目标驱动策略学习)

paper

ID-Unet: Iterative Soft and Hard Deformation for View Synthesis(视图合成的迭代软硬变形)

paper

PML: Progressive Margin Loss for Long-tailed Age Classification(【长尾分布】【图像分类】长尾年龄分类的累进边际损失)

paper

Diversifying Sample Generation for Data-Free Quantization（【图像生成】多样化的样本生成，实现无数据量化）

paper

Domain Generalization via Inference-time Label-Preserving Target Projections（通过保留推理时间的目标投影进行域泛化）

paper

DeRF: Decomposed Radiance Fields（分解的辐射场）

project

Densely connected multidilated convolutional networks for dense prediction tasks（【密集预测】密集连接的多重卷积网络，用于密集的预测任务）

paper

Weakly-supervised Grounded Visual Question Answering using Capsules（使用胶囊进行弱监督的地面视觉问答）

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation（【视频插帧】FLAVR：用于快速帧插值的与流无关的视频表示）

paper|code|project

Probabilistic Embeddings for Cross-Modal Retrieval（跨模态检索的概率嵌入）

paper

Self-supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map(道路动力学和成本图的自监督式多步同时预测)

IIRC: Incremental Implicitly-Refined Classification(增量式隐式定义的分类)

paper|project

Fair Attribute Classification through Latent Space De-biasing(通过潜在空间去偏的公平属性分类)

paper|code|project

Information-Theoretic Segmentation by Inpainting Error Maximization(修复误差最大化的信息理论分割)

paper

UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pretraining(【视频语言学习】UC2：通用跨语言跨模态视觉和语言预培训)

Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling(通过稀疏采样进行视频和语言学习)

paper|code

D-NeRF: Neural Radiance Fields for Dynamic Scenes(D-NeRF：动态场景的神经辐射场)

paper|project

Weakly Supervised Learning of Rigid 3D Scene Flow(刚性3D场景流的弱监督学习)

paper|code|project

本文章首发在极市计算机视觉技术社区

微信公众号: 极市平台（ID: extrememart ）
每天推送最新CV干货

CVPR2021 最全整理：论文分类汇总 / 代码 / 项目 / 论文解读（更新中）【计算机视觉】

CVPR2021 最全整理：论文分类汇总 / 代码 / 项目 / 论文解读（更新中）【计算机视觉】

分类目录：

图像目标检测(Image Object Detection)

视频目标检测(Video Object Detection)

三维目标检测(3D object detection)

动作检测(Activity Detection)

异常检测(Anomally Detection)

人物交互检测(HOI Detection)

伪装目标检测(Camouflaged Object Detection)

图像分割(Image Segmentation)

全景分割(Panoptic Segmentation)

语义分割(Semantic Segmentation)

实例分割(Instance Segmentation)

抠图(Matting)

9. 估计(Estimation)

人体姿态估计(Human Pose Estimation)

手势估计(Gesture Estimation)

光流/位姿/运动估计(Flow/Pose/Motion Estimation)

深度估计(Depth Estimation)

图像处理(Image Processing)

图像复原(Image Restoration)/超分辨率(Super Resolution)

图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)

图像去噪/去模糊/去雨去雾(Image Denoising)

图像编辑/图像修复(Image Edit/Inpainting)

图像翻译（Image Translation）

人脸(Face)

目标跟踪(Object Tracking)

图像/视频检索(Image/Video Retrieval)

行为识别/动作识别(Action/Activity Recognition)

医学影像(Medical Imaging)

文本检测/识别(Text Detection/Recognition)

遥感图像(Remote Sensing Image)

神经网络架构搜索(NAS)

GAN/生成式/对抗式(GAN/Generative/Adversarial)

三维视觉(3D Vision)

点云(Point Cloud)

三维重建(3D Reconstruction)

模型压缩(Model Compression)

知识蒸馏(Knowledge Distillation)

神经网络架构(Neural Network Structure)

Transformer

图神经网络(GNN)

数据处理(Data Processing)

数据增广(Data Augmentation)

表征学习(Representation Learning)

归一化/正则化(Batch Normalization)

图像聚类(Image Clustering)

模型评估(Model Evaluation)

数据集(Database)

主动学习(Active Learning)

小样本学习(Few-shot Learning)/零样本学习(Zero-shot Learning)

持续学习(Continual Learning/Life-long Learning)

视觉推理(Visual Reasoning)

迁移学习/domain/自适应](#domain)

对比学习(Contrastive Learning)

强化学习(Reinforcement Learning)

Recommend

About Joyk