每日新文推荐：计算机视觉方向 [2020-11-20]

写在前面

我们对arix每天更新的论文进行分类推送给大家，如果你想增加某个领域或方向的论文，可以私信我们。

今日更新54篇：

图像分类 0篇
目标检测 3篇
图像分割 8篇
目标跟踪 1篇
人脸识别 1篇
3D 7篇
GAN 8篇
其它 31篇

后台回复【20201120】可获取打包好的论文

目标检测: 3篇

[0] Learning to Predict the 3D Layout of a Scene

标题：学习预测场景的3D布局

作者：Jihao Andreas Lin, Jakob Brünker, Daniel Fährmann

链接：http://arxiv.org/abs/2011.09977

[1] Geography-Aware Self-Supervised Learning

标题：地理感知自我监督学习

作者：Kumar Ayush, Burak Uzkent, Chenlin Meng, Marshall Burke, David Lobell, Stefano Ermon

链接：http://arxiv.org/abs/2011.09980

[2] Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

标题：传播自己：探索无监督视觉表示学习的像素级一致性

作者：Zhenda Xie, Yutong Lin, Zheng Zhang, Yue Cao, Stephen Lin, Han Hu

链接：http://arxiv.org/abs/2011.10043

图像分割: 8篇

[0] Bidirectional RNN-based Few Shot Learning for 3D Medical Image Segmentation

标题：基于双向RNN的3D医学图像分割少量镜头学习

作者：Soopil Kim, Sion An, Philip Chikontwe, Sang Hyun Park

链接：http://arxiv.org/abs/2011.09608

备注：Submitted to AAAI21

[1] Deep LF-Net: Semantic Lung Segmentation from Indian Chest Radiographs Including Severely Unhealthy Images

标题：深度LF-Net：印度胸部X光片的语义肺分割，包括严重不健康的图像

作者：Anushikha Singh, Brejesh Lall, B. K. Panigrahi, Anjali Agrawal, Anurag Agrawal, DJ Christopher, Balamugesh Thangakunam

链接：http://arxiv.org/abs/2011.09695

[2] Attention-Based Transformers for Instance Segmentation of Cells in Microstructures

标题：基于注意力的变形金刚，用于微结构中细胞的实例分割

作者：Tim Prangemeier, Christoph Reich, Heinz Koeppl

链接：http://arxiv.org/abs/2011.09763

备注：submitted to IEEE BIBM 2020

[3] Foreground-Aware Relation Network for Geospatial Object Segmentation in High Spatial Resolution Remote Sensing Imagery

标题：高分辨率遥感影像中地理空间目标分割的前景感知关系网

作者：Zhuo Zheng, Yanfei Zhong, Junjue Wang, Ailong Ma

链接：http://arxiv.org/abs/2011.09766

代码：https://github.com/Z-Zheng/FarSeg

备注：Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). 2020

[4] Unifying Instance and Panoptic Segmentation with Dynamic Rank-1 Convolutions

标题：使用动态Rank-1卷积统一实例和全景分割

作者：Hao Chen, Chunhua Shen, Zhi Tian

链接：http://arxiv.org/abs/2011.09796

代码：https://git.io/AdelaiDet

[5] DCT-Mask: Discrete Cosine Transform Mask Representation for Instance Segmentation

标题：DCT掩码：实例分段的离散余弦变换掩码表示

作者：Xing Shen, Jirui Yang, Chunbo Wei, Bing Deng, Jianqiang Huang, Xiansheng Hua, Xiaoliang Cheng, Kewei Liang

链接：http://arxiv.org/abs/2011.09876

[6] Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

标题：用于LiDAR分割的圆柱和非对称3D卷积网络

作者：Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Yuexin Ma, Wei Li, Hongsheng Li, Dahua Lin

链接：http://arxiv.org/abs/2011.10033

代码：https://github.com/xinge008/Cylinder3D

备注：This work achieves the 1st place in the leaderboard of SemanticKITTI (until CVPR DDL) and based on this work, we also achieve the 1st place in the leaderboard of SemanticKITTI panoptic segmentation; Code at this https URL

[7] Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

标题：传播自己：探索无监督视觉表示学习的像素级一致性

作者：Zhenda Xie, Yutong Lin, Zheng Zhang, Yue Cao, Stephen Lin, Han Hu

链接：http://arxiv.org/abs/2011.10043

目标跟踪: 1篇

[0] TRAT: Tracking by Attention Using Spatio-Temporal Features

标题：TRAT：使用时空特征进行注意力跟踪

作者：Hasan Saribas, Hakan Cevikalp, Okan Köpüklü, Bedirhan Uzun

链接：http://arxiv.org/abs/2011.09524

人脸识别: 1篇

[0] Visual Diver Face Recognition for Underwater Human-Robot Interaction

标题：水下人机交互的视觉潜水员人脸识别

作者：Jungseok Hong, Sadman Sakib Enan, Christopher Morse, Junaed Sattar

链接：http://arxiv.org/abs/2011.09556

3D: 7篇

[0] TRAT: Tracking by Attention Using Spatio-Temporal Features

标题：TRAT：使用时空特征进行注意力跟踪

作者：Hasan Saribas, Hakan Cevikalp, Okan Köpüklü, Bedirhan Uzun

链接：http://arxiv.org/abs/2011.09524

[1] Bidirectional RNN-based Few Shot Learning for 3D Medical Image Segmentation

标题：基于双向RNN的3D医学图像分割少量镜头学习

作者：Soopil Kim, Sion An, Philip Chikontwe, Sang Hyun Park

链接：http://arxiv.org/abs/2011.09608

备注：Submitted to AAAI21

[2] Face Forgery Detection by 3D Decomposition

标题：通过3D分解进行人脸伪造检测

作者：Xiangyu Zhu, Hao Wang, Hongyan Fei, Zhen Lei, Stan Z. Li

链接：http://arxiv.org/abs/2011.09737

[3] All-in-Focus Iris Camera With a Great Capture Volume

标题：具有出色拍摄量的全焦点虹膜相机

作者：Kunbo Zhang, Zhenteng Shen, Yunlong Wang, Zhenan Sun

链接：http://arxiv.org/abs/2011.09908

备注：to be published in International Joint Conference on Biometrics 2020

[4] Learning to Predict the 3D Layout of a Scene

标题：学习预测场景的3D布局

作者：Jihao Andreas Lin, Jakob Brünker, Daniel Fährmann

链接：http://arxiv.org/abs/2011.09977

[5] Multi-Plane Program Induction with 3D Box Priors

标题：具有3D盒先验的多平面程序归纳

作者：Yikai Li, Jiayuan Mao, Xiuming Zhang, Bill Freeman, Josh Tenenbaum, Noah Snavely, Jiajun Wu

链接：http://arxiv.org/abs/2011.10007

备注：NeurIPS 2020. First two authors contributed equally. Project page: this http URL

[6] Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

标题：用于LiDAR分割的圆柱和非对称3D卷积网络

作者：Xinge Zhu, Hui Zhou, Tai Wang, Fangzhou Hong, Yuexin Ma, Wei Li, Hongsheng Li, Dahua Lin

链接：http://arxiv.org/abs/2011.10033

代码：https://github.com/xinge008/Cylinder3D

GAN: 8篇

[0] Contextual Fusion For Adversarial Robustness

标题：对抗性鲁棒性的上下文融合

作者：Aiswarya Akumalla, Seth Haney, Maksim Bazhenov

链接：http://arxiv.org/abs/2011.09526

[1] Robustified Domain Adaptation

标题：稳固的领域适应

作者：Jiajin Zhang, Hanqing Chao, Pingkun Yan

链接：http://arxiv.org/abs/2011.09563

[2] Abnormal Event Detection in Urban Surveillance Videos Using GAN and Transfer Learning

标题：基于GAN和转移学习的城市监控视频异常事件检测

作者：Ali Atghaei, Soroush Ziaeinejad, Mohammad Rahmati

链接：http://arxiv.org/abs/2011.09619

备注：7 pages, 9 figures, 3 tables

[3] Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision

标题：观看和学习：通过自我监督映射语言和嘈杂的真实视频

作者：Yujie Zhong, Linhai Xie, Sen Wang, Lucia Specia, Yishu Miao

链接：http://arxiv.org/abs/2011.09634

备注：NeurIPS 2020 Self-Supervised Learning Workshop

[4] Style Intervention: How to Achieve Spatial Disentanglement with Style-based Generators?

标题：样式干预：如何使用基于样式的生成器实现空间分离？

作者：Yunfan Liu, Qi Li, Zhenan Sun, Tieniu Tan

链接：http://arxiv.org/abs/2011.09699

[5] An Experimental Study of Semantic Continuity for Deep Learning Models

标题：深度学习模型的语义连续性实验研究

作者：Shangxi Wu, Jitao Sang, Xian Zhao, Lizhang Chen

链接：http://arxiv.org/abs/2011.09789

[6] Adversarial Threats to DeepFake Detection: A Practical Perspective

标题：DeepFake检测的对抗性威胁：实用观点

作者：Paarth Neekhara, Brian Dolhansky, Joanna Bitton, Cristian Canton Ferrer

链接：http://arxiv.org/abs/2011.09957

[7] Creative Sketch Generation

标题：创意草图生成

作者：Songwei Ge, Vedanuj Goswami, C. Lawrence Zitnick, Devi Parikh

链接：http://arxiv.org/abs/2011.10039

代码：https://github.com/facebookresearch/DoodlerGAN

其它: 31篇

[0] Extracting and Learning Fine-Grained Labels from Chest Radiographs

标题：从胸部X光片中提取和学习细粒度标签

作者：Tanveer Syeda-Mahmood, Ph.D, K.C.L Wong, Ph.D, Joy T. Wu, M.D., M.P.H, Ashutosh Jadhav, Ph.D, Orest Boyko, M.D. Ph.D

链接：http://arxiv.org/abs/2011.09517

备注：This paper won the Homer R. Warner Award at AMIA 2020 awarded to a paper that best describes approaches to improving computerized information acquisition, knowledge data acquisition and management, and experimental results documenting the value of these approaches. The paper shows a combination of textual and visual processing to automatically recognize complex findings in chest X-rays

[1] Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language

标题：用于视频字幕的神经符号表示：利用视觉和语言的归纳偏见的案例

作者：Hassan Akbari, Hamid Palangi, Jianwei Yang, Sudha Rao, Asli Celikyilmaz, Roland Fernandez, Paul Smolensky, Jianfeng Gao, Shih-Fu Chang

链接：http://arxiv.org/abs/2011.09530

代码：https://github.com/hassanhub/R3Transformer

[2] StressNet: Detecting Stress in Thermal Videos

标题：StressNet：检测热视频中的压力

作者：Satish Kumar, A S M Iftekhar, Michael Goebel, Tom Bullock, Mary H. MacLean, Michael B. Miller, Tyler Santander, Barry Giesbrecht, Scott T. Grafton, B.S. Manjunath

链接：http://arxiv.org/abs/2011.09540

备注：11 pages, 10 figues, 2 tables, Conference WACV2021

[3] An Efficient and Scalable Deep Learning Approach for Road Damage Detection

标题：一种高效，可扩展的深度学习道路损伤检测方法

作者：Sadra Naddaf-sh, M-Mahdi Naddaf-sh, Amir R. Kashani, Hassan Zargarzadeh

链接：http://arxiv.org/abs/2011.09577

代码：https://github.com/mahdi65/roadDamageDetection2020

[4] Patient-independent Epileptic Seizure Prediction using Deep Learning Models

标题：使用深度学习模型的与患者无关的癫痫发作预测

作者：Theekshana Dissanayake, Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes

链接：http://arxiv.org/abs/2011.09581

[5] ACRONYM: A Large-Scale Grasp Dataset Based on Simulation

标题：ACRONYM：基于仿真的大规模掌握数据集

作者：Clemens Eppner, Arsalan Mousavian, Dieter Fox

链接：http://arxiv.org/abs/2011.09584

[6] Deep Multi-view Depth Estimation with Predicted Uncertainty

标题：具有预测不确定性的深度多视图深度估计

作者：Tong Ke, Tien Do, Khiem Vuong, Kourosh Sartipi, Stergios I. Roumeliotis

链接：http://arxiv.org/abs/2011.09594

[7] HMFlow: Hybrid Matching Optical Flow Network for Small and Fast-Moving Objects

标题：HMFlow：适用于小型和快速移动物体的混合匹配光流网络

作者：Suihanjin Yu, Youmin Zhang, Chen Wang, Xiao Bai, Liang Zhang, Edwin R. Hancock

链接：http://arxiv.org/abs/2011.09654

备注：8 pages, 10 figures

[8] Modeling Fashion Influence from Photos

标题：从照片建模时尚影响力

作者：Ziad Al-Halah, Kristen Grauman

链接：http://arxiv.org/abs/2011.09663

备注：To appear in the IEEE Transactions on Multimedia, 2020. Project page: this https URL. arXiv admin note: substantial text overlap with arXiv:2004.01316

[9] Dense Label Encoding for Boundary Discontinuity Free Rotation Detection

标题：密集标签编码，用于边界不连续自由旋转检测

作者：Xue Yang, Liping Hou, Yue Zhou, Wentao Wang, Junchi Yan

链接：http://arxiv.org/abs/2011.09670

代码：https://github.com/Thinklab-SJTU/DCL_RetinaNet_Tensorflow

备注：12 pages, 6 figures, 8 tables

[10] Defocus Blur Detection via Salient Region Detection Prior

标题：通过显着区域检测优先进行散焦模糊检测

作者：Ming Qian, Min Xia, Chunyi Sun, Zhiwei Wang, Liguo Weng

链接：http://arxiv.org/abs/2011.09677

[11] Learning Deep Video Stabilization without Optical Flow

标题：学习无视频流的深度视频稳定

作者：Muhammad Kashif Ali, Sangjoon Yu, Tae Hyun Kim

链接：http://arxiv.org/abs/2011.09697

[12] Spectral Response Function Guided Deep Optimization-driven Network for Spectral Super-resolution

标题：光谱响应功能引导的深度优化驱动网络，实现光谱超分辨率

作者：Jiang He, Jie Li, Qiangqiang Yuan, Huanfeng Shen, Liangpei Zhang

链接：http://arxiv.org/abs/2011.09701

[13] Latent-Separated Global Prediction for Learned Image Compression

标题：用于学习图像压缩的潜在分离全局预测

作者：Zongyu Guo, Zhizheng Zhang, Runsen Feng, Simeng Sun, Zhibo Chen

链接：http://arxiv.org/abs/2011.09704

备注：7 pages in main paper, with appendix

[14] Scene text removal via cascaded text stroke detection and erasing

标题：通过级联文本笔触检测和擦除来删除场景文本

作者：Xuewei Bian, Chaoqun Wang, Weize Quan, Juntao Ye, Xiaopeng Zhang, Dong-Ming Yan

链接：http://arxiv.org/abs/2011.09768

备注：14 pages, 9 figures

[15] Deep Learning for Automated Screening of Tuberculosis from Indian Chest X-rays: Analysis and Update

标题：从印度胸部X射线自动筛查结核病的深度学习：分析和更新

作者：Anushikha Singh, Brejesh Lall, B.K. Panigrahi, Anjali Agrawal, Anurag Agrawal, Balamugesh Thangakunam, DJ Christopher

链接：http://arxiv.org/abs/2011.09778

[16] Towards Spatio-Temporal Video Scene Text Detection via Temporal Clustering

标题：通过时间聚类实现时空视频场景文本检测

作者：Yuanqiang Cai, Chang Liu, Weiqiang Wang, Qixiang Ye

链接：http://arxiv.org/abs/2011.09781

[17] DeepMorph: A System for Hiding Bitstrings in Morphable Vector Drawings

标题：DeepMorph：隐藏可变形矢量图形中的位串的系统

作者：Søren Rasmussen, Karsten Østergaard Noe, Oliver Gyldenberg Hjermitslev, Henrik Pedersen

链接：http://arxiv.org/abs/2011.09783

[18] TaL: a synchronised multi-speaker corpus of ultrasound tongue imaging, audio, and lip videos

标题：TaL：超声舌图像，音频和嘴唇视频的同步多说话者语料库

作者：Manuel Sam Ribeiro, Jennifer Sanger, Jing-Xuan Zhang, Aciel Eshky, Alan Wrench, Korin Richmond, Steve Renals

链接：http://arxiv.org/abs/2011.09804

备注：8 pages, 4 figures, Accepted to SLT2021, IEEE Spoken Language Technology Workshop

[19] Unmixing Convolutional Features for Crisp Edge Detection

标题：分解卷积特征以实现清晰边缘检测

作者：Linxi Huan, Xianwei Zheng, Nan Xue, Wei He, Jianya Gong, Gui-Song Xia

链接：http://arxiv.org/abs/2011.09808

[20] Interval-valued aggregation functions based on moderate deviations applied to Motor-Imagery-Based Brain Computer Interface

标题：基于中等偏差的区间值聚合函数应用于基于运动图像的脑计算机接口

作者：Javier Fumanal-Idocin, Zdenko Takáč, Javier Fernández Jose Antonio Sanz, Harkaitz Goyena, Ching-Teng Lin, Yu-Kai Wang, Humberto Bustince

链接：http://arxiv.org/abs/2011.09831

[21] Differentiable Data Augmentation with Kornia

标题：使用Kornia进行差异化数据增强

作者：Jian Shi, Edgar Riba, Dmytro Mishkin, Francesc Moreno, Anguelos Nicolaou

链接：http://arxiv.org/abs/2011.09832

[22] Everybody Sign Now: Translating Spoken Language to Photo Realistic Sign Language Video

标题：现在每个人都可以签到：将口语翻译成照片，逼真的手语视频

作者：Ben Saunders, Necati Cihan Camgoz, Richard Bowden

链接：http://arxiv.org/abs/2011.09846

[23] Recursive Deep Prior Video: a Super Resolution algorithm for Time-Lapse Microscopy of organ-on-chip experiments

标题：递归深层先验视频：一种用于芯片上器官实验的延时显微镜的超分辨率算法

作者：Pasquale Cascarano, Maria Colomba Comes, Arianna Mencattini, Maria Carla Parrini, Elena Loli Piccolomini, Eugenio Martinelli

链接：http://arxiv.org/abs/2011.09855

备注：Paper submitted to a peer-reviewed journal

[24] DeepRepair: Style-Guided Repairing for DNNs in the Real-world Operational Environment

标题：DeepRepair：在实际操作环境中为DNN进行样式指导的修复

作者：Bing Yu, Hua Qi, Qing Guo, Felix Juefei-Xu, Xiaofei Xie, Lei Ma, Jianjun Zhao

链接：http://arxiv.org/abs/2011.09884

备注：14 pages; 5 figures

[25] Learning in School: Multi-teacher Knowledge Inversion for Data-Free Quantization

标题：在学校学习：多教师知识反演，实现无数据量化

作者：Yuhang Li, Feng Zhu, Ruihao Gong, Mingzhu Shen, Fengwei Yu, Shaoqing Lu, Shi Gu

链接：http://arxiv.org/abs/2011.09899

[26] Using Text to Teach Image Retrieval

标题：使用文字教学图像检索

作者：Haoyu Dong, Ze Wang, Qiang Qiu, Guillermo Sapiro

链接：http://arxiv.org/abs/2011.09928

[27] Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations

标题：异构对比学习：编码空间信息以实现紧凑的视觉表示

作者：Xinyue Huo, Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Hao Li, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian

链接：http://arxiv.org/abs/2011.09941

备注：10 pages, 4 figures, 6 tables

[28] A Preliminary Comparison Between Compressive Sampling and Anisotropic Mesh-based Image Representation

标题：压缩采样与基于各向异性网格的图像表示方法的初步比较

作者：Xianping Li, Teresa Wu

链接：http://arxiv.org/abs/2011.09944

备注：9 pages, 3 figures, 2 tables

[29] Proposing method to Increase the detection accuracy of stomach cancer based on colour and lint features of tongue using CNN and SVM

标题：基于CNN和SVM的基于舌头颜色和皮屑特征的提高胃癌检测准确性的方法

作者：Elham Gholami, Seyed Reza Kamel Tabbakh, Maryam Kheirabadi

链接：http://arxiv.org/abs/2011.09962

[30] The Cube++ Illumination Estimation Dataset

标题：Cube ++照明估算数据集

作者：Egor Ershov, Alex Savchik, Illya Semenkov, Nikola Banić, Alexander Belokopytov, Daria Senshina, Karlo Koscević, Marko Subašić, Sven Lončarić

链接：http://arxiv.org/abs/2011.10028

代码：https://github.com/Visillect/CubePlusPlus/

写在前面

今日更新54篇：

目标检测: 3篇

图像分割: 8篇

目标跟踪: 1篇

人脸识别: 1篇

3D: 7篇

GAN: 8篇

其它: 31篇

感谢arxiv.org

Recommend

不会编程也能做酷炫视频风格迁移？这个工具冲上Reddit热榜

工业边缘计算有哪些应用场景？

你的焦虑，他们的大生意

新氧上线美次卡产品，旨在进军医美C端市场

Platypus攻击可从Intel CPU中窃取数据

BRD、MRD 和 PRD 之间的区别与联系

双十一刚结束，电商平台的财报季又接踵而至。

社区团购的暗黑江湖和两极分化

社区团购：旧故事和新战事

直播带货，不讲武德？

About Joyk