「CV」 目标检测资源汇总
图1:目标检测发现路线
1 综述
-
Object Detection with Deep Learning: A Review
2018-07-15 paper
$\bullet \bullet$ review
可看成改版版本的Image Pyramid;分析了小尺度与预训练模型尺度之间的关系, 提出了Scale Normalization for Image Pyramids (SNIP):在训练中,每次只回传那些大小在一个预先指定范围内的proposal的gradient,而忽略掉过大或者过小的proposal;在测试中,建立大小不同的Image Pyramid,在每张图上都运行这样一个detector,同样只保留那些大小在指定范围之内的输出结果,最终在一起NMS;这样就可以保证网络总是在同样scale的物体上训练,也就是标题中Scale Normalized的意思; -
Deep Learning for Generic Object Detection: A Survey
IJCV 2018 2018-09-06 paper | blog-shine-lee | blog-Junr_0926 | blog-wyj2046
$\bullet \bullet$ generic -
Object Detection in 20 Years: A Survey
2019-03-13 paper | blog | blog-小金乌会发光-Z&M | blog-Shida | blog-好好学习,天天向上 | blog-Albertchen | blog-oneoverzero | blog-奔跑的Yancy | blog-HYY CS
$\bullet \bullet$ 20 years -
A Survey of Deep Learning-based Object Detection
29-07-11 paper | 阅读 | blog-Merria28
$\bullet \bullet$ survey
文章汇总最新的基于深度学习的目标检测方法、数据集,但不介绍基础知识及深入讨论; -
Deep Domain Adaptive Object Detection: a Survey
ICIP 2020 2020-02-17 paper
2 理论
-
The Relationship Between Precision-Recall and ROC Curves
2006-12-13 paper -
Deep Neural Networks for Object Detection
google paper
目标检测中的神经网络; -
Uncertainty Estimation in One-Stage Object Detection
2019-05-24 paper -
Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
CVPR 2019 2019-02-25 paper
使用 IoU 设计新的 loss 函数;
3 检测框架
3.1 单阶段
-
You Only Look Once: Unified, Real-Time Object Detection
2015-06-08 paper | home
YOLO
anchor free? -
SSD: Single Shot MultiBox Detector
ECCV 2016 Oral 2015-12-08 Paper | caffe-official | Caffe | mxnet | mxnet-cpp | Keras | Keras | Tensorflow | Pytorch | pytorch-more | TensorRT3 | TensorRT4 | ncnn | caffe | caffe
多特征图; -
Feature Pyramid Networks for Object Detection
CVPR 2017 2016-12-09 paper
FPN
特征金字塔网络 -
YOLO9000: Better, Faster, Stronger
2016-12-25 paper
YOLOv2 -
DSSD : Deconvolutional Single Shot Detector
2017-01-23 paper | mxnet
DSSD: 使用 Top-Down 网络结构,解决小物体检测的问题 -
Enhancement of SSD by concatenating feature maps for object detection
2017-05-26 paper
RSSD - DSOD: learning deeply supervised object detectors from scratch
ICCV 2017 2017-08-03 paper | caffe-official | pytorch | pytorch | pytorch | pytorch | mxnet | mxnet | tensorflow
从零开始训练检测模型:网络设计和训练策略;性能提升,速度略慢
DSOD
-
Single-Shot Refinement Neural Network for Object Detection
CVPR 2018 2017-11-18 paper | caffe
RefineDet: 融合了 Faster RCNN 的 RPN 和 FPN 特征融合的策略,检测用的 SSD,提高对小目标的检测效果; -
Receptive Field Block Net for Accurate and Fast Object Detection
ECCV 2018 2017-11-21 paper | pytorch-official
改进 SSD ,多个感受野融合;
RFBNet -
R-FCN-3000 at 30fps: Decoupling Detection and Classification
CVPR 2018 2017-12-05 paper
R-FCN-3000
YOLO 9000 将检测数据集和分类数据集合并训练检测模型,本文仅采用具有辅助候选框信息的 ImageNet 数据集训练检测分类器 -
FSSD: Feature Fusion Single Shot Multibox Detector
2017-12-04 paper | caffe-official
FSSD: 特征融合后在下采样,之后进行预测;速度略有降低 1080Ti 上 65 fps,map 提升至 82.7;
理论上对小目标会更好; - Single-Shot Object Detection with Enriched Semantics
cvpr 2018 2017-12-01 paper
DES: 在 SSD 网络基础上,增加了语义分割分支和全局激活模块;前者增加低层检测特征,后者通过学习特征通道和目标类别的语义关系来进行高层目标检测特征; -
Scale-Transferrable Object Detection
cvpr 2018 2017 paper
STDN: 提高对多尺度的适应性(尤其是小目标);backbone 使用的是 DenseNet ,检测用的 SSD;引入scale-transfer layer,实现了在几乎不增加参数量和计算量的前提下生成大尺寸的 feature map(其他常见的算法基本上都是采用 deconvolution 或 upsample); -
YOLOv3: An Incremental Improvement
2018-04-08 paper | home | darknet-行人 | darknet | tensorflow2 | 复现代码合集 | pytorch-app
YOLO V3 -
R-FCN++: Towards Accurate Region-Based Fully Convolutional Networks for Object Detection
AAAI 2018 2018-04-27 paper -
Object detection at 200 Frames Per Second
2018-05-16 paper -
You Only Look Twice: Rapid Multi-Scale Object Detection In Satellite Imagery
2018-05-24 paper -
M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network
AAAI 2019 2018-11-12 paper | pytorch-official -
ThunderNet: Towards Real-time Generic Object Detection
2019-03-28 paper | paper with code | pytorch | pytorch ARM 端首个实时,接近 25fps; -
Libra R-CNN: Towards Balanced Learning for Object Detection
CVPR 2019 2019-04-04 paper | paper with code | mmdetection -
FCOS: Fully Convolutional One-Stage Object Detection
2019-04-02 paper | pytorch-official
FCOS
有无 anchor? -
xYOLO: A Model For Real-Time Object Detection In Humanoid Soccer On Low-End Hardware
2019-10-08 paper
TinyYOLO 上再做改进,速度更快;同时公布了一个足球数据集; -
EfficientDet: Scalable and Efficient Object Detection
CVPR 2020 2019-11-20 paper | tensorflow-official | pytorch-新零售 | paper with code | pytorch | pytorch | pytorch tvm | pytorch tracker -
YOLOv4: Optimal Speed and Accuracy of Object Detection
2020-04-23 paper | darknet | pytorch-app | 如何训练YOLOV4与YOLOV5 - OneNet: Towards End-to-End One-Stage Object Detection
2020-12-10 paper | pytorch-official
重新定义了 框与真值的绑定策略;
OneNet
3.2 两阶段
-
Rich feature hierarchies for accurate object detection and semantic segmentation
2013-11 Paper
RCNN -
Fast R-CNN
2015-04-30 Paper
实现了端到端的检测,与特征共享; -
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
2015-06 Paper | pytorch
提出了锚框(Anchor)这一划时代的思想; -
HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection
2016-04-03 paper -
R-FCN: Object Detection via Region-based Fully Convolutional Networks
NIPS 2016 2016-05-20 代季锋 paper | caffe-official | mxnet-official
RFCN
对预测特征图引入位置敏感分数图增强特征位置信息,提高检测精度 -
Mask R-CNN
CVPR 2017 2017-03-20 cvpr 2017 paper | keras | detectron-official | mxnet
实现分割任务的同时,也提升了检测性能:解决 RoIPooling 在 Pooling 过程中对 RoI 区域产生形变,且位置信息提取不精确的问题;通过改进 Faster R-CNN 结构完成分割任务 -
A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection
CVPR 2017 2017-04-11 paper | caffe
提出学习一个可以生成遮挡和变形样本的对抗网络,对抗器的目标是生成让目标检测器难以进行分类的样本;在我们的框架中,原始检测器和对抗器都是以联合的方式学习的 -
RON-Reverse Connection with Objectness Prior Networks for Object Detection
CVPR 2017 2017-07-06 paper | tensorflow | tensorflow
热度图; -
Cascade R-CNN: Delving into High Quality Object Detection
CVPR 2018 2017-12-03 paper | caffe
基于two-stage detector;Cascade R-CNN是R-CNN的multi-stage延伸,由一系列随着IOU临界值增加而训练的检测器构成,从而对close false positives更具有选择性;R-CNN阶段的cascade是按顺序训练的,使用一个阶段的输出来训练下一阶段;类似于boostrapping methods,不同点是Cascade R-CNN的重采样过程并不旨在mine hard negatives,而是通过调整bounding boxes,每个阶段的目的都是为了找到一组好的false positive来训练下一阶段 -
Path Aggregation Network for Instance Segmentation
CVPR 2018 2018-03-05 paper | Detection-official | paper with code | keras
MaskRCNN 的扩展,处理了 FPN 和自适应 ROI; -
Domain Adaptive Faster R-CNN for Object Detection in the Wild
2018-03-08 paper -
Light-Weight RetinaNet for Object Detection
2019-05-24 paper -
Sparse R-CNN: End-to-End Object Detection with Learnable Proposals
2020-11-25 paper | pytorch-official | blog
让 anchor 稀疏;
3.3 Anchor free
-
A closer look: Small object detection in faster R-CNN
无 pdf 介绍了一种生成anchor proposals的改进建议,并对Faster R-CNN进行修改,利用较高分辨率的小目标的feature maps https://blog.csdn.net/zhangjunhit/article/details/78900298 -
Scale-aware Pixel-wise Object Proposal Networks
2016-01-19 paper
[2015.1 - 2015.7] 提出Scale-aware Pixel-wise Object Proposal(SPOP)网络,可以生成具有高召回率和平均最佳重叠(ABO)的proposals,即使对于小目标也是如此;另外,引入了一个类似分段的像素定位网络来密集预测每个像素的对象坐标,并开发了一种尺度感知对象定位策略,该策略将来自大尺寸和小尺寸网络的预测与加权机制相结合,以提高各种对象尺寸的坐标预测精度 -
AutoAssign: Differentiable Label Assignment for Dense Object Detection
2020-07-07 旷视 paper | pytorch-official
AutoAssign
用于密集目标检测,优于当时的所有一阶段检测器,如 ATSS、FreeAnchor 和FCOS 等;高了 2% 左右; -
End-to-End Object Detection with Fully Convolutional Network
2020-12-07 旷视科技,西安交通大学 paper | pytorch-official
DeFCN
集于 FCOS,使用全卷积,就无需 anchor 和 NMS,做到了端到端;受 DETR 启发,设计了 POTO;精度与 FCOS 相当;速度差不多;
3.4 Transformer
-
End-to-End Object Detection with Transformers
2020-05-26 facebook paper | pytorch | blog
DETR -
Deformable DETR: Deformable Transformers for End-to-End Object Detection
2020-10-08 商汤, 中科大, 港中文 paper
DETR 存在收敛速度慢的问题,注意力可以关注参考点附近的元素;形变卷积提升了训来速度,降到 1/10,并提升小物体精度; -
DETR for Pedestrian Detection
2020-12-12 paper
改进了 transformer 应对遮挡问题的缺陷; -
Toward Transformer-Based Object Detection
2020-12-17 paper
无类别
- Learning Objectness from Sonar Images for Class-Independent Object Detection
2019-07-01 paper | github
没有类别信息的情况下,检测物体;通过输出大量候选框的方式进行物体检出;
基于显著性
- Location, location, location: Satellite image-based real-estate appraisal
2020-06-04 paper
卫星图像购房价格关系评估;
其他
-
Deformable Convolutional Networks
ICCV 2017 oral 2017-03-17 代季锋 paper | mxnet | blog - Towards Universal Object Detection by Domain Attention
CVPR 2019 2019-04-09 paper | project | code-official-coming
用注意力来解决泛化能力不好的问题;感觉就是在灌水;这不就是集成了几个模型嘛,最后融合的时候用所谓的注意力来融合,跟投票机制甚至是整个集成学习都没太大区别;
另外,这个改进是对基础网络做的,应该放在分类中,而不是局限在检测任务里; -
Receptive Field Block Net for Accurate and Fast Object Detection
RBF [2017.11 - 2018.7] https://arxiv.org/abs/1711.07767 -
An Analysis of Scale Invariance in Object Detection - SNIP
SNIP [2017.11 - 2018.3] cvpr2018, 可看成改版版本的Image Pyramid;分析了小尺度与预训练模型尺度之间的关系, 提出了Scale Normalization for Image Pyramids (SNIP):在训练中,每次只回传那些大小在一个预先指定范围内的proposal的gradient,而忽略掉过大或者过小的proposal;在测试中,建立大小不同的Image Pyramid,在每张图上都运行这样一个detector,同样只保留那些大小在指定范围之内的输出结果,最终在一起NMS;这样就可以保证网络总是在同样scale的物体上训练,也就是标题中Scale Normalized的意思 https://arxiv.org/abs/1711.08189 http://bit.ly/2yXVg4c -
Zero-Shot Detection
[2018.3] https://arxiv.org/abs/1803.07113 -
Zero-Shot Object Detection
[2018] eccv, https://arxiv.org/abs/1804.04340 -
Zero-Shot Object Detection by Hybrid Region Embedding
[2018] https://arxiv.org/abs/1805.06157 -
SAN: Learning Relationship between Convolutional Features for Multi-Scale Object Detection
SAN [2018.8] 提出了Scale Aware Network (SAN),将来自不同尺度的卷积特征映射到尺度不变的子空间,并设计了一种独特的学习方法,纯粹考虑了没有空间信息的渠道之间的关系;所提出的SAN减少了标度空间中的特征差异并提高了检测精度 https://arxiv.org/abs/1808.04974v1 -
Multi-scale Location-aware Kernel Representation for Object Detection
[2018.4] cvpr, 提出了一种新颖的多尺度位置感知核表示(MLKP),将判别性高阶统计量结合到object proposals的表示中以进行有效的对象检测;MLKP基于多项式核近似,可以有效生成低维高阶表示,其固有的位置保持性和敏感性也保证了可以灵活地用于目标检测 https://arxiv.org/abs/1804.00428 caffe https://github.com/Hwang64/MLKP -
CFENet: An Accurate and Efficient Single-Shot Object Detector for Autonomous Driving
CFENet [2018.6 - 2018.10] https://arxiv.org/abs/1805.09790 -
RepMet: Representative-based metric learning for classification and one-shot object detection
CVPR 2019 2018-06-12 paper
零样本检测; -
Acquisition of Localization Confidence for Accurate Object Detection
Iou-Net [2018.7.30] ECCV 用来学习来预测每个检测到的边界框与匹配的ground truth 之间的IoU https://arxiv.org/abs/1807.11590 -
SOD-MTGAN: Small Object Detection via Multi-Task Generative Adversarial Network
[2018] eccv, 无 pdf提出一个对于小目标检测的标准的端到端的多任务生成对抗网络(MTGAN),适用于任何已有的检测器;In the MTGAN, the generator network produces super-resolved images and the multi-task discriminator network is introduced to distinguish the real high-resolution images from fake ones, predict object categories, and refine bounding boxes, simultaneously. More importantly, the classification and regression losses are back-propagated to further guide the generator network to produce super-resolved images for easier classification and better localization -
MetaAnchor: Learning to Detect Objects with Customized Anchors
MetaAnchor [2018.7 - 2018.11] NIPS; 旷视科技 -
CornerNet: Detecting Objects as Paired Keypoints
CornerNet [2018.8] ECCV2018;密歇根大学 https://arxiv.org/abs/1808.01244 解读 https://zhuanlan.zhihu.com/p/41865617 -
Deep Feature Pyramid Reconfiguration for Object Detection
[2018.8] eccv, 当前特征金字塔的设计在如何整合不同尺度的语义信息方面仍然不够高效;本文把特征金字塔转换为特征的再组合过程,创造性地提出了一种高度非线性但是计算快速的结构将底层表示和高层语义特征进行整合;该网络由两个模块组成:全局注意力和局部再组合;这两个模块分布能全局和局部地去在不同的空间和尺度上提取任务相关的特征;重要的是,这两个模块具有轻量级、可嵌入和可端到端训练的优点 https://arxiv.org/abs/1808.07993?context=cs -
Recent Advances in Object Detection in the Age of Deep Convolutional Neural Networks
[2018.9] 目标检测综述 https://arxiv.org/abs/1809.03193 -
Context Refinement for Object Detection
ECCV 2018 http://openaccess.thecvf.com/content_ECCV_2018/papers/Zhe_Chen_Context_Refinement_for_ECCV_2018_paper.pdf -
Hybrid Knowledge Routed Modules for Large-scale Object Detection
[2018.10] NIPS; http://cn.arxiv.org/abs/1810.12681 -
Polarity Loss for Zero-shot Object Detection
[2018.11] 使用SPP模块通过扩大网络宽度而不是增加深度来生成金字塔形特征图;提出MSCA模块有效地组合了不同规模的上下文信息 https://arxiv.org/abs/1811.08982 -
Parallel Feature Pyramid Network for Object Detection
[2018] eccv http://openaccess.thecvf.com/content_ECCV_2018/papers/Seung-Wook_Kim_Parallel_Feature_Pyramid_ECCV_2018_paper.pdf -
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized for Non-GPU Computers
YOLO-LITE [2018.11] 速度是tiny-yolo的10倍; https://arxiv.org/abs/1811.05588v1 https://github.com/reu2018dl/yolo-lite https://mp.weixin.qq.com/s/xNaXPwI1mQsJ2Y7TT07u3g -
Progressive Attention Guided Recurrent Network for Salient Object Detection
http://openaccess.thecvf.com/content_cvpr_2018/CameraReady/1235.pdf -
Bottom-up Object Detection by Grouping Extreme and Center Points
[2019.1] 德克萨斯大学奥斯汀分校)one-stage 中准确度第一;测试耗时333ms; UT Austin; https://arxiv.org/abs/1901.08043 https://github.com/xingyizhou/ExtremeNet https://mp.weixin.qq.com/s/Th6vFE9Gy1Wl-Vf_Lz3w8w -
Scale-Aware Trident Networks for Object Detection
TridentNet [2019.01] 中国科学院大学&图森未来; 高 retinanet近 10 个百分点;主要处理了多尺度问题 https://arxiv.org/abs/1901.01892 https://mp.weixin.qq.com/s/7Pi5J8-d1HD2lapAD0_qHA -
DeRPN: Taking a further step toward more general object detection
DeRPN [2018.11] 这篇文章对RPN进行了改进,将anchor box进行了空间宽高解耦,使得anchor选择的复杂度从O(n^2)降到了O(n) https://arxiv.org/abs/1811.06700v1 https://github.com/HCIILAB/DeRPN https://mp.weixin.qq.com/s/4nJGdV3qF4IsfxLM8BBeqg -
Consistent Optimization for Single-Shot Object Detection
[2019.1]清华、宾夕法尼亚大学和字节跳动 https://arxiv.org/abs/1901.06563 https://mp.weixin.qq.com/s/4T90Lac_1GX2uy8xtWb1Ng -
Multiple receptive fields and small-object-focusing weakly-supervised segmentation network for fast object detection
2019-04-19 paper - A Real-Time Tiny Detection Model for Stem End and Blossom End of Navel Orange
2019-05-24 paper
4 技术点
4.1 基本配置
4.1.1 backbone
- DetNet: A Backbone network for Object Detection
ECCV 2018 2018-04-17 paper | pytorch
分析了分类网络用于检测的弊端;较多的下采样降低了检测性能,本文对此设计了新的检测骨干网络 DetNet;
DetNet
4.1.2 Anchors
-
MetaAnchor: Learning to Detect Objects with Customized Anchors
NIPS 2018 2018-07 旷视科技 & 复旦大学 Paper
动态 anchor; -
Region Proposal by Guided Anchoring
2019-01 香港中文大学-商汤联合实验室&Amazon Rekognition&南洋理工大学 Paper | mmdetection
Guided Anchoring:融合Anchor与关键点
4.1.3 NMS
-
NMS
FasterRCNN 中有对其效果进行分析; -
Soft-NMS – Improving Object Detection With One Line of Code
ICCV 2017 2017-04-14 Paper | code-official
对遮挡情况提升较多; -
Softer-NMS: Rethinking Bounding Box Regression for Accurate Object Detection
CVPR 2019 2018-09-23 Paper | code-official -
Acquisition of Localization Confidence for Accurate Object Detection
ECCV 2018 2018-07-30 Paper
NMS Network:设计了 IoU-Net 用来估计 proposal 与其对应 ground-truth 框的 IoU 值,提出了新的边框回归算法以及 NMS 的改进算法;解读
4.2 难点
4.2.1 类别不均衡
-
Team PFDet’s Methods for Open Images Challenge 2019
2019-10-25 paper
针对类别不均衡和联合标注问题做了处理;竞赛的 3、4 名; -
Focal Loss for Dense Object Detection
2017-08-07 Facebook paper | mxnet | detectron-official
RetinaNet
4.2.3 小目标
- Detecting Small Signs from Large Images
2017-06-26 paper
-
Perceptual Generative Adversarial Networks for Small Object Detection
2017-05-14 paper
P-GAN
将小目标的特征映射到相似的大目标特征上来缩小差别,便能将小目标足够近似到大目标来欺骗判别器,达到小目标检测的目的 -
Quantization Mimic: Towards Very Tiny CNN for Object Detection
ECCV 2018 2017-05-15 香港中文,商汤 paper
检测,知识蒸馏,架构混合; -
NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection
CVPR 2019 2019-04-16 paper | tensorflow - Deep High-Resolution Representation Learning for Visual Recognition
2019-08-20 paper | pytorch-official | paper with code
HRNet
4.2.4 性能
- Pelee: A Real-Time Object Detection System on Mobile Devices
NIPS 2018 paper | caffe
适合移动端的检测网络:轻量、高效的目标检测网络;
4.2.5 遮挡
- RRC: Accurate Single Stage Detector Using Recurrent Rolling Convolution
CVPR 2017 2017-04-19 paper | caffe
4.3 创新
4.3.1 物体间关联
- Relation Networks for Object Detection
CVPR 2018 2017-11-30 paper | mxnet-official
在检测过程中可以通过利用图像中 object 之间的相互关系或者叫图像内容(context)来优化检测效果,这种关系既包括相对位置关系也包括图像特征关系;
4.3.2 损失函数
- Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression
2019-02-25 CVPR 2019 paper
GIoU 做 loss;好像没啥提升,1%;
其他
- MegDet: A Large Mini-Batch Object Detector
CVPR 2018 2017-11-20 paper
mini batch 引发的问题;warmup,GPU BN;
5 应用
人、文本 、交通(车道线、交通标志、车辆、船体、安全带)、商品(Logo、商品)、医疗(肺结节)、视频 、航拍、自然(云层)、动物、水下目标检测
- DOTA: A Large-scale Dataset for Object Detection in Aerial Images
CVPR 2018 2017-11-28 paper
航拍 - Animal Detection in Man-made Environments
2019-10-24 paper | supplementary
居民区动物检测;文章分析了自然场景动物检测无法适用于居住场景;最终使用合成数据集解决了该问题;demo 中有较多标注工具;
3D
-
Patch Refinement – Localized 3D Object Detection
NIPS 2019 workshop 2019-10-09 paper -
End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds
CoRL 2019 2019-10-15 paper
体素 + KITTI、Waymo Open;
附录
A 参考资料
- Deep Learning-Based Real-Time Multiple-Object Detection and Tracking from Aerial Imagery via a Flying Robot with GPU-Based Embedded Devices
- CVPR 2017、2018 - 目标检测文章
- object detection
B 资源
a 论文
b 代码
- 数据增强
Paperspace,maozezhong - opencv 跟踪检测
- yolo 小目标检测
mmdetection
a-PyTorch-Tutorial-to-Object-Detection - nanodet
轻量级 anchor free 的检测网络,比 YOLO4-tiny 更快,模型更小;
arm 端 50m,cpu*4 200ms;1080 gpu 20ms;
c 总结
物体检测论文解读路线图,hoya012,ECCV2018 目标检测
美图:一,二,三
目标检测研究综述+LocNet
anchor 机制,yolo2 中的trick
NMS
C 数据集
名称 | 类型 | 数量(训练/验证/测试) | 说明 | 发布日期 |
---|---|---|---|---|
COCO | 通用目标检测 | 微软 | ||
object365 | ||||
Comments