
浏览全部资源
扫码关注微信
1.吉林大学 计算机科学与技术学院, 吉林 长春 130012
2.吉林大学 符号计算与知识工程教育部重点实验室, 吉林 长春 130012
3.吉林大学 学报(工学版)编辑部, 吉林 长春 130012
4.长春工业大学 计算机科学与工程学院, 吉林 长春 130012
5.山东理工大学 计算机科学与技术学院, 山东 淄博 255000
[ "范丽丽(1991-),女,山东烟台人,博士研究生,2016年于吉林大学获得硕士学位,主要从事计算机视觉及人工智能方面的研究。E-mail:llfan18@mails.jlu.edu.cn" ]
赵浩宇(1991-),男,吉林长春人,硕士,2016年于吉林大学获得硕士学位,主要从事智能信息系统与科技信息传播方面的研究。通讯作者,E-mail: zhaohaoyu@jlu.edu.cn ZHAO Hao-yu, E-mail: zhaohaoyu@jlu.edu.cn
收稿日期:2019-10-30,
修回日期:2019-12-27,
录用日期:2019-12-27,
纸质出版日期:2020-05-25
移动端阅览
范丽丽, 赵浩宇, 胡黄水, 等. 基于深度卷积神经网络的目标检测研究综述[J]. 光学 精密工程, 2020,28(5):1152-1164.
Li-li FAN, Hao-yu ZHAO, Huang-shui HU, et al. Survey of target detection based on deep convolutional neural networks[J]. Optics and precision engineering, 2020, 28(5): 1152-1164.
范丽丽, 赵浩宇, 胡黄水, 等. 基于深度卷积神经网络的目标检测研究综述[J]. 光学 精密工程, 2020,28(5):1152-1164. DOI: 10.3788/OPE.20202805.1152.
Li-li FAN, Hao-yu ZHAO, Huang-shui HU, et al. Survey of target detection based on deep convolutional neural networks[J]. Optics and precision engineering, 2020, 28(5): 1152-1164. DOI: 10.3788/OPE.20202805.1152.
作为计算机视觉中的基本视觉识别问题,目标检测在过去的几十年中得到了广泛地研究。目标检测旨在给定图像中找到具有准确定位的特定对象,并为每个对象分配一个对应的标签。近年来,深度卷积神经网络DCNN(Deep Convolutional Neural Networks)凭借其特征学习和迁移学习的强大能力在图像分类方面取得了一系列突破,在目标检测方面,它越来越受到人们的重视。因此,如何将CNN应用于目标检测并获得更好的性能是一项重要的研究。首先回顾和介绍了几类经典的目标检测算法;然后将深度学习算法的产生过程作为切入点,以系统的方式全面概述了各种目标检测方法;最后针对目标检测和深度学习算法面临的重大挑战,讨论了一些未来的方向,以促进深度学习对目标检测的研究。
Object detection
which is a fundamental visual recognition problem in computer vision
has been extensively studied in the past few decades and has become one of the popular research areas in the world. The aim of object detection is to accurately locate specific objects in a given image and assign a corresponding label to each object. In recent years
Deep Convolutional Neural Networks (DCNN) have been used in a series of developments in object detection and image classification owing to their powerful capabilities of feature learning and transfer learning.It has garnered considerable attention in the field of computer vision for object detection. Therefore
the method of applying CNN in target detection to obtain better performance is an important topic for research.First
we reviewed and introduced several types of classic object detection algorithms.Next
we considered the generation process of the deep learning algorithm as a starting point
analyzed the technical ideas and key problems of DCNN in the application of target detection
and provided a comprehensive overview of various target detection methods in a systematic manner. Finally
in view of the major challenges in target detection and deep learning algorithms
we provided future development scope and direction to promote the study of target detection using deep learning.
A KHAN , B RINNER , A CAVALLLARO . Cooperative robots to observe moving targets . IEEE Transactions on Cybernetics , 2016 . 48 ( 1 ): 187 - 198 . http://cn.bing.com/academic/profile?id=a2864865ddd38559f0f6fcaa608a2956&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=a2864865ddd38559f0f6fcaa608a2956&encoded=0&v=paper_preview&mkt=zh-cn .
SAPUTERA Y P, WAHAB M, ESTU T T. Radar Software Development for the Surveillance of Indonesian Aerospace Sovereignty[C]. 2018 International Conference on Electrical Engineering and Computer Science (ICECOS), IEEE , 2018: 189-194.
ANTON S D, SINH S, SCHOTTEN H D. Anomaly-based Intrusion Detection in Industrial Data with SVM and Random Forests[C] . 2019 International Conference on Software, Telecommunications and Computer Networks (SoftCOM), IEEE , 2019: 1-6.
王 耀东 , 朱 力强 , 余 祖俊 , 等 . 用于机械系统瞬时目标的双视角高速视觉检测系统 . 光学 精密工程 , 2017 . 25 ( 10 ): 2725 - 2735 . http://d.old.wanfangdata.com.cn/Periodical/gxjmgc201710024 http://d.old.wanfangdata.com.cn/Periodical/gxjmgc201710024 .
Y D WANG , L Q ZHU , Z J YU , 等 . Two-view high speed vision system for instant object detection in mechanical system . Opt. Precision Eng. , 2017 . 50 ( 10 ): 2725 - 2735 . http://d.old.wanfangdata.com.cn/Periodical/gxjmgc201710024 http://d.old.wanfangdata.com.cn/Periodical/gxjmgc201710024 .
A Q JIANG , D HUYNH . Multiple pedestrian tracking from monocular videos in an interacting multiple model framework . IEEE Transactions on Image Processing , 2017 . 27 ( 3 ): 1361 - 1375 . http://cn.bing.com/academic/profile?id=4ded48b0b77d89f80a5efedb4e00409e&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=4ded48b0b77d89f80a5efedb4e00409e&encoded=0&v=paper_preview&mkt=zh-cn .
张 小荣 , 胡 炳樑 , 潘 志斌 , 等 . 基于张量表示的高光谱图像目标检测 . 光学 精密工程 , 2019 . 27 ( 2 ): 488 - 498 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201902025 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201902025 .
X R ZHANG , B L HU , ZH B PAN , 等 . Tensor representation based target detection for hyperspectral imagery . Opt. Precision Eng. , 27 ( 2 ): 488 - 498 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201902025 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201902025 .
李 正周 , 曹 雷 , 邵 万兴 , 等 . 基于空时混沌分析的海面小弱目标检测精密工程 . 光学 精密工程 , 2018 . 26 ( 1 ): 193 - 199 . http://www.cnki.com.cn/Article/CJFDTotal-GXJM201801024.htm http://www.cnki.com.cn/Article/CJFDTotal-GXJM201801024.htm .
ZH ZH LI , L CAO , W X SHAO , 等 . Detection of small target in sea clutter based on spatio-temporal chaos analysis . Opt. Precision Eng. , 2018 . 26 ( 1 ): 193 - 199 . http://www.cnki.com.cn/Article/CJFDTotal-GXJM201801024.htm http://www.cnki.com.cn/Article/CJFDTotal-GXJM201801024.htm .
D LOWE . Distinctive image features from scale-invariant keypoints . International Journal of Computer Vision , 2004 . 60 ( 2 ): 91 - 110 . http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ025429678/ http://d.old.wanfangdata.com.cn/NSTLQK/NSTL_QKJJ025429678/ .
CAI Z W, SABERIAN M, VASCONCELOS N. Learning complexity-aware cascades for deep pedestrian detection[C]. Proceedings of the IEEE International Conference on Computer Vision , 2015: 3361-3369.
P VIOLA , M JONES . Rapid object detection using a boosted cascade of simple features . CVPR , 2001 . 1 ( 3 ): 511 - 518 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=CC026610058 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=CC026610058 .
C X ZHANG , J S ZHANG , S W KIM . PBoostGA: pseudo-boosting genetic algorithm for variable ranking and selection . Computational Statistics , 2016 . 31 ( 4 ): 1237 - 1262 . DOI: 10.1007/s00180-016-0652-8 http://doi.org/10.1007/s00180-016-0652-8 .
L PEI , M YE , X Z ZHAO , 等 . Learning spatio-temporal features for action recognition from the side of the video . Signal, Image Video Processing , 2016 . 10 ( 1 ): 199 - 206 . DOI: 10.1007/s11760-014-0726-4 http://doi.org/10.1007/s11760-014-0726-4 .
Y LECUN , L BOTTOU , Y BENGIO , 等 . Haffner, "Gradient-based learning applied to document recognition . Proceedings of the IEEE , 1998 . 86 ( 11 ): 2278 - 2324 . DOI: 10.1109/5.726791 http://doi.org/10.1109/5.726791 .
Y LECUN , B BOSER , J DENKER , 等 . Handwritten digit recognition with a back-propagation network . Advances in Neural Information Processing Systems , 1990 . 396 - 404 . http://d.old.wanfangdata.com.cn/OAPaper/oai_doaj-articles_c169f465208c9a08348152ea62341ccb http://d.old.wanfangdata.com.cn/OAPaper/oai_doaj-articles_c169f465208c9a08348152ea62341ccb .
HECHT-NIELSEN . Theory of the backpropagation neural network . Neural networks for perception: Elsevier , 1992 . 65 - 93 . http://cn.bing.com/academic/profile?id=3d38f7b0ec99dd230a147079cb5d6bd0&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=3d38f7b0ec99dd230a147079cb5d6bd0&encoded=0&v=paper_preview&mkt=zh-cn .
A KRIZHEVSKY , I SUTSKEVER , G E HINTON . Imagenet classification with deep convolutional neural networks . Advances in neural information processing systems , 2012 . 1097 - 1105 . http://cn.bing.com/academic/profile?id=1dc5d01904d2c274eaec2181a93aa339&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=1dc5d01904d2c274eaec2181a93aa339&encoded=0&v=paper_preview&mkt=zh-cn .
NAIR V, HINTON G E. Rectified linear units improve restricted boltzmann machines[C]. Proceedings of the 27th international conference on machine learning (ICML-10) , 2010: 807-814.
G E HINTON , N SRIVASTAVA , A KRIZHEVSKY , 等 . Improving neural networks by preventing co-adaptation of feature detectors . Computer Ence , 2012 . 3 ( 4 ): 212 - 223 . http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_1207.0580 http://d.old.wanfangdata.com.cn/OAPaper/oai_arXiv.org_1207.0580 .
M D ZEILER , R FERGUS . Visualizing and understanding convolutional networks . European conference on computer vision, Springer , 2014 . 818 - 833 . http://d.old.wanfangdata.com.cn/NSTLHY/NSTL_HYCC0214647065/ http://d.old.wanfangdata.com.cn/NSTLHY/NSTL_HYCC0214647065/ .
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. Computer Ence , 2014.
SZEGEDY C, LIU W, JIA Y, et al . Going deeper with convolutions[C]. Proceedings of the IEEE conference on computer vision and pattern recognition , 2015: 1-9.
HE K, ZHANG X Y, REN S Q, et al . Deep residual learning for image recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2016: 770-778.
NORMALIZATION B. Accelerating deep network training by reducing internal covariate shift[C]. International Conference on International Conference on Machine Learning JMLR , 2015.
J R UIJLING , K E VAN DE SANDE , T GEVERS , 等 . Selective search for object recognition . International Journal of Computer Vision , 2013 . 104 ( 2 ): 154 - 171 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=3216de1927eb16418ad3bdf8d4bcd8bd http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=3216de1927eb16418ad3bdf8d4bcd8bd .
KUO W, HARIHARAN B, MALIK J. Deepbox: Learning objectness with convolutional networks[C]. Proceedings of the IEEE International Conference on Computer Vision , 2015: 2479-2487.
PINHEIRO P O, LIN T-Y, COLLOBERT R, et al . Learning to refine object segments[C]. European Conference on Computer Vision, Springer , 2016: 75-91.
GUPTA S, GRISHICK R, ARBELAEZ P, et al . Learning rich features from RGB-D images for object detection and segmentation[C]. European Conference on Computer Vision, Sringer , 2014: 345-360.
PERRONNIN F, SANCHEZ J, MENSINK T. Improving the fisher kernel for large-scale image classification[C]. European Conference on Computer Vision, Springer , 2010: 143-156.
K HE , X Y ZHANG , S Q REN , 等 . Spatial pyramid pooling in deep convolutional networks for visual recognition . IEEE Transactions on Pattern Analysis Machine Intelligence , 2015 . 37 ( 9 ): 1904 - 1916 . DOI: 10.1109/TPAMI.2015.2389824 http://doi.org/10.1109/TPAMI.2015.2389824 .
GIRSHICK R. Fast R-CNN[C]. Proceedings of the IEEE International Conference on Computer Vision , 2015: 1440-1448.
XUE J, LI J, GONG Y. Restructuring of deep neural network acoustic models with singular value decomposition[C]. Interspeech, 2013: 2365-2369.
REN S, HE K, GIRSHICK R, et al . Faster R-CNN: Towards real-time object detection with region proposal networks[C]. Advances in Neural Information Processing Systems , 2015: 91-99.
DAI J, LI Y, HE K, et al . R-FCN: Object detection via region-based fully convolutional networks[C]. Advances in Neural Information Processing Systems , 2016: 379-387.
LIN T-Y, MAIRE M, BELONGIE S, et al . Microsoft coco: Common objects in context[C]. European Conference on Computer Vision, Springer , 2014: 740-755.
LIU W, ANGUELOV D, ERHAN D, et al . SSD: Single shot multibox detector[C]. European conference on computer vision, Springer , 2016: 21-37.
REDMON J, DIVVALA S, GIRSHICK R, et al . You Only Look Once: Unified, real-time object detection[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2016: 779-788.
REDMON J, FARHADI A. YOLO9000: better, faster, stronger[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2017: 7263-7271.
REDMON J, FARHADI A. YOLOV3: An incremental improvement[J]. arXiv preprint arXiv: 1804.02767, 2018.
ERHAN D, SZEGEDY C, TOSHEV, et al . Scalable object detection using deep neural networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2014: 2147-2154.
BELL S, LAWRENCE Z, BALA K, et al . Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2016: 2874-2883.
FU C-Y, LIU W, RANGA A, et al . DSSD: Deconvolutional single shot detector[J]. arXiv preprint arXiv : 1701.06659, 2017.
SHEN Z, LIU Z, LI J, et al . DSOD: Learning deeply supervised object detectors from scratch[C]. Proceedings of the IEEE International Conference on Computer Vision , 2017: 1919-1927.
LAW H, HENG J. Cornernet: Detecting objects as paired keypoints[C]. Proceedings of the European Conference on Computer Vision (ECCV) , 2018: 734-750.
C ZHU , Y HE , M SAVVIDES . Feature selective anchor-free module for single-shot object detection . arXiv preprint arXiv: 00621 , 2019 .
ZHOU X, ZHOU J, KRAHENBUHL P. Bottom-up object detection by grouping extreme and center points[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2019: 850-859.
TIAN Z, SHEN C, CHEN H, et al . FCOS: Fully Convolutional One-Stage Object Detection[J]. arXiv preprint arXiv : 01355, 2019.
DUAN K, BAI S, XIE L, et al . Centernet: Keypoint triplets for object detection[C]. Proceedings of the IEEE International Conference on Computer Vision , 2019: 6569-6578.
M EVERINGHAM , G WAN , C WILLIAMS , 等 . The pascal visual object classes (voc) challenge . International Journal of Computer Vision , 2010 . 88 ( 2 ): 303 - 338 . http://d.old.wanfangdata.com.cn/NSTLQK/10.2533-chimia.2011.925/ http://d.old.wanfangdata.com.cn/NSTLQK/10.2533-chimia.2011.925/ .
KUZNETSOVA A, ROM H, ALLDRIN N, et al . The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale[J]. arXiv preprint arXiv : 1811.00982, 2018.
DENG J, DONG W, SOCHER R, et al . Imagenet: A large-scale hierarchical image database[C]. 2009 IEEE conference on computer vision and pattern recognition, IEEE , 2009: 248-255.
YANG S, LUO P, LOY C-C, et al . Wider face: A face detection benchmark[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2016: 5525-5533.
JAIN V, LEARNED-MILLER E. FDDB: A benchmark for face detection in unconstrained settings[C]. Computer Science , 2010.
P FELZENSZWALB , R GIRSHICK , D MCALLE-STER , 等 . Discriminatively trained mixtures of deformable part models . PASCAL VOC Challenge , 2008 .
P DOLLAR , C WOJEK , B SCHIELE , 等 . Pedestrian detection: An evaluation of the state of the art . IEEE Transactions on Pattern Analysis Machine Intelligence , 2011 . 34 ( 4 ): 743 - 761 .
ZHANG S, BENENSON R, SCHIELE B. Citypersons: A diverse dataset for pedestrian detection[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2017: 3213-3221.
A GEIGER , P LENZ , C STILLER , 等 . Vision meets robotics: The KITTI dataset . The International Journal of Robotics Research , 2013 . 32 ( 11 ): 1231 - 1237 . DOI: 10.1177/0278364913491297 http://doi.org/10.1177/0278364913491297 .
DALAL N, TRIGGS B. Histograms of oriented gradients for human detection[C]. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA , 2005, 1: 886-893.
ESS A, LEIBE B, VANGOOL L. Depth and appearance for mobile scene analysis[C]. 2007 IEEE 11th International Conference on Computer Vision, IEEE , 2007: 1-8.
刘 晓 , 崔 光照 , 李 正周 , 等 . 基于视觉系统分层的小目标运动检测 . 光学 精密工程 , 2019 . 27 ( 10 ): 2251 - 2262 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201910021 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201910021 .
X LIU , G ZH CUI , ZH ZH LI , 等 . Small target motion detection based on layering of vision system . Opt. Precision Eng. , 2019 . 27 ( 10 ): 2251 - 2262 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201910021 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201910021 .
SHRIVASTAVA A, GUPTA A, GIRSHICK R. Training region-based object detectors with online hard example mining[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2016: 761-769.
KONG T, SUN F, YAO A, et al . Ron: Reverse connection with objectness prior networks for object detection[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2017: 5936-5944.
XIANG Y, CHOI W, LIN Y, et al . Subcategory-aware convolutional neural networks for object proposals and detection[C]. 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE , 2017: 924-933.
LIN T-Y, DOLLAR P, GIRSHICK R, et al . Feature pyramid networks for object detection[C]. Proceedings of the IEEE conference on computer vision and pattern recognition , 2017: 2117-2125.
GOODFELLOW L, POUGET-ABADIE J, MIRZA M, et al . Generative adversarial nets[C]. Advances in Neural Information Processing Systems , 2014: 2672-2680.
梁 浩 , 刘 克俭 , 刘 康 , 等 . 引入再检测机制的孪生神经网络目标跟踪 . 光学 精密工程 , 2019 . 27 ( 7 ): 1621 - 1631 . http://d.old.wanfangdata.com.cn/Periodical/gxjmgc201907024 http://d.old.wanfangdata.com.cn/Periodical/gxjmgc201907024 .
H LIANG , K J LIU , K LIU , 等 . Target tracking in twin neural networks with re-detection mechanism . Opt. Precision Eng. , 2019 . 27 ( 7 ): 1621 - 1631 . http://d.old.wanfangdata.com.cn/Periodical/gxjmgc201907024 http://d.old.wanfangdata.com.cn/Periodical/gxjmgc201907024 .
HUANG J, RATHOD V, SUN C, et al . Speed/accuracy trade-offs for modern convolutional object detectors[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , 2017: 7310-7311.
TOME D, BONDI L, BAROFFIO L, et al . Reduced memory region based deep Convolutional Neural Network detection[C]. 2016 IEEE 6th International Conference on Consumer Electronics-Berlin (ICCE-Berlin), IEEE , 2016: 15-19.
0
浏览量
742
下载量
59
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621