基于注意力机制的多尺度车辆行人检测算法

李经宇; 杨静; 孔斌; 王灿; 张露

doi:10.37188/OPE.20212906.1448

您当前的位置：

首页 >

文章列表页 >

基于注意力机制的多尺度车辆行人检测算法

信息科学 | 更新时间：2021-07-03

- 基于注意力机制的多尺度车辆行人检测算法
- Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism
- 光学精密工程 2021年29卷第6期页码：1448-1458
- 作者机构：
  
  1.中国科学院合肥物质科学研究院智能机械研究所，安徽合肥 230031
  2.中国科学技术大学，安徽合肥 230026
  3.安徽省智能驾驶技术及应用工程实验室，安徽合肥 230031
  4.合肥学院，安徽合肥 230601
- 作者简介：
  
  [ "李经宇（1995-），男，安徽淮北人，博士研究生，2018年于安徽工业大学获得学士学位，主要从事计算机视觉，多模态等方面的研究。E-mail：jingyuli@mial.ustc.edu.cn" ]
  [ "孔　斌（1967-），女，安徽肥东人，博士，研究员，博士生导师，1986年于复旦大学获得学士学位，2005年于中国科学技术大学获得博士学位，主要从事图像处理、计算机视觉、智能机器人环境感知等方面的研究。E-mail：bkong@iim.ac.cn" ]
- 基金信息：
  
  安徽省新能源汽车暨智能网联汽车产业技术创新工程项目(C02018005);国家自然科学基金重大研究计划集成项目(91320301);中国科学院合肥物质科学研究院院长基金青年“火花”项目(YZJJ2020QN20)
- DOI：10.37188/OPE.20212906.1448
  中图分类号： TP394.1;TH691.9
- 收稿日期：2020-08-28，
  
  修回日期：2020-12-16，
  
  纸质出版日期：2021-06-15
- 稿件说明：
移动端阅览
李经宇,杨静,孔斌等.基于注意力机制的多尺度车辆行人检测算法[J].光学精密工程,2021,29(06):1448-1458.

LI Jing-yu,YANG Jing,KONG Bin,et al.Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism[J].Optics and Precision Engineering,2021,29(06):1448-1458.
李经宇,杨静,孔斌等.基于注意力机制的多尺度车辆行人检测算法[J].光学精密工程,2021,29(06):1448-1458. DOI： 10.37188/OPE.20212906.1448.

LI Jing-yu,YANG Jing,KONG Bin,et al.Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism[J].Optics and Precision Engineering,2021,29(06):1448-1458. DOI： 10.37188/OPE.20212906.1448.

摘要

无人驾驶汽车在复杂多变的交通场景中能提前且准确检测到车辆行人的动态信息尤为重要。然而，无人驾驶场景下存在相机快速运动、尺度变化大、目标遮挡和光照变化等问题。为了应对这些挑战，本文提出了一种基于注意力机制的多尺度目标检测算法。基于YOLOv3网络，首先，使用空间金字塔池化模块对多尺度局部区域特征进行融合和拼接，使网络能够更全面地学习目标特征；其次，利用空间金字塔缩短通道间的信息融合，构造了YOLOv3-SPP

-PAN网络；最后，基于注意力机制设计了更高效的目标检测器SE-YOLOv3-SPP

-PAN。实验结果表明：相比于YOLOv3网络，提出的SE-YOLOv3-SPP

-PAN网络的平均精度均值提升了2.2%，且推理速度仍然保持智能驾驶平台下实时的要求。实验证明了所提出的SE-YOLOv3-SPP

-PAN网络比YOLOv3更加高效、准确，因此更适合于实际智能驾驶场景下的目标检测任务。

Abstract

In complex and dynamic traffic scenes， accurate and timely detection of dynamic vehicle and pedestrian information by driver-less cars is particularly important. However， problems such as rapid camera movement， large scale changes， target occlusion， and light changes are encountered in unmanned driving scenarios. To overcome these challenges， this paper proposes a multi-scale target detection algorithm based on attention mechanism. Based on the YOLOv3 network， multi-scale local area features were fused and stitched by adding an improved spatial pyramid pooling module， so that the network could learn target features more comprehensively. Next， a spatial pyramid was used to shorten the information fusion and construct the YOLOv3-SPP

-PAN network. Finally， an efficient attention mechanism-based target detector， SE-YOLOv3-SPP

-PAN， was designed. Numerical results from the simulated system indicate that the SE-YOLOv3-SPP

-PAN network proposed herein achieved an improvement of 2.2% in mean average precision over the YOLOv3 network while retaining superior real-time reasoning-speed performance. This proves that the proposed SE-YOLOv3-SPP

-PAN network is more efficient and accurate than YOLOv3 is， and thus， it is more suitable for target detection in complex intelligent driving scenarios.

关键词

Keywords

references

裴伟，许晏铭，朱永英，等 . 改进的SSD航拍目标检测方法［J］. 软件学报， 2019 ， 30 （ 3 ）： 248 - 268 .

PEI W ， XU Y M ， ZHU Y Y ， et al . The target detection method of aerial photography images with improved SSD ［J］. Journal of Software ， 2019 ， 30 （ 3 ）： 248 - 268 （in Chinese）

孙皓泽，常天庆，张雷，等 . 基于轻量级网络的装甲目标快速检测［J］. 计算机辅助设计与图形学学报， 2019 ， 31 （ 7 ）： 1110 - 1121 .

SUN H Z ， CHANG T Q ， ZHANG L ， et al . Fast armored target detection based on lightweight network ［J］. Journal of Computer-Aided Design & Computer Graphics ， 2019 ， 31 （ 7 ）： 1110 - 1121 . （in Chinese）

王中宇，倪显扬，尚振东，等 . 利用卷积神经网络的自动驾驶场景语义分割［J］. 光学精密工程， 2019 ， 27 （ 11 ）： 2429 - 2438 .

WANG ZH Y ， NI X Y ， SHANG ZH D ， et al . Autonomous driving semantic segmentation with convolution neural networks ［J ］. Opt. Precision Eng. ， 2019 ， 27 （ 11 ）： 2429 - 2438 . （in Chinese）

GIRSHICK R . Fast R-CNN ［J］. Computer Science ， 2015 .

REN S Q ， HE K M ， GIRSHICK ， R ， et al . Faster R-CNN： Towards real-time object detection with region proposal networks ［J］. IEEE Transactions on Pattern Analysis & Machine Intelligence ， 39 （ 6 ）： 1137 - 1149 .

LIU W ， ANGUELOY D ， ERHAN D ， et al . SSD： single shot multibox detector ［C］. European Conference on Computer Vision . Springer International Publishing ， 2016 ．

REDMON J ， DIVVALA S ， GIRSHICK R ， et al . You only look once： unified， real-time object detection ［C］. 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. IEEE ， 2016 .

REDMON J ， FARHAD A . YOLO9000： Better， faster， stronger ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）， 2017 ： 6517 - 6525 .

REDMON J ， FARHADI A . Yolov3： An incremental improvement ［J］. arXiv ， 2018 .

王建林，付雪松，黄展超，等 . 改进YOLOv2卷积神经网络的多类型合作目标检测［J］. 光学精密工程， 2020 ， 28 （ 1 ）.

WANG J L ， FU X S ， HUANG ZH C ， et al . Multi- type cooperative targets detection using improved YOLOv2 convolutional neural network ［J］. Opt. Precision Eng. ， 2020 ， 28 （ 1 ）. （in Chinese）

QI Y ， SHI H ， LI N ， et al . Vehicle detection under unmanned aerial vehicle based on improved YOLOv3 ［C］. 2019 12th International Congress on Image and Signal Processing ， BioMedical Engineering and Informatics （CISP-BMEI）， 2019 .

杨晋生，杨雁南，李天骄 . 基于深度可分离卷积的交通标志识别算法［J］. 液晶与显示， 2019 ， 34 （ 12 ）： 1191 - 1201

YANG J SH ， YANG Y N ， LI T J ， et al . Traffic sign recognition algorithm based on depthwise separable convolutions ［J］. Chinese Journal of Liquid Crystals and Displays .， 2019 ， 34 （ 12 ）： 1191 - 1201 . （in Chinese）

WANG Q ， WU B ， ZHU P ， et al . ECA-net： Efficient channel attention for deep convolutional neural networks ［C］. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition ， 2020 ： 11534 - 11542 .

余永维，韩鑫，杜柳青 . 基于Inception-SSD算法的零件识别［J］. 光学精密工程， 2020 ， 28 （ 8 ）： 1799 - 1809 .

YU Y W ， HAN X ， DU L Q . Target part recognition based Inception-SSD algorithm ［J］. Opt. Precision Eng. ， 2020 ， 28 （ 8 ）： 1799 - 1809 . （in Chinese）

WANG H ， YU Y ， CAI Y ， et al . A comparative study of state-of-the-art deep learning algorithms for vehicle detection ［J］. IEEE Intelligent Transportation Systems Magazine ， 2019 ： 1 - 1 .

程全，樊宇，刘玉春，等 . 多特征融合的车辆识别技术［J］. 红外与激光工程， 2018 ， v.47；No. 285 （ 7 ）： 316 - 321 .

CHENG Q ， FAN Y ， LIU Y C ， et al . Multi- type cooperative targets detection using improved YOLOv2 convolutional neural network ［J］. Infrared and Laser Engineering ， 2018 ， v.47；No. 285 （ 07 ）： 316 - 321 . （in Chinese）

HE ， K M ， et al . " Spatial pyramid pooling in deep convolutional networks for visual recognition ." European Conference on Computer Vision . Springer International Publishing ， 2014 .

Lin T Y ， DOLLAR ， P ， GIRSHICK R ， et al . Feature pyramid networks for object detection ［C］. Proceedings of the IEEE conference on computer vision and pattern recognition ， 2017 ： 2117 - 2125 .

LIU S ， QI L ， QIN H ， et al . Path aggregation network for instance segmentation ［C］. Proceedings of the IEEE conference on computer vision and pattern recognition ， 2018 ： 8759 - 8768 .

HU J ， SHEN L ， ALBANIE S . Squeeze-and-excitation networks ［C］. Proceedings of the IEEE conference on computer vision and pattern recognition ， 2018 ： 7132 - 7141 .

浏览量

374

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于注意力机制的宽波段小目标实时去模糊

面向领域自适应的部分最优传输高光谱图像分类

多阶段帧对齐的视频超分辨率重建网络

基于改进YOLOv4的道路交通标志识别

基于平面补丁的自适应八叉树三维图像重建