LI Jing-yu,YANG Jing,KONG Bin,et al.Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism[J].Optics and Precision Engineering,2021,29(06):1448-1458.
LI Jing-yu,YANG Jing,KONG Bin,et al.Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism[J].Optics and Precision Engineering,2021,29(06):1448-1458. DOI: 10.37188/OPE.20212906.1448.
Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism
In complex and dynamic traffic scenes, accurate and timely detection of dynamic vehicle and pedestrian information by driver-less cars is particularly important. However, problems such as rapid camera movement, large scale changes, target occlusion, and light changes are encountered in unmanned driving scenarios. To overcome these challenges, this paper proposes a multi-scale target detection algorithm based on attention mechanism. Based on the YOLOv3 network, multi-scale local area features were fused and stitched by adding an improved spatial pyramid pooling module, so that the network could learn target features more comprehensively. Next, a spatial pyramid was used to shorten the information fusion and construct the YOLOv3-SPP
+
-PAN network. Finally, an efficient attention mechanism-based target detector, SE-YOLOv3-SPP
+
-PAN, was designed. Numerical results from the simulated system indicate that the SE-YOLOv3-SPP
+
-PAN network proposed herein achieved an improvement of 2.2% in mean average precision over the YOLOv3 network while retaining superior real-time reasoning-speed performance. This proves that the proposed SE-YOLOv3-SPP
+
-PAN network is more efficient and accurate than YOLOv3 is, and thus, it is more suitable for target detection in complex intelligent driving scenarios.
PEI W , XU Y M , ZHU Y Y , et al . The target detection method of aerial photography images with improved SSD [J]. Journal of Software , 2019 , 30 ( 3 ): 248 - 268 (in Chinese)
SUN H Z , CHANG T Q , ZHANG L , et al . Fast armored target detection based on lightweight network [J]. Journal of Computer-Aided Design & Computer Graphics , 2019 , 31 ( 7 ): 1110 - 1121 . (in Chinese)
WANG ZH Y , NI X Y , SHANG ZH D , et al . Autonomous driving semantic segmentation with convolution neural networks [J ]. Opt. Precision Eng. , 2019 , 27 ( 11 ): 2429 - 2438 . (in Chinese)
GIRSHICK R . Fast R-CNN [J]. Computer Science , 2015 .
REN S Q , HE K M , GIRSHICK , R , et al . Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis & Machine Intelligence , 39 ( 6 ): 1137 - 1149 .
LIU W , ANGUELOY D , ERHAN D , et al . SSD: single shot multibox detector [C]. European Conference on Computer Vision . Springer International Publishing , 2016 .
REDMON J , DIVVALA S , GIRSHICK R , et al . You only look once: unified, real-time object detection [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE , 2016 .
REDMON J , FARHAD A . YOLO9000: Better, faster, stronger [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2017 : 6517 - 6525 .
REDMON J , FARHADI A . Yolov3: An incremental improvement [J]. arXiv , 2018 .
WANG J L , FU X S , HUANG ZH C , et al . Multi- type cooperative targets detection using improved YOLOv2 convolutional neural network [J]. Opt. Precision Eng. , 2020 , 28 ( 1 ). (in Chinese)
QI Y , SHI H , LI N , et al . Vehicle detection under unmanned aerial vehicle based on improved YOLOv3 [C]. 2019 12th International Congress on Image and Signal Processing , BioMedical Engineering and Informatics (CISP-BMEI) , 2019 .
YANG J SH , YANG Y N , LI T J , et al . Traffic sign recognition algorithm based on depthwise separable convolutions [J]. Chinese Journal of Liquid Crystals and Displays ., 2019 , 34 ( 12 ): 1191 - 1201 . (in Chinese)
WANG Q , WU B , ZHU P , et al . ECA-net: Efficient channel attention for deep convolutional neural networks [C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2020 : 11534 - 11542 .
YU Y W , HAN X , DU L Q . Target part recognition based Inception-SSD algorithm [J]. Opt. Precision Eng. , 2020 , 28 ( 8 ): 1799 - 1809 . (in Chinese)
WANG H , YU Y , CAI Y , et al . A comparative study of state-of-the-art deep learning algorithms for vehicle detection [J]. IEEE Intelligent Transportation Systems Magazine , 2019 : 1 - 1 .
CHENG Q , FAN Y , LIU Y C , et al . Multi- type cooperative targets detection using improved YOLOv2 convolutional neural network [J]. Infrared and Laser Engineering , 2018 , v.47;No. 285 ( 07 ): 316 - 321 . (in Chinese)
HE , K M , et al . " Spatial pyramid pooling in deep convolutional networks for visual recognition ." European Conference on Computer Vision . Springer International Publishing , 2014 .
Lin T Y , DOLLAR , P , GIRSHICK R , et al . Feature pyramid networks for object detection [C]. Proceedings of the IEEE conference on computer vision and pattern recognition , 2017 : 2117 - 2125 .
LIU S , QI L , QIN H , et al . Path aggregation network for instance segmentation [C]. Proceedings of the IEEE conference on computer vision and pattern recognition , 2018 : 8759 - 8768 .
HU J , SHEN L , ALBANIE S . Squeeze-and-excitation networks [C]. Proceedings of the IEEE conference on computer vision and pattern recognition , 2018 : 7132 - 7141 .