浏览全部资源
扫码关注微信
辽宁科技大学 计算机与软件工程学院,辽宁 鞍山 114000
[ "荆修平(1996-),男,辽宁鞍山人,硕士研究生,2018年于辽宁科技大学获得学士学位,主要从事机器视觉等方面研究。E-mail:jxp_snowman@163.com" ]
[ "田 莹(1971-),女,辽宁沈阳人,博士,教授,硕士生导师,2008年于沈阳工业大学获得博士学位,主要从事计算机视觉,数字图像处理,模式识别等方面研究。E-mail:astianying@126.com" ]
收稿日期:2022-06-09,
修回日期:2022-07-26,
纸质出版日期:2023-03-25
移动端阅览
荆修平,田莹.采用长距离依赖和多尺度表达的轻量化车辆检测[J].光学精密工程,2023,31(06):950-961.
JING Xiuping,TIAN Ying.Lightweight vehicle detection using long-distance dependence and multi-scale representation[J].Optics and Precision Engineering,2023,31(06):950-961.
荆修平,田莹.采用长距离依赖和多尺度表达的轻量化车辆检测[J].光学精密工程,2023,31(06):950-961. DOI: 10.37188/OPE.20233106.0950.
JING Xiuping,TIAN Ying.Lightweight vehicle detection using long-distance dependence and multi-scale representation[J].Optics and Precision Engineering,2023,31(06):950-961. DOI: 10.37188/OPE.20233106.0950.
基于深度学习的车辆检测在众多领域发挥着至关重要的作用,是近年来计算机视觉的一个重要发展方向。车辆轻量化检测包含了对网络结构和计算效率的探索,并在智慧交通等诸多领域都得以广泛应用。然而在诸多场景下存在相机中车辆目标尺度变化大、车辆相互遮挡等问题,这些情况会影响到网络检测车辆的精度。针对上述问题,提出改进Yolov5s的车辆检测方法。首先通过视觉注意力网络捕获长距离依赖,对原有特征图施加新的权重,增强自适应性,提升网络的抗遮挡能力;接着在残差模块内部再次构造水平方向残差,在一个模块内部构建相同数量、不同大小感受野的特征图,丰富网络的多尺度表达能力。实验结果表明:改进后的网络在Pascal VOC车辆数据集上提供2.1%mAP性能提升,在MS COCO车辆数据集上提供1.7%mAP性能提升。改进后网络的多尺度表达能力更加出色,且抗遮挡能力更强,与原始网络相比检测结果更具有竞争力。
Vehicle detection based on deep learning plays a vital role in many fields. In recent years, it has presented a major development direction for computer vision. Lightweight vehicle detection includes the exploration of network structure and computing efficiency, and it is widely used in many fields such as intelligent transportation. However, challenges exist in different scenarios, such as large changes in vehicle scale in detection cameras and vehicles overlapping each other, which reduce the precision of the network in detecting vehicles. To solve these problems, this study proposes an improved YOLOv5s method for detecting vehicles. First, the study proposes to capture long-distance dependencies between objects through a visual attention network and apply new weights to the network’s original feature map to increase the adaptability of the network. These operations improve the anti-occlusion ability of the network. Second, the horizontal residual is constructedagain in the residual module. The output feature maps contain the same number and different sizes of receptive fields per module. Feature extraction occurs at a more fine-grained level, thereby enriching the multi-scale representation ability of the network. The experimental results show that the improved network provides 2.1% mAP performance on the Pascal visual object classes (VOC) vehicle telemetry dataset and a 1.7% mAP performance on the MS COCO vehicle telemetry dataset. The performance of the improved network is more powerful and its anti-occlusion ability is enhanced. Compared with the original network, the detection results are more competitive.
孙敏 , 李免 , 赵玉舟 , 等 . 基于实时交通状况和自适应像素分割的运动车辆检测 [J]. 液晶与显示 , 2021 , 36 ( 10 ): 1454 - 1462 . doi: 10.37188/CJLCD.2020-0316 http://dx.doi.org/10.37188/CJLCD.2020-0316
SUN M , LI M , ZHAO Y ZH , et al . Vehicle detection based on real-time traffic condition and adaptive pixel segmentation [J]. Chinese Journal of Liquid Crystals and Displays , 2021 , 36 ( 10 ): 1454 - 1462 . (in Chinese) . doi: 10.37188/CJLCD.2020-0316 http://dx.doi.org/10.37188/CJLCD.2020-0316
张浩 , 杨坚华 , 花海洋 . 基于FVOIRGAN-Detection的车辆检测 [J]. 光学 精密工程 , 2022 , 30 ( 12 ): 1478 - 1486 . doi: 10.37188/OPE.20223012.1478 http://dx.doi.org/10.37188/OPE.20223012.1478
ZHANG H , YANG J H , HUA H Y . Vehicle detection based on FVOIRGAN-Detection [J]. Opt. Precision Eng. , 2022 , 30 ( 12 ): 1478 - 1486 . (in Chinese) . doi: 10.37188/OPE.20223012.1478 http://dx.doi.org/10.37188/OPE.20223012.1478
杨慧 . 基于HOG和Haar-like融合特征的车辆检测 [D]. 南京 : 南京邮电大学 , 2013 .
YANG H . Vehicle Detection Based on HOG and Haar-Like Features [D]. Nanjing : Nanjing University of Posts and Telecommunications , 2013 . (in Chinese)
李星 , 郭晓松 , 郭君斌 . 基于HOG特征和SVM的前向车辆识别方法 [J]. 计算机科学 , 2013 , 40 ( S2 ): 329 - 332 . doi: 10.3969/j.issn.1002-137X.2013.z2.082 http://dx.doi.org/10.3969/j.issn.1002-137X.2013.z2.082
LI X , GUO X S , GUO J B . HOG-feature and SVM based method for forward vehicle recognition [J]. Computer Science , 2013 , 40 ( S2 ): 329 - 332 . (in Chinese) . doi: 10.3969/j.issn.1002-137X.2013.z2.082 http://dx.doi.org/10.3969/j.issn.1002-137X.2013.z2.082
余永维 , 韩鑫 , 杜柳青 . 基于Inception-SSD算法的零件识别 [J]. 光学 精密工程 , 2020 , 28 ( 8 ): 1799 - 1809 .
YU Y W , HAN X , DU L Q . Target part recognition based Inception-SSD algorithm [J]. Opt. Precision Eng. , 2020 , 28 ( 8 ): 1799 - 1809 . (in Chinese)
王宸 , 张秀峰 , 刘超 , 等 . 改进YOLOv3的轮毂焊缝缺陷检测 [J]. 光学 精密工程 , 2021 , 29 ( 8 ): 1942 - 1954 . doi: 10.37188/OPE.20212908.1942 http://dx.doi.org/10.37188/OPE.20212908.1942
WANG CH , ZHANG X F , LIU CH , et al . Detection method of wheel hub weld defects based on the improved YOLOv3 [J]. Opt. Precision Eng. , 2021 , 29 ( 8 ): 1942 - 1954 . (in Chinese) . doi: 10.37188/OPE.20212908.1942 http://dx.doi.org/10.37188/OPE.20212908.1942
李经宇 , 杨静 , 孔斌 , 等 . 基于注意力机制的多尺度车辆行人检测算法 [J]. 光学 精密工程 , 2021 , 29 ( 6 ): 1448 - 1458 . doi: 10.37188/OPE.20212906.1448 http://dx.doi.org/10.37188/OPE.20212906.1448
LI J Y , YANG J , KONG B , et al . Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism [J]. Opt. Precision Eng. , 2021 , 29 ( 6 ): 1448 - 1458 . (in Chinese) . doi: 10.37188/OPE.20212906.1448 http://dx.doi.org/10.37188/OPE.20212906.1448
秦鸿睿 . 基于改进EfficientDet的车辆检测算法研究 [D]. 合肥 : 安徽建筑大学 , 2022 .
QIN H R . Research on Vehicle Detection Algorithm Based on Improved EfficientDet [D]. Hefei : Anhui Jianzhu University , 2022 . (in Chinese)
张洋 . 改进的YOLOv3-tiny在城市交叉路口车辆检测中的应用 [D]. 重庆 : 重庆师范大学 , 2020 .
ZHANG Y . Application of Improved YOLOv 3 -tiny in Vehicle Detection at Urban Intersections [D]. Chongqing : Chongqing Normal University , 2020 . (in Chinese)
张云强 . 基于YOLOv4-tiny的车辆检测与测距算法研究 [D]. 哈尔滨 : 哈尔滨理工大学 , 2022 .
ZHANG Y Q . Resarch on Detection and Distance Measurement Algorithm of Vehicles Based on YOLOv 4 -tiny [D]. Harbin : Harbin University of Science and Technology , 2022 . (in Chinese)
赵璐璐 , 王学营 , 张翼 , 等 . 基于YOLOv5s融合SENet的车辆目标检测技术研究 [J]. 图学学报 , 2022 , 43 ( 5 ): 776 - 782 .
ZHAO L L , WANG X Y , ZHANG Y , et al . Vehicle target detection based on YOLOv5s fusion SENet [J]. Journal of Graphics , 2022 , 43 ( 5 ): 776 - 782 . (in Chinese)
REDMON J , DIVVALA S , GIRSHICK R , et al . You Only Look Once: Unified, Real-Time Object Detection [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2730,2016 , Las Vegas, NV, USA. IEEE , 2016 : 779 - 788 . doi: 10.1109/cvpr.2016.91 http://dx.doi.org/10.1109/cvpr.2016.91
REDMON J , FARHADI A . YOLO9000: Better, Faster, Stronger [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2126,2017 , Honolulu, HI, USA. IEEE , 2017 : 6517 - 6525 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690
REDMON J , FARHADI A . YOLOv3: an incremental improvement [EB/OL]. 2018 : arXiv : 1804 . 02767 . https://arxiv.org/abs/1804.02767 https://arxiv.org/abs/1804.02767 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690
BOCHKOVSKIY A , WANG C Y , LIAO H Y M . YOLOv4: optimal speed and accuracy of object detection [EB/OL]. 2020 : arXiv : 2004 . 10934 . https://arxiv.org/abs/2004.10934 https://arxiv.org/abs/2004.10934
LIU S , QI L , QIN H F , et al . Path Aggregation Network for Instance Segmentation [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823,2018 , Salt Lake City, UT, USA . IEEE , 2018 : 8759 - 8768 . doi: 10.1109/cvpr.2018.00913 http://dx.doi.org/10.1109/cvpr.2018.00913
WANG C Y , MARK LIAO H Y , WU Y H , et al . CSPNet: a New Backbone That Can Enhance Learning Capability of CNN [C]. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 1419,2020 , Seattle, WA, USA. IEEE , 2020 : 1571 - 1580 . doi: 10.1109/cvprw50498.2020.00203 http://dx.doi.org/10.1109/cvprw50498.2020.00203
HE K M , ZHANG X Y , REN S Q , et al . Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2015 , 37 ( 9 ): 1904 - 1916 . doi: 10.1109/tpami.2015.2389824 http://dx.doi.org/10.1109/tpami.2015.2389824
TAN M X , LE Q V . EfficientNet: rethinking model scaling for convolutional neural networks [EB/OL]. 2019 : arXiv : 1905 . 11946 . https://arxiv.org/abs/1905.11946 https://arxiv.org/abs/1905.11946
TAN M X , LE Q V . EfficientNetV2: smaller models and faster training [EB/OL]. 2021 : arXiv : 2104 . 00298 . https://arxiv.org/abs/2104.00298 https://arxiv.org/abs/2104.00298
VASWANI A , SHAZEER N , PARMAR N , et al . Attention is All You Need [C]. Proceedings of the 31st International Conference on Neural Information Processing Systems. December 4 - 9 , 2017, Long Beach, California, USA. New York : ACM , 2017: 6000 - 6010 .
DOSOVITSKIY A , BEYER L , KOLESNIKOV A , et al . An image is worth 16 x 16 words: transformers for image recognition at scale[EB/OL]. 2020 : arXiv : 2010 . 11929 . https://arxiv.org/abs/2010.11929 https://arxiv.org/abs/2010.11929
GUO M H , LU C Z , LIU Z N , et al . Visual attention network [EB/OL]. 2022 : arXiv : 2202 . 09741 . https://arxiv.org/abs/2202.09741 https://arxiv.org/abs/2202.09741
ZHANG H , WU C R , ZHANG Z Y , et al . ResNeSt: split-attention networks [EB/OL]. 2020 : arXiv : 2004 . 08955 . https://arxiv.org/abs/2004.08955 https://arxiv.org/abs/2004.08955
HU J , SHEN L , SUN G . Squeeze and Excitation Networks [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823,2018 , Salt Lake City, UT, USA . IEEE , 2018 : 7132 - 7141 . doi: 10.1109/cvpr.2018.00745 http://dx.doi.org/10.1109/cvpr.2018.00745
LI D , YAO A B , CHEN Q F . PSConv: squeezing feature pyramid into one compact poly-scale convolutional layer [J]. Computer Vision , 2020 : 615 - 632 . doi: 10.1007/978-3-030-58589-1_37 http://dx.doi.org/10.1007/978-3-030-58589-1_37
DUTA I C , LIU L , ZHU F , et al . Pyramidal convolution: rethinking convolutional neural networks for visual recognition [EB/OL]. 2020 : arXiv : 2006 . 11538 . https://arxiv.org/abs/2006.11538 https://arxiv.org/abs/2006.11538
GAO S H , CHENG M M , ZHAO K , et al . Res2Net: a new multi-scale backbone architecture [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2021 , 43 ( 2 ): 652 - 662 . doi: 10.1109/tpami.2019.2938758 http://dx.doi.org/10.1109/tpami.2019.2938758
LIN T Y , GOYAL P , GIRSHICK R , et al . Focal Loss for Dense Object Detection [C]. 2017 IEEE International Conference on Computer Vision (ICCV). 2229,2017 , Venice, Italy. IEEE , 2017 : 2999 - 3007 . doi: 10.1109/iccv.2017.324 http://dx.doi.org/10.1109/iccv.2017.324
LOSHCHILOV I , HUTTER F . SGDR: stochastic gradient descent with warm restarts [EB/OL]. 2016 : arXiv : 1608 . 03983 . https://arxiv.org/abs/1608.03983 https://arxiv.org/abs/1608.03983
HE K M , ZHANG X Y , REN S Q , et al . Deep Residual Learning for Image Recognition [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2730,2016 , Las Vegas, NV, USA. IEEE , 2016 : 770 - 778 . doi: 10.1109/cvpr.2016.90 http://dx.doi.org/10.1109/cvpr.2016.90
TOLSTIKHIN I , HOULSBY N , KOLESNIKOV A , et al . MLP-mixer: an all-MLP architecture for vision [EB/OL]. 2021 : arXiv : 2105 . 01601 . https://arxiv.org/abs/2105.01601 https://arxiv.org/abs/2105.01601
0
浏览量
780
下载量
2
CSCD
关联资源
相关文章
相关作者
相关机构