采用长距离依赖和多尺度表达的轻量化车辆检测

荆修平; 田莹

doi:10.37188/OPE.20233106.0950

您当前的位置：

首页 >

文章列表页 >

采用长距离依赖和多尺度表达的轻量化车辆检测

信息科学 | 更新时间：2023-03-25

- 采用长距离依赖和多尺度表达的轻量化车辆检测
- Lightweight vehicle detection using long-distance dependence and multi-scale representation
- 光学精密工程 2023年31卷第6期页码：950-961
- 作者机构：
  
  辽宁科技大学计算机与软件工程学院，辽宁鞍山 114000
- 作者简介：
  
  [ "荆修平（1996-），男，辽宁鞍山人，硕士研究生，2018年于辽宁科技大学获得学士学位，主要从事机器视觉等方面研究。E-mail：jxp_snowman@163.com" ]
  [ "田莹（1971-），女，辽宁沈阳人，博士，教授，硕士生导师，2008年于沈阳工业大学获得博士学位，主要从事计算机视觉，数字图像处理，模式识别等方面研究。E-mail：astianying@126.com" ]
- 基金信息：
  
  国家自然科学基金资助项目(62072086);辽宁省教育厅资助项目(LJKZ0310)
- DOI：10.37188/OPE.20233106.0950
  中图分类号： TP391
- 收稿日期：2022-06-09，
  
  修回日期：2022-07-26，
  
  纸质出版日期：2023-03-25
- 稿件说明：
移动端阅览
荆修平,田莹.采用长距离依赖和多尺度表达的轻量化车辆检测[J].光学精密工程,2023,31(06):950-961.

JING Xiuping,TIAN Ying.Lightweight vehicle detection using long-distance dependence and multi-scale representation[J].Optics and Precision Engineering,2023,31(06):950-961.
荆修平,田莹.采用长距离依赖和多尺度表达的轻量化车辆检测[J].光学精密工程,2023,31(06):950-961. DOI： 10.37188/OPE.20233106.0950.

JING Xiuping,TIAN Ying.Lightweight vehicle detection using long-distance dependence and multi-scale representation[J].Optics and Precision Engineering,2023,31(06):950-961. DOI： 10.37188/OPE.20233106.0950.

摘要

基于深度学习的车辆检测在众多领域发挥着至关重要的作用，是近年来计算机视觉的一个重要发展方向。车辆轻量化检测包含了对网络结构和计算效率的探索，并在智慧交通等诸多领域都得以广泛应用。然而在诸多场景下存在相机中车辆目标尺度变化大、车辆相互遮挡等问题，这些情况会影响到网络检测车辆的精度。针对上述问题，提出改进Yolov5s的车辆检测方法。首先通过视觉注意力网络捕获长距离依赖，对原有特征图施加新的权重，增强自适应性，提升网络的抗遮挡能力；接着在残差模块内部再次构造水平方向残差，在一个模块内部构建相同数量、不同大小感受野的特征图，丰富网络的多尺度表达能力。实验结果表明：改进后的网络在Pascal VOC车辆数据集上提供2.1%mAP性能提升，在MS COCO车辆数据集上提供1.7%mAP性能提升。改进后网络的多尺度表达能力更加出色，且抗遮挡能力更强，与原始网络相比检测结果更具有竞争力。

Abstract

Vehicle detection based on deep learning plays a vital role in many fields. In recent years， it has presented a major development direction for computer vision. Lightweight vehicle detection includes the exploration of network structure and computing efficiency， and it is widely used in many fields such as intelligent transportation. However， challenges exist in different scenarios， such as large changes in vehicle scale in detection cameras and vehicles overlapping each other， which reduce the precision of the network in detecting vehicles. To solve these problems， this study proposes an improved YOLOv5s method for detecting vehicles. First， the study proposes to capture long-distance dependencies between objects through a visual attention network and apply new weights to the network’s original feature map to increase the adaptability of the network. These operations improve the anti-occlusion ability of the network. Second， the horizontal residual is constructedagain in the residual module. The output feature maps contain the same number and different sizes of receptive fields per module. Feature extraction occurs at a more fine-grained level， thereby enriching the multi-scale representation ability of the network. The experimental results show that the improved network provides 2.1% mAP performance on the Pascal visual object classes （VOC） vehicle telemetry dataset and a 1.7% mAP performance on the MS COCO vehicle telemetry dataset. The performance of the improved network is more powerful and its anti-occlusion ability is enhanced. Compared with the original network， the detection results are more competitive.

关键词

Keywords

references

孙敏，李免，赵玉舟，等 . 基于实时交通状况和自适应像素分割的运动车辆检测［J］. 液晶与显示， 2021 ， 36 （ 10 ）： 1454 - 1462 . doi: 10.37188/CJLCD.2020-0316 http://dx.doi.org/10.37188/CJLCD.2020-0316

SUN M ， LI M ， ZHAO Y ZH ， et al . Vehicle detection based on real-time traffic condition and adaptive pixel segmentation ［J］. Chinese Journal of Liquid Crystals and Displays ， 2021 ， 36 （ 10 ）： 1454 - 1462 . （in Chinese） . doi: 10.37188/CJLCD.2020-0316 http://dx.doi.org/10.37188/CJLCD.2020-0316

张浩，杨坚华，花海洋 . 基于FVOIRGAN-Detection的车辆检测［J］. 光学精密工程， 2022 ， 30 （ 12 ）： 1478 - 1486 . doi: 10.37188/OPE.20223012.1478 http://dx.doi.org/10.37188/OPE.20223012.1478

ZHANG H ， YANG J H ， HUA H Y . Vehicle detection based on FVOIRGAN-Detection ［J］. Opt. Precision Eng. ， 2022 ， 30 （ 12 ）： 1478 - 1486 . （in Chinese） . doi: 10.37188/OPE.20223012.1478 http://dx.doi.org/10.37188/OPE.20223012.1478

杨慧 . 基于HOG和Haar-like融合特征的车辆检测［D］. 南京：南京邮电大学， 2013 .

YANG H . Vehicle Detection Based on HOG and Haar-Like Features ［D］. Nanjing ： Nanjing University of Posts and Telecommunications ， 2013 . （in Chinese）

李星，郭晓松，郭君斌 . 基于HOG特征和SVM的前向车辆识别方法［J］. 计算机科学， 2013 ， 40 （ S2 ）： 329 - 332 . doi: 10.3969/j.issn.1002-137X.2013.z2.082 http://dx.doi.org/10.3969/j.issn.1002-137X.2013.z2.082

LI X ， GUO X S ， GUO J B . HOG-feature and SVM based method for forward vehicle recognition ［J］. Computer Science ， 2013 ， 40 （ S2 ）： 329 - 332 . （in Chinese） . doi: 10.3969/j.issn.1002-137X.2013.z2.082 http://dx.doi.org/10.3969/j.issn.1002-137X.2013.z2.082

余永维，韩鑫，杜柳青 . 基于Inception-SSD算法的零件识别［J］. 光学精密工程， 2020 ， 28 （ 8 ）： 1799 - 1809 .

YU Y W ， HAN X ， DU L Q . Target part recognition based Inception-SSD algorithm ［J］. Opt. Precision Eng. ， 2020 ， 28 （ 8 ）： 1799 - 1809 . （in Chinese）

王宸，张秀峰，刘超，等 . 改进YOLOv3的轮毂焊缝缺陷检测［J］. 光学精密工程， 2021 ， 29 （ 8 ）： 1942 - 1954 . doi: 10.37188/OPE.20212908.1942 http://dx.doi.org/10.37188/OPE.20212908.1942

WANG CH ， ZHANG X F ， LIU CH ， et al . Detection method of wheel hub weld defects based on the improved YOLOv3 ［J］. Opt. Precision Eng. ， 2021 ， 29 （ 8 ）： 1942 - 1954 . （in Chinese） . doi: 10.37188/OPE.20212908.1942 http://dx.doi.org/10.37188/OPE.20212908.1942

李经宇，杨静，孔斌，等 . 基于注意力机制的多尺度车辆行人检测算法［J］. 光学精密工程， 2021 ， 29 （ 6 ）： 1448 - 1458 . doi: 10.37188/OPE.20212906.1448 http://dx.doi.org/10.37188/OPE.20212906.1448

LI J Y ， YANG J ， KONG B ， et al . Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism ［J］. Opt. Precision Eng. ， 2021 ， 29 （ 6 ）： 1448 - 1458 . （in Chinese） . doi: 10.37188/OPE.20212906.1448 http://dx.doi.org/10.37188/OPE.20212906.1448

秦鸿睿 . 基于改进EfficientDet的车辆检测算法研究［D］. 合肥：安徽建筑大学， 2022 .

QIN H R . Research on Vehicle Detection Algorithm Based on Improved EfficientDet ［D］. Hefei ： Anhui Jianzhu University ， 2022 . （in Chinese）

张洋 . 改进的YOLOv3-tiny在城市交叉路口车辆检测中的应用［D］. 重庆：重庆师范大学， 2020 .

ZHANG Y . Application of Improved YOLOv 3 -tiny in Vehicle Detection at Urban Intersections ［D］. Chongqing ： Chongqing Normal University ， 2020 . （in Chinese）

张云强 . 基于YOLOv4-tiny的车辆检测与测距算法研究［D］. 哈尔滨：哈尔滨理工大学， 2022 .

ZHANG Y Q . Resarch on Detection and Distance Measurement Algorithm of Vehicles Based on YOLOv 4 -tiny ［D］. Harbin ： Harbin University of Science and Technology ， 2022 . （in Chinese）

赵璐璐，王学营，张翼，等 . 基于YOLOv5s融合SENet的车辆目标检测技术研究［J］. 图学学报， 2022 ， 43 （ 5 ）： 776 - 782 .

ZHAO L L ， WANG X Y ， ZHANG Y ， et al . Vehicle target detection based on YOLOv5s fusion SENet ［J］. Journal of Graphics ， 2022 ， 43 （ 5 ）： 776 - 782 . （in Chinese）

REDMON J ， DIVVALA S ， GIRSHICK R ， et al . You Only Look Once： Unified， Real-Time Object Detection ［C］. 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2730，2016 ， Las Vegas， NV， USA. IEEE ， 2016 ： 779 - 788 . doi: 10.1109/cvpr.2016.91 http://dx.doi.org/10.1109/cvpr.2016.91

REDMON J ， FARHADI A . YOLO9000： Better， Faster， Stronger ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2126，2017 ， Honolulu， HI， USA. IEEE ， 2017 ： 6517 - 6525 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690

REDMON J ， FARHADI A . YOLOv3： an incremental improvement ［EB/OL］. 2018 ： arXiv ： 1804 . 02767 . https：//arxiv.org/abs/1804.02767 https://arxiv.org/abs/1804.02767 . doi: 10.1109/cvpr.2017.690 http://dx.doi.org/10.1109/cvpr.2017.690

BOCHKOVSKIY A ， WANG C Y ， LIAO H Y M . YOLOv4： optimal speed and accuracy of object detection ［EB/OL］. 2020 ： arXiv ： 2004 . 10934 . https：//arxiv.org/abs/2004.10934 https://arxiv.org/abs/2004.10934

LIU S ， QI L ， QIN H F ， et al . Path Aggregation Network for Instance Segmentation ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823，2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 8759 - 8768 . doi: 10.1109/cvpr.2018.00913 http://dx.doi.org/10.1109/cvpr.2018.00913

WANG C Y ， MARK LIAO H Y ， WU Y H ， et al . CSPNet： a New Backbone That Can Enhance Learning Capability of CNN ［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops （CVPRW）. 1419，2020 ， Seattle， WA， USA. IEEE ， 2020 ： 1571 - 1580 . doi: 10.1109/cvprw50498.2020.00203 http://dx.doi.org/10.1109/cvprw50498.2020.00203

HE K M ， ZHANG X Y ， REN S Q ， et al . Spatial pyramid pooling in deep convolutional networks for visual recognition ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2015 ， 37 （ 9 ）： 1904 - 1916 . doi: 10.1109/tpami.2015.2389824 http://dx.doi.org/10.1109/tpami.2015.2389824

TAN M X ， LE Q V . EfficientNet： rethinking model scaling for convolutional neural networks ［EB/OL］. 2019 ： arXiv ： 1905 . 11946 . https：//arxiv.org/abs/1905.11946 https://arxiv.org/abs/1905.11946

TAN M X ， LE Q V . EfficientNetV2： smaller models and faster training ［EB/OL］. 2021 ： arXiv ： 2104 . 00298 . https：//arxiv.org/abs/2104.00298 https://arxiv.org/abs/2104.00298

VASWANI A ， SHAZEER N ， PARMAR N ， et al . Attention is All You Need ［C］. Proceedings of the 31st International Conference on Neural Information Processing Systems. December 4 - 9 ， 2017， Long Beach， California， USA. New York ： ACM ， 2017： 6000 - 6010 .

DOSOVITSKIY A ， BEYER L ， KOLESNIKOV A ， et al . An image is worth 16 x 16 words： transformers for image recognition at scale［EB/OL］. 2020 ： arXiv ： 2010 . 11929 . https：//arxiv.org/abs/2010.11929 https://arxiv.org/abs/2010.11929

GUO M H ， LU C Z ， LIU Z N ， et al . Visual attention network ［EB/OL］. 2022 ： arXiv ： 2202 . 09741 . https：//arxiv.org/abs/2202.09741 https://arxiv.org/abs/2202.09741

ZHANG H ， WU C R ， ZHANG Z Y ， et al . ResNeSt： split-attention networks ［EB/OL］. 2020 ： arXiv ： 2004 . 08955 . https：//arxiv.org/abs/2004.08955 https://arxiv.org/abs/2004.08955

HU J ， SHEN L ， SUN G . Squeeze and Excitation Networks ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823，2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 7132 - 7141 . doi: 10.1109/cvpr.2018.00745 http://dx.doi.org/10.1109/cvpr.2018.00745

LI D ， YAO A B ， CHEN Q F . PSConv： squeezing feature pyramid into one compact poly-scale convolutional layer ［J］. Computer Vision ， 2020 ： 615 - 632 . doi: 10.1007/978-3-030-58589-1_37 http://dx.doi.org/10.1007/978-3-030-58589-1_37

DUTA I C ， LIU L ， ZHU F ， et al . Pyramidal convolution： rethinking convolutional neural networks for visual recognition ［EB/OL］. 2020 ： arXiv ： 2006 . 11538 . https：//arxiv.org/abs/2006.11538 https://arxiv.org/abs/2006.11538

GAO S H ， CHENG M M ， ZHAO K ， et al . Res2Net： a new multi-scale backbone architecture ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2021 ， 43 （ 2 ）： 652 - 662 . doi: 10.1109/tpami.2019.2938758 http://dx.doi.org/10.1109/tpami.2019.2938758

LIN T Y ， GOYAL P ， GIRSHICK R ， et al . Focal Loss for Dense Object Detection ［C］. 2017 IEEE International Conference on Computer Vision （ICCV）. 2229，2017 ， Venice， Italy. IEEE ， 2017 ： 2999 - 3007 . doi: 10.1109/iccv.2017.324 http://dx.doi.org/10.1109/iccv.2017.324

LOSHCHILOV I ， HUTTER F . SGDR： stochastic gradient descent with warm restarts ［EB/OL］. 2016 ： arXiv ： 1608 . 03983 . https：//arxiv.org/abs/1608.03983 https://arxiv.org/abs/1608.03983

HE K M ， ZHANG X Y ， REN S Q ， et al . Deep Residual Learning for Image Recognition ［C］. 2016 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2730，2016 ， Las Vegas， NV， USA. IEEE ， 2016 ： 770 - 778 . doi: 10.1109/cvpr.2016.90 http://dx.doi.org/10.1109/cvpr.2016.90

TOLSTIKHIN I ， HOULSBY N ， KOLESNIKOV A ， et al . MLP-mixer： an all-MLP architecture for vision ［EB/OL］. 2021 ： arXiv ： 2105 . 01601 . https：//arxiv.org/abs/2105.01601 https://arxiv.org/abs/2105.01601

浏览量

780

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

LightDiffu-DCE：基于光照强度扩散的低光照图像增强

多级图特征融合引导相机位姿回归

基于机载的红外动态目标视频实时超分辨率重建

基于知识蒸馏的Transformer视觉跟踪器

3D-CNN与Transformer混合结构的高光谱图像空谱联合分类