基于改进高分辨率神经网络的多目标行人跟踪

张红颖; 贺鹏艺; 彭晓雯

doi:10.37188/OPE.20233106.0860

您当前的位置：

首页 >

文章列表页 >

基于改进高分辨率神经网络的多目标行人跟踪

信息科学 | 更新时间：2023-03-25

- 基于改进高分辨率神经网络的多目标行人跟踪
- Multi-object pedestrian tracking method based on improved high resolution neural network
- 光学精密工程 2023年31卷第6期页码：860-871
- 作者机构：
  
  中国民航大学电子信息与自动化学院，天津 300300
- 作者简介：
  
  [ "张红颖（1978-），女，天津人，博士，教授，硕士生导师，分别于2001年、2004年、2007年在天津大学获得学士、硕士、博士学位，主要从事图像工程与计算机视觉方面的研究。 E-mail: carole_zhang0716@163.com" ]
  [ "贺鹏艺（1996-），男，山东东营人，硕士研究生，2019年于重庆交通大学获得学士学位，主要从事图像处理、计算机视觉方面的研究。 E-mail: 522497177@qq.com" ]
- 基金信息：
  
  国家重点研发计划资助项目(2018YFB1601200)
- DOI：10.37188/OPE.20233106.0860
  中图分类号： TP391.4
- 收稿日期：2022-05-26，
  
  修回日期：2022-06-17，
  
  纸质出版日期：2023-03-25
- 稿件说明：
移动端阅览
张红颖,贺鹏艺,彭晓雯.基于改进高分辨率神经网络的多目标行人跟踪[J].光学精密工程,2023,31(06):860-871.

ZHANG Hongying,HE Pengyi,PENG Xiaowen.Multi-object pedestrian tracking method based on improved high resolution neural network[J].Optics and Precision Engineering,2023,31(06):860-871.
张红颖,贺鹏艺,彭晓雯.基于改进高分辨率神经网络的多目标行人跟踪[J].光学精密工程,2023,31(06):860-871. DOI： 10.37188/OPE.20233106.0860.

ZHANG Hongying,HE Pengyi,PENG Xiaowen.Multi-object pedestrian tracking method based on improved high resolution neural network[J].Optics and Precision Engineering,2023,31(06):860-871. DOI： 10.37188/OPE.20233106.0860.

摘要

针对行人多目标跟踪过程中目标被遮挡时产生的检测、跟踪失败问题，提出了一种改进型高分辨率神经网络作为检测网络。首先，为了增强网络对于行人目标的初始特征提取能力，在高分辨率神经网络的基础上，对网络的主干部分引入二代瓶颈残差块结构，提升感受野和特征表达力；其次，设计了添加二层高效通道注意力模块的残差检测块架构，并通过该架构替换了原有网络在多尺度信息交换阶段中的残差检测块，以提高了整个网络系统的测试性能；最后，通过选择适当的参数对网络进行了全面地训练，并通过多个测试集对算法测试。测试结果显示，本文算法相较于FairMOT在2DMOT15，MOT17，MOT20数据集上的跟踪准确度分别提升0.1%，1.6%，0.8%。本文算法可以良好地应用在目标较多且遮挡面积较大的特殊情景，同时对于较长时间视频序列的追踪稳定性也大大提高。

Abstract

This study proposes an improved high-resolution neural network to address the issue of detection and tracking failures caused by target blockage in a multi-target pedestrian tracking process. First， to enhance the initial feature extraction capability of the network for pedestrian targets， a second-generation bottleneck residual block structure was introduced into the backbone of a high-resolution neural network， thus improving the receptive field and feature expression capability. Second， a new residual detection block architecture with a two-layer efficient channel attention module was designed to replace the one at the multi-scale information exchange stage of the original network， thus improving the test performance of the entire network system. Finally， the network was fully trained by selecting appropriate parameters， and subsequently， the algorithm was tested using multiple test sets. The test results indicated that the tracking accuracy of the proposed algorithm was 0.1%， 1.6%， and 0.8% higher than that of FairMOT on 2DMOT15， MOT17， and MOT20 datasets， respectively. In conclusion， the proposed algorithm-tracking stability for longer video sequences was greatly improved. Therefore， it can be applied to special scenarios with more targets and occlusion area.

关键词

Keywords

references

曹自强，赛斌，吕欣 . 行人跟踪算法及应用综述［J］. 物理学报， 2020 ， 69 （ 8 ）： 41 - 58 . doi: 10.7498/aps.69.20191721 http://dx.doi.org/10.7498/aps.69.20191721

CAO Z Q ， SAI B ， LÜ X . Review of pedestrian tracking： Algorithms and applications ［J］. Acta Physica Sinica ， 2020 ， 69 （ 8 ）： 41 - 58 . （in Chinese） . doi: 10.7498/aps.69.20191721 http://dx.doi.org/10.7498/aps.69.20191721

LAW H ， DENG J . CornerNet： Detecting objects as paired keypoints ［J］. International Journal of Computer Vision ， 2020 ， 128 （ 3 ）： 642 - 656 . doi: 10.1007/s11263-019-01204-1 http://dx.doi.org/10.1007/s11263-019-01204-1

ZHOU X ， WANG D ， KRÄHENBÜHL P . Objects as points ［EB/OL］. 2019 ： arXiv ： 1904 . 07850 . https：//arxiv.org/abs/1904.07850 https://arxiv.org/abs/1904.07850 . doi: 10.1090/mbk/121/79 http://dx.doi.org/10.1090/mbk/121/79

WANG J D ， SUN K ， CHENG T H ， et al . Deep high-resolution representation learning for visual recognition ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2021 ， 43 （ 10 ）： 3349 - 3364 . doi: 10.1109/tpami.2020.2983686 http://dx.doi.org/10.1109/tpami.2020.2983686

ZHAN Y ， WANG C ， WANG X ， et al . A simple baseline for multi-object tracking ［J］. arXiv preprint arXiv ： 2004.01888 ， 2020 .

MEINHARDT T ， KIRILLOV A ， LEAL-TAIXE L ， et al . TrackFormer： multi-object tracking with transformers ［EB/OL］. 2021 ： arXiv ： 2101 . 02702 . https：//arxiv.org/abs/2101.02702 https://arxiv.org/abs/2101.02702 . doi: 10.1109/cvpr52688.2022.00864 http://dx.doi.org/10.1109/cvpr52688.2022.00864

VASWANI A ， SHAZEER N ， PARMAR N ， et al . Attention is all You need ［EB/OL］. 2017 ： arXiv ： 1706 . 03762 . https：//arxiv.org/abs/1706.03762 https://arxiv.org/abs/1706.03762

张红颖，贺鹏艺 . 基于卷积注意力模块和无锚框检测网络的行人跟踪算法［J］. 电子与信息学报， 2022 ， 44 （ 9 ）： 3299 - 3307 . doi: 10.11999/JEIT210634 http://dx.doi.org/10.11999/JEIT210634

ZHANG H Y ， HE P Y . Pedestrian tracking algorithm based on convolutional block attention module and anchor-free detection network ［J］. Journal of Electronics & Information Technology ， 2022 ， 44 （ 9 ）： 3299 - 3307 . （in Chinese） . doi: 10.11999/JEIT210634 http://dx.doi.org/10.11999/JEIT210634

WANG Q L ， WU B G ， ZHU P F ， et al . ECA-Net： Efficient Channel Attention for Deep Convolutional Neural Networks ［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 1319，2020 ， Seattle， WA， USA. IEEE ， 2020 ： 11531 - 11539 . doi: 10.1109/cvpr42600.2020.01155 http://dx.doi.org/10.1109/cvpr42600.2020.01155

GAO S H ， CHENG M M ， ZHAO K ， et al . Res2Net： a new multi-scale backbone architecture ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2021 ， 43 （ 2 ）： 652 - 662 . doi: 10.1109/tpami.2019.2938758 http://dx.doi.org/10.1109/tpami.2019.2938758

XIAO T ， LI S ， WANG B C ， et al . Joint Detection and Identification Feature Learning for Person Search ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2126，2017 ， Honolulu， HI， USA. IEEE ， 2017 ： 3376 - 3385 . doi: 10.1109/cvpr.2017.360 http://dx.doi.org/10.1109/cvpr.2017.360

ZHENG L ， ZHANG H H ， SUN S Y ， et al . Person Re-identification in the Wild ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）. 2126，2017 ， Honolulu， HI， USA. IEEE ， 2017 ： 3346 - 3355 . doi: 10.1109/cvpr.2017.357 http://dx.doi.org/10.1109/cvpr.2017.357

MILAN A ， LEAL-TAIXE L ， REID I ， et al . MOT16： a benchmark for multi-object tracking ［EB/OL］. 2016 ： arXiv ： 1603 . 00831 . https：//arxiv.org/abs/1603.00831 https://arxiv.org/abs/1603.00831 . doi: 10.1109/cvpr.2015.7299178 http://dx.doi.org/10.1109/cvpr.2015.7299178

LEAL-TAIXÉ L ， MILAN A ， REID I ， et al . MOTChallenge 2015： towards a benchmark for multi-target tracking ［EB/OL］. 2015 ： arXiv ： 1504 . 01942 . https：//arxiv.org/abs/1504.01942 https://arxiv.org/abs/1504.01942

WOJKE N ， BEWLEY A ， PAULUS D . Simple Online and Realtime Tracking with a Deep Association Metric ［C］. 2017 IEEE International Conference on Image Processing （ICIP）. 1720，2017 ， Beijing， China. IEEE ， 2018 ： 3645 - 3649 . doi: 10.1109/icip.2017.8296962 http://dx.doi.org/10.1109/icip.2017.8296962

曾公任，姚剑敏，严群，等 . 基于神经网络与卡尔曼滤波的手部实时追踪方法［J］. 液晶与显示， 2020 ， 35 （ 5 ）： 464 - 470 . doi: 10.3788/yjyxs20203505.0464 http://dx.doi.org/10.3788/yjyxs20203505.0464

ZENG G R ， YAO J M ， YAN Q ， et al . Hand real-time tracking method based on neural network and Kalman filter ［J］. Chinese Journal of Liquid Crystals and Displays ， 2020 ， 35 （ 5 ）： 464 - 470 . （in Chinese） . doi: 10.3788/yjyxs20203505.0464 http://dx.doi.org/10.3788/yjyxs20203505.0464

DENDORFER P ， REZATOFIGHI H ， MILAN A ， et al . MOT20： a benchmark for multi object tracking in crowded scenes ［EB/OL］. 2020 ： arXiv ： 2003 . 09003 . https：//arxiv.org/abs/2003.09003 https://arxiv.org/abs/2003.09003

PANG B ， LI Y Z ， ZHANG Y F ， et al . TubeTK： Adopting Tubes to Track Multi-Object in a One-Step Training Model ［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 1319，2020 ， Seattle， WA， USA. IEEE ， 2020 ： 6307 - 6317 . doi: 10.1109/cvpr42600.2020.00634 http://dx.doi.org/10.1109/cvpr42600.2020.00634

LIANG C ， ZHANG Z P ， ZHOU X ， et al . Rethinking the competition between detection and ReID in multiobject tracking ［J］. IEEE Transactions on Image Processing： a Publication of the IEEE Signal Processing Society ， 2022 ， 31 ： 3182 - 3196 . doi: 10.1109/tip.2022.3165376 http://dx.doi.org/10.1109/tip.2022.3165376

XU Y ， BAN Y ， DELORME G ， et al . TransCenter： transformers with dense representations for multiple-object tracking ［EB/OL］. 2021 ： arXiv ： 2103 . 15145 . https：//arxiv.org/abs/2103.15145 https://arxiv.org/abs/2103.15145 . doi: 10.1109/tpami.2022.3225078 http://dx.doi.org/10.1109/tpami.2022.3225078

浏览量

1148

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

暂无数据