浏览全部资源
扫码关注微信
江南大学 轻工过程先进控制教育部重点实验室,江苏 无锡 214122
[ "陈 莹(1976-),女,浙江丽水人,教授,博士生导师,2005年于西安交通大学获得硕士及工学博士学位,主要从事计算机视觉,模式识别和信号处理等方面的研究。 E-mail: chenying@jiangnan.edu.cn" ]
[ "朱 宇(1996-),女,江苏泰州人,硕士研究生,2018年于江苏科技大学获得学士学位,主要研究方向为计算机视觉,图像处理等。E-mail: 6181905014@stu.jiangnan.edu.cn" ]
收稿日期:2020-05-29,
修回日期:2020-07-09,
纸质出版日期:2020-12-15
移动端阅览
陈莹,朱宇.模态自适应权值学习机制下的多光谱行人检测网络[J].光学精密工程,2020,28(12):2700-2709.
CHEN Ying,ZHU Yu.Multispectral pedestrian detection network under modal adaptive weight learning mechanism[J].Optics and Precision Engineering,2020,28(12):2700-2709.
陈莹,朱宇.模态自适应权值学习机制下的多光谱行人检测网络[J].光学精密工程,2020,28(12):2700-2709. DOI: 10.37188/OPE.20202812.2700.
CHEN Ying,ZHU Yu.Multispectral pedestrian detection network under modal adaptive weight learning mechanism[J].Optics and Precision Engineering,2020,28(12):2700-2709. DOI: 10.37188/OPE.20202812.2700.
针对目前基于红外与可见光模态融合的行人检测方法难以自适应外界环境变化的问题,提出基于多模态信息融合权值学习的行人检测网络。首先,区别于目前大多数研究采用的两模态直接堆叠融合方法,权值学习融合网络考虑两种模态在不同环境条件下对行人检测任务的不同贡献比重,通过双流交互学习二者差异,然后根据各模态特征的当前特性自主获得各模态特征的相应权重,进行加权融合得到融合特征,最后基于融合特征生成新的特征金字塔,并改变先验框的尺寸和密集度以丰富行人先验信息,完成行人检测任务。实验结果表明:在Kaist多光谱行人检测数据集上获得26.96%的平均漏检率,相比目前采用直接堆叠的最优方法以及baseline方法分别降低了2.77%和27.84%,因此自适应权值融合红外和可见光两种模态的信息可以有效获得互补的模态信息以自适应外界环境变化,大幅提升行人检测的性能。
A pedestrian detection network based on the weight learning of fusing multimodal information was developed to address the issues of the pedestrian detection method based on infrared and visible modal fusion in adapting to changes in the external environment. First, unlike the fusion method used in several recent studies in which two modalities are stacked directly, the weight learning fusion network reflects different contributions of the modalities to the pedestrian detection task under different environmental conditions. The differences between the two modalities were determined through dual-stream interaction learning. Next, based on the current characteristics of each modal feature, the weight learning fusion network assigned the corresponding weights to each modal feature to generate the fusion feature by performing weighted fusion autonomously. Finally, a new feature pyramid based on the fusion feature was generated, and previous information about the pedestrian was improved by changing the size and density of prior boxes to complete the pedestrian detection task. The experimental results indicated that the log-average miss rate of the Kaist multispectral pedestrian detection dataset reached 26.96%, which was 2.77% and 27.84% lower than that of the direct stacking method and baseline method, respectively. The adaptive weight fusion of infrared and visible modal information could effectively be used to obtain complementary modal information to adapt to external environmental changes and significantly improve pedestrian detection performance.
LIU J J , ZHANG S T , WANG S , et al .. Multispectral deep neural networks for pedestrian detection [C]. British Machine Vision Conference , York , UK , 2016 : 73 . 1-73 . 13 .
KONIG D , ADAM M , JARVERS C , et al .. Fully convolutional region proposal networks for multispectral person detection [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops , Honolulu , Hawaii , 2017 : 49 - 56 .
REN S Q , HE K M , GIRSHICK R , et al . . Faster R-CNN: Towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 6 ): 1137 - 1149 .
CAO Y P , GUAN D Y , HUANG W L , et al . . Pedestrian detection with unsupervised multispectral feature learning using deep neural networks [J]. Information Fusion , 2019 : 206 - 217 .
HOU Y L , SONG Y , HAO X , et al . . Multispectral pedestrian detection based on deep convolutional neural networks [J]. Infrared Physics & Technology , 2018 , 94 : 69 - 77 .
LEE Y , BUI T D , SHIN J . Pedestrian detection based on deep fusion network using feature correlation [C]. Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. IEEE , 2018 : 694 - 699 .
LI C Y , SONG D , TONG R , et al . . Illumination-aware faster R-CNN for robust multispectral pedestrian detection [J]. Pattern Recognition , 2019 , 85 : 161 - 171 .
GUAN D Y , CAO Y P , YANG J , et al . . Fusion of multispectral data through illumination-aware deep neural networks for pedestrian detection [J]. Information Fusion , 2019 , 50 : 148 - 157 .
ZHANG L , LIU Z , ZHANG S , et al . . Cross-modality interactive attention network for multispectral pedestrian detection [J]. Information Fusion , 2019 , 50 : 20 - 29 .
吴晓雨 , 顾超男 , 王生进 . 多模态特征融合与多任务学习的特种视频分类 [J]. 光学 精密工程 , 2020 , 28 ( 5 ): 1177 - 1186 .
WU X Y , GU CH N , WANG SH J . Special video classification based on multitask learning and multimodal feature fusion [J]. Opt. Precision Eng. , 2020 , 28 ( 5 ): 1177 - 1186 . (in Chinese)
LI Z , ZHOU F . FSSD: Feature fusion single shot multibox detector [J/OL]. ArXiv e-prints , 2018-5-17 [ 2020-5-29 ]. https://arxiv.org/abs/1712.00960 https://arxiv.org/abs/1712.00960 .
LIU W , ANGUELOV D , ERHAN D , et al .. SSD: Single shot multibox detector [C]. European Conference on Computer Vision , Amsterdam , The Netherlands: Springer , 2016 : 21 - 37 .
范丽丽 , 赵宏伟 , 赵浩宇 , 等 . 基于深度卷积神经网络的目标检测研究综述 [J]. 光学 精密工程 , 2020 , 28 ( 5 ): 1152 - 1164 .
FAN L L , ZHAO H W , ZHAO H Y , et al . . Survey of target detection based on deep convolutional neural networks [J]. Opt. Precision Eng. , 2020 , 28 ( 5 ): 1152 - 1164 . (in Chinese)
KIM J , KOH J , KIM Y , et al .. Robust deep multi-modal learning based on gated information fusion network [C]. Asian Conference on Computer Vision , Perth , Australia , 2018 : 90 - 106 .
唐悦 , 吴戈 , 朴燕 . 改进的GDT-YOLO V3目标检测算法 [J]. 液晶与显示 , 2020 , 35 ( 8 ): 852 - 860 .
TANG Y , WU G , PIAO Y . Improved algorithm of GDT-YOLO V3 image target detection [J]. Chinese Journal of Liquid Crystals and Displays , 2020 , 35 ( 8 ): 852 - 860 . (in Chinese)
王建林 , 付雪松 , 黄展超 , 等 . 改进YOLOv2卷积神经网络的多类型合作目标检测 [J]. 光学 精密工程 , 2020 , 28 ( 1 ): 251 - 260 .
WANG J L , FU X S , HUANG ZH CH , et al . . Multi-type cooperative targets detection using improved YOLOv2 convolutional neural network [J]. Opt. Precision Eng. , 2020 , 28 ( 1 ): 251 - 260 . (in Chinese)
HWANG S , PARK J , KIM N , et al .. Multispectral pedestrian detection: Benchmark dataset and baseline [C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , Boston , Massachusetts , 2015 : 1037 - 1045 .
VANDERSTEEGEN M , VANBEECK K , GOEDEME T , et al .. Real-Time multispectral pedestrian detection with a single-pass deep neural network [C]. International Conference on Image Analysis and Recognition , Portugal , 2018 : 419 - 426 .
DOLLAR P , WOJEK C , SCHIELE B , et al . . Pedestrian detection: An evaluation of the state of the art [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2012 , 34 ( 4 ): 743 - 761 .
0
浏览量
451
下载量
3
CSCD
关联资源
相关文章
相关作者
相关机构