融合空间注意力机制的行车障碍预测网络

雷俊锋; 贺睿; 肖进胜

doi:10.3788/OPE.20202808.1850

您当前的位置：

首页 >

文章列表页 >

融合空间注意力机制的行车障碍预测网络

信息科学 | 更新时间：2020-08-14

- 融合空间注意力机制的行车障碍预测网络
- Driving obstacles prediction network merged with spatial attention
- 光学精密工程 2020年28卷第8期页码：1850-1860
- 作者机构：
  
  武汉大学电子信息学院, 湖北武汉 430072
- 作者简介：
  
  [ "雷俊锋(1975-), 男, 湖北武汉人, 博士, 副教授, 2002年于武汉大学电子信息学院获得博士学位, 主要研究方向是图像处理与人工智能。E-mail:jflei@whu.edu.cn" ]
  [ "贺睿(1997-), 男, 江西南昌人, 硕士研究生, 主要研究方向是图像处理与智能感知, E-mail:he_rui@whu.edu.cn" ]
- 基金信息：
  
  国家自然科学基金资助项目(41771457)
- DOI：10.3788/OPE.20202808.1850
  中图分类号： TP29;U495
- 收稿日期：2020-04-27，
  
  修回日期：2020-05-22，
  
  录用日期：2020-5-22，
  
  纸质出版日期：2020-08-25
- 稿件说明：
移动端阅览
雷俊锋, 贺睿, 肖进胜. 融合空间注意力机制的行车障碍预测网络[J]. 光学精密工程, 2020,28(8):1850-1860.

Jun-feng LEI, Rui HE, Jin-sheng XIAO. Driving obstacles prediction network merged with spatial attention[J]. Optics and precision engineering, 2020, 28(8): 1850-1860.
雷俊锋, 贺睿, 肖进胜. 融合空间注意力机制的行车障碍预测网络[J]. 光学精密工程, 2020,28(8):1850-1860. DOI： 10.3788/OPE.20202808.1850.

Jun-feng LEI, Rui HE, Jin-sheng XIAO. Driving obstacles prediction network merged with spatial attention[J]. Optics and precision engineering, 2020, 28(8): 1850-1860. DOI： 10.3788/OPE.20202808.1850.

摘要

针对现有行车障碍预测方法存在目标单一性、预测速度慢和准确性不佳等问题，提出一种融合空间注意力机制的卷积神经网络Coll-Net以及基于Coll-Net的车速控制和障碍方向判定策略。模拟驾驶员通过视觉信息判断障碍的机制，以单目视觉图像作为输入，首先对图像做预处理得到感兴趣区域，然后利用残差块网络提取区域内的空间特征；采用空间注意力机制对特征通道上的原始特征进行重新标定，获得通道权重；再将通道权重归一化后加权到通道对应的空间特征上，以此挑选关键特征，最后送入全连接层和Sigmoid函数中生成预测概率。行车根据障碍预测概率实时确定行车速度并根据多窗口的概率预测值判定障碍方向。实验表明，Coll-Net模型的障碍预测准确率达到96.01%，F1-score达到0.915，模型推理时间仅需24 ms，能够实时检测车辆、行人、护栏、墙体等多种障碍物，并且在低对比度光照环境下仍表现出良好的预测能力，基于Coll-Net的车速控制和障碍方向判定策略在Udacity Self-Driving数据集上表现出强有效性。

Abstract

To address the limited detection targets

slow processing speed

and low accuracyof existing methods for driving obstacle prediction

this paper proposed an improved convolutional neural network called Coll-Net merged with spatial attention

a suitable speed control policy

and an obstacle direction determination method based on Coll-Net. Coll-Net imitated the vision mechanism of judging obstacles during driving

preprocessed the input monocular vision images to obtain the region of interest

and extracted the spatial features using a deep residual network framework. After collecting the spatial features

Coll-Net recalibrated the original features on the spatial feature channels by using the mechanism of spatial attention

which evaluated the features of every channel

improved the important ones

and then rescaled the weights of every channel and assigned the normalized weights to the corresponding spatial features in order to select critical features. The output feature map is connected by a fullyconnected layer; then

a normalized obstacle probability range of 0 to 1 is generated by a sigmoid function. Moreover

this paper proposes a driving policy

that controls the driving speed and predicts the obstacle direction according to the generated probability by Coll-Net. Experiment results indicate that Coll-Net prediction accuracy on standard datasets reaches 96.01% and the f1 score reaches 0.915. Coll-Net performs well in detecting diverse obstacles such as cars

pedestrians

guardrails

and walls in real time(24 ms for inference)

as well as in low-contrast conditions. Moreover

the driving policy based on Coll-Net is validated using Udacity Self-Driving datasets.

关键词

Keywords

references

A HASELHOFF, A KUMMERT. A vehicle detection system based on Haar and triangle features[C]. 2009 IEEE Intelligent Vehicles Symposium , 2009: 261-266.

刘峰 , 王思博 , 王向军 , 等 . 多特征级联的低能见度环境红外行人检测方法 . 红外与激光工程 , 2018 . 47 ( 6 ): 127 - 134 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=hwyjggc201806021 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=hwyjggc201806021 .

F LIU , S B WANG , X J WANG , 等 . Infrared pedestrian detection method in low visibility environment based on multi feature association . Infrared and Laser Engineering , 2018 . 47 ( 6 ): 127 - 134 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=hwyjggc201806021 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=hwyjggc201806021 .

J REDMON, A FARHADI. YOLO9000: Better, faster, stronger[C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , Honolulu, HI, 2017: 6517-6525.

S REN , K HE , R GIRSHICK . Faster R-CNN:Towards real-time object detection with region proposal networks . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 . 39 ( 6 ): 1137 - 1149 . http://cn.bing.com/academic/profile?id=89bacd206ab7340f74ac7e7ea03ff929&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=89bacd206ab7340f74ac7e7ea03ff929&encoded=0&v=paper_preview&mkt=zh-cn .

范丽丽 , 赵宏伟 , 赵浩宇 , 等 . 基于深度卷积神经网络的目标检测研究综述 . 光学精密工程 , 2020 . 28 ( 5 ): 1152 - 1164 . http://ope.lightpublishing.cn/thesisDetails?columnId=2124482&Fpath=&index=-1&l=zh http://ope.lightpublishing.cn/thesisDetails?columnId=2124482&Fpath=&index=-1&l=zh .

L L FAN , H W ZHAO , H Y ZHAO , 等 . Survey of target detection based on deep convolutional neural networks . Opt. Precision Eng. , 2020 . 28 ( 5 ): 1152 - 1164 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc202005019 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc202005019 .

J LONG , E SHELHAMER , T DARRELL . Fully convolutional networks for semantic segmentation . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2015 . 39 ( 4 ): 640 - 651 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=994776c264e86f91bd6bda7f694c7564 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=994776c264e86f91bd6bda7f694c7564 .

任凤雷 , 何昕 , 魏仲慧 , 等 . 基于DeepLabV3+与超像素优化的语义分割 . 光学精密工程 , 2019 . 27 ( 12 ): 2722 - 2729 . http://ope.lightpublishing.cn/thesisDetails?columnId=2141138&Fpath=&index=-1&l=zh http://ope.lightpublishing.cn/thesisDetails?columnId=2141138&Fpath=&index=-1&l=zh .

F L REN , X HE , ZH H WEI , 等 . Semantic segmentation based on DeepLabV3+ and superpixel optimization . Opt. Precision Eng. , 2019 . 27 ( 12 ): 2722 - 2729 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201912025 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201912025 .

王中宇 , 倪显扬 , 尚振东 . 利用卷积神经网络的自动驾驶场景语义分割 . 光学精密工程 , 2019 . 27 ( 11 ): 2429 - 2438 . http://ope.lightpublishing.cn/thesisDetails?columnId=1453147&Fpath=&index=-1&l=zh http://ope.lightpublishing.cn/thesisDetails?columnId=1453147&Fpath=&index=-1&l=zh .

ZH Y WANG , X Y NI , ZH D SHANG . Autonomous driving semantic segmentation with convolution neural networks . Opt. Precision Eng. , 2019 . 27 ( 11 ): 2429 - 2438 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201911018 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=gxjmgc201911018 .

A KRIZHEVSKY , I SUTSKEVER , G HINTON . ImageNet classification with deep convolutional neural networks . Advances in neural information processing systems , 2012 . 25 ( 2 ): 1097 - 1105 . http://cn.bing.com/academic/profile?id=1dc5d01904d2c274eaec2181a93aa339&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=1dc5d01904d2c274eaec2181a93aa339&encoded=0&v=paper_preview&mkt=zh-cn .

TAN M, LE Q V. EfficientNet: Rethinking model scaling for convolutional neural networks[C]. 36 th International Conference on Machine Learning (ICML) , California , 2019: 691-700.

A GIUSTI , J GUZZI , D CIRESAN , 等 . A machine learning approach to visual perception of forest trails for mobile robots . IEEE Robotics & Automation Letters , 2016 . 1 ( 2 ): 661 - 667 . http://cn.bing.com/academic/profile?id=b4a4cc117c4772f49a59fbf2f75ea85b&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=b4a4cc117c4772f49a59fbf2f75ea85b&encoded=0&v=paper_preview&mkt=zh-cn .

C RICHTER, N ROY. Safe visual navigation via deep learning and novelty detection[C]. Robotics: Science and Systems XIII (RSS), Massachussets , 2017: 64-73.

刘玉洁 , 朱韶平 . 基于全局和局部多特征的图像增强算法 . 液晶与显示 , 2020 . 35 ( 5 ): 508 - 512 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=yjyxs202005014 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=yjyxs202005014 .

Y J LIU , SH P ZHU . Image enhancement algorithm based on global and local multi features . Chinese Journal of Liquid Crystals and Displays , 2020 . 35 ( 5 ): 508 - 512 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=yjyxs202005014 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=yjyxs202005014 .

K HE, X ZHANG, S REN, et al. Deep residual learning for image recognition[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV , 2016: 770-778.

IOFFE S, SZEGEDY C. Batch normalization: Accelerating deep network training by reducing internal covariate shift[C]. Proceedings of the 32nd International Conference on Machine Learning, Lille , 2015: 448-456.

A LOQUERCIO , M SEG? , D SCARAMUZZA . A general framework for uncertainty estimation in deep learning . IEEE Robotics and Automation Letters , 2019 . 5 3153 - 3160 . http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=10.1177/0022002702239508 http://www.wanfangdata.com.cn/details/detail.do?_type=perio&id=10.1177/0022002702239508 .

A LOQUERCIO , A I MAQUEDA , C R DEL-BLANCO , 等 . DroNet:learning to fly by driving . IEEE Robotics & Automation Letters , 2018 . 3 ( 2 ): 1088 - 1095 . http://cn.bing.com/academic/profile?id=948c17d1bf56b4b8387d968b1469045e&encoded=0&v=paper_preview&mkt=zh-cn http://cn.bing.com/academic/profile?id=948c17d1bf56b4b8387d968b1469045e&encoded=0&v=paper_preview&mkt=zh-cn .

K SIMONYAN, A ZISSERMAN. Very deep convolutional networks for large-scale image recognition[C]. International Conference on Learning Representations (ICLR), San Diego , 2015: 1-14.

C SZEGEDY, V VANHOUCKE, S IOFFE, et al. Rethinking the inception architecture for computer vision[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV , 2016: 2818-2826.

M SANDLER, A HOWARD, M ZHU, et al. MobileNetV2: Inverted residuals and linear bottlenecks[C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT , 2018: 4510-4520.

A HOWARD, M SANDLER, G CHU, et al. Searching for mobileNetV3[C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South) , 2019: 1314-1324.

B ZHOU, A KHOSLA, A LAPEDRIZA, et al. Learning deep features for discriminative localization[C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV , 2016: 2921-2929.

浏览量

198

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

融合知识蒸馏和注意力机制的光伏热斑检测

融合卷积块注意力模块和Siamese神经网络的人脸识别算法

基于LL-GG-LG Net的CT和PET医学图像融合

基于改进BiSeNet的实时图像语义分割

多尺度注意力线束端子实时语义分割网络