浏览全部资源
扫码关注微信
1.北京交通大学 机械与电子控制工程学院,北京 100044
2.北京交通大学 智慧高铁系统前沿科学中心,北京 100044
[ "郭保青(1978-),男,河北涞水人,副教授,博士,2004,2009年于北京交通大学分别获硕士和博士学位,主要从事铁路基础设施检测,机器视觉检测技术方面的研究工作。 Email:bqguo@bjtu.edu.cn" ]
[ "谢光非(1995-),男,四川内江人,研究生在读,2014年于北京交通大学获得学士学位,现于北京交通大学攻读硕士学位,主要研究方向为三维激光雷达目标检测算法研究。E-mail:18121290@bjtu.edu.cn" ]
收稿日期:2021-05-11,
修回日期:2021-07-05,
纸质出版日期:2021-11-15
移动端阅览
郭保青,谢光非.基于N3D_DIOU的图像与点云融合目标检测算法[J].光学精密工程,2021,29(11):2703-2713.
GUO Bao-qing,XIE Guang-fei.Object detection algorithm based on image and point cloud fusion with N3D_DIOU[J].Optics and Precision Engineering,2021,29(11):2703-2713.
郭保青,谢光非.基于N3D_DIOU的图像与点云融合目标检测算法[J].光学精密工程,2021,29(11):2703-2713. DOI: 10.37188/OPE.20212911.2703.
GUO Bao-qing,XIE Guang-fei.Object detection algorithm based on image and point cloud fusion with N3D_DIOU[J].Optics and Precision Engineering,2021,29(11):2703-2713. DOI: 10.37188/OPE.20212911.2703.
目标检测是自主驾驶和机器人导航的基础,针对二维图像信息量不足,三维点云数据量大、密度不均匀和检测精度低等问题,本文基于深度学习提出了一种融合二维图像与三维点云的目标检测网络进行三维目标检测。为减少运算量,论文首先用二维图像检测器生成的检测框对应的平截头体对原始点云进行滤波;为解决点云密度不均匀问题,提出了一种基于广义霍夫变换的改进投票模型网络用于多尺度特征提取;最后将二维DIOU(Distance Intersection over Union)损失函数扩展为三维空间的N3D_DIOU(Normal 3 Dimensional DIOU)损失函数,提高了生成框和目标框的一致性,进一步提高了点云检测精度。在KITTI数据集上进行的大量实验表明:与经典方法相比,本文算法在汽车三维检测精度上提升了0.71%,在鸟瞰图检测精度上提升了7.28%,取得了较好效果。
Object detection is the basis of autonomous driving and robot navigation. To solve the problems of insufficient information in 2D images and the large data volume, uneven density, and low detection accuracy of 3D point clouds, a new 3D object-detection network is proposed through an image and point-cloud fusion with deep learning. To reduce the calculation load, the original point cloud is first filtered with the flat interceptor corresponding to the object's frame detected in the 2D image. To address the uneven density, an improved voting model network, based on a generalized Hough transform, is proposed for multiscale feature extraction. Finally, Normal Three-Dimensional Distance Intersection over Union (N3D_DIOU), a novel loss function, is extended from the Two-Dimensional Distance Intersection over Union (2D DIOU) loss function, which improves the consistency between the generated and target frames, and also improves the object-detection accuracy of the point cloud. Experiments on the KITTI dataset show that our algorithm improves the accuracy of three-dimensional detection by 0.71%, and the aerial-view detection accuracy by 7.28%, over outstanding classical methods.
CHEN X Z , MA H M , WAN J , et al . Multi-view 3D object detection network for autonomous driving [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2126,2017 , Honolulu, HI, USA. IEEE , 2017 : 6526 - 6534 . doi: 10.1109/cvpr.2017.691 http://dx.doi.org/10.1109/cvpr.2017.691
郭保青 , 余祖俊 , 张楠 , 等 . 铁路场景三维点云分割与分类识别算法 [J]. 仪器仪表学报 , 2017 , 38 ( 9 ): 2103 - 2111 . doi: 10.3969/j.issn.0254-3087.2017.09.002 http://dx.doi.org/10.3969/j.issn.0254-3087.2017.09.002
GUO B Q , YU Z J , ZHANG N , et al . 3D point cloud segmentation, classification and recognition algorithm of railway scene [J]. Chinese Journal of Scientific Instrument , 2017 , 38 ( 9 ): 2103 - 2111 . (in Chinese) . doi: 10.3969/j.issn.0254-3087.2017.09.002 http://dx.doi.org/10.3969/j.issn.0254-3087.2017.09.002
HE K M , ZHANG X Y , REN S Q , et al . Spatial pyramid pooling in deep convolutional networks for visual recognition [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2015 , 37 ( 9 ): 1904 - 1916 . doi: 10.1109/tpami.2015.2389824 http://dx.doi.org/10.1109/tpami.2015.2389824
GIRSHICK R , DONAHUE J , DARRELL T , et al . Rich feature hierarchies for accurate object detection and semantic segmentation [C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition . 2328,2014 , Columbus , OH , USA . IEEE , 2014 : 580 - 587 . doi: 10.1109/cvpr.2014.81 http://dx.doi.org/10.1109/cvpr.2014.81
GIRSHICK R . Fast R-CNN [C]. 2015 IEEE International Conference on Computer Vision (ICCV). 713,2015 , Santiago, Chile. IEEE , 2015 : 1440 - 1448 . doi: 10.1109/iccv.2015.169 http://dx.doi.org/10.1109/iccv.2015.169
REN S Q , HE K M , GIRSHICK R , et al . Faster R-CNN: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 6 ): 1137 - 1149 . doi: 10.1109/tpami.2016.2577031 http://dx.doi.org/10.1109/tpami.2016.2577031
REDMON J , DIVVALA S , GIRSHICK R , et al . You only look once: unified, real-time object detection [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2730,2016 , Las Vegas, NV, USA. IEEE , 2016 : 779 - 788 . doi: 10.1109/cvpr.2016.91 http://dx.doi.org/10.1109/cvpr.2016.91
GUO B Q , GENG G , ZHU L Q , et al . High-speed railway intruding object image generating with generative adversarial networks [J]. Sensors , 2019 , 19 ( 14 ): 3075 . doi: 10.3390/s19143075 http://dx.doi.org/10.3390/s19143075
王建林 , 付雪松 , 黄展超 , 等 . 改进YOLOv2卷积神经网络的多类型合作目标检测 [J]. 光学 精密工程 , 2020 , 28 ( 1 ): 251 - 260 . doi: 10.3788/ope.20202801.0251 http://dx.doi.org/10.3788/ope.20202801.0251
WANG J L , FU X S , HUANG ZH CH , et al . Multi-type cooperative targets detection using improved YOLOv2 convolutional neural network [J]. Opt. Precision Eng. , 2020 , 28 ( 1 ): 251 - 260 . (in Chinese) . doi: 10.3788/ope.20202801.0251 http://dx.doi.org/10.3788/ope.20202801.0251
ARNOLD E , AL-JARRAH O Y , DIANATI M , et al . A survey on 3D object detection methods for autonomous driving applications [J]. IEEE Transactions on Intelligent Transportation Systems , 2019 , 20 ( 10 ): 3782 - 3795 . doi: 10.1109/tits.2019.2892405 http://dx.doi.org/10.1109/tits.2019.2892405
SIMON M , MILZ S , AMENDE K , et al . Complex-YOLO: an Euler-region-proposal for real-time 3D object detection on point clouds [C]. Computer Vision - ECCV 2018 Workshops , 2019 : 197 - 209 . doi: 10.1007/978-3-030-11009-3_11 http://dx.doi.org/10.1007/978-3-030-11009-3_11
ALI W , ABDELKARIM S , ZIDAN M , et al . YOLO3D: end-to-end real-time 3D oriented object bounding box detection from LiDAR point cloud [C]. Computer Vision - ECCV 2018 Workshops , 2019 : 716 - 728 . doi: 10.1007/978-3-030-11015-4_54 http://dx.doi.org/10.1007/978-3-030-11015-4_54
YANG B , LUO W J , URTASUN R . PIXOR: real-time 3D object detection from point clouds [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823,2018 , Salt Lake City, UT, USA . IEEE , 2018 : 7652 - 7660 . doi: 10.1109/cvpr.2018.00798 http://dx.doi.org/10.1109/cvpr.2018.00798
IANDOLA F N , HAN S , MOSKEWICZ M W , et al . SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and < 0 . 5 MB model size[EB/OL]. 2016. https://www.researchgate.net/publication/301878495_SqueezeNet_AlexNet-level_accuracy_with_50x_fewer_parameters_and_05MB_model_size https://www.researchgate.net/publication/301878495_SqueezeNet_AlexNet-level_accuracy_with_50x_fewer_parameters_and_05MB_model_size
ZHOU Y , TUZEL O . VoxelNet: end-to-end learning for point cloud based 3D object detection [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823,2018 , Salt Lake City, UT, USA . IEEE , 2018 : 4490 - 4499 . doi: 10.1109/cvpr.2018.00472 http://dx.doi.org/10.1109/cvpr.2018.00472
YAN Y , MAO Y X , LI B . SECOND: sparsely embedded convolutional detection [J]. Sensors , 2018 , 18 ( 10 ): 3337 . doi: 10.3390/s18103337 http://dx.doi.org/10.3390/s18103337
LIANG M , YANG B , CHEN Y , et al . Multi-task multi-sensor fusion for 3D object detection [C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1520,2019 , Long Beach, CA, USA. IEEE , 2019 : 7337 - 7345 . doi: 10.1109/cvpr.2019.00752 http://dx.doi.org/10.1109/cvpr.2019.00752
WU T E , TSAI C C , GUO J N . LiDAR/camera sensor fusion technology for pedestrian detection [C]. 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). 1215,2017 , Kuala Lumpur, Malaysia. IEEE , 2017 : 1675 - 1678 . doi: 10.1109/apsipa.2017.8282301 http://dx.doi.org/10.1109/apsipa.2017.8282301
CHARLES R Q , HAO S , MO K C , et al . PointNet: deep learning on point sets for 3D classification and segmentation [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2126,2017 , Honolulu, HI, USA. IEEE , 2017 : 77 - 85 . doi: 10.1109/cvpr.2017.16 http://dx.doi.org/10.1109/cvpr.2017.16
EVERINGHAM M , GOOL L , WILLIAMS C K I , et al . The pascal visual object classes (VOC) challenge [J]. International Journal of Computer Vision , 2010 , 88 ( 2 ): 303 - 338 . doi: 10.1007/s11263-009-0275-4 http://dx.doi.org/10.1007/s11263-009-0275-4
LIN T Y , DOLLÁR P , GIRSHICK R , et al . Feature pyramid networks for object detection [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2126,2017 , Honolulu, HI, USA. IEEE , 2017 : 936 - 944 . doi: 10.1109/cvpr.2017.106 http://dx.doi.org/10.1109/cvpr.2017.106
REZATOFIGHI H , TSOI N , GWAK J , et al . Generalized intersection over union: a metric and a loss for bounding box regression [C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1520,2019 , Long Beach, CA, USA. IEEE , 2019 : 658 - 666 . doi: 10.1109/cvpr.2019.00075 http://dx.doi.org/10.1109/cvpr.2019.00075
ZHENG Z H , WANG P , LIU W , et al . Distance-IoU loss: faster and better learning for bounding box regression [J]. Proceedings of the AAAI Conference on Artificial Intelligence , 2020 , 34 ( 7 ): 12993 - 13000 . doi: 10.1609/aaai.v34i07.6999 http://dx.doi.org/10.1609/aaai.v34i07.6999
WANG Z X , JIA K . Frustum ConvNet: sliding Frustums to aggregate local point-wise features for amodal 3D object detection [C]. 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 38,2019 , Macao, China. IEEE , 2019 : 1742 - 1749 . doi: 10.1109/iros40897.2019.8968513 http://dx.doi.org/10.1109/iros40897.2019.8968513
LIN T Y , GOYAL P , GIRSHICK R , et al . Focal loss for dense object detection [C]. 2017 IEEE International Conference on Computer Vision (ICCV). 2229,2017 , Venice, Italy. IEEE , 2017 : 2999 - 3007 . doi: 10.1109/iccv.2017.324 http://dx.doi.org/10.1109/iccv.2017.324
YU J H , JIANG Y N , WANG Z Y , et al . UnitBox: an advanced object detection network [C]. Proceedings of the 24th ACM international conference on Multimedia. Amsterdam The Netherlands. New York, NY, USA : ACM , 2016 : 516 - 520 . doi: 10.1145/2964284.2967274 http://dx.doi.org/10.1145/2964284.2967274
ZHOU D F , FANG J , SONG X B , et al . IoU loss for 2D/3D object detection [C]. 2019 International Conference on 3D Vision (3DV). 1619,2019 , Quebec City, QC, Canada. IEEE , 2019 : 85 - 94 . doi: 10.1109/3dv.2019.00019 http://dx.doi.org/10.1109/3dv.2019.00019
LIANG M , YANG B , WANG S L , et al . Deep Continuous Fusion for Multi-sensor 3 D Object Detection [M]. Computer Vision-ECCV 2018 . Cham : Springer International Publishing , 2018 : 663 - 678 . doi: 10.1007/978-3-030-01270-0_39 http://dx.doi.org/10.1007/978-3-030-01270-0_39
YANG Z T , SUN Y N , LIU S , et al . IPOD: intensive point-based object detector for point cloud [EB/OL]. 2018 . https://www.researchgate.net/publication/329641724_IPOD_Intensive_Point-based_Object_Detector_for_Point_Cloud https://www.researchgate.net/publication/329641724_IPOD_Intensive_Point-based_Object_Detector_for_Point_Cloud . doi: 10.1109/iccv.2019.00204 http://dx.doi.org/10.1109/iccv.2019.00204
LANG A H , VORA S , CAESAR H , et al . PointPillars: fast encoders for object detection from point clouds [C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 1520,2019 , Long Beach, CA, USA. IEEE , 2019 : 12689 - 12697 . doi: 10.1109/cvpr.2019.01298 http://dx.doi.org/10.1109/cvpr.2019.01298
QI C R , LIU W , WU C X , et al . Frustum PointNets for 3D object detection from RGB-D data [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823,2018 , Salt Lake City, UT, USA . IEEE , 2018 : 918 - 927 . doi: 10.1109/cvpr.2018.00102 http://dx.doi.org/10.1109/cvpr.2018.00102
0
浏览量
413
下载量
3
CSCD
关联资源
相关文章
相关作者
相关机构