浏览全部资源
扫码关注微信
西安建筑科技大学 信息与控制工程学院,陕西 西安 710055
[ "陈俊英(1980-),女,内蒙丰镇人,博士,副教授,硕士生导师,新南威尔士大学访问学者。2004年于西安交通大学获得硕士学位,2010年于西安交通大学获得博士学位,现为西安建筑科技大学信息与控制工程学院教师,主要从事计算机视觉及机器学习方面的研究。E-mail: chenjy@xauat.edu.cn,陈俊英(1980-),女,内蒙丰镇人,博士,副教授,硕士生导师,新南威尔士大学访问学者。2004年于西安交通大学获得硕士学位,2010年于西安交通大学获得博士学位,现为西安建筑科技大学信息与控制工程学院教师,主要从事计算机视觉及机器学习方面的研究。E-mail:chenjy@xauat.edu.cn," ]
[ "白童垚(1993-),男,陕西汉中人,硕士研究生,主要从事目标检测的算法研究。白童垚(1993-),男,陕西汉中人,硕士研究生,主要从事目标检测的算法研究。" ]
收稿日期:2021-03-09,
修回日期:2021-04-24,
纸质出版日期:2021-09-15
移动端阅览
陈俊英,白童垚,赵亮.互注意力融合图像和点云数据的3D目标检测[J].光学精密工程,2021,29(09):2247-2254.
CHEN Jun-ying,BAI Tong-yao,ZHAO Liang.3D object detection based on fusion of point cloud and image by mutual attention[J].Optics and Precision Engineering,2021,29(09):2247-2254.
陈俊英,白童垚,赵亮.互注意力融合图像和点云数据的3D目标检测[J].光学精密工程,2021,29(09):2247-2254. DOI: 10.37188/OPE.20212909.2247.
CHEN Jun-ying,BAI Tong-yao,ZHAO Liang.3D object detection based on fusion of point cloud and image by mutual attention[J].Optics and Precision Engineering,2021,29(09):2247-2254. DOI: 10.37188/OPE.20212909.2247.
为了利用图像信息辅助点云数据提高3D目标检测精度,需要解决图像特征空间和点云特征空间自适应对齐融合的问题。本文提出了一种多模态特征自适应融合的3D目标检测深度学习网络。首先,对点云数据体素化,基于体素内的点云特征学习体素特征表示,用3D稀疏卷积神经网络获取点云数据的特征,同时用ResNet神经网络提取图像特征。然后通过引入互注意力模块自适应对齐图像特征和点云特征,得到基于图像特征增强后的点云特征。最后在此特征基础上应用区域提案网络和分类回归多任务学习网络实现3D目标检测。在KITTI 3D目标检测数据集上的实验结果表明:在小汽车的简易、中等、困难三个不同检测难度等级上,平均检测精度分别为88.76%,77.63%和76.14%。该方法能够有效融合图像信息和点云信息,提高3D目标检测的准确率。
To use image information in assisting point cloud to improve the accuracy of 3D object detection, it is necessary to solve the problem of the adaptive alignment and fusion between the image feature space and point cloud feature space. A deep learning network based on adaptive fusion of multimodal features was proposed for 3D object detection. First, a voxelization method was used to partition point clouds into even voxels. The voxel feature was derived from the features of the point cloud included, and a 3D sparse convolution neural network was used to learn the features of the point cloud. Simultaneously, a ResNet-like neural network was used to extract the image features. Next, the image features and point cloud features were aligned adaptively by introducing the mutual attention module, and the point cloud features enhanced by the image feature were obtained. Finally, based on the derived features, Region Proposal Networks (RPN) and multitask learning networks for classification and regression tasks were applied to achieve 3D object detection. The experimental results on the KITTI 3D object detection data set showed that the average precision was 88.76%, 77.63%, and 76.14%, respectively on simple, medium, and difficult levels of car detection. The proposed method can effectively fuse image and point cloud information, and improve the precision of 3D object detection.
赵传 , 张保明 , 余东行 , 等 . 利用迁移学习的机载激光雷达点云分类 [J]. 光学 精密工程 , 2019 , 27 ( 7 ): 1601 - 1612 . doi: 10.3788/ope.20192707.1601 http://dx.doi.org/10.3788/ope.20192707.1601
ZHAO CH , ZHANG B M , YU D X , et al . Air-borne Lidar point cloud classification using transfer learning [J]. Opt. Precision Eng. , 2019 , 27 ( 7 ): 1601 - 1612 . (in Chinese) . doi: 10.3788/ope.20192707.1601 http://dx.doi.org/10.3788/ope.20192707.1601
LI B , OUYANG W , SHENG L , et al . GS3D: An efficient 3D object detection framework for autonomous driving [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR , 2019 : 1019 - 1028 . doi: 10.1109/cvpr.2019.00111 http://dx.doi.org/10.1109/cvpr.2019.00111
WANG Y , CHAO W L , GARG D , et al . Pseudo-lidar from visual depth estimation: Bridging the gap in 3D object detection for autonomous driving [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR , 2019 : 8437 - 8445 . doi: 10.1109/cvpr.2019.00864 http://dx.doi.org/10.1109/cvpr.2019.00864
CHEN X , MA H , WAN J , et al .. Multi-view 3D object detection network for autonomous driving [C]. Proceedings - 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR , 2017 : 6526 - 6534 . doi: 10.1109/CVPR.2017.691 http://dx.doi.org/10.1109/CVPR.2017.691
SIMON M , AMENDE K , KRAUS A , et al .. Complexer-YOLO: Real-time 3D object detection and tracking on semantic point clouds [C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops , 2019 : 1190 - 1199 . doi: 10.1109/cvprw.2019.00158 http://dx.doi.org/10.1109/cvprw.2019.00158
王张飞 , 刘春阳 , 隋新 , 等 . 基于深度投影的三维点云目标分割和碰撞检测 [J]. 光学 精密工程 , 2020 , 28 ( 7 ): 1600 - 1608 . doi: 10.37188/OPE.20202807.1600 http://dx.doi.org/10.37188/OPE.20202807.1600
WANG ZH F , LIU CH Y , SUI X , et al . Three-dimensional point cloud object segmentation and collision detection based on depth projection [J]. Opt. Precision Eng. , 2020 , 28 ( 7 ): 1600 - 1608 . (in Chinese) . doi: 10.37188/OPE.20202807.1600 http://dx.doi.org/10.37188/OPE.20202807.1600
QI C R , SU H , MO K , et al . PointNet: Deep learning on point sets for 3D classification and segmentation [C]. Proceedings-30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR , 2017 : 77 - 85 . doi: 10.1109/cvpr.2017.16 http://dx.doi.org/10.1109/cvpr.2017.16
QI C R , YI L , SU H , et al . PointNet++: Deep hierarchical feature learning on point sets in a metric space [C]. Advances in Neural Information Processing Systems, NIPS , 2017 : 5100 - 5109 . doi: 10.1109/cvpr.2017.16 http://dx.doi.org/10.1109/cvpr.2017.16
杨军 , 党吉圣 . 采用深度级联卷积神经网络的三维点云识别与分割 [J]. 光学 精密工程 , 2020 , 28 ( 5 ): 1187 - 1199 .
YANG J , DANG J SH . Recognition and segmentation of three-dimensional point cloud based on deep cascade convolutional neural network [J]. Opt. Precision Eng. , 2020 , 28 ( 5 ): 1187 - 1199 . (in Chinese)
ZHOU Y , TUZEL O . Voxelnet: End-to-end learning for point cloud based 3d object detection [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR , 2018 : 4490 - 4499 . doi: 10.1109/cvpr.2018.00472 http://dx.doi.org/10.1109/cvpr.2018.00472
SHI S , WANG X , LI H . PointRCNN: 3D object proposal generation and detection from point cloud [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR , 2019 : 770 - 779 . doi: 10.1109/cvpr.2019.00086 http://dx.doi.org/10.1109/cvpr.2019.00086
LANG A H , VORA S , CAESAR H , et al . Pointpillars: Fast encoders for object detection from point clouds [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR , 2019 : 12689 - 12697 . doi: 10.1109/cvpr.2019.01298 http://dx.doi.org/10.1109/cvpr.2019.01298
XU D , ANGUELOV D , JAIN A . PointFusion: deep sensor fusion for 3D bounding box estimation [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR , 2018 : 244 - 253 . doi: 10.1109/cvpr.2018.00033 http://dx.doi.org/10.1109/cvpr.2018.00033
HE K M , ZHANG X , REN S , et al . Deep residual learning for image recognition [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR , 2016 : 770 - 778 . doi: 10.1109/cvpr.2016.90 http://dx.doi.org/10.1109/cvpr.2016.90
LI Y , BU R , SUN M , et al . PointCNN: Convolution on X-transformed points [C]. Advances in Neural Information Processing Systems, NIPS , 2018 : 820 - 830 .
QI C R , LIU W , WU C , et al . Frustum pointnets for 3D object detection from RGB-D data [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR , 2018 : 918 - 927 . doi: 10.1109/cvpr.2018.00102 http://dx.doi.org/10.1109/cvpr.2018.00102
LIANG M , YANG B , WANG S , et al . Deep continuous fusion for multi-sensor 3d object detection [C]. Proceedings of the European Conference on Computer Vision, ECCV , 2018 : 663 - 678 . doi: 10.1007/978-3-030-01270-0_39 http://dx.doi.org/10.1007/978-3-030-01270-0_39
VORA S , LANG A H , HELOU B , et al . Pointpainting: Sequential fusion for 3D object detection [C]. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR , 2020 : 4603 - 4611 . doi: 10.1109/cvpr42600.2020.00466 http://dx.doi.org/10.1109/cvpr42600.2020.00466
REN S , HE K , GIRSHICK R , et al . Faster r-cnn: towards real-time object detection with region proposal networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 6 ): 1137 - 1149 . doi: 10.1109/tpami.2016.2577031 http://dx.doi.org/10.1109/tpami.2016.2577031
GEIGER A , LENZ P , URTASUN R . Are we ready for autonomous driving? the kitti vision benchmark suite [C]. IEEE Conference on Computer Vision and Pattern Recognition, CVPR , 2012 : 3354 - 3361 . doi: 10.1109/cvpr.2012.6248074 http://dx.doi.org/10.1109/cvpr.2012.6248074
CHEN J Y , BAI T Y . SAANet: Spatial adaptive alignment network for object detection in automatic driving [J]. Image and Vision Computing , 2020 , 94 ( 2 ): 103873 . doi: 10.1016/j.imavis.2020.103873 http://dx.doi.org/10.1016/j.imavis.2020.103873
KU J , MOZIFIAN M , LEE J , et al . Joint 3D proposal generation and object detection from view aggregation [C]. IEEE International Conference on Intelligent Robots and Systems , 2018 : 5750 - 5757 . doi: 10.1109/iros.2018.8594049 http://dx.doi.org/10.1109/iros.2018.8594049
0
浏览量
1022
下载量
4
CSCD
关联资源
相关文章
相关作者
相关机构