Semantic segmentation of 3D point cloud based on self-attention feature fusion group convolutional neural network

Jun YANG; Bozan LI

doi:10.37188/OPE.20223007.0840

您当前的位置：

首页 >

文章列表页 >

Semantic segmentation of 3D point cloud based on self-attention feature fusion group convolutional neural network

Information Sciences | 更新时间：2022-04-22

- Semantic segmentation of 3D point cloud based on self-attention feature fusion group convolutional neural network
- Optics and Precision Engineering Vol. 30, Issue 7, Pages: 840-853(2022)
- 作者机构：
  
  1.兰州交通大学测绘与地理信息学院，甘肃兰州 730070
  2.兰州交通大学自动化与电气工程学院，甘肃兰州 730070
- 作者简介：
- 基金信息：
- DOI：10.37188/OPE.20223007.0840
  CLC： TP391
- Received：08 October 2021，
  
  Revised：19 November 2021，
  
  Published：10 April 2022
- 稿件说明：
移动端阅览
杨军,李博赞.基于自注意力特征融合组卷积神经网络的三维点云语义分割[J].光学精密工程,2022,30(07):840-853.

YANG Jun,LI Bozan.Semantic segmentation of 3D point cloud based on self-attention feature fusion group convolutional neural network[J].Optics and Precision Engineering,2022,30(07):840-853.
杨军,李博赞.基于自注意力特征融合组卷积神经网络的三维点云语义分割[J].光学精密工程,2022,30(07):840-853. DOI： 10.37188/OPE.20223007.0840.

YANG Jun,LI Bozan.Semantic segmentation of 3D point cloud based on self-attention feature fusion group convolutional neural network[J].Optics and Precision Engineering,2022,30(07):840-853. DOI： 10.37188/OPE.20223007.0840.

摘要

针对现有算法忽略点云数据全局单点特征和局部几何特征的深层关系，导致捕获的局部几何信息缺乏鉴别性且难以有效识别复杂形状的问题，提出基于自注意力特征融合组卷积神经网络的三维点云语义分割算法。首先，设计轻量化网络框架的代理点图卷积提取点云局部几何特征，并加入组卷积操作减少计算量和复杂度，以较少的冗余信息增强特征的丰富性；其次，通过Transformer模块进行不同分支间特征信息的交流，使全局特征和局部几何特征相互补偿，增强特征的完备性；然后，将点云底层语义特征与原始点云融合以扩大局部邻域感受野，获得高级上下文语义信息；最后，将特征输入到分割模块完成细粒度语义分割。实验结果表明，该算法在S3DIS数据集和SemanticKITTI数据集上的分割精度分别达到79.3%和56.6%，能够提取三维点云的关键特征信息，网络参数量较少且具有较高的语义分割鲁棒性。

Abstract

The existing algorithms ignore the profound relationship between global single point features and local geometric features. This results in the lack of discriminative captured local geometric information and increases the difficulty of effectively identifying complex shape categories. This paper proposes a semantic segmentation algorithm for three-dimensional point clouds based on a self-attention feature fusion group convolutional neural network. First， the proxy point graph convolution of lightweight network is designed to extract the local geometric features of the point cloud. Then， the group convolution operation is added to reduce the amount of calculation and complexity and enhance the richness of features with less redundant information. Second， the feature information exchange between different branches is carried out through the Transformer module to ensure mutual compensation between the global and local geometric features and to enhance the completeness of features. Then， the underlying semantic features of the point cloud are fused with the original point cloud to expand the local neighborhood perception field and obtain high-level context semantic information. Finally， the features are input into the segmentation module to complete fine-grained semantic segmentation. The experimental results show that the segmentation accuracy reaches 79.3% and 56.6% in the S3DIS and SemanticKITTI datasets， respectively. This algorithm can extract the key feature information from a 3D point cloud using fewer network parameters and exhibits high robustness of semantic segmentation.

关键词

Keywords

references

QI C R ， LIU W ， WU C X ， et al . Frustum PointNets for 3D object detection from RGB-D data ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823，2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 918 - 927 . doi: 10.1109/cvpr.2018.00102 http://dx.doi.org/10.1109/cvpr.2018.00102

LIU Z Z ， CHEN H Y ， DI H J ， et al . Real-time 6D lidar SLAM in large scale natural terrains for UGV ［C］. 2018 IEEE Intelligent Vehicles Symposium . 2630，2018 ， Changshu， China . IEEE ， 2018 ： 662 - 667 . doi: 10.1109/ivs.2018.8500641 http://dx.doi.org/10.1109/ivs.2018.8500641

GOLOVINSKIY A ， KIM V G ， FUNKHOUSER T . Shape-based recognition of 3D point clouds in urban environments ［C］. 2009 IEEE 12th International Conference on Computer Vision. September 29 - October 2 ， 2009 ， Kyoto， Japan. IEEE ， 2009： 2154 - 2161 . doi: 10.1109/iccv.2009.5459471 http://dx.doi.org/10.1109/iccv.2009.5459471

LAWIN F J ， DANELLJAN M ， TOSTEBERG P ， et al . Deep projective 3D semantic segmentation ［C］. International Conference on Computer Analysis of Images and Patterns （CAIP）. 2224，2017 ， Ystad， Sweden. Cham ： Springer ， 2017 ： 95 - 107 . doi: 10.1007/978-3-319-64689-3_8 http://dx.doi.org/10.1007/978-3-319-64689-3_8

HUANG J ， YOU S Y . Point cloud labeling using 3D Convolutional Neural Network ［C］. 2016 23rd International Conference on Pattern Recognition （ICPR）. 48，2016 ， Mexico： Cancun. IEEE ， 2016 ： 2670 - 2675 . doi: 10.1109/icpr.2016.7900038 http://dx.doi.org/10.1109/icpr.2016.7900038

GRAHAM B ， ENGELCKE M ， MAATEN L V D . 3D semantic segmentation with submanifold sparse convolutional networks ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823，2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 9224 - 9232 . doi: 10.1109/cvpr.2018.00961 http://dx.doi.org/10.1109/cvpr.2018.00961

LIU Z J ， TANG H T ， LIN Y J ， et al . Point-voxel CNN for efficient 3D deep learning ［J］. CoRR ， 2019 ， abs/1907. 03739 .

杨军，党吉圣 . 采用深度级联卷积神经网络的三维点云识别与分割［J］. 光学精密工程， 2020 ， 28 （ 5 ）： 1187 - 1199 . doi: 10.3788/OPE.20202805.1187 http://dx.doi.org/10.3788/OPE.20202805.1187

YANG J ， DANG J SH . Recognition and segmentation of three-dimensional point cloud based on deep cascade convolutional neural network ［J］. Opt. Precision Eng. ， 2020 ， 28 （ 5 ）： 1187 - 1199 . （in Chinese） . doi: 10.3788/OPE.20202805.1187 http://dx.doi.org/10.3788/OPE.20202805.1187

杨军，党吉圣 . 基于上下文注意力CNN的三维点云语义分割［J］. 通信学报， 2020 ， 41 （ 7 ）： 195 - 203 . doi: 10.11959/j.issn.1000-436x.2020128 http://dx.doi.org/10.11959/j.issn.1000-436x.2020128

YANG J ， DANG J SH . Semantic segmentation of 3D point cloud based on contextual attention CNN ［J］. Journal on Communications ， 2020 ， 41 （ 7 ）： 195 - 203 . （in Chinese） . doi: 10.11959/j.issn.1000-436x.2020128 http://dx.doi.org/10.11959/j.issn.1000-436x.2020128

DOSOVITSKIY A ， BEYER L ， KOLESNIKOV A ， et al . An image is worth 16×16 words： transformers for image recognition at scale ［J］. ArXiv Preprint ArXiv ： 2010.11929 ， 2020 .

QI D ， SU L ， SONG J ， et al . Imagebert： Cross-modal pre-training with large-scale weak-supervised image-text data ［J］. ArXiv Preprint ArXiv ： 2001.07966 ， 2020 .

MILIOTO A ， VIZZO I ， BEHLEY J ， et al . RangeNet ++： fast and accurate LiDAR semantic segmentation ［C］. 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems （IROS）. 38，2019 ， Macao， China. IEEE ， 2019 ： 4213 - 4220 . doi: 10.1109/iros40897.2019.8967762 http://dx.doi.org/10.1109/iros40897.2019.8967762

XU C F ， WU B C ， WANG Z N ， et al . SqueezeSegV3： spatially-adaptive convolution for efficient point-cloud segmentation ［C］. European Conference on Computer Vision (ECCV). Aug. 23-28, 2020 , Online. Cham : Springer , 2020 : 1 - 19 .. doi: 10.1007/978-3-030-58604-1_1 http://dx.doi.org/10.1007/978-3-030-58604-1_1

WU B C ， WAN A ， YUE X Y ， et al . SqueezeSeg： convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D LiDAR point cloud ［C］. 2018 IEEE International Conference on Robotics and Automation . 2125，2018 ， Brisbane ， QLD ， Australia . IEEE ， 2018 ： 1887 - 1893 . doi: 10.1109/icra.2018.8462926 http://dx.doi.org/10.1109/icra.2018.8462926

CHOY C ， GWAK J ， SAVARESE S . 4D spatio-temporal ConvNets： minkowski convolutional neural networks ［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 1520，2019 ， Long Beach， CA， USA. IEEE ， 2019 ： 3070 - 3079 . doi: 10.1109/cvpr.2019.00319 http://dx.doi.org/10.1109/cvpr.2019.00319

CHARLES R Q ， HAO S ， MO K C ， et al . PointNet： deep learning on point sets for 3D classification and segmentation ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition . 2126，2017 ， Honolulu， HI， USA . IEEE ， 2017 ： 77 - 85 . doi: 10.1109/cvpr.2017.16 http://dx.doi.org/10.1109/cvpr.2017.16

QI C R ， YI L ， SU H ， et al . PointNet++： deep hierarchical feature learning on point sets in a metric space ［EB/OL］. 2017： arXiv ： 1706 .02413［cs.CV］. https：//arxiv.org/abs/1706.02413 https://arxiv.org/abs/1706.02413

JIANG L ， ZHAO H S ， LIU S ， et al . Hierarchical point-edge interaction network for point cloud semantic segmentation ［C］. 2019 IEEE/CVF International Conference on Computer Vision （ICCV） . 272，2019 ， Seoul， Korea （South）. IEEE ， 2019 ： 10432 - 10440 . doi: 10.1109/iccv.2019.01053 http://dx.doi.org/10.1109/iccv.2019.01053

DANG J S ， YANG J . HPGCNN ： Hierarchical Parallel Group Convolutional Neural Networks for Point Clouds Processing ［M］. Computer Vision-ACCV 2020. Cham ： Springer International Publishing ， 2021 ： 20 - 37 . doi: 10.1007/978-3-030-69525-5_2 http://dx.doi.org/10.1007/978-3-030-69525-5_2

HU Q Y ， YANG B ， XIE L H ， et al . RandLA-net： efficient semantic segmentation of large-scale point clouds ［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 1319，2020 ， Seattle， WA， USA. IEEE ， 2020 ： 11105 - 11114 . doi: 10.1109/cvpr42600.2020.01112 http://dx.doi.org/10.1109/cvpr42600.2020.01112

LANDRIEU L ， SIMONOVSKY M . Large-scale point cloud semantic segmentation with superpoint graphs ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823，2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 4558 - 4567 . doi: 10.1109/cvpr.2018.00479 http://dx.doi.org/10.1109/cvpr.2018.00479

ARMENI I ， SENER O ， ZAMIR A R ， et al . 3D semantic parsing of large-scale indoor spaces ［C］. 2016 IEEE Conference on Computer Vision and Pattern Recognition . 2730，2016 ， Las Vegas， NV， USA . IEEE ， 2016 ： 1534 - 1543 . doi: 10.1109/cvpr.2016.170 http://dx.doi.org/10.1109/cvpr.2016.170

BEHLEY J ， GARBADE M ， MILIOTO A ， et al . SemanticKITTI： a dataset for semantic scene understanding of LiDAR sequences ［C］. 2019 IEEE/CVF International Conference on Computer Vision （ICCV） . 272，2019 ， Seoul， Korea （South）. IEEE ， 2019 ： 9296 - 9306 . doi: 10.1109/iccv.2019.00939 http://dx.doi.org/10.1109/iccv.2019.00939

ZHAO H , JIANG L , JIA J , et al . Point transformer [C]. Proceedings of the IEEE/CVF International Conference on Computer Vision . October 11-17,2021 , Online . IEEE , 2021 : 16259 - 16268 . doi: 10.1109/iccv48922.2021.01595 http://dx.doi.org/10.1109/iccv48922.2021.01595

THOMAS H ， QI C R ， DESCHAUD J E ， et al . KPConv： flexible and deformable convolution for point clouds ［C］. 2019 IEEE/CVF International Conference on Computer Vision （ICCV）. October 27 - November 2 ， 2019 ， Seoul， Korea （South）. IEEE ， 2019： 6410 - 6419 . doi: 10.1109/iccv.2019.00651 http://dx.doi.org/10.1109/iccv.2019.00651

LI Y ， BU R ， SUN M ， et al . PointCNN： convolution on x-transformed points ［J］. Advances in Neural Information Processing Systems ， 2018 ， 31 ： 820 - 830 .

ZHAO H S ， JIANG L ， FU C W ， et al . PointWeb： enhancing local neighborhood features for point cloud processing ［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 1520，2019 ， Long Beach， CA， USA. IEEE ， 2019 ： 5560 - 5568 . doi: 10.1109/cvpr.2019.00571 http://dx.doi.org/10.1109/cvpr.2019.00571

ZHANG Y ， ZHOU Z X ， DAVID P ， et al . PolarNet： an improved grid representation for online LiDAR point clouds semantic segmentation ［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 1319，2020 ， Seattle， WA， USA. IEEE ， 2020 ： 9598 - 9607 . doi: 10.1109/cvpr42600.2020.00962 http://dx.doi.org/10.1109/cvpr42600.2020.00962

WU B C ， ZHOU X Y ， ZHAO S C ， et al . SqueezeSegV2： improved model structure and unsupervised domain adaptation for road-object segmentation from a LiDAR point cloud ［C］. 2019 International Conference on Robotics and Automation （ICRA）. 2024，2019 ， Montreal， QC， Canada. IEEE ， 2019 ： 4376 - 4382 . doi: 10.1109/icra.2019.8793495 http://dx.doi.org/10.1109/icra.2019.8793495

YAN X ， ZHENG C D ， LI Z ， et al . PointASNL： robust point clouds processing using nonlocal neural networks with adaptive sampling ［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 1319，2020 ， Seattle， WA， USA. IEEE ， 2020 ： 5588 - 5597 . doi: 10.1109/cvpr42600.2020.00563 http://dx.doi.org/10.1109/cvpr42600.2020.00563

Views

1067

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Multi-scale context-aware network for road extraction in remote sensing images

Improved DeepLabv3+ semantic segmentation incorporating attention mechanisms

Monocular vision transmission line sag measurement method based on local photography

Real-time urban street view semantic segmentation based on cross-layer aggregation network

Concrete crack segmentation combined with linear guidance and mesh optimization

Related Author

JIE Jun

ZHANG Jie

DONG Wei

LI Changhua

HUI Aiting

LI Zhijie

YAN He

LEI Qiuxia

Related Institution

School of Information and Control Engineering， Xi’an University of Architectural Science and Technology

Liangjiang College of Artificial Intelligence， Chongqing University of Technology

School of Mechanical Engineering & Automation， Northeastern University

School of Instrumentation Science and Engineering， Harbin Institute of Technology

Xi'an University of Posts and Telecommunications， Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing

AI问答

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China Postal code：130033
Tel：0431-86176855 Email：gxjmgc@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰