Adaptive octree 3D image reconstruction based on plane patch

Cheng YAO; Caiwen MA

doi:10.37188/OPE.20223009.1113

您当前的位置：

首页 >

文章列表页 >

Adaptive octree 3D image reconstruction based on plane patch

Information Sciences | 更新时间：2022-05-26

- Adaptive octree 3D image reconstruction based on plane patch
- Optics and Precision Engineering Vol. 30, Issue 9, Pages: 1113-1122(2022)
- 作者机构：
  
  1.中国科学院西安光学精密机械研究所，陕西西安 710119
  2.中国科学院大学，北京 100049
- 作者简介：
- 基金信息：
- DOI：10.37188/OPE.20223009.1113
  CLC： TP394.1;TH691.9
- Received：30 October 2021，
  
  Revised：31 December 2021，
  
  Published：10 May 2022
- 稿件说明：
移动端阅览
姚程,马彩文.基于平面补丁的自适应八叉树三维图像重建[J].光学精密工程,2022,30(09):1113-1122.

YAO Cheng,MA Caiwen.Adaptive octree 3D image reconstruction based on plane patch[J].Optics and Precision Engineering,2022,30(09):1113-1122.
姚程,马彩文.基于平面补丁的自适应八叉树三维图像重建[J].光学精密工程,2022,30(09):1113-1122. DOI： 10.37188/OPE.20223009.1113.

YAO Cheng,MA Caiwen.Adaptive octree 3D image reconstruction based on plane patch[J].Optics and Precision Engineering,2022,30(09):1113-1122. DOI： 10.37188/OPE.20223009.1113.

摘要

提出了一种基于平面补丁的自适应八叉树卷积神经网络（Octree Convolutional Neural Networks，O-CNN），用于进行有效的三维形状编码和解码。不同于基于体素或基于八叉树的卷积神经网络（Convolutional Neural Networks，CNN），以相同的分辨率表示具有体素的三维形状，O-CNN可自适应地表示具有不同层次的八叉树节点的三维形状，并使用平面补丁对每个八叉树节点内的三维形状进行建模。依据这种自适应表示设计了一种用于编码和解码三维形状的自适应O-CNN编码器和解码器。自适应O-CNN编码器将平面补丁法线和位移作为输入，仅在每个级别的八叉树节点上执行三维卷积操作，而自适应O-CNN解码器则推断每个层次的八叉树节点的形状占有率和细分状态，并估计每个最佳叶八叉树节点的平面法线和位移。通过对单个图像的形状预测验证了自适应O-CNN的生成任务的效率和有效性，倒角距离误差为0.274，低于OctGen的倒角距离误差0.294，取得了更好的重建效果。作为3D形状分析和生成的通用框架，基于平面补丁的自适应O-CNN不仅减少了内存和计算成本，而且比现有的3D-CNN方法具有更好的形状生成能力。

Abstract

In this study， an adaptive octree convolutional neural network based on plane patches is proposed for effective 3D shape encoding and decoding. Unlike volume-based or octree-based convolutional neural network （CNN） methods， which represent 3D shapes with the same voxel resolution， the proposed method can use planes and adaptively represent the 3D shapes of octree nodes with different levels. The patch models the 3D shape within each octree node， whereby the patch-based adaptive representation is utilized in the proposed adaptive patch octree convolutional neural network （O-CNN） encoder and decoder for the encoding and decoding of 3D shapes. The adaptive patch O-CNN encoder takes the plane patch normal and displacement as input and performs three-dimensional convolution on the octree nodes of each level， whereas the adaptive patch O-CNN decoder infers each level. The shape occupancy rate and subdivision state of the octree node as well as the best plane normal and displacement of each leaf octree node are estimated. As a general framework for 3D shape analysis and generation， adaptive patch O-CNN not only reduces memory and computational costs but also exhibits better shape generation capabilities than existing 3D-CNN methods. Shape prediction is performed using a single image to verify the efficiency and effectiveness of the generation task of the adaptive O-CNN. The chamfer distance error is 0.274， which is lower than that of OctGen （0.294）， resulting in a better reconstruction effect.

关键词

Keywords

references

范丽丽，赵宏伟，赵浩宇，等 . 基于深度卷积神经网络的目标检测研究综述［J］. 光学精密工程， 2020 ， 28 （ 5 ）： 1152 - 1164 .

FAN L L ， ZHAO H W ， ZHAO H Y ， et al . Survey of target detection based on deep convolutional neural networks ［J］. Opt. Precision Eng. ， 2020 ， 28 （ 5 ）： 1152 - 1164 . （in Chinese）

LUN Z L ， GADELHA M ， KALOGERAKIS E ， et al . 3D shape reconstruction from sketches via multi-view convolutional networks ［C］. 2017 International Conference on 3D Vision （3DV）. 1012，2017 ， Qingdao， China. IEEE ， 2017 ： 67 - 77 . doi: 10.1109/3dv.2017.00018 http://dx.doi.org/10.1109/3dv.2017.00018

潘仙张，张石清，郭文平 . 多模深度卷积神经网络应用于视频表情识别 [J]. 光学精密工程 , 2019 , 27 ( 4 ): 963 - 970 . doi: 10.3788/ope.20192704.0963 http://dx.doi.org/10.3788/ope.20192704.0963

PAN X ZH , ZHANG SH Q , GUO W P . Video-based facial expression recognition using multimodal deep convolutional neural networks [J]. Opt. Precision Eng. , 2019 , 27 ( 4 ): 963 - 970 . (in Chinese) . doi: 10.3788/ope.20192704.0963 http://dx.doi.org/10.3788/ope.20192704.0963

NIU Z J ， LIU W ， ZHAO J Y ， et al . DeepLab-based spatial feature extraction for hyperspectral image classification ［J］. IEEE Geoscience and Remote Sensing Letters ， 2019 ， 16 （ 2 ）： 251 - 255 . doi: 10.1109/lgrs.2018.2871507 http://dx.doi.org/10.1109/lgrs.2018.2871507

WU J J ， ZHANG C K ， XUE T F ， et al . Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling ［C］. NIPS'16 ： Proceedings of the 30th International Conference on Neural Information Processing Systems . 2016 ： 82 - 90 . doi: 10.48550/arXiv.1610.07584 http://dx.doi.org/10.48550/arXiv.1610.07584

WU Z R ， SONG S R ， KHOSLA A ， et al . 3D ShapeNets： a deep representation for volumetric shapes ［C］. 2015 IEEE Conference on Computer Vision and Pattern Recognition . 712，2015 ， Boston， MA， USA . IEEE ， 2015 ： 1912 - 1920 . doi: 10.1109/cvpr.2015.7298801 http://dx.doi.org/10.1109/cvpr.2015.7298801

BRONSTEIN M M ， BRUNA J ， LECUN Y ， et al . Geometric deep learning： going beyond euclidean data ［J］. IEEE Signal Processing Magazine ， 2017 ， 34 （ 4 ）： 18 - 42 . doi: 10.1109/msp.2017.2693418 http://dx.doi.org/10.1109/msp.2017.2693418

GROUEIX T ， FISHER M ， KIM V G ， et al . AtlasNet： a papier-mché approach to learning 3D surface generation ［EB/OL］. 2018： arXiv ： 1802 .05384［cs.CV］. https：//arxiv.org/abs/1802.05384 https://arxiv.org/abs/1802.05384

KATO H ， USHIKU Y ， HARADA T . Neural 3D mesh renderer ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823，2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 3907 - 3916 . doi: 10.1109/cvpr.2018.00411 http://dx.doi.org/10.1109/cvpr.2018.00411

吴笑天，杨航，孙兴龙 . 基于区域选择网络的图像复原及其在计算成像中的应用［J］. 光学精密工程， 2021 ， 29 （ 4 ）： 864 - 876 . doi: 10.37188/OPE.20212904.0864 http://dx.doi.org/10.37188/OPE.20212904.0864

WU X T ， YANG H ， SUN X L . Image restoring method based on region selection network and its application in computational imaging ［J］. Opt. Precision Eng. ， 2021 ， 29 （ 4 ）： 864 - 876 . （in Chinese） . doi: 10.37188/OPE.20212904.0864 http://dx.doi.org/10.37188/OPE.20212904.0864

ATTENE M ， CAMPEN M ， KOBBELT L . Polygon mesh repairing ［J］. ACM Computing Surveys ， 2013 ， 45 （ 2 ）： 1 - 33 . doi: 10.1145/2431211.2431214 http://dx.doi.org/10.1145/2431211.2431214

BOSCAINI D ， MASCI J ， MELZI S ， et al . Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks ［J］. Computer Graphics Forum ， 2015 ， 34 （ 5 ）： 13 - 23 . doi: 10.1111/cgf.12693 http://dx.doi.org/10.1111/cgf.12693

BROCK A ， LIM T ， RITCHIE J M ， et al . Generative and discriminative voxel modeling with convolutional neural networks ［EB/OL］. 2016： arXiv ： 1608 .04236［cs.CV］. https：//arxiv.org/abs/1608.04236 https://arxiv.org/abs/1608.04236

CHOY C B ， XU D F ， GWAK J ， et al . 3D-R2N2： a unified approach for single and multi-view 3D object reconstruction ［C］. Computer Vision-ECCV 2016 ， 2016 ： 628 - 644 . doi: 10.1007/978-3-319-46484-8_38 http://dx.doi.org/10.1007/978-3-319-46484-8_38

GRAHAM B . Sparse 3D convolutional neural networks ［C］. Proceedings ofthe British Machine Vision Conference 2015. Swansea. British Machine Vision Association ， 2015 ： 150 .1-150. 9 . doi: 10.5244/c.29.150 http://dx.doi.org/10.5244/c.29.150

HANE C ， TULSIANI S ， MALIK J . Hierarchical surface prediction ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2020 ， 42 （ 6 ）： 1348 - 1361 . doi: 10.1109/tpami.2019.2896296 http://dx.doi.org/10.1109/tpami.2019.2896296

FUHRMANN S ， GOESELE M . Floating scale surface reconstruction ［J］. ACM Transactions on Graphics ， 2014 ， 33 （ 4 ）： 1 - 11 . doi: 10.1145/2601097.2601163 http://dx.doi.org/10.1145/2601097.2601163

DENG J ， DONG W ， SOCHER R ， et al . ImageNet： a large-scale hierarchical image database ［C］. 2009 IEEE Conference on Computer Vision and Pattern Recognition . 2025，2009 ， Miami ， FL， USA . IEEE ， 2009 ： 248 - 255 . doi: 10.1109/cvpr.2009.5206848 http://dx.doi.org/10.1109/cvpr.2009.5206848

CHARLES R Q ， HAO S ， MO K C ， et al . PointNet： deep learning on point sets for 3D classification and segmentation ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition . 2126，2017 ， Honolulu， HI， USA . IEEE ， 2017 ： 77 - 85 . doi: 10.1109/cvpr.2017.16 http://dx.doi.org/10.1109/cvpr.2017.16

ACHLIOPTAS P ， DIAMANTI O ， MITLIAGKAS I ， et al . Learning representations and generative models for 3D point clouds ［C］. ICML ， 2017 .

GOODFELLOW I ， POUGET-ABADIE J ， MIRZA M ， et al . Generative adversarial networks ［J］. Communications of the ACM ， 2020 ， 63 （ 11 ）： 139 - 144 . doi: 10.1145/3422622 http://dx.doi.org/10.1145/3422622

GOLTS A ， SCHECHNER Y Y . Image compression optimized for 3D reconstruction by utilizing deep neural networks ［J］. Journal of Visual Communication and Image Representation ， 2021 ， 79 ： 103208 . doi: 10.1016/j.jvcir.2021.103208 http://dx.doi.org/10.1016/j.jvcir.2021.103208

FRISKEN S F ， PERRY R N ， ROCKWOOD A P ， et al . Adaptively sampled distance fields： a general representation of shape for computer graphics ［C］. Proceedings of the 27th annual conference on Computer graphics and interactive techniques - SIGGRAPH '00. Not Known. New York ： ACM Press ， 2000 ： 249 - 254 . doi: 10.1145/344779.344899 http://dx.doi.org/10.1145/344779.344899

KINGMA D P ， WELLING M . Auto-encoding variational Bayes ［EB/OL］. arXiv preprint arXiv ： 1312 . 6114 ， 2013 . https：//arxiv.org/abs/1312.6114v10 https://arxiv.org/abs/1312.6114v10

BADRINARAYANAN V ， KENDALL A ， CIPOLLA R . SegNet： a deep convolutional encoder-decoder architecture for image segmentation ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2017 ， 39 （ 12 ）： 2481 - 2495 . doi: 10.1109/tpami.2016.2644615 http://dx.doi.org/10.1109/tpami.2016.2644615

BESL P J ， MCKAY N D . A method for registration of 3-D shapes ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 1992 ， 14 （ 2 ）： 239 - 256 . doi: 10.1109/34.121791 http://dx.doi.org/10.1109/34.121791

CHANG A X ， FUNKHOUSER T ， GUIBAS L ， et al . ShapeNet： an information-rich 3D model repository ［EB/OL］. arXiv preprint arXiv ： 1512 . 03012 v 1 ， 2015 . https：//arxiv.org/abs/1512.03012 https://arxiv.org/abs/1512.03012

LIN C H ， KONG C ， LUCEY S . Learning efficient point cloud generation for dense 3D object reconstruction ［EB/OL］. 2017： arXiv ： 1706 .07036［cs.CV］. https：//arxiv.org/abs/1706.07036 https://arxiv.org/abs/1706.07036

Views

443

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Partial optimal transport-based domain adaptation for hyperspectral image classification

Multi-scale vehicle and pedestrian detection algorithm based on attention mechanism

Three-dimensional reconstruction of large-scale scene based on depth camera

LightDiffu DCE： low light image enhancement based on light intensity diffusion

Full-stokes photodetector based on neural networks

Related Author

WANG Bilin

WANG Shengsheng

ZHANG Zhe

Jing-yu LI

Jing YANG

Bin KONG

Can WANG

Lu ZHANG

Related Institution

College of Computer Science and Technology， Jilin University

Suzhou Institute of Biomedical Engineering and Technology， Chinese Academy of Sciences

Institute of Intelligent Machines， Chinese Academy of Sciences

University of Science and Technology of China

Anhui Engineering Laboratory for Intelligent Driving Technology and Application

AI问答

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China Postal code：130033
Tel：0431-86176855 Email：gxjmgc@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰