浏览全部资源
扫码关注微信
1.中国科学院 西安光学精密机械研究所,陕西 西安 710119
2.中国科学院大学,北京 100049
Received:30 October 2021,
Revised:31 December 2021,
Published:10 May 2022
移动端阅览
姚程,马彩文.基于平面补丁的自适应八叉树三维图像重建[J].光学精密工程,2022,30(09):1113-1122.
YAO Cheng,MA Caiwen.Adaptive octree 3D image reconstruction based on plane patch[J].Optics and Precision Engineering,2022,30(09):1113-1122.
姚程,马彩文.基于平面补丁的自适应八叉树三维图像重建[J].光学精密工程,2022,30(09):1113-1122. DOI: 10.37188/OPE.20223009.1113.
YAO Cheng,MA Caiwen.Adaptive octree 3D image reconstruction based on plane patch[J].Optics and Precision Engineering,2022,30(09):1113-1122. DOI: 10.37188/OPE.20223009.1113.
提出了一种基于平面补丁的自适应八叉树卷积神经网络(Octree Convolutional Neural Networks,O-CNN),用于进行有效的三维形状编码和解码。不同于基于体素或基于八叉树的卷积神经网络(Convolutional Neural Networks,CNN),以相同的分辨率表示具有体素的三维形状,O-CNN可自适应地表示具有不同层次的八叉树节点的三维形状,并使用平面补丁对每个八叉树节点内的三维形状进行建模。依据这种自适应表示设计了一种用于编码和解码三维形状的自适应O-CNN编码器和解码器。自适应O-CNN编码器将平面补丁法线和位移作为输入,仅在每个级别的八叉树节点上执行三维卷积操作,而自适应O-CNN解码器则推断每个层次的八叉树节点的形状占有率和细分状态,并估计每个最佳叶八叉树节点的平面法线和位移。通过对单个图像的形状预测验证了自适应O-CNN的生成任务的效率和有效性,倒角距离误差为0.274,低于OctGen的倒角距离误差0.294,取得了更好的重建效果。作为3D形状分析和生成的通用框架,基于平面补丁的自适应O-CNN不仅减少了内存和计算成本,而且比现有的3D-CNN方法具有更好的形状生成能力。
In this study, an adaptive octree convolutional neural network based on plane patches is proposed for effective 3D shape encoding and decoding. Unlike volume-based or octree-based convolutional neural network (CNN) methods, which represent 3D shapes with the same voxel resolution, the proposed method can use planes and adaptively represent the 3D shapes of octree nodes with different levels. The patch models the 3D shape within each octree node, whereby the patch-based adaptive representation is utilized in the proposed adaptive patch octree convolutional neural network (O-CNN) encoder and decoder for the encoding and decoding of 3D shapes. The adaptive patch O-CNN encoder takes the plane patch normal and displacement as input and performs three-dimensional convolution on the octree nodes of each level, whereas the adaptive patch O-CNN decoder infers each level. The shape occupancy rate and subdivision state of the octree node as well as the best plane normal and displacement of each leaf octree node are estimated. As a general framework for 3D shape analysis and generation, adaptive patch O-CNN not only reduces memory and computational costs but also exhibits better shape generation capabilities than existing 3D-CNN methods. Shape prediction is performed using a single image to verify the efficiency and effectiveness of the generation task of the adaptive O-CNN. The chamfer distance error is 0.274, which is lower than that of OctGen (0.294), resulting in a better reconstruction effect.
范丽丽 , 赵宏伟 , 赵浩宇 , 等 . 基于深度卷积神经网络的目标检测研究综述 [J]. 光学 精密工程 , 2020 , 28 ( 5 ): 1152 - 1164 .
FAN L L , ZHAO H W , ZHAO H Y , et al . Survey of target detection based on deep convolutional neural networks [J]. Opt. Precision Eng. , 2020 , 28 ( 5 ): 1152 - 1164 . (in Chinese)
LUN Z L , GADELHA M , KALOGERAKIS E , et al . 3D shape reconstruction from sketches via multi-view convolutional networks [C]. 2017 International Conference on 3D Vision (3DV). 1012,2017 , Qingdao, China. IEEE , 2017 : 67 - 77 . doi: 10.1109/3dv.2017.00018 http://dx.doi.org/10.1109/3dv.2017.00018
潘仙张 , 张石清 , 郭文平 . 多模深度卷积神经网络应用于视频表情识别 [J]. 光学 精密工程 , 2019 , 27 ( 4 ): 963 - 970 . doi: 10.3788/ope.20192704.0963 http://dx.doi.org/10.3788/ope.20192704.0963
PAN X ZH , ZHANG SH Q , GUO W P . Video-based facial expression recognition using multimodal deep convolutional neural networks [J]. Opt. Precision Eng. , 2019 , 27 ( 4 ): 963 - 970 . (in Chinese) . doi: 10.3788/ope.20192704.0963 http://dx.doi.org/10.3788/ope.20192704.0963
NIU Z J , LIU W , ZHAO J Y , et al . DeepLab-based spatial feature extraction for hyperspectral image classification [J]. IEEE Geoscience and Remote Sensing Letters , 2019 , 16 ( 2 ): 251 - 255 . doi: 10.1109/lgrs.2018.2871507 http://dx.doi.org/10.1109/lgrs.2018.2871507
WU J J , ZHANG C K , XUE T F , et al . Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling [C]. NIPS'16 : Proceedings of the 30th International Conference on Neural Information Processing Systems . 2016 : 82 - 90 . doi: 10.48550/arXiv.1610.07584 http://dx.doi.org/10.48550/arXiv.1610.07584
WU Z R , SONG S R , KHOSLA A , et al . 3D ShapeNets: a deep representation for volumetric shapes [C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition . 712,2015 , Boston, MA, USA . IEEE , 2015 : 1912 - 1920 . doi: 10.1109/cvpr.2015.7298801 http://dx.doi.org/10.1109/cvpr.2015.7298801
BRONSTEIN M M , BRUNA J , LECUN Y , et al . Geometric deep learning: going beyond euclidean data [J]. IEEE Signal Processing Magazine , 2017 , 34 ( 4 ): 18 - 42 . doi: 10.1109/msp.2017.2693418 http://dx.doi.org/10.1109/msp.2017.2693418
GROUEIX T , FISHER M , KIM V G , et al . AtlasNet: a papier-mché approach to learning 3D surface generation [EB/OL]. 2018: arXiv : 1802 .05384[cs.CV]. https://arxiv.org/abs/1802.05384 https://arxiv.org/abs/1802.05384
KATO H , USHIKU Y , HARADA T . Neural 3D mesh renderer [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 1823,2018 , Salt Lake City, UT, USA . IEEE , 2018 : 3907 - 3916 . doi: 10.1109/cvpr.2018.00411 http://dx.doi.org/10.1109/cvpr.2018.00411
吴笑天 , 杨航 , 孙兴龙 . 基于区域选择网络的图像复原及其在计算成像中的应用 [J]. 光学 精密工程 , 2021 , 29 ( 4 ): 864 - 876 . doi: 10.37188/OPE.20212904.0864 http://dx.doi.org/10.37188/OPE.20212904.0864
WU X T , YANG H , SUN X L . Image restoring method based on region selection network and its application in computational imaging [J]. Opt. Precision Eng. , 2021 , 29 ( 4 ): 864 - 876 . (in Chinese) . doi: 10.37188/OPE.20212904.0864 http://dx.doi.org/10.37188/OPE.20212904.0864
ATTENE M , CAMPEN M , KOBBELT L . Polygon mesh repairing [J]. ACM Computing Surveys , 2013 , 45 ( 2 ): 1 - 33 . doi: 10.1145/2431211.2431214 http://dx.doi.org/10.1145/2431211.2431214
BOSCAINI D , MASCI J , MELZI S , et al . Learning class-specific descriptors for deformable shapes using localized spectral convolutional networks [J]. Computer Graphics Forum , 2015 , 34 ( 5 ): 13 - 23 . doi: 10.1111/cgf.12693 http://dx.doi.org/10.1111/cgf.12693
BROCK A , LIM T , RITCHIE J M , et al . Generative and discriminative voxel modeling with convolutional neural networks [EB/OL]. 2016: arXiv : 1608 .04236[cs.CV]. https://arxiv.org/abs/1608.04236 https://arxiv.org/abs/1608.04236
CHOY C B , XU D F , GWAK J , et al . 3D-R2N2: a unified approach for single and multi-view 3D object reconstruction [C]. Computer Vision-ECCV 2016 , 2016 : 628 - 644 . doi: 10.1007/978-3-319-46484-8_38 http://dx.doi.org/10.1007/978-3-319-46484-8_38
GRAHAM B . Sparse 3D convolutional neural networks [C]. Proceedings ofthe British Machine Vision Conference 2015. Swansea. British Machine Vision Association , 2015 : 150 .1-150. 9 . doi: 10.5244/c.29.150 http://dx.doi.org/10.5244/c.29.150
HANE C , TULSIANI S , MALIK J . Hierarchical surface prediction [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2020 , 42 ( 6 ): 1348 - 1361 . doi: 10.1109/tpami.2019.2896296 http://dx.doi.org/10.1109/tpami.2019.2896296
FUHRMANN S , GOESELE M . Floating scale surface reconstruction [J]. ACM Transactions on Graphics , 2014 , 33 ( 4 ): 1 - 11 . doi: 10.1145/2601097.2601163 http://dx.doi.org/10.1145/2601097.2601163
DENG J , DONG W , SOCHER R , et al . ImageNet: a large-scale hierarchical image database [C]. 2009 IEEE Conference on Computer Vision and Pattern Recognition . 2025,2009 , Miami , FL, USA . IEEE , 2009 : 248 - 255 . doi: 10.1109/cvpr.2009.5206848 http://dx.doi.org/10.1109/cvpr.2009.5206848
CHARLES R Q , HAO S , MO K C , et al . PointNet: deep learning on point sets for 3D classification and segmentation [C]. 2017 IEEE Conference on Computer Vision and Pattern Recognition . 2126,2017 , Honolulu, HI, USA . IEEE , 2017 : 77 - 85 . doi: 10.1109/cvpr.2017.16 http://dx.doi.org/10.1109/cvpr.2017.16
ACHLIOPTAS P , DIAMANTI O , MITLIAGKAS I , et al . Learning representations and generative models for 3D point clouds [C]. ICML , 2017 .
GOODFELLOW I , POUGET-ABADIE J , MIRZA M , et al . Generative adversarial networks [J]. Communications of the ACM , 2020 , 63 ( 11 ): 139 - 144 . doi: 10.1145/3422622 http://dx.doi.org/10.1145/3422622
GOLTS A , SCHECHNER Y Y . Image compression optimized for 3D reconstruction by utilizing deep neural networks [J]. Journal of Visual Communication and Image Representation , 2021 , 79 : 103208 . doi: 10.1016/j.jvcir.2021.103208 http://dx.doi.org/10.1016/j.jvcir.2021.103208
FRISKEN S F , PERRY R N , ROCKWOOD A P , et al . Adaptively sampled distance fields: a general representation of shape for computer graphics [C]. Proceedings of the 27th annual conference on Computer graphics and interactive techniques - SIGGRAPH '00. Not Known. New York : ACM Press , 2000 : 249 - 254 . doi: 10.1145/344779.344899 http://dx.doi.org/10.1145/344779.344899
KINGMA D P , WELLING M . Auto-encoding variational Bayes [EB/OL]. arXiv preprint arXiv : 1312 . 6114 , 2013 . https://arxiv.org/abs/1312.6114v10 https://arxiv.org/abs/1312.6114v10
BADRINARAYANAN V , KENDALL A , CIPOLLA R . SegNet: a deep convolutional encoder-decoder architecture for image segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 39 ( 12 ): 2481 - 2495 . doi: 10.1109/tpami.2016.2644615 http://dx.doi.org/10.1109/tpami.2016.2644615
BESL P J , MCKAY N D . A method for registration of 3-D shapes [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 1992 , 14 ( 2 ): 239 - 256 . doi: 10.1109/34.121791 http://dx.doi.org/10.1109/34.121791
CHANG A X , FUNKHOUSER T , GUIBAS L , et al . ShapeNet: an information-rich 3D model repository [EB/OL]. arXiv preprint arXiv : 1512 . 03012 v 1 , 2015 . https://arxiv.org/abs/1512.03012 https://arxiv.org/abs/1512.03012
LIN C H , KONG C , LUCEY S . Learning efficient point cloud generation for dense 3D object reconstruction [EB/OL]. 2017: arXiv : 1706 .07036[cs.CV]. https://arxiv.org/abs/1706.07036 https://arxiv.org/abs/1706.07036
0
Views
443
下载量
1
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution