局部-整体双向推理的文物无监督表征学习

刘杰; 耿国华; 田煜; 王毅; 刘阳洋; 周明全

doi:10.37188/OPE.20223018.2241

您当前的位置：

首页 >

文章列表页 >

局部-整体双向推理的文物无监督表征学习

信息科学 | 更新时间：2022-09-26

- 局部-整体双向推理的文物无监督表征学习
- Unsupervised representation learning for cultural relics based on local-global bidirectional reasoning
- 光学精密工程 2022年30卷第18期页码：2241-2252
- 作者机构：
  
  1.西北大学文化遗产数字化国家地方联合工程研究中心，陕西西安 710127
  2.西北大学信息科学与技术学院，陕西西安 710127
- 作者简介：
  
  [ "刘杰（1989-），女，河南新乡人，博士研究生，主要从事三维重建和智能信息处理方面的研究。E-mail： jieliu2017@126.com" ]
  [ "耿国华（1955—），女，山东莱西人，教授，博士生导师，主要从事智能信息处理、数据库与知识库、图像处理方面的研究。E-mail：ghgeng@nwu.edu.cn" ]
- 基金信息：
  
  国家重点研发计划资助项目(2019YFC1521103);国家自然科学基金项目(61731015);陕西省重点产业链项目(2019ZDLSF07-02;2019ZDLGY10-01);陕西省重点研发项目(2022GY-331);陕西省教育厅专项项目(19JK0842);青海省重点研发与转化计划资助项目(2020-SF-140)
- DOI：10.37188/OPE.20223018.2241
  中图分类号： TP391
- 收稿日期：2022-03-27，
  
  修回日期：2022-04-27，
  
  纸质出版日期：2022-09-25
- 稿件说明：
移动端阅览
刘杰,耿国华,田煜等.局部-整体双向推理的文物无监督表征学习[J].光学精密工程,2022,30(18):2241-2252.

LIU Jie,GENG Guohua,TIAN Yu,et al.Unsupervised representation learning for cultural relics based on local-global bidirectional reasoning[J].Optics and Precision Engineering,2022,30(18):2241-2252.
刘杰,耿国华,田煜等.局部-整体双向推理的文物无监督表征学习[J].光学精密工程,2022,30(18):2241-2252. DOI： 10.37188/OPE.20223018.2241.

LIU Jie,GENG Guohua,TIAN Yu,et al.Unsupervised representation learning for cultural relics based on local-global bidirectional reasoning[J].Optics and Precision Engineering,2022,30(18):2241-2252. DOI： 10.37188/OPE.20223018.2241.

摘要

针对现有陶制文物表征学习方法是基于大量带标签数据的有监督学习方法，人工标记费时耗力且不能有效地学习到点云内在结构信息等问题，本文提出一种基于局部-整体双向推理的无监督表征学习方法。首先，提出多尺度壳卷积层级结构编码器提取不同尺度的文物碎片局部特征。其次，利用局部到整体推理模块将提取的局部特征映射得到全局特征，通过度量学习衡量两者之间差异，进行反复学习。然后，利用整体到局部推理模块以确保获取到的全局特征的质量。最后，在不同层次的局部结构和整体形状之间通过双向推理来学习文物点云表征，并将学习到的点云表征应用于分类下游任务。该网络模型在兵马俑数据集和ModelNet40公开数据集上的分类精度分别达到了93.33%和92.02%，分别高于PointNet 4.4%和2.82%。同时缩小了下游分类任务中无监督和有监督学习方法之间的差距。

Abstract

Existing representation learning methods of cultural relics require numerous labels. Manual labeling is time-consuming and labor-intensive. Furthermore， supervised learning methods cannot effectively learn the internal structure information of point clouds. We propose an unsupervised representation learning network to extract the deep features of ceramic cultural relics. The approach is based on local-global bidirectional reasoning. First， we propose a multi-scale shell convolution-based hierarchical encoder to extract local features at different scales. Second， the local-to-global reasoning module is used to map the extracted local features to the global features. The differences between the two types of features are measured using metric learning for iterative learning. Third， a fold-based decoder is used to obtain better reconstruction effects from the acquired global features in a coarse-to-fine manner. A local-to-global reasoning module supervises only the local representation to be near the global one. We propose using a low-level generation task as a self-supervision signal. The global feature can capture more basic structural information about point clouds， and the bidirectional inference between local structures and global shapes at different levels was used to learn point cloud representations. Finally， the learned representations are applied in the downstream task of point cloud classification. Experiments on the Terracotta Warriors and ModelNet40 datasets show that the proposed model significantly improves in terms of classification accuracy. The classification accuracies were 93.33% and 92.02%， respectively. The algorithm improved by approximately 4.4% and 2.82% compared with the supervised algorithm PointNet. The results demonstrate that our model achieves a comparable performance and narrows the gap between unsupervised and supervised learning approaches in downstream object classification tasks.

关键词

Keywords

references

RASHEED N A ， NORDIN M J . Classification and reconstruction algorithms for the archaeological fragments ［J］. Journal of King Saud University-Computer and Information Sciences ， 2020 ， 32 （ 8 ）： 883 - 894 . doi: 10.1016/j.jksuci.2018.09.019 http://dx.doi.org/10.1016/j.jksuci.2018.09.019

陆正杰，李纯辉，耿国华，等 . 基于多特征描述子自适应权重的文物碎片分类［J］. 激光与光电子学进展， 2020 ， 57 （ 4 ）： 321 - 329 . doi: 10.3788/LOP57.041511 http://dx.doi.org/10.3788/LOP57.041511

LU Z J ， LI C H ， GENG G H ， et al . Classification of cultural fragments based on adaptive weights of multi-feature descriptions ［J］. Laser ＆ Optoelectronics Progress ， 2020 ， 57 （ 4 ）： 321 - 329 . （in Chinese） . doi: 10.3788/LOP57.041511 http://dx.doi.org/10.3788/LOP57.041511

CHEN X Z ， MA H M ， WAN J ， et al . Multi-view 3D object detection network for autonomous driving ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu， HI， USA . IEEE ， 2017 ： 6526 - 6534 . doi: 10.1109/cvpr.2017.691 http://dx.doi.org/10.1109/cvpr.2017.691

陈苑锋 . 视觉深度估计与点云建图研究进展［J］. 液晶与显示， 2021 ， 36 （ 6 ）： 896 - 911 . doi: 10.37188/CJLCD.2020-0047 http://dx.doi.org/10.37188/CJLCD.2020-0047

CHEN Y F . Progress of visual depth estimation and point cloud mapping ［J］. Chinese Journal of Liquid Crystals and Displays ， 2021 ， 36 （ 6 ）： 896 - 911 . （in Chinese） . doi: 10.37188/CJLCD.2020-0047 http://dx.doi.org/10.37188/CJLCD.2020-0047

WANG Q ， LIU S T ， CHANUSSOT J ， et al . Scene classification with recurrent attention of VHR remote sensing images ［J］. IEEE Transactions on Geoscience and Remote Sensing ， 2019 ， 57 （ 2 ）： 1155 - 1167 . doi: 10.1109/tgrs.2018.2864987 http://dx.doi.org/10.1109/tgrs.2018.2864987

GENG G H ， LIU J ， CAO X ， et al . Simplification method for 3D Terracotta Warrior fragments based on local structure and deep neural networks ［J］. Journal of the Optical Society of America A， Optics， Image Science， and Vision ， 2020 ， 37 （ 11 ）： 1711 - 1720 . doi: 10.1364/josaa.400571 http://dx.doi.org/10.1364/josaa.400571

YANG K ， CAO X ， GENG G H ， et al . Classification of 3D terracotta warriors fragments based on geospatial and texture information ［J］. Journal of Visualization ， 2021 ， 24 （ 2 ）： 251 - 259 . doi: 10.1007/s12650-020-00710-6 http://dx.doi.org/10.1007/s12650-020-00710-6

鱼跃华，张海波，李昕，等 . 基于数据增强的秦俑碎片深度分类模型［J/OL］. 激光与光电子学进展， 2021 ， doi： 10.3788/ lop59.1810010 http://dx.doi.org/10.3788/lop59.1810010 .

YU Y H ， ZHANG H B ， LI X ， et al . Data Enhanced Depth Classification Model for Terra-Cotta Warriors Fragments ［J/OL］. Laser & Optoelectronics Progress ， 2021 ， doi： 10.3788/ lop59.1810010. http://dx.doi.org/10.3788/lop59.1810010. （in Chinese）

GAO H J ， GENG G H ， ZENG S . Approach for 3D cultural relic classification based on a low-dimensional descriptor and unsupervised learning ［J］. Entropy （Basel， Switzerland）， 2020 ， 22 （ 11 ）： 1290 . doi: 10.3390/e22111290 http://dx.doi.org/10.3390/e22111290

YAO W M ， CHU T ， TANG W L ， et al . SPPD： a novel reassembly method for 3D terracotta warrior fragments based on fracture surface information ［J］. ISPRS International Journal of Geo-Information ， 2021 ， 10 （ 8 ）： 525 . doi: 10.3390/ijgi10080525 http://dx.doi.org/10.3390/ijgi10080525

CHARLES R Q ， HAO S ， MO K C ， et al . PointNet： deep learning on point sets for 3D classification and segmentation ［C］. 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu， HI， USA . IEEE ， 2017 ： 77 - 85 . doi: 10.1109/cvpr.2017.16 http://dx.doi.org/10.1109/cvpr.2017.16

QI C R ， YI L ， SU H ， et al . Pointnet++： Deep hierarchical feature learning on point sets in a metric space ［J］. Advances in Neural Information Processing Systems ， 2017 ， 30 .

ZHANG Z Y ， HUA B S ， YEUNG S K . ShellNet： efficient point cloud convolutional neural networks using concentric shells statistics ［C］. 2019 IEEE/CVF International Conference on Computer Vision （ICCV）. Seoul ， Korea （South） . IEEE ， 2019 ： 1607 - 1616 . doi: 10.1109/iccv.2019.00169 http://dx.doi.org/10.1109/iccv.2019.00169

LI J X ， CHEN B M ， LEE G H . SO-net： self-organizing network for point cloud analysis ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City， UT， USA . IEEE ， 2018 ： 9397 - 9406 . doi: 10.1109/cvpr.2018.00979 http://dx.doi.org/10.1109/cvpr.2018.00979

杨军，党吉圣 . 采用深度级联卷积神经网络的三维点云识别与分割［J］. 光学精密工程， 2020 ， 28 （ 5 ）： 1187 - 1199 . doi: 10.3788/OPE.20202805.1187 http://dx.doi.org/10.3788/OPE.20202805.1187

YANG J ， DANG J S . Recognition and segmentation of three-dimensional point cloud based on deep cascade convolutional neural network ［J］. Opt. Precision Eng. ， 2020 ， 28 （ 5 ）： 1187 - 1199 . （in Chinese） . doi: 10.3788/OPE.20202805.1187 http://dx.doi.org/10.3788/OPE.20202805.1187

伍锡如，薛其威 . 基于激光雷达的无人驾驶系统三维车辆检测［J］. 光学精密工程， 2022 ， 30 （ 4 ）： 489 - 497 . doi: 10.37188/OPE.20223004.0489 http://dx.doi.org/10.37188/OPE.20223004.0489

WU X R ， XUE Q W . 3D vehicle detection for unmanned driving systerm based on lidar ［J］. Opt. Precision Eng. ， 2022 ， 30 （ 4 ）： 489 - 497 . （in Chinese） . doi: 10.37188/OPE.20223004.0489 http://dx.doi.org/10.37188/OPE.20223004.0489

LIU J ， CAO X ， ZHANG P C ， et al . AMS-net： an attention-based multi-scale network for classification of 3D terracotta warrior fragments ［J］. Remote Sensing ， 2021 ， 13 （ 18 ）： 3713 . doi: 10.3390/rs13183713 http://dx.doi.org/10.3390/rs13183713

徐哲，耿杰，蒋雯，等 . 联合训练生成对抗网络的半监督分类方法［J］. 光学精密工程， 2021 ， 29 （ 5 ）： 1127 - 1135 . doi: 10.37188/OPE.20212905.1127 http://dx.doi.org/10.37188/OPE.20212905.1127

XU Z ， GENG J ， JIANG W ， et al . Co-training generative adversarial networks for semi-supervised classification method ［J］. Opt. Precision Eng. ， 2021 ， 29 （ 5 ）： 1127 - 1135 . （in Chinese） . doi: 10.37188/OPE.20212905.1127 http://dx.doi.org/10.37188/OPE.20212905.1127

ACHLIOPTAS P ， DIAMANTI O ， MITLIAGKAS I ， et al . Learning representations and generative models for 3d point clouds ［C］. Conference on Computer Vision Theory and Applications （VISAPP）， 27 - 29 ， 2020， Valletta， MALTA， USA. IEEE ， 2018： 421 - 428 .

YANG Y Q ， FENG C ， SHEN Y R ， et al . FoldingNet： point cloud auto-encoder via deep grid deformation ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Salt Lake City， UT， USA . IEEE ， 2018 ： 206 - 215 . doi: 10.1109/cvpr.2018.00029 http://dx.doi.org/10.1109/cvpr.2018.00029

LI C L ， ZAHEER M ， ZHANG Y ， et al . Point cloud GAN ［EB/OL］. 2018： arXiv ： 1810 . 05795 . https：//arxiv.org/abs/1810.05795 https://arxiv.org/abs/1810.05795

LIU X H ， HAN Z Z ， WEN X ， et al . L2G auto-encoder： understanding point clouds by local-to-global reconstruction with hierarchical self-attention ［C］. Proceedings of the 27th ACM International Conference on Multimedia. Nice France. New York， NY， USA ： ACM ， 2019 ： 989 - 997 . doi: 10.1145/3343031.3350960 http://dx.doi.org/10.1145/3343031.3350960

HASSANI K ， HALEY M . Unsupervised multi-task feature learning on point clouds ［C］. 2019 IEEE/CVF International Conference on Computer Vision （ICCV）. Seoul ， Korea （South） . IEEE ， 2019 ： 8159 - 8170 . doi: 10.1109/iccv.2019.00825 http://dx.doi.org/10.1109/iccv.2019.00825

SCHROFF F ， KALENICHENKO D ， PHILBIN J . FaceNet： a unified embedding for face recognition and clustering ［C］. 2015 IEEE Conference on Computer Vision and Pattern Recognition . Boston， MA， USA . IEEE ， 2015 ： 815 - 823 . doi: 10.1109/cvpr.2015.7298682 http://dx.doi.org/10.1109/cvpr.2015.7298682

KIHYUK Sohn . Improved deep metric learning with multi-class n-pair loss objective ［C］. 2016 Conference on Neural Information Processing Systems （NIPS）， 5 - 10 ，2016， Barcelona ， SPAIN ， 2016： 1857 - 1865 .

DU G G ， ZHOU M Q ， YIN C L ， et al . Classifying fragments of terracotta warriors using template-based partial matching ［J］. Multimedia Tools and Applications ， 2018 ， 77 （ 15 ）： 19171 - 19191 . doi: 10.1007/s11042-017-5396-0 http://dx.doi.org/10.1007/s11042-017-5396-0

浏览量

575

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

面向点云分类和分割的形状自适应特征聚合网络

基于多分支残差注意力网络的水下图像增强

基于策略梯度和伪孪生网络的异源图像匹配

基于机载的红外动态目标视频实时超分辨率重建

数据驱动的多模复合制导信息融合及其试验验证