联合多模态特征与结构感知的手物交互姿态估计

王文润; 党建武; 王阳萍; 任鹏百; 潘瑞

doi:10.37188/OPE.20253320.3265

您当前的位置：

首页 >

文章列表页 >

联合多模态特征与结构感知的手物交互姿态估计

信息科学 | 更新时间：2025-11-28

- 联合多模态特征与结构感知的手物交互姿态估计
- Hand-object interaction pose estimation integrating multi-modal features and structure awareness
- 光学精密工程 2025年33卷第20期页码：3265-3280
- 作者机构：
  
  1.兰州交通大学轨道交通信息与控制国家级虚拟仿真实验教学中心，甘肃兰州 730070
  2.兰州交通大学电子与信息工程学院，甘肃兰州 730070
  3.甘肃省人工智能与图形图像处理工程研究中心，甘肃兰州 730070
- 作者简介：
  
  [ "王文润（1990-），女，甘肃白银人，博士研究生，工程师，2013年于重庆师范大学获得学士学位，2016年于兰州交通大学获得硕士学位，主要从事人机交互及虚拟现实方面的研究。E-mail： wangwenrun@mail.lzjtu.cn" ]
  [ "党建武（1963-），男，陕西富平人，博士，教授，博士生导师，1986年于兰州铁道学院获得学士学位，1992年、1996年于西南交通大学分别获得硕士和博士学位，主要从事交通信息工程及控制、智能信息处理、图像处理等方面的研究。E-mail： dangjw@mail.lzjtu.cn" ]
- 基金信息：
  
  国家自然科学基金项目(62067006;62367005);学校青年基金科学项目(2022012)
- DOI：10.37188/OPE.20253320.3265
  中图分类号： TP391.41
- CSTR：32169.14.OPE.20253320.3265
- 收稿：2025-07-23，
  
  修回：2025-08-27，
  
  纸质出版：2025-10-25
- 稿件说明：
移动端阅览
王文润,党建武,王阳萍等.联合多模态特征与结构感知的手物交互姿态估计[J].光学精密工程,2025,33(20):3265-3280.

WANG Wenrun,DANG Jianwu,WANG Yangping,et al.Hand-object interaction pose estimation integrating multi-modal features and structure awareness[J].Optics and Precision Engineering,2025,33(20):3265-3280.
王文润,党建武,王阳萍等.联合多模态特征与结构感知的手物交互姿态估计[J].光学精密工程,2025,33(20):3265-3280. DOI： 10.37188/OPE.20253320.3265. CSTR： 32169.14.OPE.20253320.3265.

WANG Wenrun,DANG Jianwu,WANG Yangping,et al.Hand-object interaction pose estimation integrating multi-modal features and structure awareness[J].Optics and Precision Engineering,2025,33(20):3265-3280. DOI： 10.37188/OPE.20253320.3265. CSTR： 32169.14.OPE.20253320.3265.

摘要

现实世界中手不可避免地要与物体进行交互，因此理解人手与物体的交互行为与意图具有重要的研究意义。本文针对手与物体交互过程中的相互遮挡、手部自遮挡及复杂交互背景等因素导致姿态估计精度低的问题，提出一种联合多模态特征与结构感知的手部与交互物体三维姿态估计方法。该方法利用彩色图像和深度图像的多模态特征实现信息互补，有效解决背景复杂、手部自遮挡及手物相互遮挡的问题；其次，基于图结构分别设计手部、交互物体及手物交互结构感知模块，辅助估计更加合理和准确的手与交互物体的二维姿态；最后，将获取的二维姿态与深度图像中的深度信息进行合并，再利用纹理特征对合并得到的三维姿态进一步优化得到最终的手物交互三维姿态。为了验证本文方法的有效性，在FPHA，HO-3D等数据集开展了系列实验，手部和交互物体的姿态误差分别降低到9.62 mm和14.37 mm。实验结果表明，所提方法优于现有的手物交互姿态估计方法，具有较强的鲁棒性和泛化性。

Abstract

In the real world， hands inevitably interact with objects. Understanding the interaction behaviors and intentions between human hands and objects is of great research significance. This paper tackled the low-accuracy pose-estimation issue during hand-object interaction， caused by mutual hand-object occlusion， hand self-occlusion， and complex backgrounds. A 3D pose-estimation method for hands and interacting objects， which combined multi-modal features and structure awareness， was proposed. This method exploited the multi-modal features of color and depth images for information complementarity， effectively addressing complex backgrounds， hand self-occlusion， and hand-object mutual occlusion. Second， graph-structure-based awareness modules for the hand， the object， and their interaction were designed to help estimate more reasonable and accurate 2D poses. Finally， the obtained 2D poses were merged with depth-image depth information， and texture features were used to optimize the merged 3D poses for the final hand-object interaction 3D pose. To verify the method’s effectiveness， experiments were conducted on datasets like FPHA and HO-3D. The hand and object pose errors are reduced to 9.62 mm and 14.37 mm， respectively. Results show the proposed method outperforms existing ones and has strong robustness and generalization.

关键词

Keywords

references

车云龙，齐越 . 基于深度图像的手部姿态估计综述［J］. 计算机辅助设计与图形学学报， 2021 ， 33 （ 11 ）： 1635 - 1648 . doi: 10.3724/sp.j.1089.2021.18788 http://dx.doi.org/10.3724/sp.j.1089.2021.18788

CHE Y L ， QI Y . A survey on depth based hand pose estimation ［J］. Journal of Computer-Aided Design & Computer Graphics ， 2021 ， 33 （ 11 ）： 1635 - 1648 . （in Chinese） . doi: 10.3724/sp.j.1089.2021.18788 http://dx.doi.org/10.3724/sp.j.1089.2021.18788

李少东，罗凯，黄远智，等 . 复杂交互场景下融合关节遮挡信息的手部姿态估计研究［J］. 计算机学报， 2025 ， 48 （ 5 ）： 1212 - 1231 .

LI S D ， LUO K ， HUANG Y Z ， et al . Hand pose estimation via fusing joint occlusion informationin complex interaction scenarios ［J］. Chinese Journal of Computers ， 2025 ， 48 （ 5 ）： 1212 - 1231 . （in Chinese）

REN P F ， CHEN Y C ， HAO J C ， et al . Two heads are better than one： image-point cloud network for depth-based 3D hand pose estimation ［J］. Proceedings of the AAAI Conference on Artificial Intelligence ， 2023 ， 37 （ 2 ）： 2163 - 2171 . doi: 10.1609/aaai.v37i2.25310 http://dx.doi.org/10.1609/aaai.v37i2.25310

TEKIN B ， BOGO F ， POLLEFEYS M . H+O： unified egocentric recognition of 3D hand-object poses and interactions ［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. June 15 - 20 ， 2019 . Long Beach， CA， USA. IEEE ， 2019 ： 4511 - 4520 .

WANG T C ， YANG T ， DANELLJAN M ， et al . Learning human-object interaction detection using interaction points ［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. June 13 - 19 ， 2020 . Seattle， WA， USA. IEEE ， 2020 ： 4116 - 4125 .

HAN S ， LIU B ， CABEZAS R ， et al . Megatrack： monochrome egocentric articulated hand-tracking for virtual reality ［J］. ACM Transactions on Graphics （ToG）， 2020 ， 39 （ 4 ）： 87 ： 1 - 87 ： 13 . doi: 10.1145/3386569.3392452 http://dx.doi.org/10.1145/3386569.3392452

WANG J Y ， MUELLER F ， BERNARD F ， et al . RGB2Hands ［J］. ACM Transactions on Graphics ， 2020 ， 39 （ 6 ）： 1 - 16 . doi: 10.1145/3414685.3417852 http://dx.doi.org/10.1145/3414685.3417852

YE Z J ， JIA J ， XING J L . Semantics2Hands： transferring hand motion semantics between avatars ［C］. Proceedings of the 31st ACM International Conference on Multimedia. Ottawa ON Canada . ACM ， 2023 . doi: 10.1145/3581783.3612703 http://dx.doi.org/10.1145/3581783.3612703

LIU X ， YI L . Geneoh diffusion： towards generalizable hand-object interaction denoising via denoising diffusion ［C］. 12th International Conference on Learning Representations （ICLR）. May 7 - 11 ， 2024 . Hybrid， Vienna， Austria. IEEE ， 2024 .

ROGEZ G ， SUPANCIC J S ， RAMANAN D . Understanding everyday hands in action from RGB-D images ［C］. 2015 IEEE International Conference on Computer Vision （ICCV）. 7-13，2015 ， Santiago， Chile. IEEE ， 2015 ： 3889 - 3897 . doi: 10.1109/iccv.2015.443 http://dx.doi.org/10.1109/iccv.2015.443

XU B S ， ZHENG S P ， JIN Q . POV： Prompt-oriented view-agnostic learning for egocentric hand-object interaction in the multi-view world ［C］. Proceedings of the 31st ACM International Conference on Multimedia. Ottawa ON Canada . ACM ， 2023 ： 2807 - 2816 . doi: 10.1145/3581783.3612484 http://dx.doi.org/10.1145/3581783.3612484

ZHU T Q ， WU R N ， LIN X B ， et al . Toward Human-like grasp： dexterous grasping Via semantic representation of object-hand ［C］. 2021 IEEE/CVF International Conference on Computer Vision （ICCV）. 10-17，2021 ， Montreal， QC， Canada. IEEE ， 2021 ： 15721 - 15731 . doi: 10.1109/iccv48922.2021.01545 http://dx.doi.org/10.1109/iccv48922.2021.01545

HANDA A ， VAN WYK K ， YANG W ， et al . DexPilot： vision-based teleoperation of dexterous robotic hand-arm system ［C］. 2020 IEEE International Conference on Robotics and Automation （ICRA）. May 31-August 31 ， 2020 . Paris， France. IEEE ， 2020： 9164 - 9170 . doi: 10.1109/icra40945.2020.9197124 http://dx.doi.org/10.1109/icra40945.2020.9197124

WANG R ， KTISTAKIS S ， ZHANG S W ， et al . POV - surgery ： a dataset for egocentric hand and Tool Pose Estimation During Surgical Activities ［M］. Medical Image Computing and Computer Assisted Intervention-MICCAI 2023. Cham ： Springer Nature Switzerland ， 2023 ： 440 - 450 . doi: 10.1007/978-3-031-43996-4_42 http://dx.doi.org/10.1007/978-3-031-43996-4_42

LI C ， ZHANG R ， WONG J ， et al . Behavior-1k： A benchmark for embodied Ai with 1，000 everyday activities and realistic simulation ［C］. Conference on Robot Learning. PMLR ， 2023 ： 80 - 93 .

TAN M K ， ZHUANG Z W ， CHEN S T ， et al . EPMF： efficient perception-aware multi-sensor fusion for 3D semantic segmentation ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2024 ， 46 （ 12 ）： 8258 - 8273 . doi: 10.1109/tpami.2024.3402232 http://dx.doi.org/10.1109/tpami.2024.3402232

XIAO Z ， WANG T ， WANG J ， et al . Unified human-scene interaction via prompted chain-of-contacts ［C］. 12th International Conference on Learning Representations（ICLR）. May 7 - 11 ， 2024 . Hybrid， Vienna， Austria. ICLR ， 2024 .

GOUDIE D ， GALATA A . 3D Hand-object pose estimation from depth with convolutional neural networks ［C］. 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition （FG 2017 ）. May 30 - June 3 ， 2017 ， Washington， DC， USA. IEEE ， 2017： 406 - 413 . doi: 10.1109/fg.2017.58 http://dx.doi.org/10.1109/fg.2017.58

WAN C D ， PROBST T ， GOOL L V ， et al . Dense 3D regression for hand pose estimation ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 18-23，2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 5147 - 5156 . doi: 10.1109/cvpr.2018.00540 http://dx.doi.org/10.1109/cvpr.2018.00540

YUAN S X ， GARCIA-HERNANDO G ， STENGER B ， et al . Depth-based 3D hand pose estimation： from current achievements to future Goals ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . June 18-23， 2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 2636 - 2645 . doi: 10.1109/cvpr.2018.00279 http://dx.doi.org/10.1109/cvpr.2018.00279

FANG L P ， LIU X Y ， LIU L ， et al . JGR-P2O： joint graph reasoning based pixel-to-offset prediction network for 3D hand pose estimation from a single depth image ［M］. Computer Vision – ECCV 2020. Cham ： Springer International Publishing ， 2020 ： 120 - 137 . doi: 10.1007/978-3-030-58539-6_8 http://dx.doi.org/10.1007/978-3-030-58539-6_8

CHEN X H ， WANG G J ， GUO H K ， et al . Pose guided structured region ensemble network for cascaded hand pose estimation ［J］. Neurocomputing ， 2020 ， 395 ： 138 - 149 . doi: 10.1016/j.neucom.2018.06.097 http://dx.doi.org/10.1016/j.neucom.2018.06.097

ZIMMERMANN C ， BROX T . Learning to estimate 3D hand pose from single RGB images ［C］. 2017 IEEE International Conference on Computer Vision （ICCV）. 22-29，2017 ， Venice， Italy. IEEE ， 2017 ： 4913 - 4921 . doi: 10.1109/iccv.2017.525 http://dx.doi.org/10.1109/iccv.2017.525

IQBAL U ， MOLCHANOV P ， BREUEL T ， et al . Hand Pose Estimation Via Latent 2 . 5 D Heatmap Regression ［M］. Computer Vision-ECCV 2018. Cham ： Springer International Publishing ， 2018 ： 125 - 143 . doi: 10.1007/978-3-030-01252-6_8 http://dx.doi.org/10.1007/978-3-030-01252-6_8

PANTELERIS P ， OIKONOMIDIS I ， ARGYROS A . Using a single RGB frame for real time 3D hand pose estimation in the wild ［C］. 2018 IEEE Winter Conference on Applications of Computer Vision （WACV）. 12-15，2018 ， Lake Tahoe， NV， USA. IEEE ， 2018 ： 436 - 445 . doi: 10.1109/wacv.2018.00054 http://dx.doi.org/10.1109/wacv.2018.00054

CAI Y J ， GE L H ， CAI J F ， et al . Weakly - supervised 3 D Hand Pose Estimation from Monocular RGB Images ［M］. Computer Vision-ECCV 2018. Cham ： Springer International Publishing ， 2018 ： 678 - 694 . doi: 10.1007/978-3-030-01231-1_41 http://dx.doi.org/10.1007/978-3-030-01231-1_41

BAEK S ， KIM K I ， KIM T K . Pushing the en velope for RGB-based dense 3d hand pose estimation Via neural rendering ［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 15-20，2019 ， Long Beach， CA， USA. IEEE ， 2019 ： 1067 - 1076 . doi: 10.1109/cvpr.2019.00116 http://dx.doi.org/10.1109/cvpr.2019.00116

LIU Y ， JIANG J ， SUN J H . Hand pose estimation from RGB images based on deep learning： a survey ［C］. 2021 IEEE 7th International Conference on Virtual Reality （ICVR）. 20-22，2021 ， Foshan， China. IEEE ， 2021 ： 82 - 89 . doi: 10.1109/icvr51878.2021.9483815 http://dx.doi.org/10.1109/icvr51878.2021.9483815

LEE S ， PARK H ， KIM D U ， et al . Image-free domain generalization Via CLIP for 3D hand pose estimation ［C］. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision （WACV）. 2-7，2023 ， Waikoloa， HI， USA. IEEE ， 2023 ： 2933 - 2943 . doi: 10.1109/wacv56688.2023.00295 http://dx.doi.org/10.1109/wacv56688.2023.00295

肖一，刘越 . 基于RGB图像的三维人手姿态估计技术综述［J］. 计算机辅助设计与图形学学报， 2024 ， 36 （ 2 ）： 161 - 172 . doi: 10.3724/sp.j.1089.2024.20158 http://dx.doi.org/10.3724/sp.j.1089.2024.20158

XIAO Y ， LIU Y . Review on 3D hand pose estimation based on a RGB image ［J］. Journal of Computer-Aided Design & Computer Graphics ， 2024 ， 36 （ 2 ）： 161 - 172 . （in Chinese） . doi: 10.3724/sp.j.1089.2024.20158 http://dx.doi.org/10.3724/sp.j.1089.2024.20158

马胜蕾，李敬华，孔德慧，等 . 基于双分支多尺度注意力的手三维姿态估计［J］. 计算机学报， 2023 ， 46 （ 7 ）： 1383 - 1395 .

MA S L ， LI J H ， KONG D H ， et al . 3D hand pose estimation based on double branches with multi-scale attention ［J］. Chinese Journal of Computers ， 2023 ， 46 （ 7 ）： 1383 - 1395 . （in Chinese）

JIANG X ， MA X H . Dynamic Graph CNN with Attention Module for 3 D Hand Pose Estimation ［M］. Advances in Neural Networks-ISNN 2019. Cham ： Springer International Publishing ， 2019 ： 87 - 96 . doi: 10.1007/978-3-030-22796-8_10 http://dx.doi.org/10.1007/978-3-030-22796-8_10

CHENG W C ， PARK J H ， KO J H . HandFoldingNet： a 3D hand pose estimation network using multiscale-feature guided folding of a 2D hand skeleton ［C］. 2021 IEEE/CVF International Conference on Computer Vision （ICCV）. 10-17，2021 ， Montreal， QC， Canada. IEEE ， 2021 ： 11240 - 11249 . doi: 10.1109/iccv48922.2021.01107 http://dx.doi.org/10.1109/iccv48922.2021.01107

GAO D H ， ZHANG X D ， CHEN X Y ， et al . CycleHand： increasing 3D pose estimation ability on in-the-wild Monocular Image Through Cyclic Flow ［C］. Proceedings of the 30th ACM International Conference on Multimedia. Lisboa Portugal . ACM ， 2022 ： 2452 - 2463 . doi: 10.1145/3503161.3547828 http://dx.doi.org/10.1145/3503161.3547828

ZHANG X ， LI Q ， MO H ， et al . End-to-end Hand mesh recovery from a monocular RGB image ［C］. 2019 IEEE/CVF International Conference on Computer Vision （ICCV）. October 27-November 2 ， 2019 . Seoul， Korea （South）. IEEE ， 2019： 2354 - 2364 . doi: 10.1109/iccv.2019.00244 http://dx.doi.org/10.1109/iccv.2019.00244

WAN C D ， PROBST T ， VAN GOOL L ， et al . Self-supervised 3D hand pose estimation through training by fitting ［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 15-20，2019 ， Long Beach， CA， USA. IEEE ， 2019 ： 10845 - 10854 . doi: 10.1109/cvpr.2019.01111 http://dx.doi.org/10.1109/cvpr.2019.01111

SPURR A ， IQBAL U ， MOLCHANOV P ， et al . Weakly Supervised 3 D Hand Pose Estimation Via Biomechanical Constraints ［M］. Computer Vision-ECCV 2020. Cham ： Springer International Publishing ， 2020 ： 211 - 228 . doi: 10.1007/978-3-030-58520-4_13 http://dx.doi.org/10.1007/978-3-030-58520-4_13

孙迪钢，张平 . 基于先验知识和网格监督的手部姿态估计［J］. 华南理工大学学报（自然科学版）， 2024 ， 52 （ 6 ）： 138 - 147 .

SUN D G ， ZHANG P . Hand pose estimation based on prior knowledge and mesh supervision ［J］. Journal of South China University of Technology （Natural Science Edition）， 2024 ， 52 （ 6 ）： 138 - 147 . （in Chinese）

GU J X ， WANG Z H ， KUEN J ， et al . Recent advances in convolutional neural networks ［J］. Pattern Recognition ， 2018 ， 77 ： 354 - 377 . doi: 10.1016/j.patcog.2017.10.013 http://dx.doi.org/10.1016/j.patcog.2017.10.013

WU Z H ， PAN S R ， CHEN F W ， et al . A comprehensive survey on graph neural networks ［J］. IEEE Transactions on Neural Networks and Learning Systems ， 2021 ， 32 （ 1 ）： 4 - 24 . doi: 10.1109/tnnls.2020.2978386 http://dx.doi.org/10.1109/tnnls.2020.2978386

VASWANI A ， SHAZEER N ， PARMAR N ， et al . Attention is all you need ［J］. Advances in neural information processing systems ， 2017 ， 30 . doi: 10.3390/rs9080848 http://dx.doi.org/10.3390/rs9080848

ZHUANG N ， MU Y D . Joint hand-object pose estimation with differentiably-learned physical contact point analysis ［C］. Proceedings of the 2021 International Conference on Multimedia Retrieval. Taipei ， Taiwan . ACM ， 2021 ： 420 - 428 . doi: 10.1145/3460426.3463648 http://dx.doi.org/10.1145/3460426.3463648

KUANG Z S ， DING C X ， YAO H . Learning context with priors for 3D interacting hand-object pose estimation ［C］. Proceedings of the 32nd ACM International Conference on Multimedia. Melbourne VIC Australia . ACM ， 2024 ： 768 - 777 . doi: 10.1145/3664647.3681065 http://dx.doi.org/10.1145/3664647.3681065

CAO Z ， RADOSAVOVIC I ， KANAZAWA A ， et al . Reconstructing hand-object interactions in the wild ［C］. 2021 IEEE/CVF International Conference on Computer Vision （ICCV）. 10-17，2021 ， Montreal， QC， Canada. IEEE ， 2021 ： 12397 - 12406 . doi: 10.1109/iccv48922.2021.01219 http://dx.doi.org/10.1109/iccv48922.2021.01219

HASSON Y ， VAROL G ， SCHMID C ， et al . Towards Unconstrained Joint hand-object Reconstruction from RGB Videos ［C］. 2021 International Conference on 3D Vision （3DV）. December 1 - 3 ， 2021 . London， United Kingdom. IEEE ， 2021 ： 659 - 668 .

YANG L X ， ZHAN X Y ， LI K L ， et al . CPF： learning a contact potential field to model the hand-object interaction ［C］. 2021 IEEE/CVF International Conference on Computer Vision （ICCV）. 10-17，2021 ， Montreal， QC， Canada. IEEE ， 2021 ： 11077 - 11086 . doi: 10.1109/iccv48922.2021.01091 http://dx.doi.org/10.1109/iccv48922.2021.01091

HUANG L ， TAN J C ， MENG J J ， et al . HOT-net： non-autoregressive transformer for 3D hand-object pose estimation ［C］. Proceedings of the 28th ACM International Conference on Multimedia. Seattle WA USA . ACM ， 2020 ： 3136 - 3145 . doi: 10.1145/3394171.3413775 http://dx.doi.org/10.1145/3394171.3413775

OBERWEGER M ， WOHLHART P ， LEPETIT V . Generalized feedback loop for joint hand-object pose estimation ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence ， 2020 ， 42 （ 8 ）： 1898 - 1912 . doi: 10.1109/tpami.2019.2907951 http://dx.doi.org/10.1109/tpami.2019.2907951

LIU S W ， JIANG H W ， XU J R ， et al . Semi-supervised 3D hand-object poses estimation with interactions in time ［C］. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. June 20 - 25 ， 2021 . Nashville， TN， USA. IEEE ， 2021 ： 14687 - 14697 .

WANG R ， MAO W ， LI H D . Interacting hand-object pose estimation Via dense mutual attention ［C］. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision （WACV） . January 2-7， 2023 ， Waikoloa ， HI， USA . IEEE ， 2023 ： 5724 - 5734 . doi: 10.1109/wacv56688.2023.00569 http://dx.doi.org/10.1109/wacv56688.2023.00569

DOOSTI B ， NAHA S ， MIRBAGHERI M ， et al . HOPE-net： a graph-based model for hand-object pose estimation ［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. June 13 - 19 ， 2020 . Seattle， WA， USA. IEEE ， 2020 ： 6608 - 6617 .

TSE T H E ， KIM K I ， LEONARDIS A ， et al . Collaborative learning for hand and object reconstruction with attention-guided graph convolution ［C］. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 18-24，2022 ， New Orleans， LA， USA. IEEE ， 2022 ： 1654 - 1664 . doi: 10.1109/cvpr52688.2022.00171 http://dx.doi.org/10.1109/cvpr52688.2022.00171

HOANG D C ， TAN P X ， NGUYEN A N ， et al . Multi-modal hand-object pose estimation with adaptive fusion and interaction learning ［J］. IEEE Access ， 2024 ， 12 ： 54339 - 54351 . doi: 10.1109/access.2024.3388870 http://dx.doi.org/10.1109/access.2024.3388870

NEWELL A ， YANG K Y ， DENG J . Stacked Hourglass Networks for Human Pose Estimation ［M］. Computer Vision-ECCV 2016. Cham ： Springer International Publishing ， 2016 ： 483 - 499 . doi: 10.1007/978-3-319-46484-8_29 http://dx.doi.org/10.1007/978-3-319-46484-8_29

WANG C ， XU D F ， ZHU Y K ， et al . DenseFusion： 6D object pose estimation by iterative dense fusion ［C］. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 15-20，2019 ， Long Beach， CA， USA. IEEE ， 2019 ： 3338 - 3347 . doi: 10.1109/cvpr.2019.00346 http://dx.doi.org/10.1109/cvpr.2019.00346

GARCIA-HERNANDO G ， YUAN S X ， BAEK S ， et al . First-person hand action benchmark with RGB-D videos and 3D hand pose annotations ［C］. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . 18-23，2018 ， Salt Lake City， UT， USA . IEEE ， 2018 ： 409 - 419 . doi: 10.1109/cvpr.2018.00050 http://dx.doi.org/10.1109/cvpr.2018.00050

HAMPALI S ， RAD M ， OBERWEGER M ， et al . HOnnotate： a method for 3D annotation of hand and object poses ［C］. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition （CVPR）. 13-19，2020 ， Seattle， WA， USA. IEEE ， 2020 ： 3193 - 3203 . doi: 10.1109/cvpr42600.2020.00326 http://dx.doi.org/10.1109/cvpr42600.2020.00326

HAMPALI S ， SARKAR S D ， LEPETIT V . Ho-3d_v3： Improving the accuracy of hand-object annotations of the ho-3d dataset ［J］. arXiv preprint arXiv： 2107.00887 ， 2021 .

ZHANG M M ， LI A ， LIU H L ， et al . Coarse-to-fine hand-object pose estimation with interaction-aware graph convolutional network ［J］. Sensors ， 2021 ， 21 （ 23 ）： 8092 . doi: 10.3390/s21238092 http://dx.doi.org/10.3390/s21238092

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

暂无数据