浏览全部资源
扫码关注微信
青岛理工大学 机械与汽车工程学院,山东 青岛 266520
Received:16 May 2022,
Revised:20 June 2022,
Published:10 October 2022
移动端阅览
林晋钢,李东年,陈成军等.基于像素投票的人手全局姿态估计[J].光学精密工程,2022,30(19):2379-2389.
LIN Jingang,LI Dongnian,CHEN Chengjun,et al.Global hand pose estimation based on pixel voting[J].Optics and Precision Engineering,2022,30(19):2379-2389.
林晋钢,李东年,陈成军等.基于像素投票的人手全局姿态估计[J].光学精密工程,2022,30(19):2379-2389. DOI: 10.37188/OPE.20223019.2379.
LIN Jingang,LI Dongnian,CHEN Chengjun,et al.Global hand pose estimation based on pixel voting[J].Optics and Precision Engineering,2022,30(19):2379-2389. DOI: 10.37188/OPE.20223019.2379.
针对人手全局姿态估计误差较大的问题,提出了一种基于像素投票的人手全局姿态估计方法。建立编码器-解码器结构卷积神经网络产生语义信息与姿态信息特征图;分别利用语义分割分支、姿态估计分支从特征图中获取人手像素位置与逐像素姿态投票,最后汇总人手像素的姿态投票获得投票结果。为解决人手全局姿态数据集较少的问题,通过OpenSceneGraph(OSG)三维渲染引擎和三维人手模型建立人手数据集合成程序。该程序可生成不同手势下的人手深度图像与全局姿态标签。实验结果表明,基于像素投票的人手全局姿态估计方法的误差均值为5.036°,可以准确地从深度图像中估计人手全局姿态。
Global hand pose estimation under changing gestures remains a challenging task in computer vision. To address the problem of large errors in this task, a method based on pixel voting was proposed. First, a convolutional neural network with an encoder-decoder structure was established to generate feature maps of semantic and pose information. Second, hand pixel positions and pixel-by-pixel pose voting were obtained from the feature maps using semantic segmentation and pose estimation branches, respectively. Finally, the pose voting of hand pixels was aggregated to obtain the voting result. Simultaneously, to solve the problem of scarcity of global hand pose datasets, a procedure for generating synthetic datasets of the human hand was established using the OpenSceneGraph 3D rendering engine and a 3D human hand model. This procedure could generate depth images and global pose labels of human hands under different gestures. Experimental results show that the average error of global hand pose estimation based on pixel voting is 5.036°, thus verifying that the proposed method can robustly and accurately estimate global hand poses from depth images.
TKACH A , TAGLIASACCHI A , REMELLI E , et al . Online generative model personalization for hand tracking [J]. ACM Transactions on Graphics , 2017 , 36 ( 6 ): 243 . doi: 10.1145/3130800.3130830 http://dx.doi.org/10.1145/3130800.3130830
李东年 , 周以齐 . 采用改进粒子群优化粒子滤波的三维人手跟踪 [J]. 光学 精密工程 , 2014 , 22 ( 10 ): 2870 - 2878 . doi: 10.3788/ope.20142210.2870 http://dx.doi.org/10.3788/ope.20142210.2870
LI D N , ZHOU Y Q . Three dimensional hand tracking by improved particle swarm optimized particle filter [J]. Opt. Precision Eng. , 2014 , 22 ( 10 ): 2870 - 2878 . (in Chinese) . doi: 10.3788/ope.20142210.2870 http://dx.doi.org/10.3788/ope.20142210.2870
TANG X , WANG T Y , FU C W . Towards accurate alignment in real-time 3D hand-mesh reconstruction [C]. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal , QC, Canada . IEEE , 2021 : 11678 - 11687 . doi: 10.1109/iccv48922.2021.01149 http://dx.doi.org/10.1109/iccv48922.2021.01149
XIONG F , ZHANG B S , XIAO Y , et al . A2J: anchor-to-joint regression network for 3D articulated pose estimation from a single depth image [C]. 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul , Korea (South) . IEEE , 2019 : 793 - 802 . doi: 10.1109/iccv.2019.00088 http://dx.doi.org/10.1109/iccv.2019.00088
TAYLOR J , TANKOVICH V , TANG D H , et al . Articulated distance fields for ultra-fast tracking of hands interacting [J]. ACM Transactions on Graphics , 2017 , 36 ( 6 ): 244 . doi: 10.1145/3130800.3130853 http://dx.doi.org/10.1145/3130800.3130853
KREJOV P , GILBERT A , BOWDEN R . Guided optimisation through classification and regression for hand pose estimation [J]. Computer Vision and Image Understanding , 2017 , 155 : 124 - 138 . doi: 10.1016/j.cviu.2016.11.005 http://dx.doi.org/10.1016/j.cviu.2016.11.005
CHEN X H , WANG G J , GUO H K , et al . Pose guided structured region ensemble network for cascaded hand pose estimation [J]. Neurocomputing , 2020 , 395 : 138 - 149 . doi: 10.1016/j.neucom.2018.06.097 http://dx.doi.org/10.1016/j.neucom.2018.06.097
CHENG W C , PARK J H , KO J H . HandFoldingNet: a 3D hand pose estimation network using multiscale-feature guided folding of a 2D hand skeleton [C]. 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal , QC, Canada . IEEE , 2021 : 11240 - 11249 . doi: 10.1109/iccv48922.2021.01107 http://dx.doi.org/10.1109/iccv48922.2021.01107
REN P F , SUN H F , HUANG W T , et al . Spatial-aware stacked regression network for real-time 3D hand pose estimation [J]. Neurocomputing , 2021 , 437 : 42 - 57 . doi: 10.1016/j.neucom.2021.01.045 http://dx.doi.org/10.1016/j.neucom.2021.01.045
李伟强 , 雷航 , 张静玉 , 等 . 基于标签分布学习的三维手部姿态估计 [J]. 计算机应用 , 2021 , 41 ( 2 ): 550 - 555 .
LI W Q , LEI H , ZHANG J Y , et al . 3D hand pose estimation based on label distribution learning [J]. Journal of Computer Applications , 2021 , 41 ( 2 ): 550 - 555 . (in Chinese)
HUANG W T , REN P F , WANG J Y , et al . AWR: adaptive weighting regression for 3D hand pose estimation [J]. Proceedings of the AAAI Conference on Artificial Intelligence , 2020 , 34 ( 07 ): 11061 - 11068 . doi: 10.1609/aaai.v34i07.6761 http://dx.doi.org/10.1609/aaai.v34i07.6761
危德健 , 王文明 , 王全玉 , 等 . 改进的基于锚点的三维手部姿态估计网络 [J]. 计算机应用 , 2022 , 42 ( 3 ): 953 - 959 .
WEI D J , WANG W M , WANG Q Y , et al . Improved 3D hand pose estimation network based on anchor [J]. Journal of Computer Applications , 2022 , 42 ( 3 ): 953 - 959 . (in Chinese)
MOON G , LEE K M . I2L-MeshNet: image-to-lixel prediction network for accurate 3D human pose and mesh estimation from a single RGB image [C]. European Conference on Computer Vision. Springer , Cham , 2020 : 752 - 768 . doi: 10.1007/978-3-030-58571-6_44 http://dx.doi.org/10.1007/978-3-030-58571-6_44
熊杰 , 彭军 , 杨文姬 , 等 . 多尺度高分辨率保持和视角不变的手姿态估计 [J]. 计算机工程与应用 , 2021 , 57 ( 14 ): 148 - 157 . doi: 10.3778/j.issn.1002-8331.2004-0078 http://dx.doi.org/10.3778/j.issn.1002-8331.2004-0078
XIONG J , PENG J , YANG W J , et al . Multi-scale high-resolution preserving and perspective-invariant hand pose estimation [J]. Computer Engineering and Applications , 2021 , 57 ( 14 ): 148 - 157 . (in Chinese) . doi: 10.3778/j.issn.1002-8331.2004-0078 http://dx.doi.org/10.3778/j.issn.1002-8331.2004-0078
LI M R , GAO Y , SANG N . Exploiting learnable joint groups for hand pose estimation [EB/OL]. 2020: arXiv : 2012 . 09496 . https://arxiv.org/abs/2012.09496 https://arxiv.org/abs/2012.09496 . doi: 10.1609/aaai.v35i3.16287 http://dx.doi.org/10.1609/aaai.v35i3.16287
SHARP T , KESKIN C , ROBERTSON D , et al . Accurate, robust, and flexible real-time hand tracking [C]. Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. Seoul, Republic of Korea. New York : ACM , 2015 : 3633 - 3642 . doi: 10.1145/2702123.2702179 http://dx.doi.org/10.1145/2702123.2702179
LIANG H , YUAN J S , LEE J , et al . Hough forest with optimized leaves for global hand pose estimation with arbitrary postures [J]. IEEE Transactions on Cybernetics , 2019 , 49 ( 2 ): 527 - 541 . doi: 10.1109/tcyb.2017.2779800 http://dx.doi.org/10.1109/tcyb.2017.2779800
RUIZ N , CHONG E , REHG J M . Fine-grained head pose estimation without keypoints [C]. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Salt Lake City , UT, USA . IEEE , 2018 : 2155 - 215509 . doi: 10.1109/cvprw.2018.00281 http://dx.doi.org/10.1109/cvprw.2018.00281
YANG T Y , CHEN Y T , LIN Y , et al . FSA-net: learning fine-grained structure aggregation for head pose estimation from a single image [C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach , CA, USA . IEEE , 2019 : 1087 - 1096 . doi: 10.1109/cvpr.2019.00118 http://dx.doi.org/10.1109/cvpr.2019.00118
车云龙 , 齐越 . 基于深度图像的手部姿态估计综述 [J]. 计算机辅助设计与图形学学报 , 2021 , 33 ( 11 ): 1635 - 1648 . doi: 10.3724/sp.j.1089.2021.18788 http://dx.doi.org/10.3724/sp.j.1089.2021.18788
CHE Y L , QI Y . A survey on depth based hand pose estimation [J]. Journal of Computer-Aided Design & Computer Graphics , 2021 , 33 ( 11 ): 1635 - 1648 . (in Chinese) . doi: 10.3724/sp.j.1089.2021.18788 http://dx.doi.org/10.3724/sp.j.1089.2021.18788
CHEN L C , ZHU Y K , PAPANDREOU G , et al . Encoder-decoder with atrous separable convolution for semantic image segmentation [C]. Proceedings of the European conference on computer vision (ECCV) , 2018 : 801 - 818 . doi: 10.1007/978-3-030-01234-2_49 http://dx.doi.org/10.1007/978-3-030-01234-2_49
HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C]. 2016 IEEE Conference on Computer Vision and Pattern Recognition . Las Vegas, NV, USA . IEEE , 2016 : 770 - 778 . doi: 10.1109/cvpr.2016.90 http://dx.doi.org/10.1109/cvpr.2016.90
徐胜军 , 欧阳朴衍 , 郭学源 , 等 . 多尺度特征融合空洞卷积 ResNet遥感图像建筑物分割 [J]. 光学 精密工程 , 2020 , 28 ( 7 ): 1588 - 1599 . doi: 10.37188/ope.20202807.1588 http://dx.doi.org/10.37188/ope.20202807.1588
XU SH J , OUYANG P Y , GUO X Y , et al . Building segmentation in remote sensing image based on multiscale-feature fusion dilated convolution resnet [J]. Opt. Precision Eng. , 2020 , 28 ( 7 ): 1588 - 1599 . (in Chinese) . doi: 10.37188/ope.20202807.1588 http://dx.doi.org/10.37188/ope.20202807.1588
RUSSAKOVSKY O , DENG J , SU H , et al . ImageNet large scale visual recognition challenge [J]. International Journal of Computer Vision , 2015 , 115 ( 3 ): 211 - 252 . doi: 10.1007/s11263-015-0816-y http://dx.doi.org/10.1007/s11263-015-0816-y
刘桂雄 , 黄坚 . 基于标签预留Softmax算法的机器视觉检测鉴别语义分割迁移学习技术 [J]. 光学 精密工程 , 2022 , 30 ( 1 ): 117 - 125 . doi: 10.37188/OPE.20223001.0117 http://dx.doi.org/10.37188/OPE.20223001.0117
LIU G X , HUANG J . Transfer learning techniques for semantic segmentation of machine vision inspection and identification based on label-reserved Softmax algorithms [J]. Optics and Precision Engineering , 2022 , 30 ( 1 ): 117 - 125 . (in Chinese) . doi: 10.37188/OPE.20223001.0117 http://dx.doi.org/10.37188/OPE.20223001.0117
ZHOU Y , BARNES C , LU J W , et al . On the continuity of rotation representations in neural networks [C]. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach , CA, USA . IEEE , 2019 : 5738 - 5746 . doi: 10.1109/cvpr.2019.00589 http://dx.doi.org/10.1109/cvpr.2019.00589
XIANG Y , SCHMIDT T , NARAYANAN V , et al . Posecnn: A Convolutional Neural Network for 6D Object Pose Estimation In Cluttered Scenes [EB/OL]. arXiv preprint arXiv: 1711.00199 , 2017 . doi: 10.15607/rss.2018.xiv.019 http://dx.doi.org/10.15607/rss.2018.xiv.019
0
Views
1215
下载量
1
CSCD
Publicity Resources
Related Articles
Related Author
Related Institution