利用几何特征和时序注意递归网络的动作识别

李庆辉; 郑勇; 方浩

doi:10.3788/OPE.20182610.2584

您当前的位置：

首页 >

文章列表页 >

利用几何特征和时序注意递归网络的动作识别

信息科学 | 更新时间：2020-07-05

- 利用几何特征和时序注意递归网络的动作识别
- Action recognition using geometric features and recurrent temporal attention network
- 光学精密工程 2018年26卷第10期页码：2584-2591
- 作者机构：
  
  火箭军工程大学保障学院, 陕西西安 710025
- 作者简介：
  
  [ "李庆辉(1989-), 男, 山东临沂人, 博士研究生, 2011年、2013年于第二炮兵工程大学分别获得学士、硕士学位, 主要从事机器视觉及模式识别方面的研究。E-mail:mailto:brightlishi@gmail.com, lqhuiu1212@126.com" ]
- 基金信息：
  
  国家自然科学基金资助项目(61501470);陕西省重点研发计划资助项目(2017GY-075)
- DOI：10.3788/OPE.20182610.2584
  中图分类号： TP391
- 收稿日期：2018-02-06，
  
  录用日期：2018-4-3，
  
  纸质出版日期：2018-10-25
- 稿件说明：
移动端阅览
李庆辉, 郑勇, 方浩. 利用几何特征和时序注意递归网络的动作识别[J]. 光学精密工程, 2018,26(10):2584-2591.

Qing-hui LI, Yong ZHENG, Hao FANG. Action recognition using geometric features and recurrent temporal attention network[J]. Optics and precision engineering, 2018, 26(10): 2584-2591.
李庆辉, 郑勇, 方浩. 利用几何特征和时序注意递归网络的动作识别[J]. 光学精密工程, 2018,26(10):2584-2591. DOI： 10.3788/OPE.20182610.2584.

Qing-hui LI, Yong ZHENG, Hao FANG. Action recognition using geometric features and recurrent temporal attention network[J]. Optics and precision engineering, 2018, 26(10): 2584-2591. DOI： 10.3788/OPE.20182610.2584.

摘要

为提高基于人体骨架（Skeleton-based）的动作识别准确度，提出一种利用骨架几何特征与时序注意递归网络的动作识别方法。首先，利用旋转矩阵的向量化形式描述身体部件对之间的相对几何关系，并与关节坐标、关节距离两种特征融合后作为骨架的特征表示；然后，提出一种时序注意方法，通过与之前帧加权平均对比来判定当前帧包含的有价值的信息量，采用一个多层感知机实现权值的生成；最后，将骨架的特征表示乘以对应权值后输入一个LSTM网络进行动作识别。在MSR-Action3D和UWA3D Multiview Activity Ⅱ数据集上该方法分别取得了96.93%和80.50%的识别结果。实验结果表明该方法能对人体动作进行有效地识别且对视角变化具有较高的适应性。

Abstract

To improve the accuracy of action recognition based on the human skeleton

an action recognition method based on geometric features and a recurrent temporal attention network was proposed. First

a vectorized form of the rotation matrix was defined to describe the relative geometric relationship between body parts. The vectorized form was fused with joint coordinates and joint distances to represent a skeleton in a video. A temporal attention method was then introduced. By considering the weighted average of the previous frame

a multi-layer perceptron was used to learn the weight of the current frame. Finally

the product of the feature vector and corresponding weight was propagated through three layers of long short-term memory to predict the class label. The experimental results show that the recognition accuracy of the proposed algorithm was superior to that of existing algorithms. Specifically

experiments with the MSR-Action3D and UWA3D Multiview Activity Ⅱ datasets achieved 96.93 and 80.50% accuracy

respectively.

关键词

Keywords

references

王世刚, 鲁奉军, 赵文婷, 等.应用在线随机森林投票的动作识别[J].光学精密工程, 2016, 24(8):2010-2017.

WANG SH G, LU F J, ZHAO W T, et al.. Action recognition based on on-line random forest voting[J]. Opt. Precision Eng., 2016, 24(8):2010-2017. (in Chinese)

刘智, 黄江涛, 冯欣.构建多尺度深度卷积神经网络行为识别模型[J].光学精密工程, 2017, 25(3):799-805.

LIU ZH, HUANG J T, FENG X. action recognition model construction based on multi-scale deep convolution neural network[J]. Opt. Precision Eng., 2017, 25(3):799-805. (in Chinese)

HAN F, REILY B, HOFF W, et al.. Space-time representation of people based on 3D skeletal data[J]. Computer Vision & Image Understanding, 2017, 158(C):85-105.

李庆武, 席淑雅, 王恬, 等.结合位姿约束与轨迹寻优的人体姿态估计[J].光学精密工程, 2017, 25(4):528-537.

LI Q W, XI SH Y, WANG T, et al.. Human pose estimation based on configuration constraints and trajectory optimization[J]. Opt. Precision Eng., 2017, 25(4):528-537.(in Chinese)

RAHMANI H, MAHMOOD A, HUYNH D Q, et al.. Real time action recognition using histograms of depth gradients and random decision forests[C]. Proceedings of IEEE Winter Conference on Applications of Computer Vision , 2014: 626-633. http://ieeexplore.ieee.org/xpls/icp.jsp?arnumber=6836044

YANG X, TIAN Y L. Effective 3D action recognition using EigenJoints[J]. Journal of Visual Communication & Image Representation, 2014, 25(1):2-11.

BOUBOU S, SUZUKI E. Classifying actions based on histogram of oriented velocity vectors[J]. Journal of Intelligent Information Systems, 2015, 44(1):49-65.

ZHANG S, LIU X, XIAO J. On geometric features for skeleton-based action recognition using multilayer lstm networks[C]. Proceedings of IEEE Winter Conference on Applications of Computer Vision, Los Alamitos: IEEE Computer Society Press , 2017: 148-157. http://ieeexplore.ieee.org/document/7926607/

DU Y, WANG W, WANG L. Hierarchical recurrent neural network for skeleton based action recognition[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New York, USA: IEEE Press , 2015: 1110-1118. http://www.irgrid.ac.cn/handle/1471x/1062399?mode=full&submit_simple=Show+full+item+record

ZHU W, LAN C, XING J, et al. Co-occurrence feature learning for skeleton based action recognition using regularized deep LSTM networks[C]. AAAI Conference on Artificial Intelligence, Palo Alto, USA: AAAI Press , 2016: 3697-3703. http://dl.acm.org/citation.cfm?id=3016423

LIU J, SHAHROUDY A, XU D, et al.. Spatio-temporal LSTM with trust gates for 3D human action recognition[C]. Proceedings of the European Conference on Computer Vision, Heidelberg: Springer , 2016: 816-833. http://www.springerlink.com/openurl.asp?id=doi:10.1007/978-3-319-46487-9_50

SHARMA S, KIROS R, SALAKHU R. Action Recognition using Visual Attention[C]. Proceedings of the International Conference on Learning Representations , 2016: 1-11. http://cn.arxiv.org/abs/1511.04119

LEE I, KIM D, KANG S, et al.. Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks[C]. Proceedings of the IEEE International Conference on Computer Vision, Los Alamitos: IEEE Computer Society Press , 2017: 1012-1020. https://www.computer.org/csdl/proceedings/iccv/2017/1032/00/1032b012-abs.html

KAR A, RAI N, SIKKA K, et al.. Adascan: Adaptive scan pooling in deep convolutional neural networks for human action recognition in videos[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos: IEEE Computer Society Press , 2017: 3376-3385. https://www.computer.org/csdl/proceedings/cvpr/2017/0457/00/0457f699-abs.html

SONG S, LAN C, XING J, et al.. An end-to-end spatio-temporal attention model for human action Recognition from Skeleton Data[C]. AAAI Conference on Artificial Intelligence, Palo Alto, USA: AAAI Press , 2017: 4263-4270. http://cn.arxiv.org/abs/1611.06067

LI W, ZHANG Z, LIU Z. Action recognition based on a bag of 3d points[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos: IEEE Computer Society Press , 2010: 9-14. http://www.researchgate.net/publication/224165257_Action_recognition_based_on_a_bag_of_3D_points

XIA L, CHEN C C, AGGARWAL J K. View invariant human action recognition using histograms of 3D joints[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos: IEEE Computer Society Press , 2012: 20-27. http://www.researchgate.net/publication/261421353_View_invariant_human_action_recognition_using_histograms_of_3D_joints

VEMULAPALLI R, ARRATE F, CHELLAPPA R. Human action recognition by representing 3D skeletons as points in a Lie group.[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos: IEEE Computer Society Press , 2014: 588-595. https://www.computer.org/csdl/proceedings/cvpr/2014/5118/00/5118a588-abs.html

RAHMANI H, MAHMOOD A, HUYNH D, et al.. Histogram of oriented principal components for cross-view action recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 38(12):2430-2443.

RAHMANI H, MAHMOOD A, HUYNH D, et al.. HOPC: Histogram of oriented principal components of 3D point louds for action recognition[C]. European Conference on Computer Vision , 2014: 742-757. https://link.springer.com/chapter/10.1007%2F978-3-319-10605-2_48

WANG J, LIU Z, WU Y, et al.. Learning actionlet ensemble for 3D human action recognition[J]. IEEE transactions on pattern analysis and machine intelligence, 2014, 36(5):914-927.

RAHMANI H, MIAN A. 3d action recognition from novel viewpoints[C]. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos: IEEE Computer Society Press , 2016: 1506-1515. http://ieeexplore.ieee.org/document/7780536/

浏览量

286

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

人体下肢应激微反应自动识别

应用在线随机森林投票的动作识别

基于Kinect数据主成分分析的人体动作识别