Action recognition model construction based on multi-scale deep convolution neural network

Zhi LIU; Jiang-tao HUANG; Xin FENG

doi:10.3788/OPE.20172503.0799

您当前的位置：

首页 >

文章列表页 >

Action recognition model construction based on multi-scale deep convolution neural network

Information Sciences | 更新时间：2020-07-07

- Action recognition model construction based on multi-scale deep convolution neural network
- Optics and Precision Engineering Vol. 25, Issue 3, Pages: 799-805(2017)
- 作者机构：
  
  1.重庆理工大学计算机学院, 重庆 400054
  2.广西师范学院计算机与信息工程学院, 广西南宁 530001
- 作者简介：
  
  HUANG Jiang-tao, E-mail: hjt@gxtc.edu.cn
- 基金信息：
- DOI：10.3788/OPE.20172503.0799
  CLC： TP394.1;TH691.9
- Received：21 December 2016，
  
  Accepted：15 January 2017，
  
  Published：25 March 2017
- 稿件说明：
移动端阅览
Zhi LIU, Jiang-tao HUANG, Xin FENG. Action recognition model construction based on multi-scale deep convolution neural network[J]. Optics and precision engineering, 2017, 25(3): 799-805.
DOI：

Zhi LIU, Jiang-tao HUANG, Xin FENG. Action recognition model construction based on multi-scale deep convolution neural network[J]. Optics and precision engineering, 2017, 25(3): 799-805. DOI： 10.3788/OPE.20172503.0799.

摘要

为了减化传统人体行为识别方法中的特征提取过程，提高所提取特征的泛化性能，本文提出了一种基于深度卷积神经网络和多尺度信息的人体行为识别方法。该方法以深度视频为研究对象，通过构建基于卷积神经网络的深度结构，并融合粗粒度的全局行为模式与细粒度的局部手部动作等多尺度信息来研究人体行为的识别。MSRDailyActivity3D数据集上的实验得出该数据集上第11~16种行为的平均识别准确率为98%，所有行为的平均识别准确率为60.625%。结果表明，本方法能对人体行为进行有效识别，基本能准确识别运动较为明显的人体行为，对仅有手部局部运动的行为的识别准确率有所下降。

Abstract

In order to simplify the feature extracting process of Human Activity Recognition (HAR) and improve the generalization of extracted feature

an algorithm based on multi-scale deep convolution neural network was proposed. In this algorithm

the depth video was selected as research object and a parallel CNN (Convolution Neural Network) based deep network was constructed to process coarse global information of the action and fine-grained local information of hand part simultaneously. Experiments were executed on MSRDailyActivity3D dataset. The average recognition accuracy on actions ranging from No.11 to No.16 was 98%

while that on all actions was 60.625%. The experimental results showed that proposed algorithm could take effective recognition for human activity. Almost all of the actions with obvious movements and most of actions with local movements just in hands could be recognized effectively.

关键词

Keywords

references

DALAL N, TRIGGS B. Histograms of oriented gradients for human detection [C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2005: 886-893.

TIAN Y L, CAO L L, LIU Z C, et al .. Hierarchical filtered motion for action recognition in crowded videos [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2012, 42(3): 313-323.

张迪飞, 张金锁, 姚克明, 等.基于SVM分类的红外舰船目标识别[J].红外与激光工程, 2016, 45(1):167-172.

ZHANG D F, ZHANG J S, YAO K M, et al .. Infrared ship-target recognition based on SVM classification [J]. Infrared and Laser Engineering, 2016, 45(1):167-172. (inchinese)

LI W, ZHANG Z, LIU Z. Action recognition based on a bag of 3D points [C]. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Piscataway, NJ: IEEE, 2010:9-14.

WANG J, LIU Z C, WU Y, et al .. Mining actionlet ensemble for action recognition with depth cameras [C] . 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Piscataway, NJ: IEEE., 2012:1290-1297.

XIA L, AGGARWAL J K. Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera [C]. 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2013:2834-2841.

OREIFEJ O, LIU Z. Hon4d: histogram of oriented 4D normals for activity recognition from depth sequences [C]. 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2013:716-723.

ZHANG C Y, TIAN Y L. Edge enhanced depth motion map for dynamic hand gesture recognition [C]. 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway, NJ: IEEE, 2013:500-505.

YE M, ZHANG Q, WANG L, et al .. A survey on human motion analysis from depth data [J]. Time-of-Flight and Depth Imaging, Sensors, Algorithms, and Applications, Springer, 2013:149-187.

LE Q V, ZOU W Y, YEUNG S Y, et al .. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis [C]. 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2011:3361-3368.

ZHANG N, PALURI M, RANZATO M, et al .. Panda: pose aligned networks for deep attribute modeling [C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2014:1637-1644.

TOSHEV A, SZEGEDY C. Deeppose: human pose estimation via deep neural networks [C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2014:1653-1660.

LIU P, HAN S, MENG Z, et al .. Facial expression recognition via a boosted deep belief network [C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2014:1805-1812.

HE K, ZHANG X, REN S, et al .. Spatial pyramid pooling in deep convolutional networks for visual recognition [C]. Computer Vision-ECCV 2014, Springer, 2014:346-361.

LIN M, CHEN Q, YAN S. Network in network [J]. Computer Science, 2014.

SZEGEDY C, LIU W, JIA Y Q, et al .. Going deeper with convolutions [C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015:1-9.

陈芬, 郑迪, 彭宗举, 等.基于模式复杂度的深度视频快速宏块模式选择算法[J].光学精密工程, 2014, 22(8):2196-2204.

CHEN F, ZHENG D, PENG Z J, et al .. Depth video fast macroblock mode selection algorithm based on mode complexity [J]. Opt. Precision Eng., 2014, 22(8):2196-2204.(inchinese)

COLLOBERT R, KAVUKCUOGLU K, FARABET C. Torch7: A matlab-like environment for machine learning [R]. BigLearn, NIPS Workshop, 2011.

MÜLLER M, RÖDER T. Motion templates for automatic classification and retrieval of motion capture data [C]. Proceedings of the 2006 ACM SIGGRAPH, Eurographics Association, 2006: 137-146.

Views

490

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Real-time super-resolution for infrared dynamic object video based on airborne platform

YOLOv8 model-based additive manufacturing micro porosity defect detection and its dimension measurement

Design of channel attention network and system for micro target measurement

Overview of visual pose estimation methods for space missions

Relocation non-maximum suppression algorithm

Related Author

ZHU Deyan

XU Jiayi

AO Yongqi

CAI Yindi

ZHANG Dianpeng

SUN Zimeng

WANG Yuxuan

ZHU Xianglong

Related Institution

College of Astronautics， Nanjing University of Aeronautics and Astronautics

Key Laboratory of Space Photoelectric Detection and Sensing of Industry and Information Technology， Nanjing University of Aeronautics and Astronautics

College of Mechanical Engineering， Dalian University of Technology

School of Instrument Science and Opto-electronics Engineering， Hefei University of Technology

Anhui Province Key Laboratory of Measuring Theory and Precision Instrument

AI问答

Address：No.3888 Dong Nanhu Road, Changchun, Jilin, China Postal code：130033
Tel：0431-86176855 Email：gxjmgc@ciomp.ac.cn
Technical support is provided by Beijing Founder electronics co., LTD 吉ICP备11002662号-17 京公网安备11010802024621
It is recommended to read the content of this site in Chrome&IE9+. Please switch to extreme mode in browser 360.
Cookies We use cookies to help provide and enhance our service and tailor content. By continuing, you agree to the use of cookies.

⁰