构建多尺度深度卷积神经网络行为识别模型

刘智; 黄江涛; 冯欣

doi:10.3788/OPE.20172503.0799

您当前的位置：

首页 >

文章列表页 >

构建多尺度深度卷积神经网络行为识别模型

信息科学 | 更新时间：2020-07-07

- 构建多尺度深度卷积神经网络行为识别模型
- Action recognition model construction based on multi-scale deep convolution neural network
- 光学精密工程 2017年25卷第3期页码：799-805
- 作者机构：
  
  1.重庆理工大学计算机学院, 重庆 400054
  2.广西师范学院计算机与信息工程学院, 广西南宁 530001
- 作者简介：
  
  [ "刘智 (1977-), 男, 江西高安人, 博士, 副教授, 2011年于四川大学计算机科学与技术专业获得博士学位, 主要从事深度学习、人体行为识别、图像处理、目标跟踪、信息融合研究。E-mail:liuzhi@cqut.edu.cn" ]
- 基金信息：
  
  重庆市教委科学技术研究基金资助项目(KJ1400926);广西自然科学基金重点项目(2014GXNSFDA118037)
- DOI：10.3788/OPE.20172503.0799
  中图分类号： TP394.1;TH691.9
- 收稿日期：2016-12-21，
  
  录用日期：2017-1-15，
  
  纸质出版日期：2017-03-25
- 稿件说明：
移动端阅览
刘智, 黄江涛, 冯欣. 构建多尺度深度卷积神经网络行为识别模型[J]. 光学精密工程, 2017,25(3):799-805.

Zhi LIU, Jiang-tao HUANG, Xin FENG. Action recognition model construction based on multi-scale deep convolution neural network[J]. Optics and precision engineering, 2017, 25(3): 799-805.
刘智, 黄江涛, 冯欣. 构建多尺度深度卷积神经网络行为识别模型[J]. 光学精密工程, 2017,25(3):799-805. DOI： 10.3788/OPE.20172503.0799.

Zhi LIU, Jiang-tao HUANG, Xin FENG. Action recognition model construction based on multi-scale deep convolution neural network[J]. Optics and precision engineering, 2017, 25(3): 799-805. DOI： 10.3788/OPE.20172503.0799.

摘要

为了减化传统人体行为识别方法中的特征提取过程，提高所提取特征的泛化性能，本文提出了一种基于深度卷积神经网络和多尺度信息的人体行为识别方法。该方法以深度视频为研究对象，通过构建基于卷积神经网络的深度结构，并融合粗粒度的全局行为模式与细粒度的局部手部动作等多尺度信息来研究人体行为的识别。MSRDailyActivity3D数据集上的实验得出该数据集上第11~16种行为的平均识别准确率为98%，所有行为的平均识别准确率为60.625%。结果表明，本方法能对人体行为进行有效识别，基本能准确识别运动较为明显的人体行为，对仅有手部局部运动的行为的识别准确率有所下降。

Abstract

In order to simplify the feature extracting process of Human Activity Recognition (HAR) and improve the generalization of extracted feature

an algorithm based on multi-scale deep convolution neural network was proposed. In this algorithm

the depth video was selected as research object and a parallel CNN (Convolution Neural Network) based deep network was constructed to process coarse global information of the action and fine-grained local information of hand part simultaneously. Experiments were executed on MSRDailyActivity3D dataset. The average recognition accuracy on actions ranging from No.11 to No.16 was 98%

while that on all actions was 60.625%. The experimental results showed that proposed algorithm could take effective recognition for human activity. Almost all of the actions with obvious movements and most of actions with local movements just in hands could be recognized effectively.

关键词

Keywords

references

DALAL N, TRIGGS B. Histograms of oriented gradients for human detection [C]. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Piscataway, NJ: IEEE, 2005: 886-893.

TIAN Y L, CAO L L, LIU Z C, et al .. Hierarchical filtered motion for action recognition in crowded videos [J]. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 2012, 42(3): 313-323.

张迪飞, 张金锁, 姚克明, 等.基于SVM分类的红外舰船目标识别[J].红外与激光工程, 2016, 45(1):167-172.

ZHANG D F, ZHANG J S, YAO K M, et al .. Infrared ship-target recognition based on SVM classification [J]. Infrared and Laser Engineering, 2016, 45(1):167-172. (inchinese)

LI W, ZHANG Z, LIU Z. Action recognition based on a bag of 3D points [C]. 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Piscataway, NJ: IEEE, 2010:9-14.

WANG J, LIU Z C, WU Y, et al .. Mining actionlet ensemble for action recognition with depth cameras [C] . 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Piscataway, NJ: IEEE., 2012:1290-1297.

XIA L, AGGARWAL J K. Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera [C]. 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2013:2834-2841.

OREIFEJ O, LIU Z. Hon4d: histogram of oriented 4D normals for activity recognition from depth sequences [C]. 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2013:716-723.

ZHANG C Y, TIAN Y L. Edge enhanced depth motion map for dynamic hand gesture recognition [C]. 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway, NJ: IEEE, 2013:500-505.

YE M, ZHANG Q, WANG L, et al .. A survey on human motion analysis from depth data [J]. Time-of-Flight and Depth Imaging, Sensors, Algorithms, and Applications, Springer, 2013:149-187.

LE Q V, ZOU W Y, YEUNG S Y, et al .. Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis [C]. 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ: IEEE, 2011:3361-3368.

ZHANG N, PALURI M, RANZATO M, et al .. Panda: pose aligned networks for deep attribute modeling [C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2014:1637-1644.

TOSHEV A, SZEGEDY C. Deeppose: human pose estimation via deep neural networks [C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2014:1653-1660.

LIU P, HAN S, MENG Z, et al .. Facial expression recognition via a boosted deep belief network [C]. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway, NJ:IEEE, 2014:1805-1812.

HE K, ZHANG X, REN S, et al .. Spatial pyramid pooling in deep convolutional networks for visual recognition [C]. Computer Vision-ECCV 2014, Springer, 2014:346-361.

LIN M, CHEN Q, YAN S. Network in network [J]. Computer Science, 2014.

SZEGEDY C, LIU W, JIA Y Q, et al .. Going deeper with convolutions [C]. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015:1-9.

陈芬, 郑迪, 彭宗举, 等.基于模式复杂度的深度视频快速宏块模式选择算法[J].光学精密工程, 2014, 22(8):2196-2204.

CHEN F, ZHENG D, PENG Z J, et al .. Depth video fast macroblock mode selection algorithm based on mode complexity [J]. Opt. Precision Eng., 2014, 22(8):2196-2204.(inchinese)

COLLOBERT R, KAVUKCUOGLU K, FARABET C. Torch7: A matlab-like environment for machine learning [R]. BigLearn, NIPS Workshop, 2011.

MÜLLER M, RÖDER T. Motion templates for automatic classification and retrieval of motion capture data [C]. Proceedings of the 2006 ACM SIGGRAPH, Eurographics Association, 2006: 137-146.

浏览量

490

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于机载的红外动态目标视频实时超分辨率重建

基于改进YOLOv8模型的增材制造微小气孔缺陷检测及其尺寸测量

面向小目标测量的通道注意力网络与系统设计

面向空间应用的视觉位姿估计技术综述

重定位非极大值抑制算法