结合限制密集轨迹与时空共生特征的行为识别

李庆辉; 崔智高; 姜柯

doi:10.3788/OPE.20182601.0230

您当前的位置：

首页 >

文章列表页 >

结合限制密集轨迹与时空共生特征的行为识别

信息科学 | 更新时间：2020-08-13

- 结合限制密集轨迹与时空共生特征的行为识别
- Action recognition via restricted dense trajectories and spatio-temporal co-occurrence feature
- 光学精密工程 2018年26卷第1期页码：230-237
- 作者机构：
  
  火箭军工程大学五系, 陕西西安 710025
- 作者简介：
  
  [ "李庆辉(1989-), 男, 山东临沂人, 博士研究生, 2011年、2013年于第二炮兵工程大学分别获得学士、硕士学位, 主要从事机器视觉及模式识别方面的研究。E-mail:lqhui1212@126.com" ]
- 基金信息：
  
  国家自然科学基金资助项目(61501470);陕西省重点研发计划资助项目(2017GY-075)
- DOI：10.3788/OPE.20182601.0230
  中图分类号： TP391
- 收稿日期：2017-08-29，
  
  录用日期：2017-10-9，
  
  纸质出版日期：2018-01-25
- 稿件说明：
移动端阅览
李庆辉, 崔智高, 姜柯. 结合限制密集轨迹与时空共生特征的行为识别[J]. 光学精密工程, 2018,26(1):230-237.

Qing-hui LI, Zhi-gao CUI, Ke JIANG. Action recognition via restricted dense trajectories and spatio-temporal co-occurrence feature[J]. Optics and precision engineering, 2018, 26(1): 230-237.
李庆辉, 崔智高, 姜柯. 结合限制密集轨迹与时空共生特征的行为识别[J]. 光学精密工程, 2018,26(1):230-237. DOI： 10.3788/OPE.20182601.0230.

Qing-hui LI, Zhi-gao CUI, Ke JIANG. Action recognition via restricted dense trajectories and spatio-temporal co-occurrence feature[J]. Optics and precision engineering, 2018, 26(1): 230-237. DOI： 10.3788/OPE.20182601.0230.

摘要

针对传统密集轨迹方法应用到真实场景后过多无效轨迹耗费存储与计算资源且严重影响有效特征提取的不足，提出一种新的人体行为识别算法。首先，检测视频帧中存在的人体目标并对获得的包含人体的矩形框进行扩展，利用扩展后的矩形框对传统密集采样特征点的范围进行筛选限制；然后，对筛选限制后的特征点在光流场中跟踪一定帧数获取限制密集轨迹，并在以限制密集轨迹为中心的时空体内构建一组包含轨迹的空间位置、时空上下文信息的特征描述子；最后在视觉词袋模型框架下，采用SVM对特征向量进行编码分类。结果显示：在KTH、YouTube和HMDB51 3个行为数据库上的识别准确率分别达到98.1%、89.7%和66.9%。证明本算法对复杂真实场景中的人体行为具有较高的识别能力。

Abstract

To overcome the limitation of improved dense trajectories for using in real environment

a novel human action recognition algorithm using restricted dense trajectories and spatio-temporal co-occurrence descriptors was proposed. Firstly

a human detector was applied to get the rectangular and the traditional dense interest points in the videos were refined via expanded rectangular box

which greatly reduces the number of trajectories while preserves the discriminative power. Then

the restricted dense trajectories were obtained by tracking the refined points using optical flow fields. And a set of new descriptors was built which describe the relative spatial position and the spatio-temporal context of motion trajectories. Finally

a Bag of Visual Words (BoVW) model with support vector machine was used to classify human action. On three action recognition datasets:KTH

YouTube and HMDB51

the recognition accuracy is 98.1%

89.7% and 66.9% respectively. Experimental results show that the proposed algorithm has higher recognition ability for human action in complex real scenes.

关键词

Keywords

references

HERATH S, HARANDI M, PORIKLI F. Going deeper into action recognition:A survey[J].Image and Vision Computing, 2017, 60:4-21.

刘智, 黄江涛, 冯欣.构建多尺度深度卷积神经网络行为识别模型[J].光学精密工程, 2017, 25(3):799-805.

LIU ZH, HUANG J T, FENG X. Action recognition model construction based on multi-scale deep convolution neural network[J].Opt. Precision Eng., 2017, 25(3):799-805. (in Chinese)

WANG H, ULLAH M M, KLÄSER A, et al . . Evaluation of local spatio-temporal features for action recognition[C]. Proceedings of British Machine Vision Conference , BMVC, 2009: 7-10.

WANG H, KLÄSER A, SCHMID C, et al . . Action recognition by dense trajectories[C]. Proceedings of 2011 IEEE Conference on Computer Vision and Pattern Recognition , IEEE , 2011: 3169-3176.

WANG H, SCHMID C. Action recognition with improved trajectories[C]. Proceedings of 2013 IEEE International Conference on Computer Vision , IEEE , 2013: 3551-3558.

王世刚, 鲁奉军, 赵文婷, 等.应用在线随机森林投票的动作识别[J].光学精密工程, 2016, 24(8):2010-2017.

WANG SH G, LU F J, ZHAO W T, et al.. Action recognition based on on-line random forest voting[J].Opt. Precision Eng., 2016, 24(8):2010-2017. (in Chinese)

张国梁, 贾松敏, 张祥银, 等.采用自适应变异粒子群优化SVM的行为识别[J].光学精密工程, 2017, 25(6):1669-1678.

ZHANG G L, JIA S M, ZHANG X Y, et al.. Action recognition based on adaptive mutation particle swarm optimization for SVM[J].Opt. Precision Eng., 2017, 25(6):1669-1678. (in Chinese)

LIU J G, LUO J B, SHAH M. Recognizing realistic actions from videos "in the Wild"[C]. Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition , IEEE , 2009: 1996-2003.

RODRIGUEZ M, ORRITE C, MEDRANO C, et al.. One-shot learning of Human activity with an MAP adapted GMM and simplex-HMM[J].IEEE Transactions on Cybernetics, 2017, 47(7):1769-1780.

PENG X J, WANG L M, WANG X X, et al.. Bag of visual words and fusion methods for action recognition:Comprehensive study and good practice[J].Computer Vision and Image Understanding, 2016, 150:109-125.

BHATTACHARYAS, SUKTHANKAR R, JIN R, et al . . A probabilistic representation for efficient large scale visual recognition tasks[C]. IEEE Proceedings of 2011 IEEE Conference on Computer Vision and Pattern Recognition , IEEE , 2011: 2593-2600.

YANG X D, TIAN Y L. Action recognition using super sparse coding vector with spatio-temporal awareness[C]. Proceedings of 13th European Conference on Computer Vision , Springer, 2014: 727-741.

JI SH W, XU W, YANG M, et al.. 3D convolutional neural networks for Human action recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013, 35(1):221-231.

VAROL G, LAPTEV I, SCHMID C. Long-term temporal convolutions for action recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, doi:10.1109/TPAMI.2017.2712608.

LE Q V, ZOU W L, YEUNG S Y, et al . . Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis[C]. Proceedings of 2011 IEEE Conference on Computer Vision and Pattern Recognition , IEEE , 2011: 3361-3368.

WANG H R, YUAN CH F, HU W M, et al.. Action recognition using nonnegative action component representation and sparse basis selection[J].IEEE Transactions on Image Processing, 2014, 23(2):570-581.

LIU L, SHAO L, LI X L, et al.. Learning spatio-temporal representations for action recognition:A genetic progra mming approach[J].IEEE Transactions on Cybernetics, 2016, 46(1):158-170.

PARK E, HAN X F, BERG T L, et al . . Combining multiple sources of knowledge in deep CNNs for action recognition[C]. Proceedings of 2016 IEEE Winter Conference on Applications of Computer Vision , IEEE , 2016: 1-8.

SIMONYAN K, ZISSERMAN A. Two-stream convolutional networks for action recognition in videos[C]. Advances in Neural Information Processing Systems 27 , NIPS, 2014: 568-576.

ZHU W J, HU J, SUN G, et al . . A key volume mining deep framework for action recognition[C]. Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition , IEEE , 2016: 1991-1999.

浏览量

371

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

暂无数据