Research Area:  Machine Learning
Human action recognition is an emerging goal of computer vision with several applications such as video surveillance and human-computer interaction. Despite many attempts to develop deep architectures to learn the spatio-temporal features of video, hand-crafted optical flow is still an important part of the recognition process. To engage the motion features deeply inside the learning process, we propose a spatio-temporal video recognition network where a motion-aware long short-term memory module is introduced to estimate the motion flow along with extracting spatio-temporal features. A specific optical flow estimator is subsumed which is based on kernelized cross correlation. The proposed network can be used without any extra learning process and there is no need to pre-compute and store the optical flow. Extensive experiments on two action recognition benchmarks verify the effectiveness of the proposed approach.
Keywords:  
Convlstm Network
Action Recognition
Human action recognition
computer vision
Machine Learning
Deep Learning
Author(s) Name:  Mahshid Majd & Reza Safabakhsh
Journal name:  Applied Intelligence
Conferrence name:  
Publisher name:  Springer
DOI:  10.1007/s10489-018-1395-8
Volume Information:  volume 49, pages: 2515–2521 (2019)
Paper Link:   https://link.springer.com/article/10.1007/s10489-018-1395-8