Research Area:  Machine Learning
The goal of human action recognition is to identify and understand the actions of people in videos and export corresponding tags. In addition to spatial correlation existing in 2D images, actions in a video also own the attributes in temporal domain. Due to the complexity of human actions, e.g., the changes of perspectives, background noises, and others will affect the recognition. In order to solve these thorny problems, three algorithms are designed and implemented in this paper. Based on convolutional neural networks (CNN), Two-Stream CNN, CNN+LSTM, and 3D CNN are harnessed to identify human actions in videos. Each algorithm is explicated and analyzed on details. HMDB-51 dataset is applied to test these algorithms and gain the best results. Experimental results showcase that the three methods have effectively identified human actions given a video, the best algorithm thus is selected.
Keywords:  
Author(s) Name:   PDF Zeqi Yu; Wei Qi Yan
Journal name:  
Conferrence name:  35th International Conference on Image and Vision Computing New Zealand (IVCNZ)
Publisher name:  IEEE
DOI:  10.1109/IVCNZ51579.2020.9290594
Volume Information:  
Paper Link:   https://ieeexplore.ieee.org/abstract/document/9290594