Comprehending human activities in visual data is fastened to advancement in modern research areas, including object recognition, human dynamics, domain adaptation, and semantic segmentation. For the last decade, human action recognition emerged from earlier techniques that were limited to controlled environments to recent days advanced outcomes that can learn from millions of videos and exploit all daily activities with a wide range of applications from video surveillance to human-computer interaction.
The most prominent applications of human action recognition are health care, well-being, digital gaming, sport, and overall monitoring system. Human action recognition covers numerous research focuses in computer vision, including human detection in video, human pose estimation, human tracking, and interpretation and understanding of time series data.
Human activity research has seen an outbreak in utilizing Deep Learning (DL) technology as an algorithm that enhances recognition accuracy. Deep learning architectures applied to human action recognition are spatiotemporal networks, multiple stream networks, deep generative networks, and temporal coherency networks.
Some common challenges human action algorithms face are the complexity and variety of day-to-day activities, intra-subject and inter-subject variability for the same operations, the trade-off between privacy and performance, embedded and portable system calculation reliability, and data annotation problem.
Most of the present surveys have concentrated on problems that include human action recognition using depth data, 3D-skeleton data, still image data, spatiotemporal interest point-based approaches, and human walking motion recognition. Being a significant research area in recent times, human activity recognition is highly investigated, and numerous comprehensive surveys and reviews have been published which describe overview, feature representations, deep learning techniques, challenges, and future directions.