A 3DCNN-LSTM Multi-Class Temporal Segmentation for Hand Gesture

A 3DCNN-LSTM Multi-Class Temporal Segmentation for Hand Gesture Recognition - 2022

Research paper on A 3DCNN-LSTM Multi-Class Temporal Segmentation for Hand Gesture Recognition

Research Area: Machine Learning

Abstract:

This paper introduces a multi-class hand gesture recognition model developed to identify a set of hand gesture sequences from two-dimensional RGB video recordings, using both the appearance and spatiotemporal parameters of consecutive frames. The classifier utilizes a convolutional-based network combined with a long-short-term memory unit. To leverage the need for a large-scale dataset, the model deploys training on a public dataset, adopting a technique known as transfer learning to fine-tune the architecture on the hand gestures of relevance. Validation curves performed over a batch size of 64 indicate an accuracy of 93.95% (±0.37) with a mean Jaccard index of 0.812 (±0.105) for 22 participants. The fine-tuned architecture illustrates the possibility of refining a model with a small set of data (113,410 fully labelled image frames) to cover previously unknown hand gestures. The main contribution of this work includes a custom hand gesture recognition network driven by monocular RGB video sequences that outperform previous temporal segmentation models, embracing a small-sized architecture that facilitates wide adoption.

Keywords:
hand gesture classification
transfer learning
three-dimensional convolutional
LSTM network

Author(s) Name: Letizia Gionfrida,Wan M. R. Rusli,Angela E. Kedgley and Anil A. Bharath

Journal name: Electronics

Conferrence name:

Publisher name: MDPI

DOI: 10.3390/electronics11152427

Volume Information: Volume 11,Issue 15

Paper Link: https://www.mdpi.com/2079-9292/11/15/2427

Office Address

Social List