Research Area:  Machine Learning
This paper introduces a multi-class hand gesture recognition model developed to identify a set of hand gesture sequences from two-dimensional RGB video recordings, using both the appearance and spatiotemporal parameters of consecutive frames. The classifier utilizes a convolutional-based network combined with a long-short-term memory unit. To leverage the need for a large-scale dataset, the model deploys training on a public dataset, adopting a technique known as transfer learning to fine-tune the architecture on the hand gestures of relevance. Validation curves performed over a batch size of 64 indicate an accuracy of 93.95% (±0.37) with a mean Jaccard index of 0.812 (±0.105) for 22 participants. The fine-tuned architecture illustrates the possibility of refining a model with a small set of data (113,410 fully labelled image frames) to cover previously unknown hand gestures. The main contribution of this work includes a custom hand gesture recognition network driven by monocular RGB video sequences that outperform previous temporal segmentation models, embracing a small-sized architecture that facilitates wide adoption.
Keywords:  
hand gesture classification
transfer learning
three-dimensional convolutional
LSTM network
Author(s) Name:  Letizia Gionfrida,Wan M. R. Rusli,Angela E. Kedgley and Anil A. Bharath
Journal name:  Electronics
Conferrence name:  
Publisher name:  MDPI
DOI:  10.3390/electronics11152427
Volume Information:  Volume 11,Issue 15
Paper Link:   https://www.mdpi.com/2079-9292/11/15/2427