In deep learning models, neural networks significantly grow the speech recognition task. Speech Recognition works on human inputs that enable machines to react on inserted text, voice, or any other inputs. Automatic speech recognition (ASR) is based on two probability functions: an acoustic model that computes the probability of correspondence of an utterance to the input word sequence and a language model that calculates the prior probability of the meaning of the input. A deep neural network as an acoustic model has improved the performance of the ASR system. Various deep neural network algorithms are applied for automatic speech recognition, such as convolutional neural networks, deep belief networks, and recurrent neural networks are most commonly used.
The application areas of speech recognition are Home automation, Interactive voice response, Mobile telephony, including smartphones, email, In-car systems, Health care, Military Telephony, IoT, and other domains. Recently, transformers for speech recognition that achieve better performance. Future advancements of speech recognition using deep neural networks are Deep RNN - long short term memory for speech recognition, subword-based HMM-DNN speech recognition, Advancing RNN Transducer Technology for Speech Recognition, and many more.
• Conventional speech recognition systems represent speech signals as a short-time stationary signal using Gaussian Mixture Models (GMMs) based on hidden Markov models (HMMs), but it is unable to model temporal dependencies for continuous signals.
• To tackle this issue, recent deep learning approaches such as convolutional neural networks (CNN) and deep recurrent neural networks (RNN) have led to a significant impact on speech recognition tasks because of their ability to model complex correlations in speech features.
• Deep learning algorithms permit discriminative training efficiently because they operate as a greedy layer-wise unsupervised pre-training and learn hierarchy from extracted features from each layer at a time.
• Deep Belief Networks, Convolutional Neural Networks, and Recurrent Neural Networks are acoustic models that have successfully outperformed GMM-based acoustic models.