Amazing technological breakthrough possible @S-Logix pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

wav2vec:Unsupervised Pre-training for Speech Recognition - 2019

Wav2vec:Unsupervised Pre-Training For Speech Recognition

Research Area:  Machine Learning

Abstract:

We explore unsupervised pre-training for speech recognition by learning representations of raw audio. wav2vec is trained on large amounts of unlabeled audio data and the resulting representations are then used to improve acoustic model training. We pre-train a simple multi-layer convolutional neural network optimized via a noise contrastive binary classification task. Our experiments on WSJ reduce WER of a strong character-based log-mel filterbank baseline by up to 36% when only a few hours of transcribed data is available. Our approach achieves 2.43% WER on the nov92 test set. This outperforms Deep Speech 2, the best reported character-based system in the literature while using two orders of magnitude less labeled training data.

Keywords:  

Author(s) Name:  Steffen Schneider, Alexei Baevski, Ronan Collobert, Michael Auli

Journal name:  Computer Science

Conferrence name:  

Publisher name:  arXiv:1904.05862

DOI:  10.48550/arXiv.1904.05862 Focus to learn more

Volume Information: