Pre-trained self-supervised speech models

Analyzing acoustic word embeddings from pre-trained self-supervised speech models - 2023

Research Paper On Analyzing acoustic word embeddings from pre-trained self-supervised speech models

Research Area: Machine Learning

Abstract:

Given the strong results of self-supervised models on various tasks, there have been surprisingly few studies exploring self-supervised representations for acoustic word embeddings (AWE), fixed-dimensional vectors representing variable-length spoken word segments. In this work, we study several pre-trained models and pooling methods for constructing AWEs with self-supervised representations. Owing to the contextualized nature of self-supervised representations, we hy-pothesize that simple pooling methods, such as averaging, might already be useful for constructing AWEs. When evaluating on a standard word discrimination task, we find that HuBERT representations with mean-pooling rival the state of the art on English AWEs. More surprisingly, despite being trained only on English, HuBERT representations evaluated on Xitsonga, Mandarin, and French consistently outperform the multilingual model XLSR-53 (as well as Wav2Vec 2.0 trained on English).

Keywords:

Author(s) Name: Ramon Sanabria, Hao Tang, Sharon Goldwater

Journal name: Speech and Signal Processing

Conferrence name:

Publisher name: IEEE

DOI: 10.1109/ICASSP49357.2023.10096099

Volume Information: Volume 36, (2023)

Paper Link: https://ieeexplore.ieee.org/document/10096099

Office Address

Social List

Analyzing acoustic word embeddings from pre-trained self-supervised speech models - 2023

Research Paper On Analyzing acoustic word embeddings from pre-trained self-supervised speech models

Abstract:

S-Logix (OPC) Private Limited

Office Address

Analyzing acoustic word embeddings from pre-trained self-supervised speech models - 2023

Research Paper On Analyzing acoustic word embeddings from pre-trained self-supervised speech models

Abstract:

Related Papers