Research Area:  Machine Learning
There has been a lot of previous works on speech emotion with machine learning method. However, most of them rely on the effectiveness of labelled speech data. In this paper, we propose a novel algorithm which combines both sparse autoencoder and attention mechanism. The aim is to benefit from labeled and unlabeled data with autoencoder, and to apply attention mechanism to focus on speech frames which have strong emotional information. We can also ignore other speech frames which do not carry emotional content. The proposed algorithm is evaluated on three public databases with cross-language system. Experimental results show that the proposed algorithm provide significantly higher accurate predictions compare to existing speech emotion recognition algorithms.
Keywords:  
machine learning
speech data
sparse autoencoder
attention mechanism
emotional information
public database
accurate prediction
Author(s) Name:  Ting-Wei Sun, An-Yeu Andy Wu
Journal name:  
Conferrence name:  2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)
Publisher name:  IEEE
DOI:  https://doi.org/10.1109/AICAS.2019.8771593
Volume Information:  -