Amazing technological breakthrough possible @S-Logix pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

Multimodal Deep Learning for Mental Disorders Prediction from Audio Speech Samples - 2019

Multimodal Deep Learning For Mental Disorders Prediction From Audio Speech Samples

Research Area:  Machine Learning

Abstract:

Key features of mental illnesses are reflected in speech. Our research focuses on designing a multimodal deep learning structure that automatically extracts salient features from recorded speech samples for predicting various mental disorders including depression, bipolar, and schizophrenia. We adopt a variety of pre-trained models to extract embeddings from both audio and text segments. We use several state-of-the-art embedding techniques including BERT, FastText, and Doc2VecC for the text representation learning and WaveNet and VGG-ish models for audio encoding. We also leverage huge auxiliary emotion-labeled text and audio corpora to train emotion-specific embeddings and use transfer learning in order to address the problem of insufficient annotated multimodal data available. All these embeddings are then combined into a joint representation in a multimodal fusion layer and finally a recurrent neural network is used to predict the mental disorder. Our results show that mental disorders can be predicted with acceptable accuracy through multimodal analysis of clinical interviews.

Keywords:  

Author(s) Name:  Habibeh Naderi, Behrouz Haji Soleimani, Stan Matwin

Journal name:  Computer Science

Conferrence name:  

Publisher name:  arXiv:1909.01067

DOI:  10.48550/arXiv.1909.01067

Volume Information: