Amazing technological breakthrough possible @S-Logix pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

Adversarial Multimodal Network for Movie Story Question Answering - 2020

adversarial-multimodal-network-for-movie-story-question-answering.jpg

Adversarial Multimodal Network for Movie Story Question Answering | S-Logix

Research Area:  Machine Learning

Abstract:

Visual question answering by using information from multiple modalities has attracted more and more attention in recent years. However, it is a very challenging task, as the visual content and natural language have quite different statistical properties. In this work, we present a method called Adversarial Multimodal Network (AMN) to better understand video stories for question answering. In AMN, we propose to learn multimodal feature representations by finding a more coherent subspace for video clips and the corresponding texts (e.g., subtitles and questions) based on generative adversarial networks. Moreover, a self-attention mechanism is developed to enforce our newly introduced consistency constraint in order to preserve the self-correlation between the visual cues of the original video clips in the learned multimodal representations. Extensive experiments on the benchmark MovieQA and TVQA datasets show the effectiveness of our proposed AMN over other published state-of-the-art methods.

Keywords:  
Knowledge discovery
Motion pictures
Visualization
Task analysis
Generators
Gallium nitride
Natural languages

Author(s) Name:  Zhaoquan Yuan; Siyuan Sun; Lixin Duan

Journal name:   IEEE Transactions on Multimedia

Conferrence name:  

Publisher name:  IEEE

DOI:  10.1109/TMM.2020.3002667

Volume Information:  Volume 23