Research Area:  Machine Learning
Multimodal Summarization (MS) has attracted research interest in the past few years due to the ease with which users perceive multimodal summaries. It is important for MS models to consider the topic a given target content belongs to. In the current paper, we propose a topic-aware MS system which performs two tasks simultaneously: differentiating the images into "on-topic" and "off-topic" categories and further utilizing the "on-topic" images to generate multimodal summaries. The hypothesis is that, the proposed topic similarity classifier will help in generating better multimodal summary by focusing on important components of images and text which are specific to a particular topic. To develop the topic similarity classifier, we have augmented the existing popular MS data set, MSMO, with similar "on-topic" and dissimilar "off-topic" images for each sample. Our experimental results establish that the focus on "on-topic" features helps in generating topic-aware multimodal summaries, which outperforms the state of the art approach by 1.7% in ROUGE-L metric.
Keywords:  
multimodal summarization
target content
topic-aware
similarity classifier
Author(s) Name:  Sourajit Mukherjee, Anubhav Jangra, Sriparna Saha, Adam Jatowt
Journal name:  Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
Conferrence name:  
Publisher name:  aclanthology
DOI:  
Volume Information:  volume:2022
Paper Link:   https://aclanthology.org/2022.findings-aacl.36/