Trending Research Topics in Multimodal Summarization

Leveraging multimodal content for podcast summarization - 2022

Leveraging multimodal content for podcast summarization | S-Logix

Research Area: Machine Learning

Abstract:

Podcasts are becoming an increasingly popular way to share streaming audio content. Podcast summarization aims at improving the accessibility of podcast content by automatically generating a concise summary consisting of text/audio extracts. Existing approaches either extract short audio snippets by means of speech summarization techniques or produce abstractive summaries of the speech transcription disregarding the podcast audio. To leverage the multimodal information hidden in podcast episodes we propose an end-to-end architecture for extractive summarization that encodes both acoustic and textual contents. It learns how to attend relevant multimodal features using an ad hoc, deep feature fusion network. The experimental results achieved on a real benchmark dataset show the benefits of integrating audio encodings into the extractive summarization process. The quality of the generated summaries is superior to those achieved by existing extractive methods.

Keywords:
podcasts
audio content
text
audio
speech transcription
multimodal features
fusion network

Author(s) Name: Lorenzo Vaiani, Moreno La Quatra, Luca Cagliero, Paolo Garza

Journal name:

Conferrence name: Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing

Publisher name: ACM

DOI: https://doi.org/10.1145/3477314.3507106

Volume Information: -

Paper Link: https://dl.acm.org/doi/abs/10.1145/3477314.3507106

Office Address

Social List

Leveraging multimodal content for podcast summarization - 2022

Leveraging multimodal content for podcast summarization | S-Logix

Abstract:

S-Logix (OPC) Private Limited

Office Address

Leveraging multimodal content for podcast summarization - 2022

Leveraging multimodal content for podcast summarization | S-Logix

Abstract:

Related Papers