Research Area:  Machine Learning
Multimodal dialog system has attracted increasing attention from both academia and industry over recent years. Although existing methods have achieved some progress, they are still confronted with challenges in the aspect of question understanding (i.e., user intention comprehension). In this paper, we present a relational graph-based context-aware question understanding scheme, which enhances the user intention comprehension from local to global. Specifically, we first utilize multiple attribute matrices as the guidance information to fully exploit the product-related keywords from each textual sentence, strengthening the local representation of user intentions. Afterwards, we design a sparse graph attention network to adaptively aggregate effective context information for each utterance, completely understanding the user intentions from a global perspective. Moreover, extensive experiments over a benchmark dataset show the superiority of our model compared with several state-of-the-art baselines.
Keywords:  
Multimodal Dialog System
Relational Graph
Context-aware Question Understanding
sparse graph attention network
Machine Learning
Author(s) Name:  Haoyu Zhang , Meng Liu , Zan Gao , Xiaoqiang Lei , Yinglong Wang , Liqiang Nie
Journal name:  
Conferrence name:  MM -21: Proceedings of the 29th ACM International Conference on Multimedia
Publisher name:  ACM
DOI:  10.1145/3474085.3475234
Volume Information:  Pages 695–703
Paper Link:   https://dl.acm.org/doi/abs/10.1145/3474085.3475234