Amazing technological breakthrough possible @S-Logix pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

An analysis of graph convolutional networks and recent datasets for visual question answering - 2022

An analysis of graph convolutional networks and recent datasets for visual question answering

Research paper on analysis of graph convolutional networks and recent datasets for visual question answering

Research Area:  Machine Learning

Abstract:

Graph neural network is a deep learning approach widely applied on structural and non-structural scenarios due to its substantial performance and interpretability recently. In a non-structural scenario, textual and visual research topics like visual question answering (VQA) are important, which need graph reasoning models. VQA aims to build a system that can answer related questions about given images as well as understand the underlying semantic meaning behind the image. The critical issues in VQA are to effectively extract the visual and textual features and subject both features into a common space. These issues have a great impact in handling goal-driven, reasoning, and scene classification subtasks. In the same vein, it is difficult to compare models performance because most existing datasets do not group instances into meaningful categories. With the recent advances in graph-based models, lots of efforts have been devoted to solving the problems mentioned above. This study focuses on graph convolutional networks (GCN) studies and recent datasets for visual question answering tasks. Specifically, we reviewed current related studies on GCN for the VQA task. Also, 18 common and recent datasets for VQA are well studied, though not all of them are discussed at the same level of detail. A critical review of GCN, datasets and VQA challenges is further highlighted. Finally, this study will help researchers to choose a suitable dataset for a particular VQA subtask, identify VQA challenges, the pros and cons of its approaches, and improve more on GCN for the VQA.

Keywords:  
Computer vision
NLP
visual question answering (VQA)
Graph convolutional networks (GCN)
Datasets
Machine Learning
Deep Learning

Author(s) Name:  Abdulganiyu Abdu Yusuf, Feng Chong & Mao Xianling

Journal name:  Artificial Intelligence Review

Conferrence name:  

Publisher name:  Springer

DOI:  10.1007/s10462-022-10151-2

Volume Information:  volume 55, pages6277–6300 (2022)