List of Topics:
Location Research Breakthrough Possible @S-Logix pro@slogix.in

Office Address

Social List

A Survey of Multimodal Large Language Model from A Data-centric Perspective - 2024

a-survey-of-multimodal-large-language-model.png

Research Paper on A Survey of Multimodal Large Language Model from A Data-centric Perspective

Research Area:  Machine Learning

Abstract:

Multimodal large language models (MLLMs) enhance the capabilities of standard large language models by integrating and processing data from multiple modalities, including text, vision, audio, video, and 3D environments. Data plays a pivotal role in the development and refinement of these models. In this survey, we comprehensively review the literature on MLLMs from a data-centric perspective. Specifically, we explore methods for preparing multimodal data during the pretraining and adaptation phases of MLLMs. Additionally, we analyze the evaluation methods for the datasets and review the benchmarks for evaluating MLLMs. Our survey also outlines potential future research directions. This work aims to provide researchers with a detailed understanding of the data-driven aspects of MLLMs, fostering further exploration and innovation in this field.

Keywords:  

Author(s) Name:  Tianyi Bai, Hao Liang, Binwang Wan, Yanran Xu, Xi Li, Shiyu Li, Ling Yang, Bozhou Li, Yifan Wang, Bin Cui, Ping Huang, Jiulong Shan, Conghui He, Binhang Yuan, Wentao Zhang

Journal name:  Artificial Intelligence

Conferrence name:  

Publisher name:  arXiv

DOI:  10.48550/arXiv.2405.16640

Volume Information:  Volume 11,(2024)