Various Deep Learning Models for Automatic Image Captioning

Research Area: Machine Learning

Abstract:

Automatic Image captioning means the generation of a caption for an image by a machine. Image captioning is performed by recognizing objects, attributes and interconnection between them. This task involves computer vision for image understanding, natural language processing for syntax and semantics purpose and machine learning for caption generation. Preferably CNN is used to understand features of an image and RNN is used for sentence generation. Earlier, Machine learning approach was used for this purpose. Input data is used to extract the features in traditional machine learning. Extracting features like handcrafted from large dataset is not so easy and feasible. Later on, Various deep learning-based approaches were proposed. In deep learning, retrieval based and template-based methods were proposed but faced some issues like missing important objects and fixed length caption respectively. Then end to end learning approach based on deep learning network came into existence and image captioning task became more efficient. The objective of this paper is to study and compare various end to end learning-based framework for image captioning using standard evaluation metric and to understand how can these frameworks be used for various research applications. Along with the comparison, futuristic challenges have also been discussed.

Keywords:

Author(s) Name: Gaurav and Pratistha Mathur

Journal name:

Conferrence name: Journal of Physics: Conference Series

Publisher name: IOP

DOI: 10.1088/1742-6596/1950/1/012045

Volume Information:

Paper Link: https://iopscience.iop.org/article/10.1088/1742-6596/1950/1/012045/meta

Office Address

Social List

A Survey on Various Deep Learning Models for Automatic Image Captioning - 2021

Abstract:

S-Logix (OPC) Private Limited

Office Address

A Survey on Various Deep Learning Models for Automatic Image Captioning - 2021

Abstract:

Related Papers