Image Captioning: A Comprehensive Survey

Research Area: Machine Learning

Abstract:

The primary purpose of image captioning is to generate a caption for an image. Image captioning needs to identify objects in image, actions, their relationship and some silent feature that may be missing in the image. After identification the next step is to generate a most relevant and brief description for the image that must be syntactically and semantically correct. It uses both computer vision concepts for identification of objects and natural language processing methods for description. It-s difficult for a machine to imitate human brain ability however researches in this field have shown a great achievement. Deep learning techniques are enough capable to handle such problems using CNN and LSTM. It can be used in many intelligent control systems and IOT based devices. In this survey paper, we are presenting different approaches of image captioning such as retrieval based, template based and deep learning based as well as different evaluation techniques.

Keywords:

Author(s) Name: Himanshu Sharma; Manmohan Agrahari; Sujeet Kumar Singh; Mohd Firoj; Ravi Kumar Mishra

Journal name:

Conferrence name: International Conference on Power Electronics & IoT Applications in Renewable Energy and its Control (PARC)

Publisher name: IEEE

DOI: 10.1109/PARC49193.2020.236619

Volume Information:

Paper Link: https://ieeexplore.ieee.org/abstract/document/9087226

Office Address

Social List