Research Area:  Machine Learning
Image captioning is a process of automatically describing an image with one or more natural language sentences. In recent years, image captioning has witnessed rapid progress, from initial template-based models to the current ones, based on deep neural networks. This paper gives an overview of issues and recent image captioning research, with a particular emphasis on models that use the deep encoder-decoder architecture. We discuss the advantages and disadvantages of different approaches, along with reviewing some of the most commonly used evaluation metrics and datasets.
Keywords:  
Author(s) Name:  I. Hrga; M. Ivašić-Kos
Journal name:  
Conferrence name:  42nd International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)
Publisher name:  IEEE
DOI:  10.23919/MIPRO.2019.8756821
Volume Information:  
Paper Link:   https://ieeexplore.ieee.org/abstract/document/8756821