Image Captioning with Generative Adversarial Network

Research Area: Machine Learning

Abstract:

Automatic image annotation, automatic image tagging, and image linguistic indexing functions use methodologies that significantly overlap. In this paper, we use the general term, image captioning, to refer to all forms of such functions. Image captioning is the process of automatically generating metadata in the form of captioning (i.e., generating sentences that describe the content of the image). Image captioning is used in image retrieval systems to locate images of interest from a database, web, or personal devices. In recent years, investigators have been using Deep Learning to caption images with some success. However, the reported results do suffer from a number of deficiencies; namely, accuracy, lack of diversity and emotions in resultant captions. In order to address some of these deficiencies, we propose to use Generative Adversarial models to produce new and combinatorial samples. More specifically, we propose to explore various autoencoders to generate more accurate and meaningful captions for images. Autoencoders are neural networks that learn data codings in an unsupervised manner. The research outlined in this paper is an ongoing investigative research project.

Keywords:

Author(s) Name: Soheyla Amirian; Khaled Rasheed; Thiab R. Taha; Hamid R. Arabnia

Journal name:

Conferrence name: International Conference on Computational Science and Computational Intelligence (CSCI)

Publisher name: IEEE

DOI: 10.1109/CSCI49370.2019.00055

Volume Information:

Paper Link: https://ieeexplore.ieee.org/abstract/document/9071372

Office Address

Social List