Amazing technological breakthrough possible @S-Logix pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

Image Captioning using Deep Learning: Text Augmentation By Paraphrasing Via Backtranslation - 2021

Image Captioning Using Deep Learning: Text Augmentation By Paraphrasing Via Backtranslation

Research Area:  Machine Learning

Abstract:

Image captioning, an exciting but challenging area of research at the intersection of natural language processing and computer vision, has recently seen dramatic progress. Much of the progress has been due to advances in machine learning and the availability of suitable data sets consisting of images, each with multiple captions. However, there has been little research on the impact of varying the number of captions on the quality of image captioning results. In particular, does it improve image captioning performance if we increase the number of captions in a dataset by augmenting the existing captions with paraphrases created using backtranslation? In this paper, we address this question and make the following three main contributions. First, we present a novel method of adding captions to a dataset by means of paraphrasing via backtranslation. Backtranslation amounts, in our case, to translating an English caption to a French counterpart which is then translated back to English. Typically, this results in the backtranslated English caption being different from the original English caption. Such backtranslated English captions can thus be used to augment the original dataset. Second, using our method, we generate paraphrases for the seminal MS COCO image captioning dataset. Our paraphrase dataset consists of around 490 000 paraphrased captions that are different from the original captions in MS COCO. Third, in image captioning experiments with deep learning models, we use datasets augmented with backtranslated paraphrases. We study the extent to which augmenting image captioning training datasets with paraphrases improves image captioning models performance, and report promising results. Code and data are publicly available.

Keywords:  

Author(s) Name:  Ingrid Ravn Turkerud; Ole Jakob Mengshoel

Journal name:  

Conferrence name:  IEEE Symposium Series on Computational Intelligence (SSCI)

Publisher name:  IEEE

DOI:  10.1109/SSCI50451.2021.9659834

Volume Information: