Amazing technological breakthrough possible @S-Logix

Office Address

  • 2nd Floor, #7a, High School Road, Secretariat Colony Ambattur, Chennai-600053 (Landmark: SRM School) Tamil Nadu, India
  • +91- 81240 01111

Social List

A Survey on Data Augmentation for Text Classification - 2021

A Survey paper on Data Augmentation for Text Classification

A Survey on Data Augmentation for Text Classification | S - Logix

Research Area:  Machine Learning


Data augmentation, the artificial creation of training data for machine learning by transformations, is a widely studied research field across machine learning disciplines. While it is useful for increasing a model-s generalization capabilities, it can also address many other challenges and problems, from overcoming a limited amount of training data, to regularizing the objective, to limiting the amount data used to protect privacy. Based on a precise description of the goals and applications of data augmentation and a taxonomy for existing works, this survey is concerned with data augmentation methods for textual classification and aims to provide a concise and comprehensive overview for researchers and practitioners. Derived from the taxonomy, we divide more than 100 methods into 12 different groupings and give state-of-the-art references expounding which methods are highly promising by relating them to each other. Finally, research perspectives that may constitute a building block for future work are provided.

Data augmentation
Text classification
Neural networks
Natural language processing
training data
machine learning

Author(s) Name:  Markus Bayer , Marc-AndrĂ© Kaufhold , Christian Reuter

Journal name:  ACM Computing Surveys

Conferrence name:  

Publisher name:  ACM

DOI:  10.1145/3544558

Volume Information:  volume 55