Research Topics in Federated Learning for Natural Language Processing

  Natural language processing (NLP) resolves the ambiguity of language by helping machines to communicate and understand the human language as it is spoken and written. NLP utilizes deep learning and pre-trained language models to perform NLP tasks and faces the data privacy issue owing to accessing a large amount of data directly from the user side. Hence, there is a need to perform NLP tasks in the framework of distributed data from different isolated organizations or users, and such data cannot be shared for privacy concerns. Federated learning-based NLP effectively accomplishes the privacy concern of user data from distributed sources. Federated learning trains the learning models from diverse, decentralized edge devices without exploiting local data privacy.
  Federated learning applied NLP tasks are language modeling, text classification, speech recognition, sequence tagging, recommendation, health text mining, translation, and summarization. Recent advancements of federated learning in NLP are sentence-level text intent classification, google keyboard suggestions, and medical name entity recognition. Challenges and future scopes of federated learning in NLP are big language models, non-independent and identically distributed (non-iid) data administration, personalized federated learning for NLP, spatial adaptability, privacy concern in the reconstruction of original data, and computation communication trade-off.