Word Embeddings and Pre-trained Language Models - S-Logix

Research Area: Machine Learning

Abstract:

Early word embeddings algorithms like word2vec and GloVe generate static distributional representations for words regardless of the context and the sense in which the word is used in a given sentence, offering poor modeling of ambiguous words and lacking coverage for out-of-vocabulary words. Hence a new wave of algorithms based on training language models such as Open AI GPT and BERT has been proposed to generate contextual word embeddings that use as input word constituents allowing them to generate representations for out-of-vocabulary words by combining the word pieces. Recently, fine-tuning pre-trained language models that have been trained on large corpora have constantly advanced the state of the art for many NLP tasks.

Keywords:

Author(s) Name: Jose Manuel Gomez-Perez, Ronald Denaux, Andres Garcia-Silva

Journal name: A Practical Guide to Hybrid Natural Language Processing-Book

Conferrence name:

Publisher name: Springer Nature Switzerland

DOI: https://doi.org/10.1007/978-3-030-44830-1_3

Volume Information: pp 17-31

Paper Link: https://link.springer.com/chapter/10.1007%2F978-3-030-44830-1_3

Office Address

Social List

Understanding Word Embeddings and Language Models - 2020

Abstract:

S-Logix (OPC) Private Limited

Office Address

Understanding Word Embeddings and Language Models - 2020

Abstract:

Related Papers