Pretrained word embedding is the emerging field in natural language processing. It uses a self-supervised learning method to learn the contextual word representations on large-scale data sets corpus. Pretrained word embedding is another form of transfer learning. The significance of pretrained word embedding is that it produces superior performance in NLP and expresses both the semantic and syntactic meaning of a word from the huge amount of trained datasets. Pretraining models always possess better model initialization, learning universal language representation, and regularization to avoid over-fitting. Pretrained word embedding is classified based on word and character level. Word2Vec and GloVe are the most popular word-level pretrained word embeddings.
Word2Vec is referred to as shallow neural network architecture for the construction of word embedding from a text corpus and comprises of two models, such as Continuous Bag-of-Words(CBOW): This model predicts the word corresponding to the context by taking the context of each word as input and Continuous Skip-Gram Model: This model achieve the reverse of CBOW, tries to predict the source context words given target word. GloVe refers to the generation of word embeddings by aggregating global word co-occurrence statics from a given corpus. Models of Character-level embeddings are ELMo: it captures latent syntactic-semantic information based on the concept of contextual string embeddings. Flair embeddings: it involves a sequence of words representing a corresponding sequence of vectors. Application areas of pretrained word embeddings are documentation search, information retrieval, Survey responses, comment analysis, recommendation engines, sentiment analysis, text classification, and more.