A word embedding is a learned representation of words for text and document analysis, where the adjacent words in the vector space are expected to have the same meaning with a similar representation. Words are mapped into vectors in language modeling and feature learning techniques to obtain the word embedding. It maps words and phrases to vectors of real numbers, and they capture both semantic and syntactic information of words. The main goal of word embedding is dimensionality reduction and predicting surrounding words using a word. It is used in many text analysis tasks, mainly natural language processing and information retrieval. Word embeddings are commonly classified as Predictive based model: Prediction of target words based on the context in the pre-trained word embedding model, and Count or frequency-based model: Determine the target word by co-occurrence estimation of words in multiple contexts. Word2vec, doc2vec, TF-IDF (Term Frequency-Inverse Document Frequency), Skip-gram, GloVe (Global Vectors for Word Representation), CBOW (Continuous bowl of words), BERT (Bidirectional Encoder Representations from Transformers), and embedding layer are some of the word embedding representations. Other application scenarios of word embedding rather than text analysis are semantic analysis, syntax analysis, idiomaticity analysis, Part Of the Speech (POS) tagging, sentiment analysis, named entity recognition, textual entailment as well as machine translation.