Amazing technological breakthrough possible @S-Logix pro@slogix.in

Office Address

  • #5, First Floor, 4th Street Dr. Subbarayan Nagar Kodambakkam, Chennai-600 024 Landmark : Samiyar Madam
  • pro@slogix.in
  • +91- 81240 01111

Social List

A Multi-layer Bidirectional Transformer Encoder for Pre-trained Word Embedding:A Survey of BERT - 2020

A Multi-Layer Bidirectional Transformer Encoder For Pre-Trained Word Embedding:A Survey Of Bert

Research Area:  Machine Learning

Abstract:

Language modeling is the task of assigning a probability distribution over sequences of words that matches the distribution of a language. A language model is required to represent the text to a form understandable from the machine point of view. A language model is capable to predict the probability of a word occurring in the context-related text. Although it sounds formidable, in the existing research, most of the language models are based on unidirectional training. In this paper, we have investigated a bi-directional training model-BERT (Bidirectional Encoder Representations from Transformers). BERT builds on top of the bidirectional idea as compared to other word embedding models (like Elmo). It practices the comparatively new transformer encoder-based architecture to compute word embedding. In this paper, it has been described that how this model is to be producing or achieving state-of-the-art results on various NLP tasks. BERT has the capability to train the model in bi-directional over a large corpus. All the existing methods are based on unidirectional training (either the left or the right). This bi-directionality of the language model helps to obtain better results in the context-related classification tasks in which the word(s) was used as input vectors. Additionally, BERT is outlined to do multi-task learning using context-related datasets. It can perform different NLP tasks simultaneously. This survey focuses on the detailed representation of the BERT- based technique for word embedding, its architecture, and the importance of this model for pre-training purposes using a large corpus.

Keywords:  

Author(s) Name:  Rohit Kumar Kaliyar

Journal name:  

Conferrence name:  10th International Conference on Cloud Computing, Data Science & Engineering (Confluence)

Publisher name:  IEEE

DOI:  10.1109/Confluence47617.2020.9058044

Volume Information: