Semantic Similarity or Semantic Textual Similarity (STS) is one of the tasks in natural language processing (NLP). Semantic similarity is the metric that scores the relationship between the texts or documents. Semantic Similarity has various applications, such as information retrieval, text summarization, sentiment analysis, plagiarism detection, information extraction, biomedical ontologies, and many more. Semantic Similarity is the task of discovering the similarity between sentences in terms of meaning. The main significance of STS is the measure of semantic equivalence between two blocks of text. General types of semantic similarity are knowledge-based similarity, statistical-based similarity, string-based similarity, and language model based similarity.
Recent Semantic similarity methods are the knowledge-based method, corpus-based method, deep neural network method, and hybrid methods. One of the recently developed methods for STS is deep learning, which provides enhanced performance. The most widely used deep neural network techniques include Convolutional Neural Networks (CNN), Long Short Term Memory (LSTM), Bidirectional Long Short Term Memory (Bi-LSTM), Recursive Tree LSTM and their combinations. Deep learning utilizes deep neural networks to estimate the similarity between word-embeddings. Deep neural networks models are built based on convolution and pooling. Recent development in deep learning-based semantic similarity is transformer-based models comprised of encoder and decoder with multi-head attention mechanisms. Some of the applications of deep learning-based semantic similarity are automated short answer grading, machine translation, image captioning, and short paragraph similarity.