A survey on the techniques,applications,and performance

A survey on the techniques,applications,and performance of short text semantic similarity - 2021

Research Area: Machine Learning

Abstract:

Short text similarity plays an important role in natural language processing (NLP). It has been applied in many fields. Due to the lack of sufficient context in the short text, it is difficult to measure the similarity. The use of semantics similarity to calculate textual similarity has attracted the attention of academia and industry and achieved better results. In this survey, we have conducted a comprehensive and systematic analysis of semantic similarity. We first propose three categories of semantic similarity: corpus-based, knowledge-based, and deep learning (DL)-based. We analyze the pros and cons of representative and novel algorithms in each category. Our analysis also includes the applications of these similarity measurement methods in other areas of NLP. We then evaluate state-of-the-art DL methods on four common datasets, which proved that DL-based can better solve the challenges of the short text similarity, such as sparsity and complexity. Especially, bidirectional encoder representations from transformer model can fully employ scarce information of short texts and semantic information and obtain higher accuracy and F1 value. We finally put forward some future directions.

Keywords:

Author(s) Name: Mengting Han,Xuan Zhang,Xin Yuan,Jiahao Jiang,Wei Yun,Chen Gao

Journal name: Concurrency and Computation: Practice and Experience

Conferrence name:

Publisher name: Wiley

DOI: 10.1002/cpe.5971

Volume Information: Volume33, Issue5 10 March 2021

Paper Link: https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.5971

Office Address

Social List