Research Area:  Machine Learning
Distributed vector representations or embeddings map variable length text to dense fixed length vectors as well as capture prior knowledge which can transferred to downstream tasks. Even though embeddings have become de facto standard for text representation in deep learning based NLP tasks in both general and clinical domains, there is no survey paper which presents a detailed review of embeddings in Clinical Natural Language Processing. In this survey paper, we discuss various medical corpora and their characteristics, medical codes and present a brief overview as well as comparison of popular embeddings models. We classify clinical embeddings and discuss each embedding type in detail. We discuss various evaluation methods followed by possible solutions to various challenges in clinical embeddings. Finally, we conclude with some of the future directions which will advance research in clinical embeddings.
Author(s) Name:  Katikapalli Subramanyam Kalyan, S. Sangeetha
Journal name:  Journal of Biomedical Informatics
Publisher name:  Elsevier
Volume Information:  Volume 101, January 2020, 103323
Paper Link:   https://www.sciencedirect.com/science/article/pii/S1532046419302436