Research Area:  Machine Learning
Early detection of preventable diseases is important for better disease management, improved interventions, and more efficient health-care resource allocation. Various machine learning approaches have been developed to exploit information in Electronic Health Record (EHR) for this task. Majority of previous attempts, however, focus on structured fields and lose the vast amount of information in the unstructured notes. In this work we propose a general multi-task framework for disease onset prediction that combines both free-text medical notes and structured information. We compare performance of different deep learning architectures including CNN, LSTM and hierarchical models. In contrast to traditional text-based prediction models, our approach does not require disease specific feature engineering, and can handle negations and numerical values that exist in the text. Our results on a cohort of about 1 million patients show that models using text outperform models using just structured data, and that models capable of using numerical values and negations in the text, in addition to the raw text, further improve performance. Additionally, we compare different visualization methods for medical professionals to interpret model predictions.
Keywords:  
Electronic Health Record
Chronic Disease Prediction
Machine Learning
Deep Learning
Medical Notes
Author(s) Name:  Jingshu Liu, Zachariah Zhang, Narges Razavian
Journal name:  
Conferrence name:  Proceedings of the 3rd Machine Learning for Healthcare Conference
Publisher name:  
DOI:  10.48550/arXiv.1808.04928
Volume Information:  Volume 2018
Paper Link:   http://proceedings.mlr.press/v85/liu18b.html