Research Area:  Machine Learning
Recent developments in machine learning algorithms have enabled models to exhibit impressive performance in healthcare tasks using electronic health record (EHR) data. However, the heterogeneous nature and sparsity of EHR data remains challenging. In this work, we present a model that utilizes heterogeneous data and addresses sparsity by representing diagnoses, procedures, and medication codes with temporal Hierarchical Clinical Embeddings combined with Topic modeling (HCET) on clinical notes. HCET aggregates various categories of EHR data and learns inherent structure based on hospital visits for an individual patient. We demonstrate the potential of the approach in the task of predicting depression at various time points prior to a clinical diagnosis. We found that HCET outperformed all baseline methods with a highest improvement of 0.07 in precision-recall area under the curve (PRAUC). Furthermore, applying attention weights across EHR data modalities significantly improved the performance as well as the models interpretability by revealing the relative weight for each data modality. Our results demonstrate the models ability to utilize heterogeneous EHR information to predict depression, which may have future implications for screening and early detection.
Keywords:  
Depression
Data models
Medical diagnostic imaging
Predictive models
Feature extraction
Diseases
Semantics
Author(s) Name:  Yiwen Meng; William Speier; Michael Ong
Journal name:  IEEE Journal of Biomedical and Health Informatics
Conferrence name:  
Publisher name:  IEEE
DOI:  10.1109/JBHI.2020.3004072
Volume Information:  Volume: 25
Paper Link:   https://ieeexplore.ieee.org/abstract/document/9122394/keywords#keywords