Research Area:  Big Data
Feature selection for predictive analytics continues to be a major challenge in the healthcare industry, particularly as it relates to readmission prediction. Several research works in mining healthcare data have focused on structured data for readmission prediction. Even within those works that are based on unstructured data, significant gaps exist in addressing class imbalance, context specific noise removal which thus necessitates new approaches readmission prediction using unstructured data. In this work, a novel approach is proposed for feature selection and domain related stop words removal from unstructured with class imbalance in discharge summary notes. The proposed predictive model uses these features along with other relevant structured data. Five iterations of predictions were performed to tune and improve the models, results of which are presented and analyzed in this paper. The authors suggest future directions in implementing the proposed approach in hospitals or clinics aimed at leveraging structured and unstructured discharge summary notes.
Keywords:  
Author(s) Name:  Arun Sundararaman,Srinivasan Valady Ramanathan and Ramprasad Thati
Journal name:  Big Data Research
Conferrence name:  
Publisher name:  ELSEVIER
DOI:  10.1016/j.bdr.2018.05.004
Volume Information:  Volume 13, September 2018, Pages 65-75
Paper Link:   https://www.sciencedirect.com/science/article/abs/pii/S2214579617303131