Table of Contents Author Guidelines Submit a Manuscript
BioMed Research International
Volume 2019, Article ID 8532892, 9 pages
Research Article

Machine Learning Readmission Risk Modeling: A Pediatric Case Study

1Research Center on Business Intelligence, University of Chile, Beauchef 851, Of. 502, Santiago, Chile
2Hospital Dr. Exequiel González Cortés, Gran Avenida 3300, San Miguel, Santiago, Chile
3Computation Intelligence Group, Basque University (UPV/EHU) P. Manuel Lardizabal 1, 20018 San Sebastian, Spain
4ACPySS, San Sebastián, Spain

Correspondence should be addressed to Manuel Graña; sue.uhe@anarg.leunam

Received 21 December 2018; Revised 8 March 2019; Accepted 1 April 2019; Published 15 April 2019

Academic Editor: Xudong Huang

Copyright © 2019 Patricio Wolff et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Background. Hospital readmission prediction in pediatric hospitals has received little attention. Studies have focused on the readmission frequency analysis stratified by disease and demographic/geographic characteristics but there are no predictive modeling approaches, which may be useful to identify preventable readmissions that constitute a major portion of the cost attributed to readmissions. Objective. To assess the all-cause readmission predictive performance achieved by machine learning techniques in the emergency department of a pediatric hospital in Santiago, Chile. Materials. An all-cause admissions dataset has been collected along six consecutive years in a pediatric hospital in Santiago, Chile. The variables collected are the same used for the determination of the child’s treatment administrative cost. Methods. Retrospective predictive analysis of 30-day readmission was formulated as a binary classification problem. We report classification results achieved with various model building approaches after data curation and preprocessing for correction of class imbalance. We compute repeated cross-validation (RCV) with decreasing number of folders to assess performance and sensitivity to effect of imbalance in the test set and training set size. Results. Increase in recall due to SMOTE class imbalance correction is large and statistically significant. The Naive Bayes (NB) approach achieves the best AUC (0.65); however the shallow multilayer perceptron has the best PPV and f-score (5.6 and 10.2, resp.). The NB and support vector machines (SVM) give comparable results if we consider AUC, PPV, and f-score ranking for all RCV experiments. High recall of deep multilayer perceptron is due to high false positive ratio. There is no detectable effect of the number of folds in the RCV on the predictive performance of the algorithms. Conclusions. We recommend the use of Naive Bayes (NB) with Gaussian distribution model as the most robust modeling approach for pediatric readmission prediction, achieving the best results across all training dataset sizes. The results show that the approach could be applied to detect preventable readmissions.