Table of Contents Author Guidelines Submit a Manuscript
Computational and Mathematical Methods in Medicine
Volume 2015, Article ID 576413, 6 pages
Research Article

Application of Random Forest Survival Models to Increase Generalizability of Decision Trees: A Case Study in Acute Myocardial Infarction

1Regional Knowledge Hub and WHO Collaborating Center for HIV Surveillance, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman 7616911317, Iran
2Department of Epidemiology, University of Tehran, Tehran, Iran
3Research Center for Modeling in Health, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman 7616911317, Iran

Received 12 September 2015; Revised 23 November 2015; Accepted 24 November 2015

Academic Editor: Issam El Naqa

Copyright © 2015 Iman Yosefian et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Background. Tree models provide easily interpretable prognostic tool, but instable results. Two approaches to enhance the generalizability of the results are pruning and random survival forest (RSF). The aim of this study is to assess the generalizability of saturated tree (ST), pruned tree (PT), and RSF. Methods. Data of 607 patients was randomly divided into training and test set applying 10-fold cross-validation. Using training sets, all three models were applied. Using Log-Rank test, ST was constructed by searching for optimal cutoffs. PT was selected plotting error rate versus minimum sample size in terminal nodes. In construction of RSF, 1000 bootstrap samples were drawn from the training set. C-index and integrated Brier score (IBS) statistic were used to compare models. Results. ST provides the most overoptimized statistics. Mean difference between C-index in training and test set was 0.237. Corresponding figure in PT and RSF was 0.054 and 0.007. In terms of IBS, the difference was 0.136 in ST, 0.021 in PT, and 0.0003 in RSF. Conclusion. Pruning of tree and assessment of its performance of a test set partially improve the generalizability of decision trees. RSF provides results that are highly generalizable.