Improving Prediction Accuracy of “Central Line-Associated Blood Stream Infections” Using Data Mining Models
Table 1
The six data mining methods used.
No
DM Method
Description
1
AdaBoost (AB)
AdaBoost [40], short for “Adaptive Boosting,” is “a machine learning meta-algorithm. It is a powerful classification algorithm that has practical success with applications in a wide variety of fields. Boosting is an approach to machine learning based on the idea of creating a highly accurate prediction rule by combining many relatively weak and inaccurate rules.”
2
Random forest (RF)
Random forest [41] is “an ensemble learning method for classification and regression that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. Random decision forests correct for decision trees’ habit of overfitting to their training set.”
3
Support vector machine (SVM)
The SVM [42] is used “to find the best classification function to distinguish between members of two classes in the training data. The metric for the concept of the ‘best’ classification function can be realized geometrically. It is considered a good classifier because of its high generalization performance without the need to add a priori knowledge, even when the dimension of the input space is very high.”
4
Multilayer Perceptron (MLP)
The MLP [43, 44] is “a feedforward artificial neural network model that maps sets of input data onto a set of appropriate outputs. An MLP consists of multiple layers of nodes in a directed graph, with each layer fully connected to the next one.”
5
Logistic regression (LR)
Logistic regression [45] is “a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. The outcome is measured with a dichotomous variable (in which there are only two possible outcomes).”
6
Naive Bayesian inference (NBI)
Naïve Bayesian inference [46] is “a method of statistical inference in which Bayes’ theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics and especially in mathematical statistics.”