Research Article

Improving Prediction Accuracy of “Central Line-Associated Blood Stream Infections” Using Data Mining Models

Table 1

The six data mining methods used.

NoDM MethodDescription

1AdaBoost (AB)AdaBoost [40], short for “Adaptive Boosting,” is “a machine learning meta-algorithm. It is a powerful classification algorithm that has practical success with applications in a wide variety of fields. Boosting is an approach to machine learning based on the idea of creating a highly accurate prediction rule by combining many relatively weak and inaccurate rules.”

2Random forest (RF)Random forest [41] is “an ensemble learning method for classification and regression that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees. Random decision forests correct for decision trees’ habit of overfitting to their training set.”

3Support vector machine (SVM)The SVM [42] is used “to find the best classification function to distinguish between members of two classes in the training data. The metric for the concept of the ‘best’ classification function can be realized geometrically. It is considered a good classifier because of its high generalization performance without the need to add a priori knowledge, even when the dimension of the input space is very high.”

4Multilayer Perceptron (MLP)The MLP [43, 44] is “a feedforward artificial neural network model that maps sets of input data onto a set of appropriate outputs. An MLP consists of multiple layers of nodes in a directed graph, with each layer fully connected to the next one.”

5Logistic regression (LR)Logistic regression [45] is “a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. The outcome is measured with a dichotomous variable (in which there are only two possible outcomes).”

6Naive Bayesian inference (NBI)Naïve Bayesian inference [46] is “a method of statistical inference in which Bayes’ theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics and especially in mathematical statistics.”