Defining an Optimal Cut-Point Value in ROC Analysis: An Alternative Approach
ROC curve analysis is often applied to measure the diagnostic accuracy of a biomarker. The analysis results in two gains: diagnostic accuracy of the biomarker and the optimal cut-point value. There are many methods proposed in the literature to obtain the optimal cut-point value. In this study, a new approach, alternative to these methods, is proposed. The proposed approach is based on the value of the area under the ROC curve. This method defines the optimal cut-point value as the value whose sensitivity and specificity are the closest to the value of the area under the ROC curve and the absolute value of the difference between the sensitivity and specificity values is minimum. This approach is very practical. In this study, the results of the proposed method are compared with those of the standard approaches, by using simulated data with different distribution and homogeneity conditions as well as a real data. According to the simulation results, the use of the proposed method is advised for finding the true cut-point.
The ROC curve is a mapping of the sensitivity versus for all possible values of the cut-point between cases and controls. To measure the diagnostic ability of a biomarker, it is common to use summary measures such as the area under the ROC curve (AUC) and/or the partial area under the ROC curve (pAUC) . A biomarker with AUC = 1 discriminates individuals perfectly as diseased or healthy. Meanwhile, an AUC = 0.5 means that there is no apparent distributional difference between the biomarker values of the two groups .
ROC analysis provides two main outcomes: the diagnostic accuracy of the test and the optimal cut-point value for the test. Cut-points dichotomize the test values, so this provides the diagnosis (diseased or not). The identification of the cut-point value requires a simultaneous assessment of sensitivity and specificity . A cut-point will be referred to as optimal when the point classifies most of the individuals correctly [4, 5].
AUC, sensitivity, and specificity values are useful for the evaluation of a marker; however they do not specify “optimal” cut-points directly. In the literature, related to the subject, there are many approaches using both sensitivity and specificity for cut-point selection [4–9]. One of the commonly used method is the Youden index () method . This method defines the optimal cut-point as the point maximizing the Youden function which is the difference between true positive rate and false positive rate over all possible cut-point values [6, 7]. Another approach is known as the point closest-to- corner in the ROC plane (ER) which defines the optimal cut-point as the point minimizing the Euclidean distance between the ROC curve and the point . A third approach is based on the maximum achievable value of the chi-square statistic () which is driven using the cross-tabulations of true disease status and categorized new variables that separate the biomarker into two categories according to all possible cut-point values . A more recent approach was proposed by Liu , which defines the optimal cut-point as the point maximizing the product of sensitivity and specificity (CZ). In the literature, there are studies comparing optimal metrics derived from the sensitivity, specificity, agreement, and distance [10, 11]. In these studies, it is generally recommended that researchers should select one that is most clinically relevant.
In this study, a new approach is proposed for the identification of the optimal cut-point value in ROC analysis. The approach is based on the area under the ROC curve (AUC), sensitivity, and specificity values. It defines the optimal cut-point value as the point minimizing the summation of absolute values of the differences between AUC and sensitivity and AUC and specificity provided that the difference between sensitivity and specificity is minimum.
In the following section, first the background methodologies of previous methods are summarized, and, then, the proposed method is introduced. In Section 3, in order to compare the performance of the previous methods with that of the proposed one, generated data under the assumption of normal distribution and gamma distribution models for the biomarker are used. Then, in Section 4, using data from a real-world study of heart-failure patients , the cut-points for pulse pressure, plasma sodium, LVEF, and heart rate in prediction of mortality are calculated by applying the proposed and the previous methods. Finally, in Section 5, conclusions are given.
2. Previous Methods and the Proposed Method
2.1. Minimum Value Approach
Let be a continuous biomarker that is assumed to be predictive of an event (i.e., for diseased or for not diseased). At any given possible cut-point of , sensitivity () and specificity () values are as follows:Cut-point separates the data into two groups which forms a table, as shown in Table 1.
The minimum value approach was proposed by Miller and Siegmund  and defines the optimal point as cut-point that maximizes the standard chi-square statistic with one degree of freedom:where . As it was shown by Rota and Antolini , it can be also written in terms of classification probabilities:
2.2. Youden Index
The Youden index is a measure for evaluating the biomarker effectiveness. This measure was first introduced to the medical literature by Youden . is a function of and , such thatover all cut-points ; denotes the cut-point corresponding to . When the value of is maximum, is the “optimal” cut-point value [6, 7].
2.3. The Closest to Criteria (ER)
In this criteria, the “optimal” cut-point is defined as the point closest to the point on the ROC curve [3, 4].Mathematically, the point minimizing the function is called the “optimal” cut-point value.
2.4. Concordance Probability Method (CZ)
The concordance probability method proposed by Liu  defines the optimal cut-point as the point maximizing the product of sensitivity and specificity. This product gets value between 0 and 1. The concordance probability of dichotomized measure at cut-point can be expressed as the area of a rectangle associated with the ROC curve. Cut-point maximizing actually maximizes the area of the rectangle .
2.5. The Proposed Method: Index of Union (IU)
Perkins and Schisterman  stated that the “optimal” cut-point should be chosen as the point which classifies most of the individuals correctly and thus least of them incorrectly. From this point of view, in this study, the Index of Union method is proposed. This method provides an “optimal” cut-point which has maximum sensitivity and specificity values at the same time. In order to find the highest sensitivity and specificity values at the same time, the AUC value is taken as the starting value of them. For example, let AUC value be 0.8. The next step is to look for a cut-point from the coordinates of ROC whose sensitivity and specificity values are simultaneously so close or equal to 0.8. This cut-point is then defined as the “optimal” cut-point. The above criteria correspond to the following equation:The cut-point , which minimizes the function and the difference, will be the “optimal” cut-point value.
In other words, the cut-point defined by the IU method should satisfy two conditions: () sensitivity and specificity obtained at this cut-point should be simultaneously close to the AUC value; () the difference between sensitivity and specificity obtained at this cut-point should be minimum. The second condition is not compulsory, but it is an essential condition when multiple cut-points satisfy the equation.
In order to illustrate how the IU method defines the “optimal” cut-point, the values obtained from an artificial data are used. Some of the cut-points (with their sensitivity and specificity values) provided by the artificial data are given in Table 2. In this example, the AUC value is calculated as 0.918. For the sake of simplicity, instead of values, specificity values are given in the table. By using IU method, one can easily find that sensitivity (0.92) and specificity (0.92) values of the cut-point 1.985 are the nearest ones to the AUC value. Since also the difference between these two values is minimum, this cut-point will be called the “optimal” cut-point by the IU method.
However, it should be noted that choosing such a cut-point as the “optimal” cut-point may sometimes fail. For example, let . Then, the statistic given in (7) will be 0 and also the difference between and will be 0. Thus according to the definition of optimality given in the IU method, cut-point will be accepted as the “optimal” cut-point. However, if there is a point for which and , then the total misclassification rate will be 0.38 (which is smaller than that of the point , i.e., 0.40). Hence, cut-point is a better optimized point than cut-point , based on the definition of optimality given by Perkins and Schisterman .
Geometrically, the idea behind the IU method is very similar to the idea behind the ER method. As it can be seen in Figure 1, the IU method also tries to find the closest point to a point, that is, the point (, AUC). In the ER method, this point is taken as . However, instead of using the Euclidean distance as in the ER method, the IU method uses the absolute differences between the diagnostic accuracy measures and the AUC value. More specifically, the IU method searches for the point that minimizes the half perimeter of the ABCD rectangle seen in Figure 1. This rectangle is constructed by connecting the intersections points of the lines of , , , and .
3. Simulation Study
As it was shown by Rota and Antolini  although some of these methods are mathematically related, they do not necessarily identify the same true cut-point. That is, depending on the design of the study (balanced or unbalanced), the methods may identify different cut-points. According to their results, in the balanced homoscedastic scenario, the methods identified the same point; in the remaining scenarios (i.e., unbalanced homoscedastic and balanced/unbalanced heteroscedastic scenarios), the methods identified different cut-points. These results emphasize the importance of correctly defining the true cut-point in all possible scenarios.
Let us assume that a specific biomarker () in diseased and nondiseased populations is normally distributed, for diseased subjects and for nondiseased subjects. Under these assumptions, sensitivity and specificity can be written aswhere denotes the standard normal distribution function. The optimal cut-point occurs at the intersection of the normal probability density functions of diseased and nondiseased subjects (i.e., ) [7, 13]. For example, if is taken as , the corresponding true cut-points will be [11, 13]. These values of guarantee a wide variety of classification accuracies, ranging from a poor to a high one [7, 11, 13]. The identification of the true theoretical cut-point for the IU method under this scenario is given in the Appendix.
Now assume that is gamma distributed with the following parameters: for diseased subjects and for nondiseased subjects. If, for instance, is taken as , the corresponding cut-points for each method will be different; that is, for approach, , for Youden index, , for the concordance probability, , and, for the point closest-to- corner, . For the Index of Union, the corresponding cut-points are estimated by the empirical estimation method given in Liu’s work  as (Figure 2).
In order to compare the performance of the cut-point selection methods with the performance of the method proposed in this study, a simulation study is conducted with different scenarios. These scenarios are the same as the ones given in Rota and Antolini’s work . The first scenario is normal homoscedastic scenario with balanced design where all of the methods theoretically identify the same true cut-point. The second one is the nonbalanced normal case, where all of the methods except the approach identify the same cut-point. The last scenario is gamma case where all of the methods identify different cut-points.
In all scenarios, 1000 samples were generated with sample sizes 50, 100, and 200 for each group and with sample size , ; , ; and , ( is the number of diseased subjects and is the number of nondiseased subjects).
For each sample, the optimal cut-points , , , , and for the minimum value, the Youden index, the concordance probability, the point closest-to-(0, 1) corner, and the Index of Union are estimated, respectively. The relative bias and mean square error (MSE) values of each method are computed by and , respectively. ( denotes the true cut-point and denotes the estimated cut-point by the method.)
In order to estimate the standard deviation and the confidence interval (CI) for the optimal cut-point, the bootstrap resampling technique is applied . To calculate the bootstrap estimate , random sampling with replacement is used to draw 200 bootstrap samples within each of the 1000 generated samples. Moreover, to construct a 95% CI for the optimal cut-point, the basic percentile method is applied by taking the 2.5 and 97.5 percentiles of the bootstrap distribution.
The bootstrap estimator of the standard deviation () for the estimated cut-point is calculated by taking the standard deviation of the 200 cut-point estimates. Within each of the simulation scenarios, the CIs are subsequently evaluated by computing coverage probability and mean length.
All simulations are done by using R program with the version of 3.2.0. To determine the estimates for Youden index and the point closest-to- corner, the pROC library is used . For defining the estimates of the rest of the methods, an R code is written by the author and it can be available upon request.
3.1. Simulation Results
Table 3 shows the results for the balanced design under normal homoscedastic distributions. The relative bias values of the previously proposed methods are similar to the results of Rota and Antolini’s work  except the relative bias of Youden index. In particular for poor classification accuracy scenarios (i.e., and 0.52), Youden index has worse performance in the estimation of the optimal cut-point than their results. However, this discrepancy is not seen in the comparison of MSEs. That is, the MSEs of all methods are similar to that of Rota and Antolini’s work .
When comparing the relative bias and MSE values of the IU method with that of the other methods, it can be easily seen that the IU method has mostly similar performance with the point closest-to- corner method and has better performance than the other methods (i.e., lower relative bias and lower MSE values).
For the balanced design under normal homoscedastic distributions, the bootstrap standard deviation, coverage probability, and mean length of the 95% bootstrap CI for the cut-point are shown in Table 4. As in Table 3, the results given in Table 4 are similar to that of Rota and Antolini’s work . That is, the of the minimum value approach is still greater with respect to that of the other methods and the better classification accuracies provide the narrower 95% bootstrap CIs. The IU method achieves the smallest value and the narrowest CIs in most of the scenarios. The coverage probabilities are close to the nominal level for all methods.
The relative bias and MSE results for the unbalanced design under normal homoscedastic distributions are shown in Table 5. Since the true cut-point for the minimum value approach depends on the prevalence of the disease in the sample, different optimal cut-points are used for the comparisons . The relative bias values of all methods are similar to those of Rota and Antolini’s work , except for the minimum value approach in the lowest classification accuracy scenario (i.e., ). For this scenario the relative bias for the minimum value approach is larger than the bias given in their work. For poor and poor-moderate classification accuracy (i.e., and 0.52), the MSE is the lowest for the IU method, and, for moderate-high and high classification accuracy (i.e., and 1.28), both the point closest-to- corner method and the IU method get the lowest MSE values.
For the unbalanced design under normal homoscedastic distributions, the bootstrap standard deviation, coverage probability, and mean length of the 95% bootstrap CI for the cut-point are given in Table 6. For this scenario, the lowest and mean length of the 95% bootstrap CI values are obtained by the point closest-to-(0, 1) corner method and the IU method. As in the comparison of the relative bias and MSE values of the methods (Table 5), for poor and poor-moderate classification accuracy (i.e., and 0.52), the IU method gets the lowest and mean length, and, for moderate-high and high classification accuracy (i.e., and 1.28), both the point closest-to- corner in the ROC plane and the IU method get the lowest values. The coverage probabilities are close to the nominal level for all methods.
As it was shown in Rota and Antolini’s work , under a gamma distribution assumption with a balanced design, the theoretical true cut-points , , , and are all different. For all classification accuracy scenarios, the theoretical true cut-points for the IU method are obtained based on the idea given in the article of Liu  (Figure 2). The relative bias values of all methods are similar to those of Rota and Antolini’s work . The MSE gets its lowest value in the point closest-to- corner and the IU method for all classification accuracy scenarios (Table 7).
For this design (under gamma distributions), the and mean length of 95% CI values for the point closest-to- corner method and the IU method are lower than the other investigated approaches (Table 8). The coverage probabilities are close to the nominal level for all methods.
In all simulation scenarios, the IU method shows a better performance in the estimation of the optimal cut-point with respect to the other methods. The bootstrap standard deviation and mean length of the 95% bootstrap CI values for the IU method are also minimum among all methods. Thus, for all simulation scenarios, although, in gamma scenarios, the methods do not lead to a common cut-point, in order to identify the optimal cut-point, the IU method is a better alternative than the previous proposed methods.
3.2. Cross-Validation of the Optimal Cut-Point
In order to evaluate the significance of the optimally selected cut-point, twofold cross-validation process  is used. The procedure is as follows:(1)Generating data with the same properties given in this manuscript(2)Applying all methods to the data and estimating cut-points for all methods(3)Splitting data into two equal subsets, that is, subset I and subset II(4)Applying all methods to subset I and estimating cut-points for all methods(5)Assigning each observation in subset II to either one of two groups by using the cut-point obtained in the previous step(6)Applying all methods to new subset II and estimating cut-points for all methods(7)Assigning each observation in subset I to either one of two groups by using the cut-point obtained in the previous step(8)Applying all methods to the combination of these two subsets and estimating cut-points for all methods(9)Taking the difference between the cut-points obtained at the second step and at the last stepThis procedure is applied for 4 scenarios (2 normal and 2 gamma scenarios with the sample size ) given in the manuscript. The results are shown in Figure 1 in Supplementary Material available online at https://doi.org/10.1155/2017/3762651. According to the results, for each method, the difference between the optimal cut-points estimated before and after cross-validation is around 0 and the IU method gets the smallest mean absolute difference in all four scenarios.
A real data obtained from a study in cardiology is used as an example. Yildiran et al.  investigated an association between pulse pressure and 2-year cardiovascular death in an entire heart-failure population. They prospectively enrolled 225 (188 male, 37 female) heart-failure patients with NYHA functional classes I–IV, mean age 56.5 .
They recorded detailed histories of the 225 patients, including demographic characteristics, cardiovascular (CV) risk factors, and medication usage. The patients were divided into 4 NYHA classes in accordance with their medical histories and the findings upon physical examination and then into 2 groups according to their NYHA class (mild heart failure [classes I-II] and advanced heart failure [classes III-IV]). Levels of serum lipids, glucose, high-sensitivity C-reactive protein, blood urea nitrogen, creatinine, sodium, and potassium were measured by routine laboratory methods. Blood pressures were measured by sphygmomanometer in accordance with published guidelines. Pulse pressure was calculated as the difference between systolic and diastolic blood pressure, and the patients were divided accordingly into 4 quartiles (PP of <35, 35–45, 46–55, or >55 mmHg) .
They used ROC analysis to define the cut-point values for pulse pressure, LVEF, plasma sodium value, and heart rate in predicting CV death. In this analysis, 170 patients who had all four measurements at the same time (55 patients’ measurements were missing) were included. To get optimal cut-point values, they used ER approach .
Supplementary Web-Only Table reports some descriptive statistics of these four measurements. Pulse pressure, LVEF, and plasma sodium levels are significantly lower in dead patients than in alive patients and heart rate is significantly higher in dead patients than in alive patients. According to the results of the Shapiro-Wilk nonparametric normal distribution test, heart rate and plasma sodium are both normally distributed in both groups, LVEF is normally distributed in dead patients and is not normally distributed in alive patients, and pulse pressure is not normally distributed in both groups. For nonnormal distributed variables, the distribution of LVEF in alive patients is left-skewed and the distributions of pulse pressure in both groups are right-skewed. Since the numbers of patients in two groups are not close enough, the design is unbalanced and the ratio between the numbers of patients is similar to the 50 : 100 scenario in the simulation protocol.
In this study, the data obtained from the study by Yildiran et al.  is used and all the methods including the IU method are applied to this data. The corresponding results are given in Table 9. The upper part of Table 9 shows the cut-points obtained by using the previously proposed methods. To define the cut-point with the IU method, some of cut-points with their sensitivity and specificity values and AUC value are given. According to this table, the IU method gives the same cut-points with the ER method for different AUC values (Figure 3).
(a) The receiver operator characteristic curve for LVEF in the prediction of cardiovascular death 
(b) The receiver operator characteristic curve for plasma sodium in the prediction of cardiovascular death 
(c) The receiver operator characteristic curve for heart rate in the prediction of cardiovascular death 
Defining the optimal cut-point is very important when a continuous variable is considered as a diagnostic marker. Getting optimal classification level depends on the point chosen for diagnosis. The criteria for optimality can change according to the aim of the study. However, as a general rule, minimizing the total misclassification rates is a good approach. With IU method, since the difference between sensitivity and specificity values is minimum, this condition is met most of the time.
According to the results given in the tables, the proposed IU method can be a better alternative for defining the cut-point. When the definition of optimal point is stated as the point that minimizes the misclassification rates or the point that equalizes the values of sensitivity and specificity, the IU method is better than the other methods in most of the comparison scenarios. This conclusion does not change with the distribution of biomarker or the homogeneity of variances of biomarkers. The changes in the sample size and the AUC values may affect but not alter the interpretation.
The IU method uses the absolute difference between diagnostic accuracy measures and AUC value instead of using the Euclidean distance. The reason behind this idea is to provide the simplicity in defining the point as optimal. With the IU method, one can easily identify the optimal cut-point only by checking whether the sensitivity and specificity values are close enough to AUC value or not. That is, the complex calculations are not necessary for the IU method.
When the relative bias and MSE values of the IU method are compared with the previous methods, it is seen that the IU method is better than the others. Thus this method can be used for defining the optimal cut-point value especially when the sample sizes of the two groups are equal and the AUC value is high. (i.e., higher than 0.7).
A common practice is to select a cut-point which defines two risk groups for a continuously measured biomarker . A cut-point for a biomarker is meaningful for the clinicians when it is clinically interpretable and understandable. Clinical meaning for a cut-point can be explained by using its accuracy, that is, true classification rate. Among all the methods, only two of them, the Youden index and the concordance probability, are based on the maximization of this rate. Thus, these methods provide interpretable cut-points.
The point closest to point on the ROC curve method involves a quadratic term and clinical meaning of this term is unknown. Despite the lack of clinical meaning, it is shown in the literature that this method is superior to the other methods in estimating the true cut-point .
The IU method, like the Youden index and the concordance probability, tries to minimize the misclassification rate. Hence, it also provides an interpretable cut-point. In this study, it is shown that the IU method performs better than (or equal to) the point closest to point on the ROC curve method. Therefore, the use of the IU method is advised to get more interpretable and better optimized cut-point.
The IU method provides a cut-point whose sensitivity and specificity are equally high. This means that, in a cut-point determination process, if sensitivity and specificity are valued equally, the IU method seems to be the best option among all other methods.
Identification of the True Theoretical Cut-Point for the IU Method under the Normal Homoscedastic Distribution Case
Let us consider the normal homoscedastic distribution scenario, where , , 1 (assuming and ). Then, the conditional distribution of the quantitative variable in group is for , 1.
In particular, at cut-point , specificity , and sensitivity . Then the function can be written as one of the following forms (according to the difference in the absolute value):(i)(ii)(iii)(iv)That is, where and are arbitrary . Thus this formulation is general form of the Youden Index. So, the cut-point which optimizes the IU function can be obtained by taking the first derivative of IU(), , where are the normal probability density functions for diseased and nondiseased subjects. Since the normal distribution is symmetric, for the standard normal distribution and thus the root of is .
Conflicts of Interest
The author declares that there are no conflicts of interest regarding the publication of this paper.
The author gratefully thanks Dr. Refik Burgut, Dr. Nazan Alparslan, and Dr. Yasar Sertdemir for their valuable comments and suggestions and also thanks Dr. Tansel Yildiran for providing the data to illustrate the method.
Supplementary Table 1: Descriptive statistics of the application example of cut-point finding for Pulse pressure, LVEF, Plasma sodium level and Heart rate in prediction of mortality, from Yildiran et al. (2010).
Supplementary Figure 1: The difference between the optimal cut-points estimated before and after cross-validation is around 0 and the IU method gets the smallest mean absolute difference in all four scenarios.
M. H. Zweig and G. Campbell, “Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine,” Clinical Chemistry, vol. 39, no. 4, pp. 561–577, 1993.View at: Google Scholar
M. S. Pepe, The Statistical Evaluation of Medical Tests for Classification and Prediction, vol. 28 of Oxford Statistical Science Series, Oxford University Press, Oxford, UK, 2003.View at: MathSciNet
N. J. Perkins and E. F. Schisterman, “The inconsistency of “optimal” cut-points using two ROC based criteria,” American Journal of Epidemiology, vol. 163, no. 7, pp. 670–675, 2006.View at: Google Scholar
T. Yildiran, M. Koc, A. Bozkurt, D. Y. Sahin, I. Unal, and E. Acarturk, “Low pulse pressure as a predictor of death in patients with mild to advanced heart failure,” Texas Heart Institute Journal, vol. 37, no. 3, pp. 284–290, 2010.View at: Google Scholar