Abstract

The reasonable credit scoring model must have strong default identification ability, which means the credit scoring can effectively distinguish between defaulting and nondefaulting customers. The premise to determine the credit score of small enterprises is to determine the weight of indicators. This paper studies 3,045 Chinese small business loans, and two novel weighting methods “Wilks’ Lambda method” and “AUC value method” are proposed, The greater the weight they meet, the greater the ability of default identification. The five weighting methods of “Wilks’ lambda method,” “AUC value method,” “G1 method,” “entropy method,” and “mean square variance method” are compared. An important contribution of the paper is to discover that Wilks’ Lambda method is the most effective method for small business.

1. Introduction

The essence of credit is a borrowing and lending relationship which aims at pay back. Credit risk is default risk, that is, the possibilities that the borrower repays the principal and interest as scheduled. Credit risk evaluation is to reveal the nature of a debt default risk, which essentially estimates the customer’s credit status and determine the order of loan customers.

A reasonable credit risk evaluation system must have strong default identification ability, which is able to effectively distinguish between defaulting customers and nondefaulting customers. One reason to determine the weight of the credit evaluation indicator is the key to determine the quality of the credit evaluation system. In most of weighting methods, the choice of an appropriate weighting method is the key for credit risk evaluation. If the choice of the weighting method is not appropriate, it will directly affect the evaluation result, which means the poor credit enterprises will be evaluated as good businesses and this will mislead the decision-making for financial institutions. The weight also can reflect the importance of the indicator; that is, we can determine the key indicators that play an important role in credit risk evaluation according to the weight.

This paper studies 3,045 small business loans of a commercial bank of China, and two novel weighting methods “Wilks’ Lambda method” and “AUC value method” are proposed. The greater the weight they meet, the greater the ability of default identification. The five weighting methods of “Wilks’ lambda method,” “AUC value method,” “G1 method,” “entropy method, and mean square variance method” are compared.. An important contribution of the paper is to discover that Wilks’ Lambda method is the most effective method for small business. The weight results show that nonfinancial indicators such as “consumer price indicator” and “enterprise credit in 3 years” have the largest weight and play an important role in default prediction of small enterprises. The credit scoring model is constructed according to Wilks’ Lambda method, which has the maximum default identification ability.

The rest of the paper is structured as follows: Section 2 is the review of the literature. Section 3 subsequently describes the model of weight indicator. Section 4 constructs the standard to choose the optimal weighting methods. Section 5 is the empirical study, and the final section concludes the study.

2. Review of the Literature

In the existing research, artificial intelligence methods such as neural network, SVM, and statistical methods such as logit regression are used to build the credit scoring model. Chai et al. established a credit scoring system by using both partial correlation analysis and probit regression [1]. Bai et al. used fuzzy rough-set theory and fuzzy C-means clustering to evaluate farmer credit level [2]. Tong et al. introduced mixture cure models to the area of credit scoring [3]. Harris used SVM assessment for the credit risk [4]. Tanoue et al. forecasted default with a multistage model [5]. Chi et al. established the credit risk rating system by logit regression [6]. Shi et al. developed an approach combining Pearson correlation analysis with F test significance discrimination for credit risk [7]. Shi et al. proposed a credit rating model that considers the impact of LGD [8]. Mizen and Tsoukas forecasted default ratings in an ordered probit model [9]. Danenas and Garsva [10] and Hilscher and Wilson [11] constructed the linear credit scoring equation. Hasumi and Hirata studied the Japanese credit scoring market using data on 2,000 small- and medium-sized enterprises and a small-business credit scoring (SBCS) model [12]. Min and Lee proposed a DEA model for credit scoring [13].

For all the credit scoring models, the key point is to determine the weight of indicators. The existing weighting methods can be divided into three categories: subjective weight, objective weight, and combined weight [14].

Subjective weight was decided by the experts according to their experience, knowledge, and personal preferences. For example, the method of analytical hierarchy process (AHP) was used to weight the evaluation indicator [1517]. Vidal used the Delphi method to determine the subjective weight of evaluation indicators [18].

Objective weight was decided by the data which belong to objective information. The objective weight methods include entropy method [19], standard deviation method [20], variation coefficient method [21], and goal programming method [22, 23]. Chen et al. used an entropy weight method to weight industries when analyzing the systemic risk of different industries and thereby established a credit evaluation model [24].

The existing research based on the discrete degree of the data to determine the indicator weight did not take into account the default identification ability. In fact, for the credit risk evaluation, the standard “the bigger the default identification ability, the greater the indicator weight” should be satisfied.

Optimal weight also belongs to the objective weight, which can get the weight through the goal programming method. The special point is that we can get the optimal weight according to the evaluation results, which means the default customer has the lower evaluation scoring and the nondefault customer has the higher scoring. The disadvantage lies in that the optimal weight only ensures the evaluation result to be the most optimal, but the size of weight does not reflect the importance of the indicator, whereas the objective and subjective weights do.

Combined weight combines the subjective weight and objective weight, which can ensure that the result not only relies on subjective experience of experts but also reflects the objective information of data [25, 26]. Ono used the Propensity Score Match method to weight an indicator and established a credit scoring model for Japan’s small businesses [27]. Huang used the second order of the least square method and the GMM-SYS method to weight an indicator, thereby examining the relationship between trade credit and bank credit [28].

In fact, the combined weight was not reasonable, because the combined weight was a combination of different weighting methods, especially combining the objective weight and subjective weight, which will combine a good method and a bad method and lead to the final result being not better. On the issue of credit risk evaluation, the combined weight may combine the weighting method with large default identification ability and less default identification ability and lead to the result lacking default identification ability.

In the research of credit risk evaluation, the weighting method often randomly chooses subjectively and there was not a standard. This study has done the following two tasks: one is calculating the indicator weight based on the default identification ability and the other is determining the optimal weighting method in the five different weighting methods according to the maximum default identification ability.

3. Weighting Methods

3.1. Standardization of Rating Indicator Data

The standardization of indicator data aims to transform the original indicator data into standardized values between 0 and 1, in order to eliminate the impact of indicators dimension. There are four types of indicators named positive indicator, negative indicator, interval indicator, and qualitative indicator. The standardization process is as follows.

Let xij be the standardized value of the jth customer in the ith indicator, be the original value of the jth customer in the ith indicator, n be the number of customers, q1 be lower boundary of the optimal region, and q2 be the upper boundary. The positive indicator, negative indicator, and the interval indicator can be expressed as follows:

The standardization of qualitative indicators is through expert interview, survey, etc. It is given in Table 1.

3.2. Subjective Weighting Method Based on the G1 Method

The subjective weight of evaluation indicator can be obtained based on the experts’ experience. The G1 method reflects the importance of indicators by the order that experts gave. If the order is given, the relative importance of any two adjacent indicators can be obtained, and this is the parameter used to calculate the weight. The steps of calculating the weight is as follows:Step 1: determine the importance order of indicators by experts. The most important indicator is in the first place, and the least important indicator is in the last place.Step 2: determine the value of the ratio ri between two adjacent indices xi-1 and xi, and the values of the ratio are shown in Table 2.Step 3: calculate the weight of the last indicator . The superscript “1” denotes the first weighting method. The formula isStep 4: calculate the weight of the other indicator . On the basis of formula (4), the other indicators’ weight is calculated

Through formulas (4) and (5), we can get the weight of every indicator. The results satisfied that the higher the ranking, the more important the indicator and the larger the weight is.

3.3. Objective Weighting Method Based on Information Content

There are entropy weight method, the mean square deviation method, and other methods that can measure the information content. The more discrete the data of the indicator, the more information the indicator reflects, and the greater the weight, so we can ensure that the more the information content of the important indicators, the greater the weight.

3.3.1. Entropy Weight Method

Let xij be the standardized value of the jth customer in the ith indicator, be the average of the ith indicator, si be the standard deviation of the ith indicator, n be the number of customers, m be the number of indicators, ei be the entropy of the ith indicator, be the weight of the ith indicator, and the superscript “2” be the second weighting method. The formula is as follows:

The value of entropy ei denotes the information content, (1 − ei) denotes the difference coefficient, and the larger the difference coefficient, the larger the information content of the ith indicator, so the larger the weight.

3.3.2. Mean Square Deviation Method

Let xij be the standardized value of the jth customer in the ith indicator, n be the number of customers, m be the number of indicators, si be the mean square deviation of the ith indicator, be the weight of the ith indicator, and the superscript “3” denote the third weighting method. The formula is as follows:

The value of mean square deviation si reflects the discrete data; the more discrete the data, the more the information content of the indicator and the larger the weight.

3.4. Objective Weighting Method Based on Default Identification Ability

A reasonable credit risk evaluation system must have strong default identification ability, which can effectively distinguish between default customers and nondefault customers. Determining the weight of the credit evaluation index reasonably is the key to determining the quality of the credit evaluation system. If the choice of weighting method is not appropriate, it will directly affect the evaluation result, which means the poor credit enterprises will be evaluated as good businesses, and this will mislead the decision-making for financial institutions.

Therefore, this paper puts forward the idea of assigning weights to indicators according to the standard of default identification ability. Indicators with stronger capability of distinguishing default state should be given greater weight. We construct the statistics related to the default state, such as the F-statistics and Wilks’ Lambda χ2-statistics, which can identify the default identification capability. We can also identify the default identification capability through the judgment of default, like the ROC curve and gene coefficient.

3.4.1. Objective Weighting Method Based on Wilks’ Lambda Method

The steps to calculate the weight based on Wilks’ Lambda method:Step 1: evaluate the sum of squares within group SSwi for the ith indicator.According to the customer’s actual default state, the ith indicator is divided into two groups, the default group (denoted as 1) and the nondefault group (denoted as 0). Let m be the number of customers, m0 be the number of nondefault customers, m1 be the number of default customers, be the standardized value of the jth nondefault customer in the ith indicator, be the average of nondefault customers in the ith indicator, be the standardized value of the jth default customer in the ith indicator, be the average of default customers in the ith indicator, be the average of the ith indicator, and n be the number of customers. The sum of squares within group SSwi for the ith indicator isEquation (10) refers to the sum of nondefault customer values deviating from the average value and the default customer values deviating from their mean value for the ith indicator. The smaller the sum of squares within group SSwi, the less the value differences between default and nondefault customers.Step 2: evaluate the sum of squares between groups SSbi for the ith indicator.The sum of squares between groups SSbi for the ith indicator isEquation (11) refers to the default and the nondefault customer average values deviating from the mean of all customers for the ith indicator. The larger of the sum of squares between groups SSbi, the larger the value differences between default and nondefault customers.Step 3: evaluate the eigenvalue γi for the ith indicator.Take the maximum value of discriminant criterion named eigenvalue γi in discriminant analysis into the indicator weighting. That is,Step 4: evaluate Wilks’ Lambda value Λi for the ith indicator:Step 5: evaluate the statistics for the ith indicator.Let m be the number of customers and G be the number of groups; in this study, there are two groups named default group and nondefault group, so G = 2, and let J be the number of variables, because we calculate the statistics of one indicator each time, so J = 1. The formula of statistics isThe meaning of equations (12) to (14): for the ith indicator, the smaller the sum of squares within group SSwi, the less the value differences between default and nondefault groups and the larger the sum of squares between groups SSbi, the larger the value differences between default and nondefault groups. Thus, the eigenvalue γi is larger and Wilks’ Lambda value Λi is also larger which means the stronger the indicator ability to distinguish the default situation.Step 6: evaluate the weight of the ith indicator.Normalization processing is used for the value of statistic which is calculated by formula (14), the weight of the ith indicator is obtained, and the superscript “4” denotes the fourth weighting method named Wilks’ Lambda method. The formula is as follows:Meaning of formula (15): the larger the statistic, the stronger the indicator ability to distinguish the default situation and the larger the weight of the ith indicator.

The meaning of the weighting method based on Wilks’ Lambda method: for the ith indicator, the smaller the sum of squares within group SSwi, the less the value differences between default and nondefault groups and the larger the sum of squares between groupw SSbi, the larger the value differences between default and nondefault groups and the larger the statistic, which means the stronger the indicator ability to distinguish the default situation. Also, the weight of the ith indicator is larger, and the method makes the weight reflect the ability of identifying default state, which makes up the disadvantage that the existing indicator system had nothing to do with the ability to identify the default situation.

3.4.2. Objective Weighting Method Based on ROC Curve

Calculating the AUC value reflects the default identification ability through the ROC curve. When the AUC value is greater, the indicator can distinguish the default customers from nondefault customers and the default identification accuracy is higher. This means the indicator has stronger default identification ability, so the indicator weights should be greater.

The steps to calculate the weight based on the ROC curve method:Step 1: building the logistic regression equationLet P(y = 1) denote the default probability of the jth customer; zj denote the Latent variables; xij denote the standardization score of the ith indicator and the jth customer; n denote the number of customers; m denote the number of indices; α denote the constant; βi denote the regression coefficient of the ith indicator; and ε denote the random error term. The logistic regression model isThe regression coefficient β and its standard error SEβ can be obtained by using maximum likelihood estimation in equation (16), and this process can be realized by SPSS software.Step 2: the prediction of the default probabilityTaking the data of customers into formulas (16) and (17), the default probability P(y = 1) can be predicted.Step 3: the classification of model identification resultsFrom the calculated default probability P(y = 1) with the real default state of customers, if the default probability is P(y = 1) ≥ 0.5, the customers are discriminated default; else P(y = 1) < 0.5, the customers are not default.The classification result by comparing predicted and real default state is shown in Table 3.Step 4: the construction of the ROC curveAccording to the classification results in Table 3, the two variables are defined, which are the horizontal and vertical coordinates of the ROC curve.Vertical coordinate: also known as the true positive rate (TPR), it is the ratio of predict the correct default sample TP accounted for the total sample (TP + FN), with the formula expressed asHorizontal coordinate: also known as the false positive rate (FPR), it is the ratio of wrongly predicted sample TP that nondefault customers are predicted default, accounted for the total sample (TP + FN), with the formula expressed asStep 5: the calculation of AUC valueComputing the area under the ROC curve, the value is AUC which belongs to 0-1. The greater the AUC value of the indicator, the stronger the ability of default identification of the indicator.. If AUC = 1, it means the predicted results are entirely consistent with actual state and this is the most ideal situation.Step 6: evaluate the weight of the ith indicator.Normalization processing is used for the value of AUC, the weight of the ith indicator is obtained, and the superscript “5” denotes the fifth weighting method named ROC curve.

The meaning of the weighting method based on ROC curve: for the ith indicator, the ROC curve can be constructed according to the number of default customers judged correctly TP accounted for the proportion of all default customers (TP + TN) and the number of nondefault customers judged correctly TN accounted for the proportion of all nondefault customers (FN + TN). The larger the area under the ROC curve, the stronger the default identification ability is and the larger the weight of the indicator is, which makes the weight reflect the ability of identifying default state and makes up the disadvantage that the existing indicator system had nothing to do with the ability to identify the default situation.

4. Selection of the Optimal Weighting Model

How to confirm the optimal weighting method for credit risk evaluation among many weighting methods? The standard to select the optimal weighting method is that the credit score has default identification ability. In other words, the credit scores of nondefaulting customers are relatively high, and the credit scores of defaulting customers are relatively low.(1)Calculate the credit evaluation score zjWe get the weight of indicator ωt in Section 2, where the superscript “t” denotes the tth weighting method. For the weight and the standard value of indicators, we can get the customers’ credit score by the linear weighted method.Let zj denote the credit score of the jth customer; xij denote the standardization score of the ith indicator and the jth customer; n denote the number of customers; and m denote the number of indices; so the evaluation function isFormula (21) considers that there is a linear relationship between the indicators and credit score. Some nonlinear evaluation model has the similar result when choosing the weighting method. For example, Logit model, Tobit model, Probit model, etc. are nonlinear models and their ranking of small business credit scores is the same as formula (21). Therefore, this paper chooses the linear model to find the optimal weighting method. Wilks’ Lambda weighting method is still the best, and the evaluation result is still the best under the condition of the nonlinear evaluation model.(2)Determine positive and negative ideal pointsThe positive ideal point means the best evaluation result is hypothetical; in the credit risk evaluation, it means the nondefault customers all have the best value and the default customers all have the worst value. Conversely, the negative ideal point means the worst evaluation result; the nondefault customers have the worst evaluation result and the default customers have the best evaluation results.For the linear weighted evaluation, the sum of the indicators’ weight is always equal to one and the customers’ data belong to the interval zero to one by standard processing, so the credit score belongs to the interval zero to one. The evaluation score vector Z for n customers meetsAs shown above, n0 denotes the number of nondefault customers; n1 denotes the number of default customers; and superscript “(0)” denotes the nondefault customers and “(1)” denotes default customers.So the positive ideal point Z+ and the negative ideal point Z satisfiedZ+ and Z have the same structure as Z.(3)Calculated the Euclidean distanceLet D+ denote the distance between credit score and positive ideal point, D denote the distance between credit score and negative ideal point, zj denote the credit score of the jth customer, denote the positive ideal value of the jth customer, denote the negative ideal value of the jth customer, and n denote the number of customers. Then,Formula (24) represents the close relationship between the evaluation value of customers and the positive ideal value. Formula (25) represents the close relationship between the evaluation value of customers and the negative ideal value.(4)Calculate the neartude CtAs shown above, D+ denotes the distance between credit score and positive ideal point and D denotes the distance between credit score and negative ideal point, so the formula of neartude based on the tth weighting method isThe value of neartude Ct satisfied 0 ≤ Ct ≤ 1. If the condition zj = , the evaluation value is equal to the positive ideal value which means the default customers’ credit score is the worse value 0 and the nondefault customers’ credit score is the best value 1, so the neartude Ct = 1. similarly, if zj = , the evaluation value id equal to the negative ideal value which means the default customers’ credit score is the best value 1 and the nondefault customers’ credit score is the worst value 0, so the neartude Ct = 0.The larger the value of neartude Ct, the closer the final credit score is to the positive ideal value 1 and the farther away the score is from the negative ideal value 0; this means the larger the value of neartude, the more distinguished the evaluation result of the default and nondefault customers.(5)Select the optimal weighting methodAccording to the analysis of formula (26), we know that the larger the neartude Ct, the more distinguished the evaluation result of the default and nondefault customers. This means the credit score has greater default identification ability, because the evaluation score is a function of weight and the corresponding weighting method is optimal.In short, the greater the neartude value, the better the weighting method.The meaning of the selection optimal weighting method: through the distance of nondefault customers’ scores to positive ideal point and default customers’ scores to negative ideal point, we construct the neartude which reflected the default identification ability; the greater the neartude value, the easier for the weighting method to distinguish between the default and nondefault customers, so we can select the optimal method among different weighting methods; by this way, it overcomes the existing research’s disadvantages that nondefault and default customers’ scores had a large number of overlaps; this can also avoid the deficiency of random selection of weighting methods without considering the purpose of evaluation.

5. Empirical Study

5.1. Credit Risk Evaluation Indicator System and the Indicator Data
5.1.1. Credit Risk Evaluation Indicator System

We get the credit risk evaluation indicator system by the logistic regression model which includes sixteen indicators. Because the establishment of the indicator system is not the main content of this paper, our research is how to choose an optimal weighting method for credit risk evaluation. So, we just use the indicator system directly. The indicator system including sixteen indicators is shown in Table 4. The explanation of 16 indicators is in [6]. So, there is no further explanation in this paper.

The indicator system is shown in column (a) in Table 4.

5.1.2. Data Obtained

There are two types of data in this paper. First is using the data of 3045 small enterprise loans in 28 cities of a regional commercial bank in China in recent 20 years, in which there are 2995 nondefault small enterprises and 50 default small enterprises. Second, we asked 43 experts from one regional commercial bank’s head office to rank the indicators based on their significance.

The data can be obtained by 16 indicators in the order of X1, X2, …, X16 annotation, showed in column a in Table 4. Table 4 is constituted by 2 parts: the first part is the original data showed in columns 1–3045, recorded for matrix (); the second part is the standardization data showed in columns 3046–6090, recorded for matrix (xij). The process of standardization is shown in (3).

The ranking data of 43 experts on the indicators are shown in rows 1–16 in Table 5. We will convert the rankings to values in Section 5.2.

5.1.3. Standardization of Indices Data

For the data matrix () in rows 1–16 and columns 1–3045 in Table 4, each data set represents the original data of the ith indicator for the jth customer. Among them, we can find a maximum and a minimum value from 3045 data in each row that were max() and min() and needed in formulas (1)–(3).

The original data can be standardized, and the standardization data are shown in columns 3046–6090 in Table 4.

There is a need to point out that there was one interval-type indicator in the 16 indices, which is the consumer price indicator. The best range of consumer price indicator is [101, 105]. Taking the original data into formula (3), the standardized data xij can be obtained.

According to the standardized method of qualitative indices in Section 3.1, the indices value are changed into [0, 1] range.

5.2. Calculation of Five Types of Indicator Weights
5.2.1. Subjective Weight Based on the G1 Method

The importance order of indicators is determined by experts. The most important indicator “X10 working time in relevant industry” is in the first row in Table 6, and the least important indicator “X8 controlled income of each urban resident (yuan)” is in the last row in Table 6.

The value of the ratio ri between two adjacent indices xi-1 and xi was determined according to the rules in Table 2 by experts, and the results of the ratio are shown in column 2 in Table 6.

The weight of the least important indicator “X8 controlled income of each urban resident” is determined as follows: put the data ri in rows 2–16 and column 2 in Table 6 into formula (4); the subjective weight is  = [1 + (1 × 1.8 × 1.4 × … × 1.1 × 1) + … + (1.1 × 1) + 1]−1 = 0.007.

The result is shown in column 3 and row 16 in Table 6.

On the basis of , the weight of other indicators was calculated according to formula (5). For example, the indicator in row 15 in Table 6X7 consumer price indicator” shows  = r16 ×  = 1 × 0.007 = 0.007. And so on, the weight of other indicators can be reverse-calculated; the results are shown in Table 6.

5.2.2. Objective Weight Based on the Entropy Weight Method

Taking the indicator “X1 net cash flow ratio from current liabilities operating activities,” for example, and putting the standardized data in row 2 and columns 3046–3090 in Table 4 into formula (6), we get the entropy value of indicator X1: e1 = 0.996; the result is shown in column 2 in Table 7.

Similarly, we can calculate the entropy of other indicators; the results are shown in column 2 in Table 7.

Putting the entropy values in column 2 in Table 7 into formula (7), the weights of indicators were obtained which are shown in column 3 in Table 7.

5.2.3. Objective Weight Based on the Mean Square Deviation Method

Taking the indicator “X1 net cash flow ratio from current liabilities operating activities,” for example, and putting the standardized data in row 2 and columns 3046–3090 in Table 4 into formula (8), we get the mean square deviation value of indicator X1: s1 = 0.104; the result is shown in column 4 in Table 7.

Similarly, we can calculate the mean square deviation value of other indicators; the results are shown in column 4 in Table 7.

Putting the mean square deviation values in column 4 in Table 7 into formula (9), the weights of indicators were obtained which are shown in column 5 in Table 7.

The first two weighting methods are based on information content, and the latter two weighting methods are based on default identification ability.

5.2.4. Objective Weight Based on Wilks’ Lambda Method

Taking the indicator “X1 net cash flow ratio from current liabilities operating activities,” for example, and putting the standardized data in row 2 and columns 3046–3090 in Table 4 into formulas (10)–(14), we get the statistics value of indicator X1:  = 12.712; the result is shown in column 6 in Table 7.

Similarly, we can calculate the χ2 statistics value of other indicators; the results are shown in column 6 in Table 7.

Putting the χ2 statistics values in column 6 in Table 7 into formula (15), the weights of indicators were obatined which are shown in column 7 in Table 7.

5.2.5. Objective Weight Based on the ROC Curve Method

Taking the indicator “X1 net cash flow ratio from current liabilities operating activities,” for example, and putting the standardized data in row 2 and columns 3046–3090 in Table 4 into formulas (16) and (17), we get the logistic regression model of indicator X1 and then we can calculate the default probability Pj(y = 1) of the jth customer. Compare the calculated default probability P(y = 1) with the size of 0.5 P(y = 1) < 0.5, means that customers are defaulted, and vice versa.

According to the classification result in Table 3, we can obtain the ROC curve based on formulas (18) and (19). Computing the area under the ROC curve, and the value is AUC1 = 0.608, and the result is shown in column 8 in Table 7. Similarly, we can calculate the AUC value of other indicators; the results are shown in column 8 in Table 7. Putting the AUC values in column 8 in Table 7 into formula (20), the weights of indicators were obtained which are shown in column 9 in Table 7.

In order to show the difference among the 5 types of weighting methods, the weights of indicators are drawn in Figure 1.

5.3. Selection of the Optimal Weighting Method

In the five types of weighting methods, through calculating the neartude to select an optimal weighting method, we can get a series of indicator weights.

Taking the G1 weighting method for example and putting the G1 weight in column 3 in Table 6 into formula (21), we can obtain the credit score of every customer; the evaluation results are represented by vectors:

Putting the results Z1, the positive ideal point Z+ = {} = {1, …, 1,0, …, 0}, and the number of customers m = 3045 into formula (24), the distance between credit score and positive ideal point is obtained as D+ = 21.841. Similarly, we can get the distance between evaluation result Z1 and the negative ideal point Z = {} = {0, …, 0, 1, …, 1} satisfying D = 35.221.

Putting the two distances into formula (26), the neartude of the G1 weighting method was calculated:

Similarly, we can calculate the neartude of other four weighting methods, as follows: entropy weighting method: C2 = 0.486. Mean square deviation weighting method: C3 = 0.576. Wilks’ Lambda weighting method: C4 = 0.703. ROC curve weighting method: C5 = 0.580.

The neartude of five types of weighting methods is shown in Figure 2.

From the abovementioned part, we know the greater the neartude C, the more distinguish the evaluation result of the default and nondefault customers and the better the corresponding weighting method.

In the five types of weighting methods, the neartude of Wilks’ Lambda weighting method is highest at C4 = 0.703 and so this weighting method is more suitable for credit risk evaluation.

5.4. Analysis of the Wilks’ Lambda Weight of Credit Evaluation Indicators

The optimal weighting result based on Wilks’ Lambda method is shown in Table 8. Adding the weight of the financial indicators in rows 1–6 in Table 8, the sum is 0.113, and adding the weight of nonfinancial indicators in rows 7–16 in Table 8, the sum is 0.887. So, we can get the conclusion that the nonfinancial indicators are more important than financial indicators in the area of credit risk evaluation for small business.

Adding the weights of the macroenvironment indicators in rows 7–9 in Table 8, the sum is 0.571. This shows that the macroeconomic factors are especially important in the credit risk evaluation of small business.

For small businesses, this result is obvious. Small businesses are more vulnerable to changes in external macroconditions because of their high risk, small amount, etc.

5.5. Credit Scoring Model

We get the optimal weighting method in Section 5.4. For the weight and the standard value of indicators, we can get the customers’ credit score by the linear weighted method.

Let zj denote the credit score of the jth customer; the credit score is given as

That is, 0.017 net cash flow ratio from current liabilities operating activities +0.025 super-quick ratio +0.024 total outstanding loans to total assets ratio +0.035 net cash flow from operating activities +0.006 working capital allocation ratio +0.007 retained earnings growth rate +0.237 consumer price indicator +0.147 controlled income of each urban resident +0.187 Engel coefficient +0.057 working time in relevant industry +0.006 account opening status +0.046 product sales range +0.027 dwelling condition +0.038 working time holding the position +0.108 enterprise credit in 3 years +0.034 score of pledged collateral.

6. Conclusions

Among most of weighting methods, the choice of an appropriate weighting method is the key for credit risk evaluation. If the choice of weighting method is not appropriate, it will directly affect the evaluation result, which means the poor credit enterprises will be evaluated as good businesses and this will mislead the decision-making for financial institutions. The weight can also reflect the importance of the indicator, that is, we can determine the key indicators that play an important role in credit risk evaluation according to the weight.

The reasonable credit risk evaluation system must have strong default identification ability, which means the evaluation results can be effectively distinguished between defaulting customers and nondefaulting customers. The existing research always used the combined weight which combines the subjective weight and objective weight. In fact, the combined weight was not reasonable, because the combined weight will combine a good method and a bad method and leads to the final result being not better.

This paper proposed the method for selecting an optimal weighting method suitable for credit risk evaluation. The contribution in theory was through the distance of nondefault customers’ scores to positive ideal point and default customers’ scores to negative ideal point, we construct the neartude which reflected the default identification ability; the greater the neartude value, the more distinguished the weighting method of the default and nondefault customers, and so we can select the optimal method among different weighting methods; by this way, it overcomes the existing research’s disadvantages that nondefault and default customers’ scores have a large number of overlaps; this can also avoid the deficiency of random selection of weighting methods without considering the purpose of evaluation.

This paper proposed two novel weighting methods “Wilks’ Lambda method” and “AUC value method” according to the ability of default identification of the indicators. For comparison, three kinds of traditional subjective and objective weighting methods are listed. The subjective weight of evaluation indicator can be obtained by the G1 method, which reflects the experts’ experience. The objective weights of evaluation indicator can be obtained by the entropy weight method and the mean square deviation method, which can measure the information content.

This paper also proposed how to confirm the optimal weighting method among those five weighting methods. The standard to select the optimal weighting method is the evaluation result with default identification ability which means the credit score can confirm the largest difference between default customers’ scores and nondefault customers’ scores.

The empirical study used loan data from 3,045 small business loans from a Chinese commercial bank and also used survey data from 43 experts from one regional commercial bank’s head office. An important contribution of the paper is to discover that Wilks’ Lambda method is the most effective method for small business and nonfinancial indicators such as “consumer price indicator” and “enterprise credit in 3 years” play an important role in prediction of small business default.

Our study opens up some potential future research avenues. First, increasing the amount of data or some other database in the empirical research could make the results more convincing. Second, a future study could develop more weighting methods based on default identification capability. These methods show the importance of explanatory indicators and give more reasonable evaluation results. Third, in terms of the weight assignment method based on information amount, the method is further improved, such as the method of topological entropy in [29, 30].

Data Availability

The data used to support the findings of this study have not been made available because of a contract with the commercial bank that supports the research on the confidentiality and nondisclosure of the data.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Youth Project of National Natural Science Foundation of China (No. 71901055), the Key Projects of National Natural Science Foundation of China (No. 71731003), and the General Projects of National Natural Science Foundation of China (No. 71873103).