Mathematical Problems in Engineering

Mathematical Problems in Engineering / 2020 / Article
Special Issue

Data-driven Fuzzy Multiple Criteria Decision Making and its Potential Applications

View this Special Issue

Research Article | Open Access

Volume 2020 |Article ID 7504764 | 12 pages | https://doi.org/10.1155/2020/7504764

A Novel Ensemble Credit Scoring Model Based on Extreme Learning Machine and Generalized Fuzzy Soft Sets

Academic Editor: Zaoli Yang
Received28 Feb 2020
Accepted04 May 2020
Published30 Jun 2020

Abstract

This paper mainly discusses the hybrid application of ensemble learning, classification, and feature selection (FS) algorithms simultaneously based on training data balancing for helping the proposed credit scoring model perform more effectively, which comprises three major stages. Firstly, it conducts preprocessing for collected credit data. Then, an efficient feature selection algorithm based on adaptive elastic net is employed to reduce the weakly related or uncorrelated variables to get high-quality training data. Thirdly, a novel ensemble strategy is proposed to make the imbalanced training data set balanced for each extreme learning machine (ELM) classifier. Finally, a new weighting method for single ELM classifiers in the ensemble model is established with respect to their classification accuracy based on generalized fuzzy soft sets (GFSS) theory. A novel cosine-based distance measurement algorithm of GFSS is also proposed to calculate the weights of each ELM classifier. To confirm the efficiency of the proposed ensemble credit scoring model, we implemented experiments with real-world credit data sets for comparison. The process of analysis, outcomes, and mathematical tests proved that the proposed model is capable of improving the effectiveness of classification in average accuracy, area under the curve (AUC), H-measure, and Brier’s score compared to all other single classifiers and ensemble approaches.

1. Introduction

Nowadays, financial institutions tend to adopt different risk assessment and credit scoring models to reduce potential risk to a certain extent [1]. By analyzing customer credit data to figure out the probability that potential borrowers will default on their loans, the evaluation approaches can be utilized to turn customer data into principle, which could support credit decisions [2]. In that way, an effective credit scoring model can be a reliable supporting system to help managers in making their financial decisions.

To handle the potential risk of financial services, in the past few years, increasingly financial institutions are moving from traditional manual methods to advanced approaches that require building various types of evaluation models. For credit evaluation, three main methods, which are statistical approaches, nonparametric approaches, and AI methods, are being widely utilized [37]. These three methods work efficiently in various circumstances. Statistical methods consist of different models, which include discriminant analysis models, linear probability models, and probit and logit models. Yet nonparametric approaches tend to utilize the decision tree, K-nearest neighbor algorithm, fuzzy logic, Naïve Bayes, and so on. AI methods are more advanced and technology-dependent, such as artificial neural networks, support vector machines (SVM), particle swarm optimization (PSO), and genetic algorithm (GA).

Many researches also indicated that ensemble approaches show more effective performance in the evaluation of credit than single classifiers. To avoid the downsides of single classifiers, an increasing number of researchers have switched to using customized and combined various methods instead of using individual classification models separately. The principle of the hybrid approach is to perform preprocessing on the data input to the classifiers. The focus is to gather information from group-based classifiers based on the same issue and then export these strengths to get valid credit scoring decisions [8, 9]. In recent years, the research of fuzzy soft sets theory has made great progress, especially in the application field of multiattribute decision making [10, 11]. The development of fuzzy soft sets theory can provide a new perspective for us to build more state-of-the-art ensemble data classification and credit evaluation models [12].

The motivation of our study is to construct a more reliable credit scoring model that can generate accurate outcomes within imbalanced data. Three main approaches will be addressed to achieve this goal: (1) improved elastic net-based feature selection, (2) novel ensemble strategy and learning algorithm for imbalanced credit data, and (3) dynamic weighting method for single ELM classifiers based on new proposed similarity measure of GFSS. Because in real applications of credit risk evaluation, especially in peer-to-peer lending, credit data could be gathered from many different channels, including social networking and judicial administration platforms. The data collected from these channels are usually very sparse, redundant, rough, and imbalanced (good customers generally outnumber bad customers) and often consist of various weakly related or even uncorrelated features [13, 14]. These data characteristics will make the commonly used credit scoring models unstable, which leads to the credit evaluation results become unreliable and inaccurate. Through the above three approaches proposed in this article, the problems arising in credit scoring for imbalanced data can be handled and solved effectively.

In Section 2, we will talk about the construction of a new ensemble credit scoring model. The experimental outcomes will be discussed in Section 3. Finally, Section 4 concludes the paper.

2. New Ensemble Credit Scoring Model

This section mainly talks about the construction of the ensemble classification model for credit scoring.

2.1. Adaptive Elastic Net-Based Feature Selection

A large number of researchers have studied the appropriate feature selection approaches for credit scoring, such as cost-sensitive [15], information gain ratio [16], and genetic algorithm [17]. The Lasso estimator can reduce the regression coefficients to zero in L_1-norm. This method can also reduce features (variables) as well as select the most important one to build simple but effective models while keeping the high efficiency. Denote historical credit scoring data as , , where are variables for customers and are category tags (binary responses, denote 0 as default and 1 as nondefault). The regression model could be constructed as follows:where and are the intercept and regression coefficients, respectively. Suppose that every observation is not correlated and that all the variables are normalized. The Lasso proper estimate of could be constructed as

Based on the information above, a large would reduce some coefficients in to zero. That is, Lasso reduces the coefficients to zero while gradually increases. In addition, the Lasso model is able to hold any number of variables. Therefore, both the reduction of coefficients and the selection of features (variables) can be carried out at the same time.

Although Lasso has been proved to be easily interpretable and effective under various circumstances, it still has some shortcomings [18]. Zou and Hastie [19] put forward an expansion approach called elastic net to conduct selection. Similarly, the elastic net is also able to conduct automatic selection of variables and shrinkage of coefficient at the same time and select groups of correlated variables. For any constant and nonnegative and , the estimation of by the elastic net could be carried out as follows:where is an element of L1-norm and is the L2-norm element.

In addition, Zou and Zhang [20] also pointed out that the elastic net does not possess the oracle property. They then proposed a new adaptive elastic net which combines the L2 penalty with the weighted L1 penalty to penalize the squared error loss. Therefore, the adaptive elastic net could be treated as the package of the adaptive Lasso and elastic net. The valuation of by the adaptive elastic net could be calculated aswhere , is positive, while is fixed and nonnegative.

Using formula (4), we will be capable of obtaining the most significant attributes (“big fish”) from the variable pool. Then, we can plug them into credit scoring models to get a more precise result but with minimum computational and operational cost.

2.2. ELM-Based Classifier

ELM model, as a single-hidden layer feedforward neural network (SLFN), can select the input weight and hidden biases randomly without any adjustment during its process. The Moore–Penrose generalized inverse matrices of the hidden-layer output matrix can be utilized to analyze and determine the output weights. ELM exhibits excellent performance of generalizing and the reduction in the iterative time of training process. Clearly, it is more effective than any other ANN-type machine learning algorithms [21].

For the historical training credit data set that is mentioned above, the input vector is the ith sample with p-dimensional features, and .Then, p is the amount of input neurons. p is also equivalent to the input features. Let L be the amount of hidden neurons. Denote C as the amount of output neurons, which is also equivalent to the category number. Denote the input weight matrix as , where is the vector connecting the p input neurons with the jth hidden neuron. is the bias value of the hidden neurons, where is the bias value of the jth hidden neurons. The above parameters do not change during the whole process. The output could be computed by as follows:where G(x) is the activation function. Let H be the output of all the samples. It can be calculated by using the following equation:

The ith column represents the ith hidden nodes output vector relative to the inputs to . The jth row represents the output vector of the hidden layer relative to the input . The output of ELM can be calculated bywhere is the weight vector that connects the ith hidden nodes with the output nodes. ELM is able to evaluate those N samples without any mistake. In other words, . Then, the following equation can be obtained:

Equation (8) can also be rewritten as follows:

Based on (9), the value of output weight could be estimated using a least square solution as follows:where stands for the Moore–Penrose generalized inverse of . For credit scoring classification, the outcome of ELM is as follows:

2.3. Ensemble Strategy for Imbalanced Data

To better solve the classification of imbalanced data, a considerable number of approaches have been used. They can be categorized into three types: preprocessing, cost-sensitive learning, and ensemble methodology. Preprocessing is able to decrease the classification bias based on the bias-variance decomposition to enhance the single classifier. Undersampling [22, 23], oversampling [24, 25], and strategic sampling are extensively utilized to offset imbalanced data.

Ensemble methodology can be viewed as a decision-making process that combines both individual learning algorithms and their outcomes in parallel to obtain the ultimate result. The basic idea behind the ensemble methodology is that the algorithm will get a number of single classifiers from the training set, and then it uses some ensemble strategies to integrate them to raise the accuracy and reliability of classification. Bagging [26], boosting [27], and stacking [28] are the most common ensemble approaches in credit scoring.

A novel ensemble strategy is planned for imbalanced data according to its imbalance ratio, which could determine the number of ELM classifiers that apply as single classifiers to predict credit scoring data, as well as the number of samples that feed into each ELM as training data.

For any given historical credit training data set that contains N samples, there are “good applicants” and “bad applicants,” such that . Then, the imbalance ratio is called IR, which can be calculated as follows:

After obtaining the IR, the amount of single ELM classifiers M in the ensemble model also can be calculated as follows:where the symbol represents “ceiling” operation.

Equation (13) can help us to not only determine the number of ELM classifiers needed in the ensemble credit scoring model but will also guide us to make the imbalanced data become balanced for each classifier. Regarding the ensemble strategy for imbalanced data, we proposed M-based value. In the remainder of this subsection, we will elaborate the proposed strategy in detail.

Firstly, calculate the imbalance ratio IR for any given historical credit training data set on the basis of and . Secondly, determine the number of ELM classifiers M using (13).

Thirdly, for the first M-1 ELM classifiers, we feed “good applicants” samples and “bad applicants” samples into the ELM classifiers to make sure that the training data sets of the first M − 1 classifiers are balanced, and random sampling without replacement method is employed to extract samples from “good applicants” samples for each classifier. After finishing the first M-1 training data sets extraction, there are “good applicants” sample that have not been extracted.

Finally, for the last ELM classifier, we let the remaining “good applicants” samples into the training data sets. Considering that , we will employ the SMOTE algorithm to create “good applicants” samples from “good applicants” samples. Thus, for the last ELM classifier, there are still “good applicants” samples and “bad applicants” samples fed into it as the training data set.

Through the described processes above, the ensemble strategy for the imbalanced data we proposed has been realized, and the training data for each classifier is balanced. In the next subsection, we will further introduce the GFSS theory-based ensemble credit scoring approach, which utilizes the results of each single ELM classifier.

2.4. GFSS Theory-Based Ensemble Credit Scoring Model

Since we have the results of each ELM classifier, we also need to figure out the weights with respect to their performance. The accuracy of classification is expected to be greatly improved. The theory of soft sets, which is firstly put forward by Molodtsov [29], can be regarded as a way for solving the uncertainties in imprecise environments (e.g., credit scoring area). Maji et al. [30] launched a research focusing on both fuzzy and soft sets. We will firstly introduce the principle of generalized fuzzy soft sets and then put forward a similarity measure of generalized fuzzy soft sets using angular cosine. After that, we can get the weights of each credit scoring model using similarity measure and the accuracy of classification. Finally, we are able to build the generalized fuzzy soft sets theory-based ensemble credit scoring model.

2.4.1. GFSS Theory

Based on the theories proposed by Molodtsov [29] and Maji et al. [30], we can make some definitions for fuzzy soft sets.

Definition 1. Denote U as the initial universal set. Denote P as a set of parameters. P(U) is the power set of U. (F, P) is a soft set over U if P is a mapping given by F : PP(U).

Definition 2. Denote U as the initial universal set. Denote P as a set of parameters. The power set of all fuzzy subsets of U is IU. Let : A pair (F, P) be the fuzzy soft set over U if F is a mapping given by F : AIU.
Then, Maji and Samanta’s [31] definition of GFSS is as follows:

Definition 3. Denote U as the initial universal set. Denote P as a set of parameters. Let (U, P) be the soft universe. Denote F : PIU and µ as the fuzzy subset of P, i.e., µ: PI= [0, 1], where IU is all fuzzy subsets of U. Denote Fµ as the mapping given by Fµ : P × I, which can also be denoted as Fµ(e) = (F(e), µ(e)), where F(e) IU. In this way, Fµ can be viewed as a GFSS over the soft universe (U, P).
For every ei, Fµ (ei) = (F(ei), µ(ei)) illustrates both the level of belonging of the subsets of U to F(ei) and the possibility of belonging.
In this paper, U denotes the historical data of customer credit , where , and F(ei) represents the performance of the classification by a single customer from a single classifier and µ(ei) denotes the overall degree of classification of a certain single classifier.

2.4.2. Similarity Measure of GFSS

It is important to address the similarity measurement of GFSS during the setting of GFSS and the establishment of our model. Thus, we establish a new approach for the similarity measure of GFSS for the scoring of credits.

Definition 4. For M single classifiers, denote and as the elements of Fµ. The definition is as follows:where is the category tag of the tth customer (binary responses denote 0 as default and 1 as nondefault); is the forecasting result of the tth customer predicted by the mth classifier and is between 0 and 1. is the accuracy degree of classification of a single classifier for a single customer. This accuracy ranges between 0 and 1. This is in line with our initial intuition. In addition, is the mth classifier’s overall classification performance, which can be calculated as follows:where TPm, FNm, TNm, and FPm are elements of the confusion matrix (Table 1). TPm is the amount of good customers accurately labeled as good, TNm is the amount of bad customers accurately labeled as bad, FNm is the amount of good customers falsely labeled as bad, and FPm is the amount of bad customers falsely labeled as good. The greater the value of , the more accurate the result is calculated by the mth classifier.
Based on the above discussion, it can be noted that and from Definition 4 are able to evaluate the performance of classification of every single model. Therefore, we can build the GFSS of the mth classifier as follows:


Real class (%)Projected class (%)
GoodBad

GoodTPmFNm
BadFPmTNm

Definition 5. For two GFSS and over the universal set U, where , the similarity measurement of GFSS can be calculated as follows:

2.4.3. Ensemble Credit Scoring Modeling

The determination of the weight for every single model is the most important step in the establishment of the ensemble credit score modeling. The purpose of this determination is to compare the calculated credit score with the actual records.

The similarity measure of GFSS can be utilized to do the determination, i.e., the calculated value, where is the GFSS of the actual score of customers. In this way, the weight of mth classifier can be calculated bywhere and . Thus, the final score could be calculated as

Figure 1 presents the flow-process diagram of the ensemble model. The algorithm of “ELM and GFSS Theory-Based Hybrid Ensemble” model is described. We will call this EGHE in the following parts (Algorithm 1).

Input: Historical data of credit scoring (xi, yi).
Output: Score of every single customer .
Step 1. Preprocessing of data.
Step 2. Variables selection using AEnet.
Step 3. Imbalanced data rebalancing by using the proposed ensemble strategy.
Step 4. Credit scoring of every single ELM classifier.
Step 5. Compute and using (14) and (15).
Step 6. Calculate the of single classifier using (17).
Step 7. Calculate the weight of mth classifier using (18).
Step 8. Get the final credit score of every customer .

3. Results and Discussion

3.1. Preparation of Dataset

During the evaluation, we collected many different private and public data sets. We have collected a total of six credit data sets and four additional imbalanced data sets with different IR are obtained, i.e., three public and three private. The public sets can be obtained from the UCI Machine Learning Repository. They are real-world credit score data sets and are now widely used by researchers. The German, Australian, and Japanese data sets are used for extra verification. The private data sets consist of the Iranian data set that has also been widely used in many studies and the Bene 1 and 2 data sets, which can be obtained from two key financial institutions in Benelux et al. [32]. This Iranian set has various customer data of many small Iranian private banks [33, 34]. Four additional imbalanced datasets are also from the Machine Learning Repository, UCI. They are Shuttle, Skin_segment, MiniBooNE, and LC2017Q1 which contains loan data of the first quarter in 2017 from Lending Club. The characteristics of all experimental data sets can be found in Table 2.


Data setsSizeAttributesGood/badImbalance ratioNumber of classifiers

Credit scoring data sets
Germany100025700/3002.333
Australia69014307/3830.801
Japan69015296/3570.831
Iran100027950/501919
Bene 13,123332,082/1,04122
Bene 27,190335,033/2,1572.333

Additional imbalanced data sets
Shuttle12,380911,428/95212.00413
Skin_segment117,7283114,039/3,67930.9931
MiniBooNE201,35510196,555/4,80040.9541
LC2017Q195,6337294,414/1,21977.4578

In this paper, we compared our proposed model with the other four state-of-the-art models, namely, C5.0 decision tree, SVM with Radial Basis Function, kernel SVM-R, Deep Belief Networks (DBN), and Bayes to validate the performance of our approaches. All the continuous attributes will be discretized into various intervals. Every single data set will be divided into a two-thirds training set and one-third testing set randomly. We use the open source platform R-statistics (version R-3.2.2.) to conduct our experiments.

3.2. Experimental Results

Different methods are utilized as comparison models to test the validity of the EGHE credit scoring model.

Firstly, the FS algorithm that is based on AEnet is used to obtain the highly correlated variables after initial data gathering and preprocessing. We could notice that after selection, the variables in these ten data sets are all decreased to various degrees (Table 3). In consideration of the complexity of computation, deleting irrelevant or weakly correlated variables is becoming increasingly important for big-data-oriented credit assessment issues.


Data setsNumber of features
BeforeAfter

Germany2521
Australia1410
Japan1512
Iran2722
Bene 13315
Bene 23320
Shuttle1613
Skin_segment33
MiniBooNE109
LC2017Q17248

After feature selection, we can perform two experiments to further see the effect of feature selection on the classification of each model: (1) all single classifiers with feature selection and (2) all single classifiers without feature selection. Tables 4 and 5 show in detail the AUC, H-measure (HM), Brier’s score (BS), and accuracy (ACC) for all single classifiers with and without feature selection.


Data setPerformance measuresSingle classifiers
C5.0SVM-RDBNBayesELM

GermanyAUC0.6800.7320.7530.7610.764
HM0.3360.3620.3990.3380.379
BS0.2510.1670.1730.1990.172
ACC0.7450.7520.7490.7450.762
AustraliaAUC0.8670.8120.8760.8570.852
HM0.5160.6030.6130.5810.610
BS0.1480.1130.1070.1660.127
ACC0.7360.7420.7510.7340.735
JapanAUC0.8570.9080.9110.8880.882
HM0.6890.6080.6080.6670.618
BS0.1560.1200.1380.1760.142
ACC0.7380.7490.7570.7420.751
IranAUC0.6160.6040.6140.7150.710
HM0.1110.0740.0610.1920.133
BS0.0710.0710.0780.0770.083
ACC0.6230.6170.6400.6270.648
Bene 1AUC0.7280.8210.7670.7400.752
HM0.3100.3660.2500.3350.351
BS0.2670.1760.1680.2970.199
ACC0.7010.7490.6980.6900.753
Bene 2AUC0.7450.7890.7460.7070.762
HM0.3850.3090.2380.2760.270
BS0.1430.1780.1420.1720.153
ACC0.7280.7200.7150.7110.747
ShuttleAUC0.6040.6750.6930.7140.722
HM0.2990.3340.3670.3170.358
BS0.2240.1530.1580.1860.163
ACC0.6620.6930.7400.6980.720
Skin_segmentAUC0.7700.7490.8040.8020.805
HM0.4580.5560.5640.5430.576
BS0.1310.1050.0990.1560.119
ACC0.6530.6850.6890.6870.695
MiniBooNEAUC0.7610.8370.8380.8330.833
HM0.6130.5600.5590.6250.584
BS0.1400.1120.1260.1640.134
ACC0.6550.6900.6960.6950.710
LC2017Q1AUC0.5470.5570.5630.6690.671
HM0.0990.0670.0570.1810.126
BS0.0620.0660.0720.0710.078
ACC0.6550.6700.6880.6870.712


Data setPerformance measurementSingle classifiers
C5.0SVM-RDBNBayesELM

GermanyAUC0.6900.7560.7620.7740.768
HM0.1810.2950.2480.2670.268
BS0.2300.1630.1720.1920.173
ACC0.7510.7590.7520.7580.765
AustraliaAUC0.8820.9020.9150.9120.883
HM0.6140.6350.6420.6230.624
BS0.1210.1040.1310.1210.130
ACC0.7490.7500.7550.7410.748
JapanAUC0.8610.9100.9190.9100.878
HM0.6080.6210.6310.6220.622
BS0.1160.1120.1260.1130.119
ACC0.7490.7520.7660.7530.758
IranAUC0.6310.6500.6120.7230.638
HM0.1370.1180.1080.2120.117
BS0.0680.0710.0780.0720.081
ACC0.6520.6450.6500.6520.658
Bene 1AUC0.7700.8100.8060.7730.798
HM0.3120.3450.3360.3020.320
BS0.2330.1770.1840.2640.241
ACC0.7210.7460.7190.7210.753
Bene 2AUC0.7630.8240.8150.7800.796
HM0.3540.3860.2690.2850.326
BS0.1320.1190.1370.1250.137
ACC0.7340.7340.7150.7230.747
ShuttleAUC0.6520.6820.6860.7230.729
HM0.3230.3370.3630.3210.362
BS0.2420.1550.1560.1880.165
ACC0.6850.6990.7430.7070.727
Skin_segmentAUC0.8320.7560.7960.8120.813
HM0.4950.5620.5580.5500.582
BS0.1410.1060.0980.1580.120
ACC0.6750.6920.7120.6960.702
MiniBooNEAUC0.8220.8450.8300.8440.841
HM0.6620.5660.5530.6330.589
BS0.1510.1130.1250.1660.135
ACC0.7070.7040.6890.7040.717
LC2017Q1AUC0.5910.5630.5570.6780.678
HM0.1070.0680.0560.1830.125
BS0.0670.0670.0710.0720.079
ACC0.6820.6870.6810.6960.719

From the results of every single classifier in Tables 4 and 5, we can see that the feature selection helps enhance the effectiveness of classification of most single classifiers. After feature selection, C5.0 shows accuracy values that are 0.8%, 1.76%, 1.63%, 4.49%, 2.85%, 0.82%, 3.47%, 4.15%, 7.94%, and 4.12% greater than without feature selection for Germany, Australia, Japan, Iran, Bene 1, Bene 2, Shuttle, Skin_segment, MiniBooNE, and LC2017Q1 data sets, respectively. In the same way, SVM-R increased the accuracy values by 0.93%, 1.08%, 0.67%, 4.53%, −0.4%, 1.94%, 0.87%, 1.02%, 2.03%, and 2.54%, respectively, for ten experimental data sets. DBN and Naïve Bayes improve their effectiveness too after feature selection. Only SVM-R on Bene 1 reduced by 0.4%, but this does not contradict the improvement on the classification that our AEnet-based feature selection brings. Not only do outliers, redundant, and weakly or even unrelated variables help improve the effectiveness, but also affect the model establishment and cause great computation.

It is noteworthy that, compared with C5.0, SVM-R, DBN, and Bayes, ELM has manifested the superiority in accuracy on the vast majority of data sets. Table 6 reports the average running time (total time for training and testing) for all models. For these experiments, we use an Intel i5-8500 with CPU at 3.0 GHz and 16 GB of RAM.


Data setsModels
C5.0SVM-RDBNBayesELM

Germany1.762.894.331.721.38
Australia1.162.022.991.070.97
Japan1.252.052.901.130.94
Iran1.823.014.311.691.44
Bene 15.6310.1114.295.854.62
Bene 212.7121.1328.9612.3210.01
Shuttle21.7935.7853.6121.2917.11
Skin_segment207.20340.24509.76202.49182.46
MiniBooNE354.38581.92871.87346.33277.87
LC2017Q1591.26970.781254.49560.57436.04

From Table 6, we can see that ELM costs less time than other single models to carry out credit scoring activities. The efficiency of computing resources also makes ELM a great match for ensemble learning and modeling.

After implementing feature selection, completing the ensemble strategy, and individual model classification, the EGHE model can be achieved. Based on (14), (15), and (17)–(19), the weights of single ELM classifiers are calculated according to their efficiency, respectively. To validate the availability of EGHE, we employed several ensemble models in contrast with EGHE. These models were split into two parts. The first one contains four FS algorithms with GFSS-based combination. They are cost-sensitive, GA, information gain ratio (IGR), and elastic net (Enet). Cost-sensitive, GA, and IGR are popular feature selection approaches in the credit scoring area [16, 35, 36]. The second group applies four other approaches with AEnet-based feature selection, which were weighted average (WAVG) [37], majority voting (MajVot) [38], weighted voting (WVOT) [39], and fuzzy soft set (FSS). Those methods are frequently adopted in the establishment and utilization of different combination models. They also employ ELM as the classifier but did not take the ensemble strategy that is proposed above, only using random sampling methods to make all training data sets become balanced. Table 7 displays the results of AUC, H-measure, and Brier’s score for all ensemble models.


Data setPerformance measurementOther feature selection methods with GFSS-based combinationTraditional combination methods with AEnet-based feature selectionEGHE
Cost-sensitiveGAIGREnetWAVGMajVotWVOTFSS

GermanyAUC0.7860.7920.7770.7810.7770.7860.7890.7710.823
HM0.2860.1770.2260.2380.2220.2850.2460.3050.325
BS0.1660.2080.1970.1820.1810.1830.1920.1570.184
ACC0.8420.8620.8330.8270.8450.8590.8360.8440.886
AustraliaAUC0.9210.9270.9330.9280.9210.9220.9320.9280.945
HM0.6670.5190.5730.6280.6370.6370.6320.6530.659
BS0.0920.1450.1730.1030.1010.1120.1070.0970.095
ACC0.8720.8780.8820.8630.8720.8730.8760.8740.895
JapanAUC0.9230.9280.9180.9320.9180.9270.9280.9250.937
HM0.6500.4920.5660.6180.6020.6420.6170.6480.660
BS0.0990.1570.1740.1090.1210.1120.1160.1030.098
ACC0.8710.8620.8610.8680.8530.8640.8730.8680.904
IranAUC0.7910.7780.7810.7670.7770.7790.7830.7780.802
HM0.2790.1500.2170.1620.2830.1080.1070.2940.303
BS0.0420.0690.0560.0490.0440.0480.0490.0440.059
ACC0.8830.8810.8740.8610.8670.8850.8770.8830.915
Bene 1AUC0.8430.8120.8220.8210.8820.8790.8860.8880.881
HM0.3960.2630.3380.2580.3240.4410.3850.4470.351
BS0.1610.2400.2560.1970.1880.1590.1980.1540.173
ACC0.8810.8720.8640.8720.8720.8720.8650.8760.896
Bene 2AUC0.9210.8380.8580.8680.8780.8440.8760.8830.888
HM0.5370.4010.4780.4890.5050.4360.4320.4960.506
BS0.0910.1360.1680.1220.1020.1150.1120.1030.114
ACC0.8780.8600.8570.8680.8810.8750.8820.8840.898
ShuttleAUC0.8960.9140.9140.9220.8970.9190.9170.9060.943
HM0.6530.5100.5620.6250.6200.6350.6220.6360.658
BS0.0910.1430.1700.1030.0990.1100.1040.0960.095
ACC0.8520.8660.8620.8580.8490.8700.8620.8510916
Skin_segmentAUC0.9020.9130.8990.9260.8960.9220.9150.9030.936
HM0.6370.4860.5550.6150.5860.6380.6070.6310.659
BS0.0960.1540.1710.1070.1190.1100.1130.0980.098
ACC0.8530.8490.8440.8630.8330.8620.8590.8450.908
MiniBooNEAUC0.7730.7670.7640.7630.7570.7750.7700.7580.800
HM0.2720.1490.2130.1590.2770.1090.1040.2860.302
BS0.0420.0670.0540.0480.04390.0470.0470.0410.059
ACC0.8750.8770.8850.8760.8840.8800.8930.8900.903
LC2017Q1AUC0.8230.8010.8040.8160.8590.8740.8720.8670.879
HM0.3860.2600.3320.2570.3150.4400.3780.4350.350
BS0.1580.2370.2520.1950.1820.1570.1960.1490.203
ACC0.8610.8880.8770.8750.8710.8780.8710.8730.912

From Tables 7 and 5, we can see that, compared with the single classifiers, ensemble methods reveal significant advantages with regard to the accuracy of classification. Compared with other single classifiers and combined approaches in both groups, EGHE has an advantage in all metrics across all datasets. Experiments on several state-of-the-art ensemble models are performed to verify the effectiveness of the EGHE model. They are an EMPNGA-based multistage hybrid model put forward by Zhang and Xia [37]; the heterogeneous ensemble credit model put forward by Xia et al. [40]; EBCA-RF&XGB-PSO model that is put forward by He et al. [41]; heterogeneous ensemble learning-based two-stage credit risk model (TSHE) proposed by Papouskova and Hajek [42]; twin neural networks (TNN) proposed by Jayadeva et al. [43]; and a new rule-based knowledge extraction (RKE) method proposed by Mahani and Baba [44] recently. Table 8 gives the results of ensemble models in different data sets.


Data setPerformance measurementEnsemble models
EMPNGA-based modelHeterogeneous ensembleEBCA-RF& XGB-PSOTSHETNNRKEEGHE

GermanyAUC0.8020.7950.7980.7750.8110.7900.823
HM0.4000.3860.3970.3760.3200.3760.325
BS0.1580.1640.1630.1840.1810.1580.184
ACC0.7680.8590.8690.8390.8730.8740.886
AustraliaAUC0.9400.9230.9310.9330.9310.9290.945
HM0.6720.6480.6610.6280.6490.6760.659
BS0.0920.1010.0920.1010.0950.0990.096
ACC0.8750.8420.8610.8520.8820.8770.895
JapanAUC0.9320.9250.9190.9350.9230.9210.937
HM0.6650.6510.6360.6490.6500.6490.660
BS0.0950.0910.0970.0900.0970.0900.098
ACC0.8720.8830.8890.8870.8900.8870.904
IranAUC0.8760.8240.8310.8240.7900.8210.802
HM0.4240.3840.4890.3840.2980.4160.303
BS0.0580.0470.0610.0430.0580.0530.059
ACC0.9070.9080.9210.9020.9110.9100.915
Bene 1AUC0.8240.8210.8180.8210.8680.8190.881
HM0.3760.3830.4420.4230.3460.3840.351
BS0.1520.1470.1470.1500.1690.1390.173
ACC0.8720.8690.8690.8650.8830.8720.896
Bene 2AUC0.8660.8710.8630.8810.8740.8750.887
HM0.4790.4880.4650.4980.4980.4900.506
BS0.1020.0980.1010.1180.1120.1100.114
ACC0.8710.8670.8750.8870.8850.8820.898
ShuttleAUC0.9130.8700.8500.8650.9290.8500.943
HM0.4420.4050.5000.4030.6480.4310.658
BS0.0600.0500.0620.0450.0950.0550.096
ACC0.8860.8780.8920.8860.9020.8920.916
Skin_segmentAUC0.8590.8670.8370.8620.9210.8480.935
HM0.3920.4040.4520.4440.6490.3970.659
BS0.1580.1550.1500.1580.0960.1430.098
ACC0.8700.8770.8890.8980.8940.8930.908
MiniBooNEAUC0.9030.9190.8830.9250.7880.9060.800
HM0.4990.5150.4760.5230.2970.5070.302
BS0.1060.1030.1030.1240.0580.1140.059
ACC0.8900.8950.9010.8970.8890.9010.903
LC2017Q1AUC0.8580.8460.8220.9230.8660.8450.879
HM0.4020.2740.3400.3680.3450.2660.350
BS0.1650.2500.2580.2130.1990.2020.203
ACC0.8880.8970.9030.8950.8980.9030.912

From Table 8, we could tell that the results of these models are very close. The accuracy of EGHE model is better than most of the other models but Iranian. The EBCA-RF&XGB-PSO model achieved a high accuracy of 0.921 in the Iranian data set because it uses the Extended Balance Cascade method that can effectively solve the issue of class imbalance. However, the ensemble strategy and GFSS theory-based model EGHE can deal with the thorny problem of unbalanced data classification better in most experimental data sets; even in some severely skewed data sets, ideal outcomes have been achieved, such as in Shuttle, Skin_segment, MiniBooNE, and LC2017Q1.

4. Conclusion

In this paper, we proposed a novel ensemble credit scoring model called EGHE, which integrates efficient feature selection algorithm, novel ensemble strategy, and GFSS-based weighting method for single ELM classifiers. In the proposed model, the adaptive elastic net-based feature selection algorithm was firstly utilized to obtain high-quality training data to improve the evaluation efficiency without reducing the predictive precision. ELM model was employed as basic classifier, and a novel ensemble strategy was generated to make the imbalanced training data sets become balanced for each ELM classifier. Additionally, we proposed a new weighting method to build the GFSS theory-based ensemble credit scoring model. Dual-scale classification accuracy metric that is based on new similarity measurement of GFSS was constructed to compute the final weight of every single classifier. The biggest contribution of this paper is that the proposed EGHE is able to predict credit risk reliably and accurately, especially for unbalanced credit data. Comparisons between EGHE and other credit scoring models were implemented on ten real-world datasets with four metrics (average accuracy, AUC, H-measure, and Brier’s score). A variety of state-of-the-art ensemble models were employed to compare with EGHE to prove its validity. The experiments results demonstrated that the proposed EGHE model was robust and represented a positive development in credit scoring.

Data Availability

(1) The “Germany” data set used to support the findings of this study are included within the following URL: http://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29. (2) The “Australia” data set used to support the findings of this study are included within the following URL: http://archive.ics.uci.edu/ml/datasets/Statlog+%28Australian+Credit+Approval%29. (3) The “Japan” data set used to support the findings of this study are included within the following URL: https://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening. (4) The “Iran” data set used to support the findings of this study are included in [34, 45] (5) The “Bene 1” and “Bene 2” data set used to support the findings of this study are included within the following article: [32]. (6) The “Shuttle” data set used to support the findings of this study are included within the following URL: http://archive.ics.uci.edu/ml/datasets/statlog+(shuttle). (7) The “Skin_segment” data set used to support the findings of this study are included within the following URL: http://archive.ics.uci.edu/ml/datasets/Skin+Segmentation. (8) The “MiniBooNE” data set used to support the findings of this study are included within the following URL: http://academictorrents.com/details/7fafb101f9c7961f9b840daeb4af43039107ddef. (9) The “LC2017Q1” data set used to support the findings of this study are included within the following URL and article: [41] http://www.lendingclub.com.

Disclosure

Dayu Xu and Xuyao Zhang are co-first authors.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

Dayu Xu and Xuyao Zhang contributed equally to this work.

Acknowledgments

The authors acknowledge the support for the project (no. 20YJC630173) supported by the Ministry of Education of Humanities and Social Science Project, the project (no. 31971493) by the National Natural Science Foundation of China, the National Key Research and Development Program of China (no. 2018YFD0401403), the Zhejiang Province Key Science and Technology Projects (no. 2018C02050), the Hangzhou Agricultural and Social Development Project (no. 20190101A07), and the Zhejiang Education and Teaching Reform Project (no. jg20180175) supported by the Department of Education of Zhejiang Province.

References

  1. C. Serrano-Cinca and B. Gutiérrez-Nieto, “The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (p2p) lending,” Decision Support Systems, vol. 89, pp. 113–122, 2016. View at: Publisher Site | Google Scholar
  2. C. Liberati and F. Camillo, “Personal values and credit scoring: new insights in the financial prediction,” Journal of the Operational Research Society, vol. 69, no. 12, pp. 1–21, 2018. View at: Publisher Site | Google Scholar
  3. J. Sun, H. Li, P.-C. Chang, and Q.-H. Huang, “Dynamic credit scoring using B & B with incremental-SVM-ensemble,” Kybernetes, vol. 44, no. 4, pp. 518–535, 2015. View at: Publisher Site | Google Scholar
  4. A. Kammoun, “Credit scoring models for a Tunisian microfinance institution: comparison between artificial neural network and logistic regression,” Review of Economics & Finance, vol. 6, pp. 61–78, 2016. View at: Google Scholar
  5. Z. Zhao, S. Xu, B. H. Kang, M. M. J. Kabir, Y. Liu, and R. Wasinger, “Investigation and improvement of multi-layer perceptron neural networks for credit scoring,” Expert Systems with Applications, vol. 42, no. 7, pp. 3508–3516, 2015. View at: Publisher Site | Google Scholar
  6. S. Y. Sohn, D. H. Kim, and J. H. Yoon, “Technology credit scoring model with fuzzy logistic regression,” Applied Soft Computing, vol. 43, pp. 150–158, 2016. View at: Publisher Site | Google Scholar
  7. H. Xiao, Z. Xiao, and Y. Wang, “Ensemble classification based on supervised clustering for credit scoring,” Applied Soft Computing, vol. 43, pp. 73–86, 2016. View at: Publisher Site | Google Scholar
  8. S. Oreski and G. Oreski, “Genetic algorithm-based heuristic for feature selection in credit risk assessment,” Expert Systems with Applications, vol. 41, no. 4, pp. 2052–2064, 2014. View at: Publisher Site | Google Scholar
  9. D. Liang, C.-F. Tsai, and H.-T. Wu, “The effect of feature selection on financial distress prediction,” Knowledge-Based Systems, vol. 73, no. 1, pp. 289–297, 2015. View at: Publisher Site | Google Scholar
  10. F. Feng, H. Fujita, M. I. Ali, R. R. Yager, and X. Liu, “Another view on generalized intuitionistic fuzzy soft sets and related multiattribute decision making methods,” IEEE Transactions on Fuzzy Systems, vol. 27, no. 3, pp. 474–488, 2019. View at: Publisher Site | Google Scholar
  11. X. Peng and H. Garg, “Algorithms for interval-valued fuzzy soft sets in emergency decision making based on WDBA and CODAS with new information measure,” Computers & Industrial Engineering, vol. 119, pp. 439–452, 2018. View at: Publisher Site | Google Scholar
  12. F. Feng, Z. Xu, H. Fujita, and M. Liang, “Enhancing promethee method with intuitionistic fuzzy soft sets,” International Journal of Intelligent Systems, vol. 35, no. 7, pp. 1071–1104, 2020. View at: Publisher Site | Google Scholar
  13. R. Emekter, Y. Tu, B. Jirasakuldech, and M. Lu, “Evaluating credit risk and loan performance in online peer-to-peer (p2p) lending,” Applied Economics, vol. 47, no. 1, pp. 54–70, 2015. View at: Publisher Site | Google Scholar
  14. Y. Guo, W. Zhou, C. Luo, C. Liu, and H. Xiong, “Instance-based credit risk assessment for investment decisions in p2p lending,” European Journal of Operational Research, vol. 249, no. 2, pp. 417–426, 2016. View at: Publisher Site | Google Scholar
  15. S. Benítez-Peña, R. Blanquero, E. Carrizosa, and P. Ramírez-Cobo, “Cost-sensitive feature selection for support vector machines,” Computers & Operations Research, vol. 106, pp. 169–178, 2019. View at: Publisher Site | Google Scholar
  16. S. Jadhav, H. He, and K. Jenkins, “Information gain directed genetic algorithm wrapper feature selection for credit rating,” Applied Soft Computing, vol. 69, pp. 541–553, 2018. View at: Publisher Site | Google Scholar
  17. N. Kozodoi, S. Lessmann, K. Papakonstantinou, Y. Gatsoulis, and B. Baesens, “A multi-objective approach for profit-driven feature selection in credit scoring,” Decision Support Systems, vol. 120, pp. 106–117, 2019. View at: Publisher Site | Google Scholar
  18. L. Cui, L. Bai, Z. Zhang, Y. Wang, and E. R. Hancock, “Identifying the most informative features using a structurally interacting elastic net,” Neurocomputing, vol. 336, pp. 13–26, 2018. View at: Publisher Site | Google Scholar
  19. H. Zou and T. Hastie, “Regularization and variable selection via the elastic net,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), vol. 67, no. 2, pp. 301–320, 2005. View at: Publisher Site | Google Scholar
  20. H. Zou and H. H. Zhang, “On the adaptive elastic-net with a diverging number of parameters,” The Annals of Statistics, vol. 37, no. 4, pp. 1733–1751, 2009. View at: Publisher Site | Google Scholar
  21. A. Bequé and S. Lessmann, “Extreme learning machines for credit scoring: an empirical evaluation,” Expert Systems with Applications, vol. 86, pp. 42–53, 2017. View at: Publisher Site | Google Scholar
  22. W.-C. Lin, C.-F. Tsai, Y.-H. Hu, and J.-S. Jhang, “Clustering-based undersampling in class-imbalanced data,” Information Sciences, vol. 409-410, pp. 17–26, 2017. View at: Publisher Site | Google Scholar
  23. W. W. Y. Jhang, J. Hu, D. S. Yeung, S. Yin, and F. Roli, “Diversified sensitivity-based undersampling for imbalance classification problems,” IEEE Transactions on Cybernetics, vol. 45, no. 11, pp. 2402–2412, 2017. View at: Publisher Site | Google Scholar
  24. G. Douzas and F. Bacao, “Self-Organizing Map Oversampling (SOMO) for imbalanced data set learning,” Expert Systems with Applications, vol. 82, pp. 40–52, 2017. View at: Publisher Site | Google Scholar
  25. G. Douzas, F. Bacao, and F. Last, “Improving imbalanced learning through a heuristic oversampling method based on K-means and SMOTE,” Information Sciences, vol. 465, pp. 1–20, 2018. View at: Publisher Site | Google Scholar
  26. S. Dahiya, S. S. Handa, and N. P. Singh, “A feature selection enabled hybrid-bagging algorithm for credit risk evaluation,” Expert Systems, vol. 34, no. 6, pp. 1–11, 2017. View at: Publisher Site | Google Scholar
  27. Y. Xia, C. Liu, and N. Liu, “Cost-sensitive boosted tree for loan evaluation in peer-to-peer lending,” Electronic Commerce Research and Applications, vol. 24, pp. 30–49, 2017. View at: Publisher Site | Google Scholar
  28. M. Doumpos and C. Zopounidis, “Model combination for credit risk assessment: a stacked generalization approach,” Annals of Operations Research, vol. 151, no. 1, pp. 289–306, 2007. View at: Publisher Site | Google Scholar
  29. D. Molodtsov, “Soft set theory-first results,” Computers & Mathematics with Applications, vol. 37, no. 4-5, pp. 19–31, 1999. View at: Publisher Site | Google Scholar
  30. P. K. Maji, R. Biswas, and A. R. Roy, “Fuzzy soft sets,” Journal of Fuzzy Mathematics, vol. 9, no. 3, pp. 589–602, 2001. View at: Google Scholar
  31. P. K. Maji and S. K. Samanta, “Generalised fuzzy soft sets,” Computers & Mathematics with Applications, vol. 59, no. 4, pp. 1425–1432, 2010. View at: Publisher Site | Google Scholar
  32. B. Baesens, R. Setiono, C. Mues, and J. Vanthienen, “Using neural network rule extraction and decision tables for credit-risk evaluation,” Management Science, vol. 49, no. 3, pp. 312–329, 2003. View at: Publisher Site | Google Scholar
  33. K. Setiono, B. Mac Namee, and S. J. Delany, “Using semi-supervised classifiers for credit scoring,” Journal of the Operational Research Society, vol. 64, no. 4, pp. 513–529, 2012. View at: Publisher Site | Google Scholar
  34. A. I. Marqués, V. García, and J. S. Sánchez, “Exploring the behaviour of base classifiers in credit scoring ensembles,” Expert Systems with Applications, vol. 39, no. 11, pp. 10244–10250, 2012. View at: Publisher Site | Google Scholar
  35. H. Zhao and S. Yu, “Cost-sensitive feature selection via the 2,1-norm,” International Journal of Approximate Reasoning, vol. 104, pp. 25–37, 2019. View at: Publisher Site | Google Scholar
  36. D. Wang, Z. Zhang, R. Bai, and Y. Mao, “A hybrid system with filter approach and multiple population genetic algorithm for feature selection in credit scoring,” Journal of Computational and Applied Mathematics, vol. 329, pp. 307–321, 2018. View at: Publisher Site | Google Scholar
  37. D. Zhang and Z. Xia, “Weighted-averaging estimator for possible threshold in segmented linear regression model,” Journal of Statistical Planning and Inference, vol. 200, pp. 102–118, 2019. View at: Publisher Site | Google Scholar
  38. D. Tripathi, D. R. Edla, V. Kuppili, A. Bablani, and R. Dharavath, “Credit scoring model based on weighted voting and cluster based feature selection,” Procedia Computer Science, vol. 132, pp. 22–31, 2018. View at: Publisher Site | Google Scholar
  39. A. L. M. Vilela, C. Wang, K. P. Nelson, and H. E. Stanley, “Majority-vote model for financial markets,” Physica A: Statistical Mechanics and Its Applications, vol. 515, pp. 762–770, 2019. View at: Publisher Site | Google Scholar
  40. Y. Xia, C. Liu, B. Da, and F. Xie, “A novel heterogeneous ensemble credit scoring model based on bstacking approach,” Expert Systems with Applications, vol. 93, pp. 182–199, 2018. View at: Publisher Site | Google Scholar
  41. H. He, W. Zhang, and S. Zhang, “A novel ensemble method for credit scoring: adaption of different imbalance ratios,” Expert Systems with Applications, vol. 98, pp. 105–117, 2018. View at: Publisher Site | Google Scholar
  42. M. Papouskova and P. Hajek, “Two-stage consumer credit risk modelling using heterogeneous ensemble learning,” Decision Support Systems, vol. 118, pp. 33–45, 2019. View at: Publisher Site | Google Scholar
  43. Jayadeva, H. Pant, M. Sharma, and S. Soman, “Twin Neural Networks for the classification of large unbalanced datasets,” Neurocomputing, vol. 343, pp. 34–49, 2019. View at: Publisher Site | Google Scholar
  44. A. Mahani and A. R. Baba-Ali, “A new rule-based knowledge extraction approach for imbalanced datasets,” Knowledge and Information Systems, pp. 1–27, 2019. View at: Google Scholar
  45. A. Marqués, V. García, and J. S. Sánchez, “Two-level classifier ensembles for credit risk assessment,” Expert Systems with Applications, vol. 39, pp. 10916–10922, 2012. View at: Publisher Site | Google Scholar

Copyright © 2020 Dayu Xu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

82 Views | 33 Downloads | 0 Citations
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at help@hindawi.com to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.