Machine Learning, Deep Learning, and Optimization Techniques for TransportationView this Special Issue
Research on Credit Risk Measurement of Small and Micro Enterprises Based on the Integrated Algorithm of Improved GSO and ELM
Small and micro enterprises play a very important role in economic growth, technological innovation, employment and social stability etc. Due to the lack of credible financial statements and reliable business records of small and micro enterprises, they are facing financing difficulties, which has become an important factor hindering the development of small and micro enterprises. Therefore, a credit risk measurement model based on the integrated algorithm of improved GSO (Glowworm Swarm Optimization) and ELM (Extreme Learning Machine) is proposed in this paper. First of all, according to the growth and development characteristics of small and micro enterprises in the big data environment, the formation mechanism of credit risk of small and micro enterprises is analyzed from the perspective of granularity scaling, cross-border association and global view driven by big data, and the index system of credit comprehensive measurement is established by summarizing and analyzing the factors that affect the credit evaluation index. Secondly, a new algorithm based on the parallel integration of the good point set adaptive glowworm swarm optimization algorithm and the Extreme learning machine is built. Finally, the integrated algorithm based on improved GSO and ELM is applied to the credit risk measurement modeling of small and micro enterprises, and some sample data of small and micro enterprises in China are collected, and simulation experiments are carried out with the help of MATLAB software tools. The experimental results show that the model is effective, feasible, and accurate. The research results of this paper provide a reference for solving the credit risk measurement problem of small and micro enterprises and also lay a solid foundation for the theoretical research of credit risk management.
In recent years, China’s economy has maintained a good momentum of development. The number of domestic enterprises has grown steadily, especially small and micro enterprises, which have become a large number of dynamic enterprise groups in the main body of market economy. So small and micro enterprises are an important part of China’s economy. However, small and micro enterprises also face severe financing difficulties in the process of survival and development. Because of their own reasons and political, economic, legal, and other external factors, they are in a dilemma of few financing channels and high financing costs. From a macro perspective, China’s market economy system is still not perfect, the credit system is defective, small and micro enterprise groups cannot get a comprehensive and objective credit evaluation, and the financial investment industry has not paid enough attention to it, resulting in narrow financing channels. From the micro point of view, small and micro enterprises have light assets, small scale, and unknown financial situation, so it is difficult to reasonably assess the credit risk. Also, the private lending interest rate and cost are high, which makes financing a difficult problem.
The measurement of credit risk of small and micro enterprises is a research hotspot of scholars at home and abroad. In China, Man et al.  constructs lasso logistic model to identify the key indicators that affect the credit risk of small and micro enterprises based on the data of 496 unlisted small and micro enterprises’ default in the process of bank loan. Han  uses the fuzzy integral support vector machine algorithm to build the credit risk assessment model of small and micro enterprises from the perspectives of loan mode, financial situation, and characteristics of enterprises. Su  uses the mixed analysis method of soft information and hard information to measure the credit risk of small and micro enterprises. Zhang et al.  proposes a dynamic combination evaluation model based on fuzzy clustering and SOM-k algorithm to evaluate the credit risk of small and micro enterprises. Based on the theory of information asymmetry, Yoshino and Taghizadeh-hesary  built a theoretical model of SMEs’ credit and empirically analyzed the influencing factors of the optimal guarantee ratio of enterprises. Hanggraeni et al.  proposes a more suiTABLE credit risk measurement method for small- and medium-sized enterprises, which uses the method of default probability measure to improve Z value and estimate its critical value. Based on the fuzzy matter-element Euclidean approach degree model, Zhou et al.  elaborated the inherent law of the credit risk evaluation of the industrial park enterprises and provided a practical method for the research of the industrial park enterprise credit rating evaluation. Shi et al.  discovered that the default of small and medium-sized enterprises with high-grade loans will lead to low recovery rate, so they used three credit data sets to test a credit risk method to help small- and medium-sized enterprises solve the mismatch problem.
In conclusion, the current research states of credit risk measurement methods of small and micro enterprises are discussed from different perspectives of constructing credit evaluation index system and risk measurement model. Among them, foreign scholars emphasize on quantitative research by mining the credit data of some small- and medium-sized enterprises for research and evaluation. Domestic scholars pay more attention to qualitative research on the construction of indicator system and quantitative research on credit risk measurement. In addition, big data provides more comprehensive, accurate, and precise digital management for many subject areas. Chen and Wu  believes that traditional management has become data management and traditional decision-making has become decision-making based on data mining analysis in the background of big data. The research of big data application should start from the data characteristics, problem characteristics, and management decision characteristics of big data. A credit risk measurement model for small and micro enterprises driven by big data is proposed in this background.
First of all, the characteristics of small and micro enterprises and the impact of big data on small and micro enterprises’ credit evaluation are analyzed in depth from the perspective of granularity scaling, cross-border association, and global view driven by big data, and the mechanism of small and micro enterprises’ credit risk formation is explored. On this basis, credit risk measurement indicators are selected to establish a credit risk measurement indicator system. Secondly, due to the non-linear inherent relationship of credit risk data, adaptive learning characteristics of neural network have become the most common and relatively accurate classifier in credit risk measurement [10–13].
Therefore, ELM feedforward neural network is employed to solve the problem of credit risk measurement. Because the number of hidden layer nodes has a great influence on the classification accuracy, the initial weight, threshold, and hidden layer node parameters of ELM are optimized by the improved GSO algorithm. A parallel integrated learning algorithm based on the improved GSO algorithm and ELM neural network is applied to the credit risk measurement of small and micro enterprises for solving the problem of credit risk measurement of small and micro enterprises.
The research in this paper mainly includes 3 parts. Section 2 gives explanations on relevant algorithms and theories, which puts forward the improved GSO algorithm based on good point set and combines the improved GSO algorithm with adaptive step-size strategy. Section 3 introduces the index system of credit comprehensive measurement established by summering and analyzing the factors from the perspective of granularity scaling, cross-border association, and global view driven by big data and proposes a credit risk measurement model based on the integrated algorithm of IGSO and ELM. Section 4 selects 10 Banchmarks standard functions to evaluate the performance of IGSO algorithm. Collect data set of micro small enterprise from a region in China to verify credit risk measurement model of micro small enterprise based on the integrated algorithm of IGSO and ELM.
2. Description of Relevant Algorithms
2.1. GSO Algorithm
Metaheuristic algorithm is an improvement of heuristic algorithm, which is the combination of random algorithm and local search algorithm. Traditional metaheuristic algorithms include tabu search algorithm, simulated annealing algorithm, genetic algorithm, ant colony optimization algorithm, particle swarm optimization algorithm, artificial fish swarm algorithm, artificial bee colony algorithm, artificial neural network algorithm, glowworm swarm optimization algorithm, etc. At present, there are several new metaheuristic algorithms, such as Henry gas solubility (HGS) , Slime mould algorithm (SMA) , and Harris hawk optimization (HHO) . Swarm intelligence optimization algorithm is an algorithm designed based on the group behavior characteristics of natural organisms, and it is also a method to solve distributed problems. The common swarm intelligence algorithms are ant colony algorithm, particle swarm optimization algorithm, artificial fish swarm algorithm, artificial bee colony algorithm, wolf colony algorithm, glowworm swarm optimization algorithm, and so on. GSO (Glowworm swarm optimization) algorithm is not only a metaheuristic algorithm, but also a swarm intelligence algorithm.
GSO algorithm  is also a bionic swarm intelligence algorithm proposed by Indian scholars Krishnand and Ghose in 2005, which simulates the foraging and courtship behavior of swarms in nature. Comparing with other intelligent optimization algorithms, GSO algorithm is simple and easy to implement with clear algorithm flow. Due to the advantages of less parameter setting, simple working mechanism, easy programming, and multi extreme value capture, this paper selects the GSO algorithm as the research object to carry out the relevant theoretical and application research. The algorithm has been currently applied to multimode function optimization , multisource signal location , wireless sensor node layout , simulation robot group , cluster analysis  and other fields, which show a good research and application prospects.
The main idea of GSO algorithm is that the glowworm in search space represents every feasible solution of the optimization problem. Each glowworm has its own fluorescein and sensing radius. Its brightness is related to the target value of its location. The glowworm with higher brightness has better index value. In the iterative process, the brighter glowworm has stronger attraction ability, which attracts other glowworms to move towards it. Since each glowworm has its own decision radius, the decision radius will be affected by neighboring glowworms at the same time. When the number of glowworms around it is few, the decision radius of glowworms will increase, which can attract more glowworms around. When there are more glowworms around, the decision radius will be smaller. At last, most of the glowworms will gather in several positions with better objective function value to reach the best value. GSO algorithm is described by the following mathematical formulas:where is the fluorescein value of glowworm in the t-th iteration, is the target function value in the t-th iteration, is the volatilization coefficient of fluorescein value, is the enhancement factor of fluorescein, is the coefficient of change of perception radius, is the number threshold of neighborhood glowworms, is perception radius of neighborhood, is the dynamic decision domain, and is the step size.
The GSO algorithm is described as follows: Step 1: Initializing the relevant parameters of the algorithm, including population size, iteration times, and other parameters, to be set. Step 2: The objective function value corresponding to the position of glowworm in the t-th iteration is converted to the fluorescein value by formula (1). Step 3: Within the radius of its dynamic decision domain , each glowworm chooses individuals whose brightness is higher than itself to form its domain set by formula (2). The probability of individuals moving to the individuals in their domain set is calculated by formula (3). Step 4: Selecting the object to move and updating the position of glowworm i according to formula (4). Step 5: Updating the dynamic decision domain radius of glowworm i according to formula (5). Step 6: Judging whether the algorithm reaches the maximum iterations or not. If not, turn to step 2; otherwise, end.
Due to the uneven distribution of the initial solution in the solution space, the algorithm is unstable, slow in convergence speed, and low in accuracy. In order to avoid the premature problem of GSO algorithm, the idea of Good Point Set (GPS) Theory is employed to generate the initial glowworm population with uniform distribution. At the same time, a new inertia weight function of glowworm moving is used to dynamically update the moving step length, i.e., adaptive step-size, so as to further improve the stability, convergence speed, and accuracy of the GSO algorithm.
2.2. Improved GSO Algorithm
2.2.1. Good Point Set Improved GSO Algorithm
The theory of good point set  was put forward by Luogeng and Wang in the book of number theory in approximate analysis. Good point set can provide better support for the calculation in high-dimensional spaces . In , a glowworm optimization algorithm based on the good point set is proposed to optimize the initial weight and threshold of BP neural network, which is employed to solve the problem of agricultural drought evaluation. According to [23–25], the basic definition and structure of good point set are described as follows.
Definition 1. Let be a unit cube in m-dimensional Euclidean space, which is expressed as .
Definition 2. According to number and dimension of the sample , a point set of the same size is generated as the target set , where , and is the decimal part of . Compare the sample points with the target set , and Assume , where , , , and is known as the deviation of point set .
Definition 3. Set , where is a constant related to and . is also regarded as the deviation of , and . So the is considered as a good point set and is a good point.
It can be seen from  that the order of error is only related to the number of samples and independent of the spatial dimension of samples, which provides a very good algorithm for high-dimensional approximation. The method based on good point set theory is better than the random method to generate the initial population, because the deviation of the good point set method is , and the deviation of the random method is .
Because of using the theory of good point set to construct the initial population of glowworm, its calculation accuracy is independent of the dimension. So using the method based on good point set theory to design the initial glowworm population uniformly can overcome the shortcomings of the traditional methods and can produce the initial population with better diversity.
The number of initial glowworm population is n, and the method of taking the set of n good points in the m-dimensional space is defined as follows:(1)Generating good point set by exponential sequence method: .(2)Generating good point set by square root sequence method: .(3)Generating good point set by circle Division method: .Using the good point set method to design the initial glowworm population uniformly can produce the initial population with better diversity. Figures 1–4 show the two-dimensional initial population distribution charts with the scale of 400 generated by the random method and the three good point set methods respectively. It can be seen directly from the figures that the data point distribution in good point set method is much more uniform than that of random points. The construction of the good points is independent of the space dimension and can be used to solve the high-dimensional problems. Because the number of points taken by the good point set method is the same, the distribution effect is the same. Therefore, a relatively good initial glowworm population can be obtained by mapping the generated good points to the target solution space of GSO algorithm.
2.2.2. Adaptive Step-Size Improved GSO Algorithm
In the GSO algorithm, each glowworm has a different search range determined by the sensing radius. GSO algorithm can find the global or local optimal solution, which depends on whether the individual can move within the sensing range. With the increase of the number of iterations, glowworm individuals tend to converge near the peak. At this time, if the distance between glowworm individuals and the peak is less than the moving step, the individual will move to the other side of the peak. If the iteration is repeated again, the glowworm individual will move to the other side of the peak. The individual still fails to reach the optimal peak at this time. The glowworm individual repeatedly moves around the peak, which is called oscillation phenomena. To solve this problem, it is necessary to adjust the step size dynamically according to the search results of different stages, so as to deal with the relationship among the global optimization ability, convergence speed, and optimization accuracy. Based on the idea of inertial weight of particle velocity in particle swarm optimization algorithm , the inertial weight function of glowworm displacement is defined, which is the function of iterations . The inertia weight function of moving step size is specifically defined aswhere is the minimum moving step size, is the maximum moving step size, and is the maximum number of iterations.
The inertia weight function of glowworm moving step is shown in Figure 5. Therefore, the position update of the i-th glowworm is no longer carried out by formula (4) but by formula (7) in improved GSO algorithm:
The traditional single hidden layer feedforward neural network model is mainly based on the algorithm of gradient descent, such as BP neural network algorithm. Its learning speed is difficult to meet the needs, and it is easy to lead to local optimal solution. In different application scenarios, parameters need to be adjusted.
In 2004, Huang et al.  proposed a new feedforward neural network, which is the extreme learning method (ELM). The ELM has some advantages in solving data fitting, regression, classification, pattern recognition, and other related problems  and has been widely used in cx-image processing , medical diagnosis , fault inspection  traffic sign recognition , and other fields. The ELM is a fast learning single-layer feedforward neural network algorithm based on Moore Penrose matrix theory.
Suppose the number of neurons in the hidden layer of ELM is and the number of training samples is . There are arbitrary samples , where , . A single hidden layer neural network with L hidden layer nodes can be expressed aswhere is the activation function, represents the input weight, is the output weight, is the offset of the i-th hidden layer unit, and is Inner product of and .
The learning goal of single hidden layer neural network is to minimize the output error, which can be expressed as
That is, there are , , and , making
It can also be expressed aswhere is the output of the hidden layer node; is the output weight; and is the expected output.
In order to train single hidden layer neural network, we hope to get , makingwhere , which is equivalent to minimizing the loss function:
The traditional learning algorithm based gradient needs to adjust all parameters in the process of iteration. In the ELM algorithm, once the input weight and the bias of the hidden layer are determined, the output matrix of the hidden layer is uniquely determined . Training single hidden layer neural network can be transformed into solving a linear system and the output weight can be determined aswhere is the Moore Penrose generalized inverse of the hidden layer output matrix, which is called pseudo inverse for short. Therefore, the learning process of ELM can be summarized into three points.(1)Input weight and hidden layer threshold are given randomly(2)According to the input of training data and the activation function of hidden layer, the output matrix of hidden layer is calculated(3)According to the formula (16), the output weight of the network is calculated
3. Model of Credit Risk Measurement of Small and Micro Enterprises
3.1. Construction of Credit Risk Measurement Index System
Because the construction of credit risk measurement indicators of small and micro enterprises has three characteristics of big data problem : granularity scaling, cross-border association, and global view, which also belongs to the problem related to big data. First of all, granularity scaling refers to the digitalization of credit problem elements of small and micro enterprises, and it can be scaled among different measurement index granularity levels. Secondly, cross-border association refers to the expansion of the factor space of the credit problem of small and micro enterprises. It is not only necessary to consider the conventional elements and domain perspectives of small and micro enterprises, but also emphasize “externality” and “cross-border”, which associate internal data (such as basic information and financial and nonfinancial and industry internal data of small and micro enterprises) with external data (social contact and e-commerce data of business owners or managers). Finally, the global view refers to the global nature of the definition and solution of the credit index problem of small and micro enterprises, emphasizing the control and interpretation of the overall picture of the relevant situation and its dynamic evolution.
Refering to the credit indicator system used by Moody’s Investors Service and standard & Poor’s corporation, combined with the development characteristics of small and micro enterprises driven by big data, we pay full attention to the impact of the social platform and e-commerce platform data of enterprise owners on the credit status of small and micro enterprises. Credit status of enterprise operators, enterprise innovation ability, enterprise competitiveness, and staff quality are highlighted, so as to reasonably reflect the credit level of small and micro enterprises. 7 first-level indicators and 22 second-level indicators are selected. SPSS software is used for factor analysis to quantify the correlation between indicators, and principal component analysis is used to extract factors from indicators. Finally, 7 primary indicators and 10 secondary indicators are selected as credit risk measurement indicator systems, which is shown as Table 1.
3.2. Model of Credit Risk Measurement
3.2.1. Ideas of the Integrated Algorithm
ELM can randomly initial the connection weights and hidden threshold between input layer and hidden layer before training. ELM does not need iterative learning many times and can directly calculate the least square solution of output weight matrix. Although the learning speed is fast and the parameter adjustment is simple, the robustness of the model will be greatly affected when there is noise or uneven distribution in the training data set . In addition, since the input weights and hidden layer thresholds are randomly selected, ELM needs more hidden layer nodes than other feedforward neural networks, which affect the stability and generalization of the network. Therefore, IGSO is used to optimize the weight and threshold of ELM, and the ensemble strategy is carried out for the best weights and thresholds and the most reasonable number of hidden layer nodes.
The idea of IGSO-ELM integrated learning algorithm is to determine the structure of ELM according to the input and output parameters, so as to determine the coding length of each individual glowworm. Each individual in the population contains the initial weights and thresholds value of ELM. That is to say, the initial weights and thresholds of ELM are obtained by decoding the glowworm individuals in the IGSO algorithm; then the IGSO algorithm is used to optimize the weights and thresholds of ELM. Thus, a parallel and interactive learning algorithm of IGSO and ELM method is built. Finally, the optimal ELM weights and thresholds are obtained. The flow chart of the IGSO-ELM is shown in Figure 6.
3.3. Description of the Integrated Algorithm
The number of ELM input nodes is determined by the credit risk measurement index of small and micro enterprises. The number of hidden layer nodes is decided by the number of samples. The Output node indicates whether the credit record of small and micro enterprises is in default. The IGSO-ELM algorithm is implemented as follows: Step 1. Encoding: The parameters of ELM, such as weight , threshold , and number of initial hidden layer nodes , are considered as a glowworm to encode the real number. The population of glowworm is initialized by the theory of good point set. Calculate the output weights and create ELM network according to formulas (11)–(14) and (16). Step 2. Initialing parameters: Set the size of glowworm population , the initial fluorescein , sensing radius , initial step length , fluorescein volatility coefficient , and fluorescein renewal rate of each glowworm. Let the initial value of iteration counter . Set the maximum number of iterations . Step 3. Calculating glowworm fitness: Decode the glowworm, generate the weight and threshold of ELM, train and test the network to get the network test error (MAE), employ the error indices as the fitness of each glowworm, and update the fluorescein value of each glowworm by formula (1). Step 4. Updating glowworm locations: Calculate the neighborhood set by formula (2), compute the moving probability by formula (3), and update the location of each glowworm based on moving direction of target object selected by roulette method and step length calculated by formula (6). Step 5. Updating Decision radius: Update the sensing radius of glowworm according to formula (5). Step 6. Judging: Judge whether output accuracy of ELM meet the end conditions or not; if it is achieved, the optimistic results are given to the ELM network to produce the output result, iteration ends; otherwise, judge whether reaches the maximum iterations or not, if not, turn to Step 3. Otherwise, end.
3.4. Model Performance Evaluation Index
The performance evaluation criterion of the model is an indispensable part of the measure model. The proper estimation of the measure model can evaluate the accuracy of different models, which allow different models to compare with each other and also be used to define warning threshold . There are many error measures method to evaluate the matching degree between the model and the observation data, such as MAE (mean absolute error), RMSE (root mean square error), CA (Classification Accuracy), and (square correlation coefficient). The details are as follows:where and represent the measure value and the average measure value; and represent the observed value and the average observed value; and represent the number of samples correctly classified and the number of test samples; represents the number of observation samples.
4. Experiment and Analysis
All the codes in this experiment are written on MATLAB r2013a software platform. The compiled PC parameters are Intel (R) core (TM) i7-7200U CPU 2.71 GHz, 8.00 GB memory, 64-bit Windows10 operating system.
4.1. Performance of IGSO Algorithm Test Experiment
In order to verify the effectiveness of the IGSO algorithm based on good point sets theory and adaptive step length strategy, the following 10 benchmarks standard functions are selected for testing, as listed in Table 2. Among them, f1 to f5 are bidimensional functions and f6 to f10 are multidimensional functions, with the dimension d = 20. To show the advantages of IGSO algorithm in solving multivalued functions, the study compared the IGSO algorithm with the traditional GSO algorithm is made as follows.
The parameters of the two GSO algorithms are setting as follows: the maximum iteration is set to 100. The population size is set to 100. Set the initial fluorescein , initial step length , , , fluorescein volatility coefficient , fluorescein renewal rate , and field change rate of each glowworm. The parameters of GSO and IGSO come from the value of experimental experience. The dynamic decision domain and perception radius of glowworm individuals have important influence on the results of the algorithm. Therefore, we focus on the experiment to determine the glowworm individual decision radius and perception radius when solving each Benchmark test function.
The initial perception domain and the dynamic decision domain for the test function are different, whose decision radius of glowworm individuals are 2.448, 2.548, 2.688, 2.448, 2.888, 35.048, 28.048, 30.048, 35.048,30.048 respectively.
In Table 3, Best, Worst, Mean, and Var are used to record the best solution, the worst solution, the average solutions, and the variance of solutions for 30 times independent experiments. Figure 7 shows the convergence curve of the IGSO and GSO algorithms for 10 function . The performance test results of IGSO algorithm are analyzed as follows:(1)In terms of calculation accuracy, in 30 times repeated experiments, Table 3 shows that the best value, the worst value, and the average value of 10 functions solved by the IGSO algorithm are all better than those of the traditional GSO algorithm, which are basically close to their respective standard values.(2)In terms of convergence speed, it can be seen from Figure 7 that the convergence speed of the IGSO algorithm for functions f1, f5, f9, and f10 is much faster than that of the GSO algorithm, but for functions f6, f7, and f8 the convergence speed is the same as that of the GSO algorithm, which because functions f6, f7, and f8 are multidimensional functions, and the search speed of two algorithms in high-dimensional space is much slower than that in two-dimensional space. The convergence speed of the IGSO algorithm for function f2, f3, and f4 is much slower than that of the GSO algorithm. But the accuracy of the algorithm is better than that of the GSO algorithm.(3)In terms of stability, from the variance value in Table 3, it can be seen that the GSO algorithm has a large difference in the 30 times solution results of function f1, f2, f6, f9, and f10, and the variance of IGSO algorithm is smaller than that of GSO algorithm for function f1f2, f6, and f9. This is because the initial population of IGSO algorithm is generated by the good point set method, and the initial solution is the same every time the algorithm runs. It can also be seen from Figure 7 that there is a certain fluctuation in the algorithm curve during the iteration process. Compared with IGSO algorithm, GSO algorithm is less stable.
Take function f8 for example, comparing with the original GSO algorithm, the best solution, the worst solution, and the average solution of the 30 times independent experiments for IGSO algorithm decreased by 7.456504e−15, 1.612441e−08, and 2.003430e−12, respectively. The corresponding solution of original GSO algorithm decreased by 1.132974e−14, 1.612441e−08, and 6.449133e−09 respectively. The performance of the IGSO algorithm is the best, which indicates the abilities of IGSO algorithm to approximate to the most optimal solutions. Besides, the variance of the solutions for GSO and IGSO algorithm decreased by 2.696393e−16, 1.531690e−17 respectively, which means the IGSO algorithm can generate more accurate solutions stably. It is because the IGSO algorithm is based on the uniform distributed initial population produced by good point sets method and adaptive step-size strategy introduced by inertia weights, which made glowworm individuals in the algorithm keep an adaptive step-size to search for the optimal solution. Therefore, the IGSO algorithm has dramatic global search ability.
In a word, compared with original the GSO algorithm, the IGSO algorithm has a great improvement in convergence speed and calculation accuracy, and the IGSO algorithm has some advantages in stability compared with GSO algorithm.
4.2. Empirical Analysis on Credit Risk of Small and Micro Enterprises
4.2.1. Data Collection
The credit data of small and micro enterprises in China are employed to test IGSO-ELM algorithm model. The experiment data span is from 2017 to 2018, which includes the following types of information: (1) personal information of legal person small and micro enterprises; (2) economic and financial ratio data of small and micro enterprises; and (3) current credit data of small and micro enterprises. Personal information of enterprise owners and financial and economic data are from CSMAR Taian Financial Research Database, and enterprise credit data are obtained from sesame credit business service platform. A total of 549 samples of small and micro enterprises have been obtained and processed. According to the definition of financial institutions, a loan is considered to be in default if it is overdue for more than 15 days. Among them, 308 small and micro enterprises did not have loan default, accounting for 56.11%, and the remaining 214 small and micro enterprises have different degrees of loan default, accounting for 43.89%. In order to effectively compare the classification models (BPNN, ELM, ABC-ELM, PSO-ELM, GSO-ELM, and IGSO-ELM), the data set is randomly divided into two disjoint subsets, of which 75% are training subsets and 25% are testing subsets. 10 cross tests are used for each model. The advantage of cross testing is that the credit model can contain the available data (75% of the samples) to the maximum extent.
4.2.2. Experimental Design
According to the research results in Section 4.1, 10 indexes are selected as the input layer of ELM neural network, the hidden layer of 10 nodes, and corresponding to the output layer of one node. So the ELM network structure is 10-10-1. To verify the efficiency of IGSO-ELM model, BPNN model, ELM model, ABC-ELM model, GSO-ELM model, and PSO-ELM model are constructed for credit risk measurement analysis. Parameters of ELM related models are set as follows: the reconstructed data are normalized between [0, 1]. The sigmoid function is employed as the activation function. The type of ELM is set to classification mode.
The parameters of model related to IGSO and GSO algorithms are set as the population size , the maximum iterative . the initial fluorescein , initial step length , , , fluorescein volatility coefficient , fluorescein renewal rate , and field change rate of each glowworm. The initial perception domain and the dynamic decision domain are the same, which are 58.248.
For BPNN model, the initial weights and thresholds are obtained by Marquardt Levenberg. The transfer functions of hidden layer and output layer are sigmoid function and linear function, respectively. Among them, the number of training iterations of BPNN model is 103, the MSE target is 10−2, and learning rate is 10−1. The number of hidden layer nodes in BPNN and ELM models is determined by step trial calculation. According to the size of training samples, the number of hidden layer nodes is increased in turn. The number of hidden layer nodes is determined when the classification accuracy reaches the maximum. The calculation results show that the number of hidden layer nodes of ELM model is 20.
For ABC-ELM algorithm , the parameters of model related to ABC algorithms are set as the population size n is 100, the number of employed foragers and onlookers are n/2, the limits of food sources Limit is set to 50, and the maximum iterative number is set to 50.
For PSO-ELM algorithm , the parameters of model related to PSO algorithm are set as the population size n is 100. The acceleration factors C1 and C2 are all set to 2, and other parameters of PSO algorithm are set as follows: maximum velocity Vmax = 0.5, minimum velocity Vmin = −0.5.
IGSO-ELM starts from the initial hidden node 5 and gradually increases the number of hidden nodes, which optimizes the sample classification accuracy. Determine the number of hidden nodes of IGSO-ELM as 20, which is shown as Figure 8. Thus, the structure of BPNN model, ELM model, ABC-ELM model, GSO-ELM model, and IGSO-ELM model are all 10-20-1.
In order to compare the convergence effect of IGSO-ELM model, GSO-ELM model, PSO-ELM model, and ABC-ELM model, Figure 9 shows the relationship between the fitness value, i.e. MSE (mean square error) and the number of iterations for four model. It can be seen from Figure 9 that the IGSO-ELM model reduces the number of iterations and can find a stable solution close to the best goal. The main reason is that the initial population based on the theory of good point set and the dynamic adjustment of the moving step length can improve the global search ability of glowworm population.
In Table 4, Best, Worst, Mean, and Var are used to record the best solution, the worst solution, the average solutions, and the variance solutions of BPNN model, ELM model, ABC-ELM mode, PSO-ELM mode, GSO-ELM mode, and IGSO-ELM mode for 10 times independent experiments of ABC-ELM, PSO-ELM, GSO-ELM, and IGSO-ELM algorithms. The test set output results of the six models are shown in Figure 10. The performance test results of six models are analyzed as follows.
Firstly, it can be seen from Table 4 that compared with the other five models, the best, worst, and average solutions of the IGSO-ELM model with 10 independent experiments are 92.7273, 85.4545, and 88.8951, respectively, which are better than the corresponding solutions of the other five models.
Secondly, compared with ABC-ELM model，PSO-ELM model，GSO-ELM model, and IGSO-ELM model, BPNN model and ELM model which are two kinds of single hidden layer feedforward neural network have lower classification accuracy. The best classification accuracy of BPNN model is just 50%, and the best classification accuracy of ELM model is only 81%, which shows that it is a very correct choice to optimize single hidden layer feedforward neural network model by using various swarm intelligence optimization algorithms to improve the classification accuracy.
Finally, it can be seen from Table 4 that the variance values of BPNN and ELM models are far greater than those of ABC-ELM, PSO-ELM, GSO-ELM, and IGSO-ELM models, which indicates that the stability of BPNN and ELM models is relatively poor compared with the other four models. The variance value of IGSO-ELM model is the smallest compared to those of ABC-ELM, PSO-ELM, GSO-ELM models, which indicates that IGSO-ELM model is more stable than the other three combination models. This is because the IGSO algorithm generates the initial population by using the good point set method, which can greatly improve the performance of the whole IGSO-ELM model.
In order to better illustrate the advantages of IGSO-ELM model, three commonly used evaluation indexes of machine learning, which are MAE, RMSE, and R2, are selected. MAE (mean absolute error) is the mean absolute error, which means the average absolute error between the predicted value and the observed value; RMSE (root mean square error) is the sample standard deviation of the difference (called residual) between the predicted value and the observed value. RMSE is used to illustrate the degree of sample dispersion; R2 is Nick name of the determination coefficient or the square correlation coefficients, which measures the fraction of the total variation in the dependent variable that is explained by the independent variable.
It can be seen from Table 5 that the square correlation coefficients of BPNN, ELM, ABC-ELM, PSO-ELM, GSO-ELM, and IGSO-ELM model are NaN, 0.3888, 0.5952, 0.6668, 0.6261, and 0.7105, respectively. The value of ELM model is larger than that of BPNN model, and the values of MAE and MSE are smaller than that of BPNN model, which shows ELM model has higher classification accuracy and better generalization performance compared with BPNN model based on gradient descent method.
In Table 5, the classification accuracy (CA) of ABC-ELM, PSO-ELM, and GSO-ELM models is higher than that of ELM model, which shows that swarm intelligent algorithm plays a good role in improving the prediction accuracy of ELM. In addition, the classification accuracy of IGSO-ELM model is higher than that of ABC-ELM, PSO-ELM, and GSO-ELM models, which shows the correctness of the improvement direction of ELM algorithm and the good performance of IGSO-ELM model in credit risk measurement of micro and small enterprise.
Because of the lack of reliable financial statements and operating records, small and micro enterprises are facing financing difficulties, which has become an important factor restricting the development of small and micro enterprises. The credit status of small and micro enterprises plays an important role in their financing, so it is of great significance to study the credit risk measurement of small and micro enterprises. Therefore, a credit risk measurement model was proposed in this paper based on the improved GSO algorithm and ELM algorithm. Firstly, according to the growth and development characteristics of small and micro enterprises in the big data environment, the formation mechanism of credit risk of small and micro enterprises was analyzed from the perspective of granularity scale driven by big data, cross-border correlation, and global perspective, and a comprehensive evaluation index system was built by summarizing and analyzing the factors influencing credit evaluation indicators. Secondly, the traditional GSO algorithm was improved by good point set theory and variable step size strategy, and 10 benchmark standard functions were selected to test effectiveness of the IGSO algorithm. Experimental results show that the IGSO algorithm had great improvement in stability, accuracy, and convergence speed compared with the GSO algorithm. So, the integrated algorithm based on the improved GSO and ELM was established. The number of hidden layer nodes in ELM is determined by step-by-step trial method, and then the weight and threshold of ELM are optimized by the improved GSO algorithm in this integrated algorithm. Finally, ELM is a simple and effective method to establish the credit risk measurement model of small and micro enterprises which is verified by simulation experiment. Thus, a credit risk measurement model of small and micro enterprises based on IGSO-ELM integrated algorithm was proposed. The sample data of small and micro enterprises in China are collected, and the simulation experiment is carried out with MATLAB software tool. The experimental results showed that the model was effective, feasible, and accurate compared with the BPNN model, ELM model, ABC-ELM mode, PSO-ELM model, and GSO-ELM model. The research results of this paper can provide a reference for solving the problem of credit risk measurement of small and micro enterprises and also lay a solid foundation for the theoretical research of credit risk management.
The [.xlsx] data used to support the findings of this study are currently under embargo, while the research findings are commercialized. Requests for data, 6 months after publication of this article, will be considered by the corresponding author.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This study was supported by the fund of Philosophy and Social Science Planning Project of Anhui Province (No. AHSKY2018D09).
X. Man, T. Zhang, C. Wang et al., “Credit risk factors identification and risk measurement of Micro，Small and medium enterprises in China,” Journal of Central University of Finance & Economics, vol. 9, pp. 46–58, 2018.View at: Google Scholar
X. Han, “Study on the micro and small enterprises credit risks evaluation model based on support-vector machine regression ensemble,” Credit Reference, vol. 37, no. 11, pp. 13–20, 2019.View at: Google Scholar
J. SU, “On soft information and credit risk evaluation of micro and small enterprises,” Credit Reference, vol. 36, no. 7, pp. 13–20, 2018.View at: Google Scholar
F. Zhang, A. Li, and Y. Han, “Study on small and micro business credit assessment based on improved dynamic combined evaluation method,” Chinese Journal of Management, vol. 16, no. 2, pp. 286–296, 2019.View at: Google Scholar
L. Zhou, B. song, Y. Liu et al., “The application study of park enterprises credit risk evaluation based on euclid approach degree of fuzzy matter-element model,” Advances in Intelligent Systems Research, vol. 143, pp. 279–283, 2018.View at: Google Scholar
G. Chen and G. Wu, “The challenges for big data driven research and applications in the context of managerial decision-making: paradigm shift and research directions,” Journal of Management Sciences in China, vol. 21, no. 7, pp. 6–15, 2008.View at: Google Scholar
A. Maher, F. Maysam, and Abbod, “A new hybrid ensemble credit scoring model based on classifiers consensus system approach”,” Expert Systems with Applications, vol. 64, pp. 36–55, 2016.View at: Google Scholar
S.Y. Chang and T.Y. Yeh, ““An artificial immune classifier for credit scoring analysis”,” Applied Soft Computing, vol. 12, no. No. 2, pp. 611–618, 2012.View at: Google Scholar
K. N. Krishnanand and D. Ghose, “Detection of multiple source locations using a glowworm metaphor with applications to collective robotics,” in Proceedings of 2005 IEEE Swarm Intelligence Symposium,, pp. 84–91, Pasadena, CA, USA, June 2005.View at: Google Scholar
K. S. S. Rani and N. Devarajan, “Optimization model for sensor node deployment,” European Journal of Scientific Research, vol. 70, no. 4, pp. 491–498, 2012.View at: Google Scholar
I. Aljarah and S. A. Ludwig, “A new clustering approach based on glowworm swarm optimization”,” in Proceedings of 2013 IEEE Congress on Evolutionary Computation, pp. 2642–2649, IEEE, Cancun, Mexico, June 2013.View at: Google Scholar
H. Lo-keng and Y. Wang, Applications of Number Theory to Numerical Analysis, Springer, Berlin, Germany, 1972.
Y. Chen, X. Liang, and Y. Huang, “Improved quantum particle swarm optimization based on good-point set,” Zhongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Central South University (Science and Technology), vol. 44, no. 4, pp. 1409–1414, 2013.View at: Google Scholar
J. Li, Z. Ni, X. Zhu et al., “Drought prediction model based on GPSGSO-BPNN parallel ensemble learning algorithm,” Systems Engineering Theory & Practice, vol. 38, no. 5, pp. 1343–1353, 2018.View at: Google Scholar
L. Zhongqun, Z. Hua, and P. Chengming, ““Particle Swarm Optimization algorithm based on adaptive inertia weight’,” in Proceedings of 2010 2nd International Conference on. IEEE Signal Processing Systems (ICSPS), pp. 454–459, Dalian, China, July 2010.View at: Google Scholar
G. B. Huang, Q. Y. Zhu, and C. K. Siew, “Extreme learning machine: a new learning scheme of feedforward neural networks,” in Proceedings of IEEE International Joint Conference on Neural Networks, IEEE, Budapest, Hungary, July 2004.View at: Google Scholar