The paper aims to propose a new method to state the credit risk characteristics of the regional listed companies in China and makes the listed companies avoid involving in credit crisis. The paper selects fifty-four listed companies of Hebei Province as the research sample and establishes the index system of listed company’s credit risk evaluation from four financial index categories which included profitability, operating capacity, solvency, and growth capability. The paper first filtrates fifteen indexes by using the gray clustering method from the four financial categories and finds out the effective variables of the prediction model. Then the paper predicates the credit risk probability of the listed companies by using the logistic regression model. Finally, by analyzing the financial data of annual reports of fifty-four listed companies in Hebei Province from 2012 to 2017 as sample data, the simulation experiment empirical test is carried out by using SPSS software. The results show that the logistic regression model with gray clustering analysis has high predictive accuracy and has a strong predictive ability to evaluate the credit risk of listed companies. The gray logistics evaluation plays a very good role in financial early warning for regional listed companies in China.

1. Introduction

Credit risk of the listed companies is concerned by increasing investors and researchers. The researchers usually build a credit risk assessment model to avoid credit risk. There are four evaluation methods which are widely used [1]. The four methods include credit rating method, pluralistic statistic analysis method, dynamic measurement analysis method, and artificial intelligence like neural network method. However, the four methods have some defects. For example, the credit rating method is a subjective conjecture; therefore, the method is quite limited. The dynamic measurement analysis method can accurately describe the credit risk which is caused by the change of enterprise credit quality [2]. However, the stock market of China is not mature yet; the value of the stock market cannot reflect the real value of the enterprise. The defect of artificial intelligence method is that sometimes the method may get into local optimum and sometimes the method cannot get the optimal solution [3]. The pluralistic statistic analysis method is a traditional statics method. It mainly includes pluralistic discriminant analysis and logistic model analysis. The method is comparatively mature and widely used in many fields. The logistic model is adopted by many scholars. For example, a famous researcher named Tang Haiou has analyzed the financial indexes of 30 listed companies and verified the effectiveness of the Z-score model. Logistic model analysis establishes the logistic regression model by using the original data. This method is easy to operate. However, sometimes this method cannot get higher prediction accuracy [4].

Therefore, in this paper, the authors establish the logistic regression model of listed companies’ credit risk and make use of the gray clustering method to estimate the parameters of the model. The logistic regression model is easy to operate and the gray clustering method can improve the prediction accuracy. The paper studies the credit risk of Chinese regional listed companies and their financial condition. Financial indexes are used as dependent variables to quantify the risk. The credit risk evaluation focuses on the debt repayment ability. The value of variables is divided into two types: “default” and “normal,” corresponding to the companies with high credit risk and the companies with low credit risk [5]. Variables that the paper selects are binary variables, and the paper uses the logistic regression model to predicate the credit risk of the listed companies because this method is widely used in solving binary classification problems. As we know, if the company with higher risk is considered as a normal company, it will be more dangerous than the lower credit risk company which is considered as a higher credit risk company. Therefore, the paper can use 0.3 as the cut point to build the logistic regression model. The accuracy rate of the logistic regression model can reach 81.5%. The paper uses logistic regression analysis as the main modeling method. To select model indexes, the gray correlation clustering method is used to simplify the index system of credit risk evaluation and can avoid multicollinear problems of explanatory variables [6]. Therefore, this paper combines the gray relational clustering method with the actual conditions of the credit risk model to select variables and constructs the logistic regression model. The paper also analyzes and assesses the credit risk of the listed companies in Hebei Province. It not only effectively reduces the multicollinear problems of explanatory variables but also ensures the explanatory maximum of variables [7].

2. Methods

2.1. Building Models
2.1.1. Standardization of Assessment Index

Assuming that the number of observation objects is m, is an n-dimension random vector. X is the financial index of the company. n is the number of selected financial indexes. Due to the different value ranges or measurement units of the selected financial indexes, it is necessary to standardize the financial indexes. The common types of index attribute values can be divided into benefit index, cost index, and fixed index. The benefit index is the index that is the larger the better. The cost index is quite opposite. The fixed index is the index closer to a certain value, the better it is [8]. Generally, we can take the following methods to deal with these three types of indices:

The benefit index

The cost index

The fixed indexwhere is the optimal value of the fixed index

The standardized matrix Y can be obtained by standardizing the original index values according to the above method.

2.1.2. Gray Relational Cluster Analysis

The company’s financial index is high relevance and multicollinearity. If the independent variables are not tackled, the predictive function of the logistic regression model will be greatly reduced [9]. Therefore, this paper uses the gray correlation clustering method to tackle the financial indexes of the company, to simplify the indexes, and to reduce their correlation and collinearity [10].

Gray relational clustering can calculate the correlation degree of each sample according to the gray theory and combine the similar factors by forming the correlation matrix of characteristic variables [11]. This paper uses the method of gray correlation clustering to cluster the selected financial indexes. Each cluster is regarded as a collection of a series of similar index variables. Financial indexes are reclassified by clustering and the most representative factor in each cluster is selected to represent this series. The gray clustering method not only ensures the adequacy of information but also reduces the multicollinearity problems between variables [12].

The standard matrix Y, for all , calculates the absolute correlation degree of Yi and Yj; thus, the triangular matrix A is obtained.

Here, , is called the characteristic variable correlation matrix for matrix A.

If the critical value R ∈ [0, 1] is taken, it is generally required R > 0.5. When R, Yi and Yj are merged into one class. The critical value affects the classification. Generally, the larger the R value is, the closer to 1 it will be. The finer the cluster is, the fewer variables in the cluster group there will be. The smaller the value of R is, the closer to 0 it will be and the coarser the cluster is, and there will be more variables in the cluster group. R can be adjusted according to the actual needs and the correlation between the factors. Then we determine the final index variables to construct the logistic regression model. The method mentioned above has no requirement for sample size and is easy to operate. It can effectively overcome the problem of multicollinearity between variables [13].

2.1.3. Model Building

If we study the credit risk of a company, the value of the credit risk of the company can be regarded as a binary variable which is a random variable with (0-1) distribution. Logistic regression is the most commonly used method to study these variables [14].

Suppose that the original data sequence of financial index variables of a listed company is ; then is the actual observation value [15]. The nonlinear relationship between the probability of credit risk which the model assumes and the variables of financial indexes will appear as follows:

By using the logit equation, (5) is transformed into a linear function:

The financial index variables function as follows:

Equation (6) is the logistic function. Equation (7) is the linear combination of financial index variables that affect the probability of credit risk occurrence. is the explanatory variable. Parameter is a constant term. Parameter is a logistic regression coefficient. is the number of financial index variables. Usually, we use the iterative method to estimate the constant term and regression coefficient in the logistic function [16]. According to gray theory, we build a differential equation [17]:

The letter a and the letter u are called undetermined coefficients [18]. The interval value of a is (−2, 2). We can obtain the matrix a-u, and the matrix is called gray parameter matrix. The matrix is . By calculating the parameter a and parameter u [19], we can obtain the forecasting value . Then the paper carried out the accuracy test of the gray model. The residual error test is carried out. The residual error function is as follows:

The variance of residual error is described as follows:

The mean value of residual error is described as follows:

2.2. Empirical Analysis
2.2.1. Sample Selection and Index Selection

From 2012 to 2017, there are 56 listed companies in Hebei Province; one of them is an ST company. To ensure the comparability of the samples, Fangzhan, which is ST company, is removed. Due to the particularity of financial indexes of financial enterprises, Baoshuo shares were also removed. Therefore, the samples of this paper are 54 listed companies in Hebei A-share market from 2012 to 2017. These 54 companies belong to 18 different industries in the industry classification by China Securities Regulatory Commission, and they have strong industrial representation [20]. The 18 industries are numbered as follows.

A1: coal industry, A2: food manufacturing industry, A3: liquor and beverage manufacturing industry, A4: metallurgical industry, A5: cement manufacturing industry, A6: chemical industry, A7: furniture manufacturing industry, A8: textile industry, A9: leather, fur, feather, and their products, A10: pharmaceutical industry, A11: mechanical industry, A12: electronic industry, A13: thermal power generation industry, A14: water transportation industry, A15: information technology service industry, A16: wholesale and retail trade industry, A17: real estate industry, and A18: livestock industry. Taking these 18 industry classifications as observation objects, each observation object has 15 characteristic data, which come from financial indexes that affect the credit risk of listed companies [21].

This paper selects 15 financial indexes as candidate variables from the four financial indexes of profitability, operating capacity, solvency, and growth capability. The data of financial indexes are from the annual reports of listed companies of various industries [22]. Financial indexes are selected as follows, as shown in Table 1.

2.2.2. Variable Processing

Based on the above-mentioned sample data and relevant data of financial indexes, the average value of each industry index is used as the basis of gray clustering analysis [23] (see Table 2 for detailed data). The gray clustering analysis of data is carried out using gray modeling software which is called Gray Modeling (version 3.0).

The relevance matrix of characteristic variables is obtained as shown in Table 3.

In this paper, the critical value of 0.85 is taken as the basis of classification. The critical value not only ensures the sufficiency of information but also simplifies the indexes [24]. These 15 financial indexes are grouped into eight categories. X1, X3, and X15 are grouped into one category; X2, X6, and X10 are grouped into one category; X5 and X14 are grouped into one category; X7 and X11 are grouped into one category; X8 and X12 are grouped into another category; and each of the other three indexes X4, X9, and X13 is classified into one category separately. Finally, this paper selects the eight financial indexes X1, X2, X4, X5, X7, X9, X12, and X13 for modeling.

2.2.3. Modeling

Logistic model samples include unhealthy companies and healthy companies [25]. Most scholars regard those listed companies which have been specially treated (ST; ST) as unhealthy companies; the others are healthy companies. However, this classification method is not suitable for regional listed companies with little sample data. Therefore, we cannot take those as grouping indices by whether the listed companies are specially treated or not [26]. We should put into consideration comprehensively all aspects of the financial conditions of listed companies.

The occurrence of the credit risk of listed companies is related to the financial condition of the enterprises [27]. Therefore, this paper believes that if half or more of the financial indexes of listed companies show signs of deterioration while being compared with the industry average, the company will have credit risk. If the company has credit risk, that is unhealthy, the value is 1; if there is no credit risk, that is, “healthy,” the value is 0 [28].

In the logistic model, the choice of cutting point has a great influence on the result of the model [29]. In practical application, the value of cutting point and the misjudgment rate of unhealthy companies meet the increasing relationship [30]. This means that the larger the value of the cutting point is, the higher the rate of misjudgment will be. Generally speaking, it is not very dangerous that a healthy company is misjudged as an unhealthy company. However, it seems more dangerous that a company which has higher credit risk is considered as a healthy company. Therefore, this paper takes the above considerations and finally selects 0.3 as the cutting point to construct the logistic model [31].

By using SPSS software, using maximum likelihood estimation, logistic regression analysis was carried out on the above 8 variables by forward stepwise regression method [32]. The output results that come from SPSS software are shown in Table 4. Finally, two variables get into the model: return on total asset ratio X2 and growth rate of operating revenue X13. If the significant values of X2 and X13 are less than 0.05, this shows that the two variables have good statistical characteristics, and the variables were significant.

To assess the fitting effect of the model to the data, it is necessary to test H-L fitting degree of the model. The test results are shown in Table 5

When the significance level is 0.05 and df = 8, the critical value of chi-square is 15.507. The chi-square value of H-L was 6.061 < 15.507, significant value 0.64 > 0.05. According to this, it can be concluded that the H-L test can pass and the model can fit the data well [33]. Therefore, the logistic function can be obtained.

The default warning model is

Given the company’s X2 and X13 data, we can calculate the probability of default of the company. The results of the accuracy test of the model are as shown in Table 6.

2.2.4. Empirical Test

To further test the effectiveness of the Logistic Default early warning model based on gray relational cluster analysis [34], this paper randomly selects 8 healthy companies and 8 unhealthy companies for the empirical test (the data come from the annual report of Hebei listed companies in 2017 made by Ruisi Database) [35]. The default probability can be calculated by taking the financial data of the 16 enterprises into equation (7). The test results are shown in Table 7.

3. Results and Discussion

From the data analysis, we found that two indicators which are the return on total asset ratio and growth rate of operating revenue play an essential role in credit risk evaluation of the listed companies. The two indicators are a positive correlation with the credit status of the listed companies [36]. In the research of the regional listed companies of Hebei Province, we use all the sample data for the parameter estimation of the model because of the small number of sample data. Then, we achieve the prediction value of all sample data and check the prediction effect according to the accuracy rate and misjudgment ratio [37]. According to Table 6, the comprehensive classification accuracy rate of the logistic model is 81.5%. The classification accuracy rate of healthy companies is 85%. The classification accuracy rate of unhealthy companies is 71.4%. Therefore, the model has good prediction ability [38].

From Table 8, we can see that the prediction accuracy of the credit risk assessment model of listed companies in Hebei Province (which was established by logistic regression method) reaches 81.25%, and the probability of misjudgment is 18.75%. The prediction result is relatively satisfactory [39].

4. Conclusions

This paper uses gray cluster analysis to cluster the financial indexes of the selected listed companies to achieve the effect of index reduction and reduce the correlation between the indexes. And it is guaranteed that the financial index information is not damaged. Logistic regression analysis is used to construct the credit risk assessment model for the reduced indexes, which can reduce the workload of collecting invalid data. The empirical research results show that the model has strong prediction ability and accuracy for assessing the credit risk of listed companies in China. It can play an early warning role of the credit risk of listed companies.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.


This research was supported by the Science & Technology Research Project of Colleges and Universities in Hebei Province, Grant no. ZD2020403; Scientific Research Project of Tangshan Normal University, Grant no. 2020A13; and Natural Science Foundation of Hebei Province, Grant no. A2019105110.