Abstract

In the fierce market competition, companies are constantly facing the threat of falling into GFC. A global financial crisis refers to a crisis in global financial assets or financial institutions or financial markets. However, the threat of a global financial crisis (GFC) is not helpless, but can be predicted in advance. Therefore, building a GFC prediction model is of great significance to the development of the company. This article mainly studies the GFC prediction model of listed companies based on statistics and AI methods. This paper chooses to determine the number of training samples and test samples as 40 and 16 respectively, that is, 8 companies are randomly selected as test samples from financial health companies and GFC companies respectively, and the remaining 40 become training samples. According to the primary selection of characteristic indicators, this paper adopts the frequency statistics method, that is, the higher frequency is selected through the previous research, and the indicator selection is made on this basis. This article will use the Kolmogorov–Smimov (K-S test) goodness-of-fit test method. Each of the early warning indicators selected in this article should be able to distinguish between GFC and non-GFC companies, so the selection should be made by indicators one by one. Bring the indicators of each year into the factor function formula obtained by factor analysis, and get a new variable group. Then SPSS16.0 was used for binomial logistic regression analysis for each year. This article uses KMO and Bartlett identification. The assumption of the sphericity test of the Bartlett test is that the correlation coefficient matrix is an identity matrix, and statistics are obtained according to the matrix formula of the correlation coefficient matrix. The prediction accuracy of the nonlinear combination discriminant method has been improved in the first three years of the GFC, and in the year (t − 3), which is a little far away from the crisis time, the accuracy rate has reached 83%. The results show that the combination of statistics and AI has a significant effect on improving the prediction accuracy of the GFC prediction model of listed companies.

1. Introduction

With the active development of the capital market, companies can raise low-cost funds from the capital market to accelerate their development, and investors can use the operation of the capital market to invest and obtain higher returns. However, modern enterprises are facing an increasingly unlucky market environment, and risks always give operators a headache. The characteristics of the modern business environment are mainly reflected in economic globalization, rapid development of information technology, customer orientation, changes in business models and management methods. These factors are affected by politics, economy, society and technology. The survival process of a modern enterprise is a process in which various risks are continuously generated and resolved. When a GFC occurs in an enterprise, it will affect various degrees and seriously affect the interests of investors.

For investors, the establishment of an effective GFC early warning model helps corporate investors to correctly analyze, judge, and predict the financial status of the enterprise, establish correct investment concepts, and make correct investment decisions. Due to the asymmetry of information, most of the information obtained by investors will be delayed. If investors obtain information about abnormal corporate finances, losses are likely to occur. Therefore, being able to correctly judge the company’s financial status and predict financial risks will help minimize the investment risks that are important to investors.

With the rapid development of financial integration, the financial security of various countries has seriously affected the pace of economic development of other countries. Konstantakopoulos et al. believes that the global financial crisis has had a significant impact on people’s mental health, leading to an increasing incidence of mental disorders and suicide. Regarding the diagnostic classification rate of new CMHC cases each year, no significant difference was observed. Although his research has certain reference, it lacks some necessary data [1]. Motsi et al. studied the changes in bank competition behavior in sub-Saharan Africa after the 2007-2008 global financial crisis. He adopted the Panzar-Rosse competition model and found that the degree of competition among banks in sub-Saharan Africa has increased. Nevertheless, when the GFC broke out in 2007/2008, the success of the development of the banking system eased somewhat. Subsequently, as the regulator sought to restore system stability, the prudential policy underwent major adjustments, which once again had an impact on changing the competitive behavior of banks. Policymakers should continue to formulate and promote policies aimed at developing financial intermediaries and improving the competitive behavior of banks in sub-Saharan Africa. However, there is no specific experimental operation in his research [2]. Debunov believes that for companies under market conditions, not only the sum of profits is important, but also their financial capabilities. The ability of a company to resist the threat of bankruptcy is a necessary condition for its long-term operation and sustainable development. He proposed to use artificial neural networks to establish an economic mathematical model of corporate financial sustainability, remove human factors, and improve the speed and accuracy of corporate bankruptcy threat diagnosis. In the current conditions following the economic crisis of 2014-2015, an example of this model relates to Ukrainian companies. In order to establish a financial sustainability model, he constructed a three-layer artificial neural network for direct signal propagation. As an input factor, he suggested using 17 financial indicators to make the most comprehensive assessment of the company’s financial sustainability. Although his model has certain reference value for financial institutions, investment funds, audit firms and enterprises themselves to predict corporate bankruptcy in time, it lacks specific experimental results [3].

In the construction of the indicator system, this article selects two aspects, financial indicators and nonfinancial indicators, to fully and accurately reflect the company’s actual financial status and the company’s overall picture. In the choice of the nonfinancial indicator system, the factors that have an impact on the company’s financial goals are mainly considered, such as corporate governance factors, company ownership structure and auditing. This article analyzes the advantages and disadvantages of the model from three aspects: classification accuracy, two types of misjudgment costs, and operability, in order to find a model that is more suitable for enterprise GFC early warning research and has strong operability, and better help the relevant stakeholders of the enterprise avoid risks.

2. GFC Prediction Model for Listed Companies

2.1. Statistics and AI

Statistical data is a general term for the numerical data and other data related to the national economy and social phenomena obtained in the process of statistical work activities, and is the result of measuring the phenomenon. The statistic is the sum of the relative square deviations between the actual observation frequency and the predicted frequency of the model. Then its calculation formula is [4]:

Among them, K is the number of types of covariates, represents the observation frequency of the k-th covariant type, represents the prediction frequency of the k-th covariant type, and the degree of freedom is the difference between the number of covariate types and the number of parameters [5].

Let represent the maximum likelihood value estimated by the set model, which represents the degree to which the selected model fits the sample data, and represents the maximum likelihood value of the saturated model. is called the likelihood ratio [6]. Usually we use the natural logarithm of the likelihood ratio multiplied by −2 as the D statistic, then:

The HL test data are arranged in ascending order of their predicted probabilities [7]. The statistical formula is as follows:

Among them, J is the number of groups, and J ≤ 10; is the number of cases in the jth group; is the number of observations of the jth group of events; is the predicted event probability of the jth group [8].

The empirical risk Remp (w) and the actual risk R (w) satisfy the following relationship with a probability of at least 1 − η [9]:where h is the VC dimension of the function set, and n is the number of samples [10].

The classic radial basis function uses the following decision rules:

Among them, depends on the distance between the two vectors [11].

In general, a neuron is a multi-input, single-output nonlinear device, a neuron usually has multiple dendrites, which are mainly used to receive incoming information; while there is only one axon, and there are many axon terminals at the axon tail that can transmit information to other neurons. Axon terminals connect with the dendrites of other neurons to transmit signals. The location of this connection is biologically called a “synapse.” and its expression is as follows [12]:

Among them, is the internal state of the neuron, is the threshold, and represents the weight connected to the neuron [13].

The output of the neuron is represented by the function f, and the S function is most commonly used to realize the non-linear characteristics of the network [14].

Among them, c is a constant.

The network structure of Elman neural network is shown in Figure 1. Hidden layer neurons can use non-linear or linear transfer functions for transmission [15].

The non-linear state space expression of Elman neural network is [16]

Among them, y, x, u, xc represent m-dimensional output neuron vector, n-dimensional hidden layer neuron vector, r-dimensional input vector and n-dimensional feedback state vector [17].

2.2. GFC Prediction Model

It is a gradual process for an enterprise to develop from the occurrence of financial risks to the GFC of the enterprise. Therefore, the GFC of the enterprise can be predicted. Therefore, the early warning of GFC is a very important part of the financial risk management of the entire enterprise, and it is effective. GFC early warning system can not only determine the status of corporate GFC, but also analyze the causes of corporate GFC, which can prompt companies to take corresponding countermeasures to avoid similar situations [18]. In addition, an effective GFC early warning system can issue early warning signals of corporate GFC in advance, so that companies can find financial risks in a timely manner and prescribe remedies, take relevant measures to control the spread of risks in a timely manner, and help companies return to normal financial status [19].

In a binary classification model with two situations in the dependent variable, in a binary classification model with two situations in the dependent variable, the dependent variable y is set to represent different results, and the dependent variable is the independent variable . When the error term exists, the regression model can be written as follows [20]:

If the result value of the dependent variable y is 1, the probability is , then the probability that the result value of the dependent variable y is 0 is , so that the value of the dependent variable is in the interval [0, 1], so write into the following distribution function:

Different prediction models correspond to different functional forms of . The logistic regression model is the prediction model when takes the logistic function, namely , so the logistic regression model function form is as follows:

When a GFC occurs, the financial risk early warning system can distinguish the causes of financial risks, and obtain the most likely causes of corporate financial risks and financial crises, so that the management can take effective measures in a timely and accurate manner to avoid the crisis. The financial risk early warning system can record the causes and roots of the crisis in detail, analyze the possible causes of the crisis in detail, and draw an analytical report, and formulate detailed measures and plans for the handling of such crises in the future. The loopholes in the existing management system and regulations can be remedied to improve the functions of the corporate financial early warning system and further reduce the potential hidden dangers of financial risks.

As a leader in the development of the market economy, listed companies have a relatively complete financial system. Facing the complicated and rapidly changing domestic and foreign economic situations, building a financial early warning model suitable for listed companies can enable management to detect abnormalities in financial operations early and prompt they adjusted their business strategies in advance to reduce the probability of deterioration in their financial situation. However, the financial early warning system cannot be widely used in my country’s listed companies for some reasons. For example, the quality of accounting information affects the effectiveness of the financial early warning system, the decision-making level of listed companies lacks the awareness of actively using the financial early warning system, and the research on the practical application of the financial early warning system is not yet in place.

3. Simulation Experiment of the GFC Prediction Model of Listed Companies

3.1. Sources of Financial Data

The samples selected in this article involve various industries of listed companies in my country. This article mainly reflects the financial status of my country’s listed companies from five indicators: solvency, profitability, operating ability, development ability and cash flow. Through the analysis of the significance of these indicators, 19 financial indicator systems were selected. When collecting and sorting out the data of 19 financial indicators, if there are missing data, the mean value of the nonmissing value closer to the missing value is used to fill it. Data with one-sidedness will not be selected. This paper finally chooses to determine the number of training samples and test samples as 40 and 16 respectively, that is, 8 companies are randomly selected as test samples from financial health companies and GFC companies respectively, and the remaining 40 become training samples. The sample grouping situation is shown in Table 1.

3.2. Selection of Financial Indicators

According to the primary selection of characteristic indicators, this paper adopts the frequency statistics method, that is, the higher frequency is selected through the previous research, and the guarantee is more general, and the indicator selection is carried out on this basis. This paper initially selects 23 financial indicators that reflect the five aspects of corporate profitability, debt solvency, operating capacity, growth capacity and cash flow, as shown in Table 2.

3.3. Normality Test of Early Warning Indicators

In statistics, there are two main types of normality tests for multivariate statistical analysis: one is to perform normality tests for each variable separately. If each variable conforms to normality, it is said that the multivariate conforms to the normal distribution; the second is to consider multiple variables at the same time and conduct a multivariate normality test. This article will use the Kolmogorov–Smimov (K-S test) goodness-of-fit test method. Each of the early warning indicators selected in this article should be able to distinguish between GFC and non-GFC companies, so the selection should be made by indicators one by one. In view of this, the normality test suitable for this article should be the first test method, that is, to verify whether each indicator conforms to the normal distribution.

3.4. Construction of GFC Prediction Model

Before creating and training the neural network, first initialize the neural network, the purpose is to prevent any remnants of previous values or operations from affecting the creation of the model. The initialization parameters of the neural network in this study are completely set according to the default values of the MALTAB software package. The larger the sample size of the neural network, the better it can be trained, and the higher the final training effect. Therefore, this article will divide all the samples into two parts based on the random selection method of the financial data of the first two years, the first three three years, the first four years of the crisis sample ST and the data of the corresponding years. A total of 415 sets of data are used as the input of the neural network to calculate the deviation between the output value and the expected value, and then calculated from the output layer to the input layer, adjusting each weight to reduce the deviation.

3.5. Test of Logistic Regression Model

Bring the indicators of each year into the factor function formula obtained by factor analysis, and get a new variable group. Then SPSS16.0 was used for binomial Logistic regression analysis for each year. The curve of the logistic regression model is S-shaped, and the predicted maximum value is close to 1, and the minimum value is close to 0. Usually, 50% is selected as the split point. In other words, if the value of the dependent variable calculated according to the model is greater than 0.5, it can be classified as a GFC enterprise. Otherwise, it can be regarded as a sound enterprise.

3.6. Factor Analysis

This article uses KMO and Bartlett identification. The KMO test statistic is an indicator used to compare simple and partial correlation coefficients between variables. The assumption of the sphericity test of the Bartlett test is that the correlation coefficient matrix is an identity matrix, and statistics are obtained according to the matrix formula of the correlation coefficient matrix. If the value is large and the corresponding correlation probability value is less than the set validity level, the assumption is cancelled and the correlation coefficient matrix cannot be used as the identity matrix. In other words, there is a correlation between the original variables, which is suitable for factor analysis. In addition, it is not suitable for factor analysis.

4. Results and Discussion

The short-term solvency index variables selected in this article include: current ratio (X1), quick ratio (X2), working capital ratio (X3), cash ratio (X4) and working capital to total assets ratio (X5). The correlation analysis results between the variables X1, X2, X3, X4, and X5 are shown in Figure 2. From the results of the correlation analysis of the short-term solvency indicator variables, the correlation coefficient of the indicator variables X1 and X2 reached 0.9802, which is highly correlated. One of them should be removed, and the correlation between these two variables and other variables should be further analyzed. According to the principle of less correlation between the index variables, the index variable X1 is eliminated. According to the above method, the correlation between each index variable is compared one by one and eliminated one by one.

Taking P = 0.5 as the decision point, substituting the sample data of the training group to test the effect of the model, the results are shown in Table 3. For GFC companies, the forecast accuracy of t − 2 year reached 89.8%, t − 3 year reached 81.6%, and the forecasting ability declined sharply from t − 4 year, and only 55.1% of GFC companies could be identified. In terms of reference to normal companies, the model maintained a relatively high recognition rate from t − 2 to t − 4, the lowest t − 2 year also reached 81.6%, and the highest t − 3 was 85.7%.

In order to explain these 8 factors, this paper uses the maximum variance method in the orthogonal rotation method to transform the factor loading matrix, as shown in Figure 3. The main variable of main component 1 is 0.887 expense and expense profit margin, which is much higher than other indicators. Explain the capital utilization efficiency of main component 1, and its representative index is the profit rate of expenses and expenses. The main variable of main component 2 is the total asset turnover rate of 0.830, which represents the equipment utilization rate of the asset. The factor load of the enterprise asset scale index of main component 3 is 0.713, which is significantly higher than other indexes, so the main component 3 can be interpreted as the asset scale factor, and its representative index is the total assets of the enterprise. The main factor load of main component 4’s total asset growth rate is significantly higher than other indicators, so main component 4 can be interpreted as a growth factor, and its representative indicator is the growth rate of total assets. The main variable of main component 5 is the current ratio, which reflects the company’s ability to pay. Therefore, the main component 5 can use the current ratio as a representative variable and summarize it as the debt repayment coefficient. In the main component 6, the main variable is the ratio of operating cash flow per share and the liquidity ratio of short-term loans, which reflects the company’s ability to obtain cash. Therefore, the main component 6 can be summarized as cash flow factors and operating cash flow per share. In the main component 7 and the main component 8, the types of audit opinions and the two indicators of profit per share have increased significantly compared with other indicators. The main variables of main component 7 are profit per share, return on net assets, and return on total assets.

The accuracy of the prediction model is shown in Table 4. In the three-category Logistic forecasting model, the rate of GFC companies being misjudged as non-GFC companies is 27.61%, the rate of non-GFC companies being misjudged as GFC companies is 7.14%, and the overall accuracy rate is 87.19%. The two-category logistic regression results show that the accuracy rate of companies in GFC is 62.70%, the accuracy rate of companies in non-GFC is 94.00%, and the total accuracy rate is 85.30%. From an overall point of view, the ordered three-catnon-financial crises, the accuracy of the former is 1.14% lower egory Logistic prediction model is better than the two-category Logistic prediction model, its accuracy rate is 1.89% higher, and the cost of misjudgment is relatively reduced. From the perspective of the accuracy of predicting than that of the latter, but from the accuracy of predicting financial crises, the accuracy of the former is higher than that of the latter 9.69%, which is of great significance in practical applications.

The empirical results of Fisher’s two types of discriminant models, Logisic regression analysis model and nonlinear combination model based on BP neural network are compared as shown in Figure 4. It can be seen that the prediction accuracy of the nonlinear combination discriminant method has been improved in the first three years of the GFC, and in the year (t − 3), which is a little far away from the crisis time, the accuracy rate has reached more than 83%. This shows that the combined forecasting model can comprehensively consider the forecasting information of each individual model to a certain extent, thereby improving the forecasting accuracy.

The discrimination results of the BP neural network model are shown in Table 5. The results of the BP neural network’s judgment and simulation of the control sample and the test sample respectively show that the control sample type I misjudgment (the GFC enterprise is misjudged as a financial health enterprise, that is, the number of false errors is 0, and the misjudgment rate is 0%, category II misjudgment (to judge a financial health company as a GFC company, that is, the number of true errors is 0, and the misjudgment rate is 0%. Therefore, the total number of misjudgments is 0, and the misjudgment rate is 0.0%. Therefore, the classification accuracy rate of the BP neural network for the control sample is 100%. The number of misjudgments for the test sample is 4, the misjudgment rate is 22.2%, and the misjudgment for the second type is 3, and the misjudgment rate The judgment rate is 16.7%, so the total number of misjudgments is 7, and the misjudgment rate is 19.5%. Therefore, the classification accuracy of the BP neural network on the test samples reaches 80.5%. In general, the BP neural network model has a good the forecast effect can be early warning of GFC three years in advance.

The comparison of the predicted value of the GFC early warning model is shown in Figure 5. The forecasting capabilities of the five GFC early warning models are all high. On the one hand, the accuracy of ST company’s discrimination is higher than that of normal companies. It can be understood as: try to avoid misjudging ST company as a normal company in the experiment, meet the restriction of the first type of error in the statistical inspection, and reduce the inspection. On the other hand, the discriminant analysis model and regression analysis model are widely used, their prediction accuracy is very high, reaching 90%, its operation is simple, the requirements for samples are not harsh, and it reflects good predictive ability, which is most scholars why choose to use it. The overall prediction accuracy of the BP neural network is higher than that of the other two, which mainly reflects the adaptability and fault tolerance of the neural network. In particular, the prediction accuracy of the BP combined prediction model is as high as 95%. The BP combined prediction model can compare the information of the subject sample. Classification integration improves the accuracy of prediction.

The accuracy of PSO-SVM for different data predictions is shown in Figure 6. It can be seen from the figure that the average accuracy of the model has reached 88%, 90.5%, and 86%, and the accuracy is within the interval of 0.8∼0.95. The standard deviation shows that the prediction results produced by the model are The neatness is getting better and better with the continuous optimization of the index system. In the early warning of the three sets of sample data, the volatility of the data screened for the first time is relatively small, indicating that the PSO-SVM model established in this article has a relatively high degree of recognition of the data screened for the first time, and has reached a high degree of accuracy.

The performance comparison of early warning models is shown in Figure 7. The indicators of the two early warning models are above 80%, and both show good early warning performance. Comparing the indicators separately, it is found that the random forest model dominates the other performance indicators except for the positive hit rate. This shows that although the Lasso-logistic early warning model can get a little more positive feedback information after the early warning signal is issued, there is still a certain gap in the accuracy of the ST early warning signal coverage and the overall prediction accuracy compared with the random forest early warning model.

Table 6 shows the comparison of the differences in the overall accuracy of each model. From the difference of the overall accuracy of each model, among the single models, the accuracy of the SVM model has the largest difference, reaching 13.3%, and the difference of the accuracy of the Fisher discriminant model is the smallest, which is 3.3%. The overall discriminant accuracy difference of the linear combination model is better than the above three single models, and the difference is further reduced to 2%. It can be seen that the robustness of the combined early warning model is better than that of a single early warning model, and the combination of models is beneficial to improve the robustness of the model.

5. Conclusions

With the increasingly fierce market competition, listed companies are being dealt with by ST due to financial failures, causing huge losses to investors and directly affecting the healthy development of the securities market. This paper constructs the Logistic regression model and the Fisher linear discriminant analysis model, which have good prediction accuracy in the three years before the dilemma. Comparing the two, whether it is the overall accuracy rate or the first type accuracy rate, the Logistic regression model is higher than the Fisher linear discriminant analysis model, and has a better prediction effect. Each GFC early warning model has its own advantages and disadvantages. In actual application, an early warning model suitable for each company should be selected according to the company’s own conditions and the characteristics of the early warning model. And the company can consider combining various early warning models to achieve better early warning effects. For the company, the most important thing is how to integrate its own GFC early warning model, so as to avoid the company’s GFC and achieve the effect of GFC early warning, so as to achieve a goal of better and smoother development of the enterprise.

Data Availability

No data were used to support this study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.