#### Abstract

Bankruptcy prediction is an important problem facing financial decision support for stakeholders of firms, including auditors, managers, shareholders, debt-holders, and potential investors, as well as academic researchers. Popular discourse on financial distress forecasting focuses on developing the discrete models to improve the prediction. The aim of this paper is to develop a novel hybrid financial distress model based on combining various statistical and machine learning methods. Then multiple attribute decision making method is exploited to choose the optimized model from the implemented ones. Proposed approaches have also been applied in Iranian companies that performed previous models and it can be consolidated with the help of the hybrid approach.

#### 1. Introduction

Listed companies financial distress prediction is important to both listed companies and investors. However, due to the uncertainty of business environment and strong competition, even companies with perfect operation mechanism have the possibility of business failure and financial bankruptcy. So whether listed companies financial distress can be forecasted effectively and timely is related to companies’ development, numerous investors’ interest, and the order of capital market [1–3].

Most topical studies have adopted a multiple-variable approach to the prediction of financial distress by combining accounting and nonaccounting data in a variety of statistical formulas [4–7]. While the predictive value of accounting information was based on samples of industrials or on nonindustrials, the misclassification rates were low. Hence the explanatory variables had a significant predictive power. Ratios based on accounting earnings reported cash flow and book debt figured prominently in various statistical formulas, especially those applied to the industrial sector such as univariate analysis, multiple discriminant analysis, logit, and probit model [8–11]. Although these methods use history samples to create diagnostic model, they cannot inductively learn from new data dynamically. This greatly affects the forecasting accuracy. More recently, many studies have demonstrated that artificial intelligence such as decision trees, neural networks, and support vector machine can be alternative methods for financial distress prediction [12–14]. Moreover, there are various studies on the comparison between statistical and machine learning methods in terms of their ability to predict financial data [15–18].

On the other hand, the recent researches have exploited multiple attribute decision making (MADM) methods in financial analysis to improve the final outputs [19–22].

This paper puts emphasis on optimizing the financial distress forecasting in the case of listed companies in Tehran Stock Exchange (TSE) of Iran by the hybrid approach which outperforms existing discrete models significantly. Because of the importance of financial ratios to describe a company’s situation, factor analysis is applied to summarize the effect of financial ratios; then the combinations of all of them are exploited. Subsequently, the extracted predictors are utilized to forecast financial distress in a hybrid approach through traditional statistical modeling distress and machine learning methods for classifying business. In this analysis, another important issue is homogenizing business via clustering method to improve prediction models. Also, MADM method is used to distinguish the best model via different classification performances measures. Consequently, a comparison of the final results shows that the prediction of the financial distress is significantly consolidated.

The paper is organized as follows: Section 2 presents a short review of the literature in the field of financial distress forecasting. Section 3 briefly describes the applied methods. Then Section 4 explains the proposed approach and also the empirical evidence from Iran is presented. The paper ends with concluding remarks in Section 5.

#### 2. Review of the Literature

The early prediction of distress is essential for companies and investors or lending institutions that wish to protect their financial investments. As a consequence, modeling, prediction, and classification of companies to determine whether they are potential candidates for financial distress have become key topics of debate and detailed research.

Corporate bankruptcy was first modeled, classified, and predicted by Beaver [23]. He defined financial distress as bankruptcy, insolvency, and liquidation for the benefit of a creditor, firms defaulted on loan obligations or firms miss preferred by dividend payments. In this study, “cash flow to total debt” had the highest discriminatory power of the ratios examined. Altman’s model is perhaps the best known of the early studies [4]. He developed a -score bankruptcy prediction model and determined a cut point of -score (2.675) to classify healthy and distressed firms. The results showed that the -score model had a sound prediction performance one year and two years before financial distress but did not indicate good prediction utility three to five years before financial distress. A number of authors such as Taffler [24, 25], Pantalone and Platt [26], Betts and Belhoul [27], and Piesse and Wood [28] followed his work and applied the -score model into different markets, different time periods, and different industries. Also, Deakin [29] and Blum [30] used multiple-variable statistical techniques subsequent to Altman [31].

Furthermore, most recent studies have adopted a multiple-variable approach to the prediction of financial distress by combining accounting and nonaccounting data in a variety of statistical formulas. In the reviewed literature, 64% of all authors used statistical techniques whose overall predictive accuracy was 84%. 25% of the authors used machine learning models whose overall accuracy was 88%, and 11% of the authors used theoretical models whose accuracy was calculated as 85% [32]. Table 1 briefly presents some recent researches in financial distress forecasting.

In general, the investigation of the studies carried out on the value of data of financial cases of bankruptcy prediction shows that the accounting data are able to predict the financial distress in the companies. We must, however, consider this point that there is no high unity (of views) regarding the kind of the financial ratios used in prediction of financial distress and the yielded results according to different financial ratios and methods of research. In this research, some ratios that have a high unity of views are used [37–39].

#### 3. Methodology

In this section, the methods applied in our paper are briefly described. Factor analysis, -means method, discriminant analysis, logit model, decision trees, neural network, and TOPSIS are presented, respectively.

##### 3.1. Factor Analysis

Factor analysis is a dimension reduction method of multivariate statistics, which explores the latent variables from manifest variables. Two methods for factor analysis are generally in use, principal component analysis and the maximum likelihood method. The main procedure of principal component analysis can be described in the following steps when applying factor analysis [40].

*Step 1. *Find the correlation matrix () or variance-covariance matrix for the objects to be assessed.

*Step 2. *Find the eigenvalues (, ) and eigenvectors ( for assessing the factor loading ( and the number of factors ().

*Step 3. *Consider the eigenvalue ordering (; to decide the number of common factors and pick the number of common factors to be extracted by a predetermined criterion.

*Step 4. *According to Kaiser [41], use Varimax criterion to find the rotated factor loading matrix, which provides additional insights for the rotation of factor-axis.

*Step 5. *Name the factor referring to the combination of manifest variables.

##### 3.2. -Means Method

This method clusters objects into deterministic partitions by minimizing the total squared error function given by MacQueen (1967) [40, 42]:where is the number of clusters in the data, is the center of cluster , and is the data point in the cluster . Different solutions can be attained depending on the initial guess of cluster centers; therefore, the procedure should be repeated multiple times, and the final solution is selected as the one that gives the maximum separation between clusters.

##### 3.3. Discriminant Analysis

Let be a -dimensional normal random vector belonging to class if , , where , and is a positive definite symmetric matrix. If , , and are known, the optimal classification rule is Fisher’s linear discriminant rule,where , , and denotes the indicator function, with value 1 corresponding to classifying to class 1 and 0 to class 2. Fisher’s rule is equivalent to the Bayes rule with equal prior probabilities for two classes. The misclassification rate of the optimal rule iswhere is the standard normal distribution function.

In practice, Fisher’s rule is typically not directly applicable, because the parameters are usually unknown and need to be estimated from the samples. Let , and , be independent and identically distributed random samples from and , respectively. The maximum likelihood estimators of and and arewhere , and settingand (or generalized inverse when does not exist), Fisher’s rule becomes the classic LDA [40, 43]:

##### 3.4. Logit Model

Binary responses, for example, success and failure, are the most common form of categorical data and the most popular model for them is logit model. For a binary response, , and a vector of explanatory variables, , let denote the success probability when takes value . This probability is the parameter for the binomial distribution [44]. The logistic regression model has a linear form for the logit of this probability,

##### 3.5. Decision Trees

A decision tree (DT) is a machine learning technique used in classification, clustering, and prediction tasks. A well-known tree-growing algorithm for generating DT is Quinlan’s ID3 [45]. It starts from the root node. The root node is one of the best attributes. The property values are then generated that correspond to each branch. Each branch generates a new node. For the best attributes according to the selection criteria, ID3 uses an entropy-based definition of the information gain to select the test attribute within the node. The entropy characterizes the purity of a sample set. Suppose is a set of data samples. We assume that the class label attribute has different values, the definition of different classes is , and set is the number of samples in class . Equation (8) is the given sample classification based on the expectations of the information:where is the probability of any sample belonging to , which is estimated using .

The set attribute has different values . A property can be divided into subsets , where contains a number of values in this sample and they have a value of in . If we select test attribute , these subsets correspond to set , which contains nodes derived from growing the branches. assumes that is a subset of the samples of class . Thus, can be divided into subsets of entropy or expected information, which is given by where the item subset is on the right of the first and is equal to the number of subsets of the sample divided by the total number of in the sample. Equation (10) is a given subset for :where is a sample of based on the probability of belonging to class . Equation (11) is a branch that will be used for encoding information:

In other words, is attributable to a value of that property because of the expectations of the entropy of compression. Thus, a smaller entropy value leads to a lower correlation, whereas a higher corresponding information gain produces a subset of the division with higher purity. Therefore, the test attribute decision tree selects the properties with the highest information gain. This creates a node and marks the property, where each value of the property creates a branch and divides the sample accordingly.

The decision tree contains leaves, which indicate the value of the classification variable, and decision nodes, which specify the test to be carried out. For each outcome of a test, a leaf or a decision node is assigned until all the branches end in the leaves of the tree [45–47].

##### 3.6. Neural Network

Neural network is a technique that imitates the functionality of the human brain using a set of interconnected vertices. It is based on an artificial representation of the human brain, through a directed acyclic graph with nodes (neurons) organized into layers. In typical feed-forward architecture, there are a layer of input nodes, a layer of output nodes, and a series of intermediate layers. The input signals are multiplied by their corresponding weights to give the value of as in where is weighted sum of input signals at node ; is threshold (bias) value; is the weight associated with the connection between node and the input node ; is a value of input node ; is number of input nodes. A sigmoid activation function (13) is applied to the weighted sum:

The value calculated from (13) is the output signal from node , which can be considered as the input signal to the next layer [48].

##### 3.7. TOPSIS

TOPSIS (technique for order preference by similarity to an ideal solution) method is a popular approach to MADM (multiple attribute decision making) that has been widely used in the literature. It presented by Hwang and Yoon consists of the following steps [49].

*Step 1. *The decision matrix is normalized through the application of

*Step 2. *A weighted normalized decision matrix is obtained by multiplying the normalized matrix with the weights of the criteria,

*Step 3. *Positive indicator score () (maximum value) and negative indicator score () (minimum value) are determined by

*Step 4. *The distance of each alternative from and is calculated using

*Step 5. *The closeness coefficient for each alternative () is calculated by applying

*Step 6. *At the end of the analysis, the ranking of alternatives is made possible by comparing the values.

#### 4. The Proposed Approach and Empirical Evidence

Corporate bankruptcy forecasting plays a central role in academic finance research, business practice, and government regulation to financial decision support. Consequently, accurate default probability prediction is extremely important.

The main purpose of this study is not only improving the prediction performance model through hybrid analysis approach but also employing a multiple attribute decision making (MADM) method to make optimum decision for choosing the best alternative classification. Figure 1 briefly presents the flowchart of the optimization approach.

As shown in Figure 1, the first step consists of a list of important and available financial ratios including liquidity measurement ratios, profitability indicator ratios, debt ratios, operating performance ratios, cash flow indicator ratios, and investment valuation ratios. In the proposed approach, all ratios would be considered because of the efficiency of forecasting approach. In fact, the factor analysis is used for dimension reduction when the numbers of predictors are high. If this method is not applied, it may cause overfitting during modeling. Also, by applying the factor analysis, our predictors are more influential and significant than before because they present more information on the listed companies.

Another suggestion in this research to present more sophisticated models is homogenizing of the company’s performance. It is fairly clear that there are a variety of businesses which can bring about the inefficiency of forecasting models. As a solution, this step tries to cluster businesses based on an influenced ratio and then exploit forecasting methods. A comparison among performance measurements presents remarkable improvement among recent financial distress modelings.

Subsequently, the extracted predictors from the factor analysis are utilized to forecast financial distress through traditional statistical modeling distress and machine learning methods in each cluster separately.

As the last but not the least step, based on the different classification performances measures, we try to choose the best model from the data set. It is because up till now there is no best classification method which can cover the best score in all evaluation measures. Different multiple attribute decision making (MADM) methods often produce different outcomes for selecting or ranking a set of decision alternatives involving multiple attributes. The TOPSIS is one of the famous MADM methods used to distinguish the best model. Also, other MADM methods can be applied. Consequently, the final results show that the prediction of the financial distress is significantly consolidated.

In the following, empirical evidence related to the proposed approach is also presented. Population under the study is the accepted manufacturing companies in Tehran Stock Exchange (TSE) for one year ended on March 21, 2011. The reason for this choice is the availability of financial information of these companies. There are 461 companies listed in TSE with 37 industry groups, of which 412 are manufacturing companies and 49 are the nonmanufacturing ones. The number of the manufacturing companies is more than other listed companies subject to granting more loans due to their extensive activities. A sample of 180 companies is chosen for this research.

In Tehran Stock Exchange, the measure for companies exiting capital market is the commercial law of 141 acts. According to those acts, companies are known as bankrupt whose retained losses are more than 50% of their capital. 58 companies are bankrupt under this law. The rest of nonbankrupt companies were randomly selected from the remaining list.

In this research, some ratios that have a high unity of views for 180 manufacturing companies quoted in Tehran Stock Exchange for one year (year ended on March 21, 2011) were used. The required data to calculate the ratios have been gathered from companies’ balance sheets and income statements. The financial ratios used in the prediction are listed in “The Definition of Variables Used.”

Popular discourse on financial distress prediction deals with the selection of important variables because of enormous financial ratios [27, 30]. Hence, in this study, to reduce the variables dimension and summarize their effects on factors by the analysis of correlation matrix, we applied the factor analysis. Consequently, Table 2 shows four common factors by the principal component method and Varimax rotation. They account for a cumulative proportion of 93 percent of the total sample variance.

It is fairly clear that the first factor represents the debt and cash flow conditions (a strong combination of DR, DER, ROS, ROE, and WCTA). The second factor almost symbolizes the liquidity conditions (a strong combination of TA, CR, and QR), and the third one approximately denotes the operating performance of the listed companies (a strong combination of SFA, STA). The last factor is the investment conditions (a combination of SI, TLDS, CLOE, and TLOE). For example, we have

Then, thanks to the existing difference in the size and type of the industries, the listed companies have not been homogeneous, causing the inefficiency of prediction models. As a solution, we tried to cluster listed companies by using the -means technique based on a ratio. The current ratio (CR) is a popular financial ratio used to test a company’s liquidity by deriving the proportion of current assets available to cover current liabilities. Therefore, it is crucial to improve distress prediction; the listed companies were clustered based on their capital positions. Table 3 shows the result. Hence, the listed companied can be divided into three levels of current ratio (CR): small, average, and high.

In the next step, the usual statistical distress prediction methods are applied. First of all, the logit models arewhere is the debt and cash flow conditions, is the liquidity conditions, is the operating performance, and is the investment conditions. All listed companies divided into 3rd cluster are not distressed. Also, the average error count of clustered distress is about 7.5 percent less than that without cluster analysis for logistic regression.

The next statistical method for bankruptcy prediction is discriminant analysis. Table 4 presents the linear discriminant functions for financial distress in each cluster. The average error count is significantly reduced in comparison with the general function.

In addition, to implement the machine learning methods, decision trees and neural network are applied where analysis for each cluster is not necessary separately. This is because both methods can be used for both classification and clustering purposes. Actually, in this part, the important issue is to utilize all ratios and extract factors as predictors separately.

To classify the distressed companies through decision tree, 100 companies were randomly used to build the tree and the rest for test (see Figure 2).

It means that if, for a company, (the debt and cash flow conditions) is greater than 0.41, then the company is not distressed, and if is less than 0.23 and (the liquidity conditions) is less than 0.16, then the company is distressed. To evaluate the built trees, we applied about 30 percent of data. The results showed that the tree of directly financial ratios had a greater misclassification error than that of factors (20%). In other words, comparison between two trees showed that to make tree by factors more information was used; hence a more reliable result was achieved.

At last, neural network is implemented by three hidden layers and hyperbolic function. The final result shows the overall accuracy of 94 percent, which is 12 percent less than all ratios directly used.

On the other hand, there are four models to forecast financial distress (in fact, there are more classification methods that can be applied), and we have to present the best one from our data set. As mentioned above, the TOPSIS method is used to choose the best model based on the different classification performances measures. Table 5 illustrates some classification performances measures for the applied methods, where TP is true positive, FP is false positive, TN is true negative, and FN is false negative andand AUC is the area under a receiver operating characteristic (ROC) curve, where a ROC space is defined by FP and TP as - and -axes, respectively, which depicts relative trade-offs between true positive and false positive.

As mentioned, there are various methods to choose an ideal alternative from MADM problems and as stated before, TOPSIS is one of them. In the TOPSIS approach, the best alternative is the nearest one to the ideal solution and the farthest one from the negative ideal solution. Also, it is assumed that all the criteria have identical weights and importance. Table 6 presents a brief calculation of this method, where is normalized criteria (*accuracy, error rate, precision, sensitivity, specificity*, and* AUC*, resp.), is the distance from the ideal alternative, is the distance from the negative ideal alternative, and CC is the relative closeness to the ideal solution.

Based on the last column of Table 6, the decision trees, neural network, logit analysis, and discriminant analysis are the better models, respectively, while if decision maker wants to choose the best model by the information in Table 5, then the judgment may be probably different.

#### 5. Conclusion

The enterprise bankruptcy forecasting has always been an important issue in the business and financial decision support. In this research, applying a hybrid approach is suggested to improve the prediction performance and give more supportive results.

First of all, factor analysis was used to determine and summarize some combinations of financial ratios correlated together. After that, -means algorithm was used to cluster companies, homogenize them, and get much accurate results.

Later, to predict the financial distress, the multiple logistic regression analysis, the multiple discriminant analysis, the decision tree, and the neural network, that are all the famous methods in this field, were applied. Finally, the best model classifier was chosen with the help of multiple attribute decision making (MADM), the TOPSIS method.

The proposed approach, which has also been applied in Iranian companies in Tehran Stock Exchange which used to employ previous performance models, can be consolidated with the help of the hybrid analysis. The comparison among the used methods clearly showed that the decision tree and then the neural network had a remarkable performance in comparison to others.

The hybrid approach advanced provides insight into the complex interaction of the common bankruptcy prediction methods and suggests avenues for applying MADM methods in this area in the future research.

#### The Definition of Variables Used

TA: | The total asset |

CR: | The ratio of the current assets to the current liabilities |

QR: | The ratio of the amount of cash and equivalents, short, and accounts receivable, term investments to the current liabilities |

DR: | The ratio of the total liabilities to the total assets |

DER: | The ratio of the total liabilities to the total owner’s equity |

SI: | The ratio of the sales to the number of the inventories |

TLDS: | The ratio of the total liabilities to the daily sales |

SFA: | The ratio of the sales to the fixed assets |

STA: | The ratio of the sales to the total assets |

ROS: | The ratio of the net income to the sales |

ROE: | The ratio of the net income to the average owner’s equity |

CLOE: | The ratio of the current liabilities to the owner’s equity |

TLOE: | The ratio of the total liabilities to the owner’s equity |

WCTA: | The ratio of the working capital to the total assets. |

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.