Abstract

This paper proposes a novel ensemble learning approach based on logistic regression (LR) and artificial intelligence tool, that is, support vector machine (SVM) and back-propagation neural networks (BPNN), for corporate financial distress forecasting in fashion and textiles supply chains. Firstly, related concepts of LR, SVM, and BPNN are introduced. Then, the forecasting results by LR are introduced into the SVM and BPNN techniques which can recognize the forecasting errors in fitness by LR. Moreover, empirical analysis of Chinese listed companies in fashion and textile sector is implemented for the comparison of the methods, and some related issues are discussed. The results suggest that the proposed novel ensemble learning approach can achieve higher forecasting performance than those of individual models.

1. Introduction

In fashion and textile industry, there is a great deal of change due to global sourcing and high levels of price competition. In addition, fashion and textiles have market characteristics such as short product lifecycle, high volatility, low predictability, and a high level of impulse purchase [13]. The rigorous competition and rapid change demand cause financial risk to corporations in a fashion and textile supply chain [48]. As financial risk may be infectious from one corporation to another within the supply chain, the prediction of corporate financial distress is important to fashion and textile supply chain management.

For better performance in fashion retailing, more and more research has been focused on forecasting, including sales forecasting [911], fashion retail forecasting [12], and color trend forecasting [13, 14]. However, in fashion and textile sector, little attention has been paid to corporate financial distress forecasting, which is important to various stakeholders (i.e., management, investors, employees, shareholders, and other interested parties) as it provides them with timely warnings. From a managerial perspective, financial distress forecasting tools allow for taking timely strategic actions so that financial distress can be avoided [15].

Many traditional techniques have been presented to predict corporate financial distress, including univariate approaches [16], linear multiple discriminant approaches (MDA) [17, 18], multiple regression [19, 20], logistic regression [21], and factor analysis [22]. However, strict assumptions of traditional statistics such as linearity, normality, and independence among predictor variables limit their applications in the real world [23].

Due to limitations of traditional statistical and econometric models, some nonlinear and artificial intelligence (AI) models, including neural networks [24], case-based reasoning [25, 26], and support vector machine [27], have been used for corporate financial distress forecasting. However, individual forecasting methods have limited capability in the description of financial characteristics. In particular, to some complex forecasting problems, there may be a bias in the results when only an individual method is used [28]. A more appropriate approach for improving the forecasting accuracy is the combination of individual methods, which always performs better than the worst individual model on predictions and sometimes can outperform the best individual model [29].

Some hybrid methods have been used for corporate financial distress forecasting [30]. In terms of experimental results, AdaBoost ensemble with single attribute test (SAT) outperforms AdaBoost ensemble with decision tree (DT), single DT classifier, and single support vector machine (SVM) classifier. As a conclusion, the choice of weak learner is crucial to the performance of AdaBoost ensemble, and AdaBoost ensemble with SAT is more suitable for corporate financial distress forecasting of Chinese listed companies [31]. Also, empirical results indicate that the integration of principal component analysis (PCA) with MDA can produce better performance in short-term financial distress forecasting of Chinese listed companies [32]. However, there is not an ensemble learning approach that can improve the forecasting performance in one method by recognizing Type I error and Type II error in another method yet.

This paper proposes a novel ensemble learning approach based on logistic regression (LR) and artificial intelligence tool, that is, support vector machine (SVM) and back-propagation neural networks (BPNN) for the prediction of corporate financial distress in fashion and textiles supply chains. Firstly, related concepts of LR, SVM, and BPNN are introduced. Then, the forecasting results by LR are introduced into the SVM/BPNN technique as an ensemble approach. Empirical analysis is implemented for the comparison of the methods, and some related issues are discussed.

The rest of this paper is organized as follows. The basic concepts of LR, SVM, and BPNN are introduced in Section 2. We describe our proposed ensemble learning approach in Section 3. Section 4 presents empirical analysis to illustrate the proposed approaches and some related issues, including performance comparison and analysis. Finally, we make a conclusion and discuss future research in Section 5.

In this section, concepts of LR, SVM, and BPNN are introduced as follows.

2.1. Logistic Regression Model

In a logistic regression model (LR), dependent variable is always in categorical form and has two or more levels [33]. In this study, we consider the situation where we observe a binary outcome variable and a vector of covariates for each of individuals. We code the two class via a 0/1 response , where is for the first class (financial distress) and is for the second one (no financial distress). Let be the conditional probability associated with the first class. In a logistic regression model, probability of the dichotomous outcome event is related to a set of explanatory variables as follows: where is the coefficient vector of the model and is the transpose vector. Equation (1) is logit transformation and is odds ratio.

Let be the training data set, which is a set of independent and identically distributed random variables. The regression coefficient estimated from the data is interpretable as log-odds ratios or, in term of exp(), as odds ratios. The log-likelihood for observations is used for estimating regression coefficients as follows: where gives odds ratio and this value reflects the effect of indicators in financial distress.

2.2. Support Vector Machine

Support vector machine (SVM), proposed by Vapnik, has been proved to possess excellent capability for classification [34]. The conventional SVM achieves classification by mapping the input vectors on to a high-dimensional feature space and by then constructing a linear model that implements nonlinear class boundaries in the original space. The SVM employs an algorithm that finds a special kind of linear model, that is, the optimal hyperplane, which refers to the maximum-margin hyperplane and yields the maximum separation between decision classes. Thus, the optimal hyperplane separates the training examples with the maximum distance from the separating hyperplane to the closest training data samples. The training examples closest to the maximum-margin hyperplane are called support vectors. All other training examples, other than the support vectors, are useless for constructing the optimal hyperplane. As a result, it is possible for SVM models to effectively perform binary classification with a small size of training samples [28].

For the linearly separable case, a hyperplane, which separates the binary decision classes in the case of attributes, can be represented as the following equation: where is the outcome, is the attribute value , and is the weight of learned by the algorithm. In (3), the weights are the parameters that determine the hyperplane. By using the support vectors, SVM models approximate the maximum-margin hyperplane as follows: where is the class-value of the training example . The problem of finding the support vectors and parameters and can be transformed into a linearly constrained quadratic programming (QP) problem.

For the linearly separable case, we assume that all data is at least distance 1 from the hyperplane . Then, given a training set of instance-label pairs (), where and , the data points will be correctly classified by

The SVM finds an optimal separating hyperplane with the maximum margin by solving the following quadratic optimization problem:

By adopting nonnegative slack variables, we can transform (6) into (7) as follows:

By solving (7), we can find the hyperplane that provides the minimum number of training errors. For the nonlinear separable case, SVM models are able to undertake the classification by constructing a linear model that implements the nonlinear class boundaries by transforming the inputs into the high-dimensional feature space. In this case, (4) can be modified into a high-dimensional version as follows:

The function is the kernel function which transforms the input vector into a high-dimensional feature space. Usually, there are 3 types of kernel functions: the linear kernel, ; the polynomial kernel, , where is the degree of the polynomial kernel; the Gaussian radial basis function (RBF), , where is the bandwidth of the kernel.

2.3. Back-Propagation Neural Networks

Also, in the problem of financial forecasting, the technique of back-propagation neural networks (BPNN) is always used as a benchmark model [3538]. The procedure to set up a BPNN includes the following: select input and output variables; determine layers and number of neurons in hidden layers; learn from real data; test; recall. Let be the number of hidden nodes and the dimension of the input vector (the lagged observations). The relationship between the output () and the inputs () has the following mathematical representation: where is the connection weight of theth hidden node and is the connection weight between the th input node and theth hidden node. The logistic function is often used as the hidden layer transfer function as follows:

The BPNN model performs a nonlinear functional mapping from inputs () to as follows: where is a vector of all parameters and is a function determined by the network structure and connection weights. Thus, the neural network is equivalent to a nonlinear autoregressive model.

However, individual methods for corporate financial distress forecasting have a disadvantage for error recognition. Therefore, in the following section, a novel ensemble learning approach based on logistic regression and SVM/BPNN is proposed for error recognition and the improvement of forecasting performance.

3. A Novel Ensemble Learning Approach

Many hybrid methods have been used for economic forecasting [3941], and risk analysis has been addressed in supply chain management [4248]. When the methods of LR, SVM, and BPNN are used for corporate financial distress forecasting, financial ratio indices of companies are input variables and corresponding corporate financial state (0/1) of companies is the output. The principle of the new ensemble learning approach based on LR and SVM/BPNN is that the forecasting results of corporate financial distress by LR are set as another input variable within the SVM/BPNN framework, corresponding to the output of corporate financial state. In this way, both Type I error (reject-true error) and Type II error (accept-false error) in LR analysis can be recognized by SVM/BPNN.

In summary, the overall process of the ensemble learning approach (LR-SVM/BPNN) can be described in Figure 1 in the following three main steps.(1)Stepwise regression analysis is used to remove independent variables which are insignificantly linear with the dependent variable. In this way, remained independent variables are significant to the dependent variable and multicollinearity is removed.(2)Logistic regression analysis is implemented based on train sample set. Then, critical value of financial distress probability is set as 0.5. If the probability is more than 0.5, the value is 1 and the company is predicted to have financial distress; or else, the value is 0 and the company is predicted to have no financial distress.(3)The forecasting results by LR are introduced into the SVM/BPNN technique as a new variable. Then, there will be a new forecasting result by the LR-SVM/BPNN approach.

On the whole, Logistic regression is a linear model, while SVM/BPNN is nonlinear model. When the forecasting results by LR are introduced into the SVM/BPNN technique as an input variable, SVM/BPNN can recognize the forecasting errors in fitness by LR. Therefore, the ensemble learning approach LR-SVM/BPNN is promising to achieve better forecasting performance.

4. Empirical Analysis

4.1. Data Description and Experiment Design

Fashion and textile sector is a traditional main industry in China. In recent years, China’s export in fashion and textile sector is more than 25% in the global trade. However, many fashion and textile companies have been shocked immensely since the financial crisis in 2008. When a fashion and textile company suffers from financial distress, other companies in its supply chains will be subjected to financial risk. Therefore, financial distress forecasting is important in fashion and textiles supply chains.

In this study, the data for our experiment were collected from the Shanghai Stock Exchange and Shenzhen Stock Exchange databases in China. In 2012, there are 88 fashion and textiles related listed companies such as Shenzhen Victor Onward Textile Industrial Co., Ltd. (000018); Xinlong Holding (Group) Company Ltd. (000955); Hubei Maiya Co., Ltd. (000971); Shandong Demian Incorporated Company (002072); Lanzhou Sanmao Industrial Co., Ltd. (000779); Ningxia Zhongyin Cashmere Co. Ltd. (000982); Sichuan Langsha Holding Ltd. (600137); Hunan Huasheng Co., Ltd. (600156); Xinjiang Tianshan Wool Tex Stock Co., Ltd. (000813); Nanjing Textiles Import & Export Corp., Ltd. (600250); Henan Xinye Textile Co., Ltd. (002087), where numbers in brackets are stock codes of corresponding listed companies. We selected 15 companies once special-treated (ST) and 45 non-ST companies as samples, where ST are regarded as financial distress in this study. Also, we selected 12 variables and categorized them as four major types: earning ability, operating ability, debt-repaying ability, and growing ability. The details of these indicators belong to each type and are listed in Table 1.

The data set consisting of 60 samples are listed in Table 2, where variable represents financial distress state (0 means no financial distress and 1 means financial distress) in period and represents the value of the th variable in period . In this study, is an independent variable and is a dependent variable.

The training set (first 40 samples in Table 2) and the testing set (last 20 samples in Table 2) used in empirical analysis are described in Table 3. Here, the training set is used to acquire the parameters of forecasting models, while the testing set is used to measure the forecasting performance of forecasting models.

4.2. Experiment Results and Analysis

In the empirical analysis of corporate financial distress forecasting, financial state of listed companies in period is dependent variable/output and variables in period are input. Normalization of variables in period is implemented and the normalized value is listed in Table 4.

As a result, the methods are used for corporate financial distress forecasting in the next period when the value of variables in current period are available. By using the training set and the testing set in Table 3, we obtain the forecasting performance comparison of methods with values of variables in period and financial state in period as in Table 5.

Moreover, independent variable values in periods and are used for corporate financial distress (independent variable in period ) forecasting, and comparison of forecasting performance with period and period data is shown in Tables 6 and 7.

From the results in Table 5, the accuracy achieved by BPNN and SVM is 80%, and that by LR-BPNN and LR-SVM is 85%, which is much higher than the accuracy 50% by LR. In addition, by comparing the results in Tables 57, we can find that the total accuracy decreases in the length from current period to period ; that is, total accuracy in period is better than that in period , which is better than that in period . Therefore, we can conclude that artificial intelligence techniques, that is, SVM and BPNN, can achieve better forecasting performance than LR. In addition, as SVM/BPNN can recognize the forecasting errors in fitness by LR, ensemble learning approaches LR-BPNN and LR-SVM can achieve better forecasting performance than individual methods when the forecasting results by LR are introduced into the SVM/BPNN technique as an input variable.

5. Conclusions and Future Work

In this study, logistic regression (LR) model is integrated with artificial intelligence tools, that is, support vector machine (SVM) and back-propagation neural networks (BPNN), for corporate financial distress forecasting in fashion and textiles supply chains. Empirical analysis of Chinese listed companies in fashion and textile sector is implemented for the comparison of the methods, and some related issues are discussed.

The contribution of this study is that a novel ensemble learning approach is developed for corporate financial distress forecasting in fashion and textiles supply chains. In the framework of the proposed ensemble learning approach, the forecasting results by LR are introduced into the SVM and BPNN techniques which can recognize the forecasting errors in fitness by LR. The results suggest that artificial intelligence tools are better than LR and the proposed novel ensemble learning approach can achieve better forecasting performance than that of individual models. By using the proposed approach, managers in fashion and textiles companies can predict the financial state of their suppliers, manufacturers, and retailers in advance and give a quick response for better supply chain performance.

It is expected that future research would benefit from the concentration on other methods for corporate financial distress forecasting, using data from a wider sample of fashion and textiles companies.

Acknowledgments

The authors sincerely thank the anonymous referees for their valuable suggestions and comments. This work was supported by the National Natural Science Foundation of China (Grant no. 70871107 and 71101028), China Postdoctoral Science Foundation (Grant no. 20060400103), the Program for Innovative Research Team in UIBE, the Program for Excellent Talents, UIBE and The Royal Academy of Engineering for research exchanges with China and India scheme.