Research Article  Open Access
A Seasonal TimeSeries Model Based on Gene Expression Programming for Predicting Financial Distress
Abstract
The issue of financial distress prediction plays an important and challenging research topic in the financial field. Currently, there have been many methods for predicting firm bankruptcy and financial crisis, including the artificial intelligence and the traditional statistical methods, and the past studies have shown that the prediction result of the artificial intelligence method is better than the traditional statistical method. Financial statements are quarterly reports; hence, the financial crisis of companies is seasonal timeseries data, and the attribute data affecting the financial distress of companies is nonlinear and nonstationary timeseries data with fluctuations. Therefore, this study employed the nonlinear attribute selection method to build a nonlinear financial distress prediction model: that is, this paper proposed a novel seasonal timeseries gene expression programming model for predicting the financial distress of companies. The proposed model has several advantages including the following: (i) the proposed model is different from the previous models lacking the concept of time series; (ii) the proposed integrated attribute selection method can find the core attributes and reduce high dimensional data; and (iii) the proposed model can generate the rules and mathematical formulas of financial distress for providing references to the investors and decision makers. The result shows that the proposed method is better than the listing classifiers under three criteria; hence, the proposed model has competitive advantages in predicting the financial distress of companies.
1. Introduction
Financial distress is also known as “financial crisis,” referring to the situation when cash flow is not sufficient to compensate the current debt. Listed companies having financial crisis or bankruptcy may affect the stability of the entire capital market or may even cause panic of investors and economic losses, such that damages on the interest parties of the shareholders, creditors, investors, and company employees could be serious. If there is a feasible financial crisis warning model for listed companies, during the early stage of the occurrence of financial crisis, to business manager, the countermeasures can be adopted early to prevent the expansion of damages. To investors, the financial crisis warning model can also strengthen the capital market in order to provide guarantees to investors whom may not be aware of the operation status of the companies. Therefore, a feasible warning model is able to detect problems of a listed company early in order to prevent significant losses of investors.
Currently, there are a great number of financial crisis prediction models, and each model has its own applicable timing, advantages, and disadvantages. The financial crisis prediction models can be divided into the statistical method and the artificial intelligence method. The statistical method requires the data variables to satisfy certain assumptions, whereas the artificial intelligence method requires no assumption of any probability distribution. In addition, many artificial intelligence methods refer to the freemodel method [1], which only requires adjustments of some environmental parameters for the machine learning to train the output of the prediction rules. From the literature, this study finds that early researches have some drawbacks:(1)The selected index in the shortcoming of some of previous researches is based on personal experience and opinion.(2)The variables of the statistical model need to follow relevant assumptions [2].(3)Most of the past researches use the linear attribute selection method or no attribute selection to build prediction model.(4)The traditional GA uses the fixed length coding method that performance is poor while facing complicated problems. However, GP uses nonlinear and dynamic data structure for coding, which is capable of handling complicated problem [3].
For properly handling the above problems, this paper proposed a novel seasonal timeseries gene expression programming model for predicting the financial distress of companies. The proposed model has some foci including the following: the proposed model is seasonal timeseries model; the proposed model can find the core attributes and reduce high dimensional data; the proposed model can generate the decision rules and mathematical formulas to build a feasible warning model to be provided to the investors and decision makers as references.
The remainder of this paper is organized as follows. Section 2 provides the related literature review, including financial crisis, attribute selection methods, related data mining techniques, and gene expression programming. Section 3 describes the proposed method and algorithm. The experimental results and comparison are presented in Section 4. The last section is the conclusion.
2. Literature Review
This section introduced the related works including financial crisis, attribute selection methods, related data mining techniques, and gene expression programming in the following.
2.1. Financial Crisis
In general, financial crisis referred to a company having insufficient cash flow to pay for its debt. Beaver [4] defined the financial distress as bankruptcy, preferred dividends in arrears, bank overdrafts, and irredeemable debenture. Deakin [5] believed that financial distress included companies that had bankrupted, without ability to pay for debt or had undergone liquidation for creditors’ benefits. Carmichael [6] proposed the opinion that financial distress referred to hindrance to businesses performed by corporates, and the actual demonstrations thereof included insufficient liquidity, insufficient rights, insufficient funds, and delinquencies. Foster [7] expressed that the socalled financial distress referred to a company having serious asset for cash problem, and the resolution of such problem requiring the reliance on the transformation of the operation method or existence format of the company. Morris [8] listed out 12 signs indicating a corporate in its financial distress in a descending order of severity. Quinlan [9] pointed out that financial crisis referred to the situation where the operating cash flow of a company was insufficient to satisfy the current condition of debt such that the company was forced to adopt corrective actions. The definition of bankruptcy is a legal outcome in which a court discovers that the corporate has no ability to pay for debt or the action of expropriation is performed after pursuing the payment of debt. Altman [10] defined that the finical crisis referred to the restructuring for the bankruptcy determined legally. Ohlson [11] also defined a company in financial crisis as the one satisfying the standard for determination of bankruptcy legally. The definition of financial crisis has become clearer and it includes the meaning of early prevention. If it can be more definite, it would allow companies to discover and prevent in advance such that there is also a great chance for saving such companies. In other words, financial crisis is not equivalent to bankruptcy. On the contrary, a company that bankrupts must have experienced financial crisis.
Sun et al. [12] proposed a definition of financial crisis and showed various definitions of different viewpoints for financial crisis. Different scholars may have different interpretations according to the objectives of their researches. From most of the definitions, they can be mainly divided into two types. In theoretical analysis, financial crisis has different degrees, in which a mild financial crisis may involve temporary difficulties in cash flow, and a serious financial crisis may involve corporate failure or bankruptcy. When a corporate is in a financial crisis, it may experience a dynamic change between these two extreme ends. In empirical research, they proposed clearly the criteria of research sampling and the restrictions of data availability. And financial crisis is often clearly defined to be some situations, and such situations clearly describe financial difficulty such as statutory bankruptcy.
For financial crisis, the related methods included the traditional statistical method and the artificial intelligence method. However, the variable of statistical model requires some assumptions [2]: (a) variables have normal distributions, (b) the variables need to be independent of each other, (c) variables must have a high discriminative ability of separating solvent companies from insolvent ones, (d) in dataset, each record must be complete, and (e) corporate classification shall be clearly defined, meaning that companies belonging to a certain class shall not belong to another class. In contrast to the statistical models, artificial intelligence has the parallel computation ability such that it is able to handle nonlinear system problems, and artificial intelligence is widely used in the field of financial crisis prediction. Artificial intelligence methods include the neural network (NN), genetic algorithm (GA), rough set theory (RST), casebased reasoning (CBR), support vector machine (SVM), and nearest neighbor (KNN) methods.
Nevertheless, statistical methods and artificial intelligence methods have their own merits and drawbacks. For example, the widely used statistical multiple discriminant analysis (MDA) has simple and outstanding interpretation advantage; however, strict statistic assumption limits its application, and it is a static determination model. In contrast to the backpropagation neutral network of artificial intelligence, its application requires no assumption of any probability distribution, and the backpropagation neural network is an effective tool for nonlinear system model. Therefore, many researchers use three hidden layer backpropagation neural networks to predict financial crisis.
Many researches used the financial ratio to build financial crisis prediction model; this paper summarized related studies for data sample and results as Table 1. From Table 1, we can find that in recent years, many researches utilized the attribute selection and classifier of artificial intelligent to build financial crisis models, and these methods primarily are based on the practical industrial data in order to develop the financial crisis prediction model. Table 2 is a list of financial ratio used in the financial crisis prediction model. Table 2 shows that the researches with the use of financial ratio have gradually increased.
 
MDA: multivariate discriminant analysis, DT: decision tree, NN: neural network, PCA: principal component analysis, LR: logistic regression, NB: Naive Bayes, MLP: multilayer perceptron neural network, CART: classification and regression tree, SVM: support vector machines, and RSBL: random subspace binary logit. 

2.2. Attribute Selection
Attribute selections include subjective attribute selection: the researcher asks experts to select attributes based on consensus decision, such as Delphi panel and analytic hierarchy process (AHP); objective attribute selection: since 1970, the objective attribute selection has been widely used in various research fields, such as the statistical model analysis [13], machine learning [14], and data mining [15]. Attribute selection is an important step in the data mining preprocessing, and its primary purpose is to select important and useful information from vast amount of information, referring to selection of useful attributes and deletion of unnecessary and irrelevant attributes. Through the step of attribute selection, a more effective outcome can be achieved [16].
The algorithms for attribute selection are based on different rules of evaluation, which can be generally divided into two types: one comprising two methods of filter and wrapper, and the other is known as the embedded method. The filter method is performed for the attribute evaluation based on the characteristic of data rather than using particular algorithm to perform the attribute selection, such as the use of distance, correlation, or consistency for evaluation. Since filter method does not require the use of any algorithms, it has a relatively faster computation speed. The wrapper method uses particular algorithm to perform attribute selection; therefore, it often demonstrates relatively better performance; however, it requires longer computation time and higher costs. The embedded method (hybrid method) integrates the filter and wrapper methods by using different bases and computation processes to perform the action of attribute selection [16]. Next the following introduced the attribute selection methods used in this research.
2.2.1. Decision Tree (DT)
Since 1960, many scholars have started to use tree structures to perform data analysis, including AID, ID3, CHAID, and FACT, which shows that decision tree is a widely used classification prediction tool. The decision tree is a tree structure similar to a flowchart, in which each node represents a test on the attribute, each branch represents an output of a completed test, and the leaf nodes represent the classification or distribution of the classification [26]. ID3 [27] is a decision tree algorithm based on the decision of information theory, and its basic strategy is to pick attribute with high information gain. C4.5 is obtained based on the expansion of ID3, which has been designed by Quinlan [9] and used for solving some issues that cannot be handled by ID3 [28, 29], such as reducing the errors of tree pruned, processing continuous attribute, preventing overfitting of data, depth of development of decision tree, determination of tree pruned, deciding an appropriate attribute selection measure, processing training data with missing value, increasing computation efficiency, and different costs to process attribute.
2.2.2. Support Vector Machine (SVM)
Support vector machine is proposed by Cortes and Vapnik [30], which was a supervised learning method and widely used in classification and regression analysis. SVM is a concept for performing classification on hyperplane, which can be one or a multiple of hyperplanes and seeks to find a hyperplane among these hyperplanes in order to maximize the margin between two classes. A nonlinear classifier is to apply kernel function on SVM, and the formality of the computation result is similar to the linear type, except that each inner product is replaced by the kernel function. Therefore, it allows the algorithm to be applied to converting the attribute space to the hyperplane with maximized margin, which can also be converted to nonlinear high dimensional space. Consequently, classifier is on the hyperplane of high dimensional attribute space. The advantages of SVM [31] include the following: (a) nonlinear determination classifier allows the theoretical and actual results to match greatly; (b) the problem of overfitting rarely occurs; (c) the problem of overly large dimension due to excessive characteristics is not obvious; (d) it does not converge at local optimal solution but converges at the overall optimal solution; and (e) flexibly application of kernel function.
2.2.3. Multilayer Perceptron (MLP)
In 1950, scientists mimicked the operation method of human brain and started to propose the “Perception” neural element model, which was known as the artificial neural network (ANN). In 1980, Hopfield neural network was proposed, and artificial neural network theory gradually drew more attention. Up to the present day, many new structures and theories are still continuously proposed, among which the most popular theory is the multilayer perception [32], which is also known as the backpropagation neural network. The structure of the backpropagation neural network includes the input layer, hidden layer, and output layer. The first layer of input layer is used for receiving external information, and the last layer of output layer is able to generate a solution model. In addition, in the input layer and the output layer, there may be one or a multiple of hidden layers, and such hidden layers are used for identifying some complicated models in the data. The advantages of MLP rely on that it is able to generate a nonlinear model and is of high accuracy. Moreover, it also has an extremely high fitness such that it is able to process different types of variable inputs. MLP also has its drawbacks associated with that since the hidden layers of MLP can be one or a multiple of layers, and it also requires the setting of its learning rate and parameters; therefore, it can be extremely time consuming.
2.2.4. Rough Set Theory (RS)
In 1982, Pawlak proposed the rough set theory, which had been proved to be an effective mathematical tool for exploring data model [33]. It is a mathematical method for processing inaccurate, uncertain, and incomplete data. It uses the analysis and inference on data to discover the implied knowledge and to disclose potential patterns. Its core ability is to use the equivalence relation to divide a target set. It mainly uses the difference set of the lower approximations and upper approximations in the set theory, based on the concept of conditional probability, to perform calculation on the set already classified in order to obtain an objective result. The basic concept of rough set is to obtain the core set via reduction based on the discernible matrix established from the entire data system and to establish the result on the twodimensional decision table of criteria attribute, research subject, and decision attribute [34]. Since the rough set theory proposed by Pawlak in 1982, it has provided an excellent research method for many researchers and has achieved rapid development. Applications related to the rough set theory in various fields include disease identification model [35], medication knowledge [36], stock market forecast [37], business decline prediction [38], customer relationship [39], and building windward surface value analysis [40].
2.2.5. Radial Basis Function Network
Radial basis function network (RBF network) is an artificial neutral network using radical basis function as the activation function. The output of the radial basis function network is a linear combination of the inputted radial basis function and neural element parameters. The radial basis function network has numerous applications, including the function approximation, timeseries prediction, classification, and system controls. Broomhead and Lowe [41] first established the radial basis function network; the radial basis function network typically includes three layers: input layer, hidden layer, and an output layer of a nonlinear activation function and linear radial basis neural network. The input can be modeled as a real vector, and the output is a scalar function of the input vector. The difference between the radial basis function network and the multilayer perception relies in that the function of the hidden node of the multilayer perception (MLP, including BP) uses a linear function, and the activation function uses a Sigmoid function or a step function. The most prominent characteristic of RBF network is that the hidden node basis function uses a distance function (such as Euclidean distance), and it uses the radial basis function (such as Gaussian function) as the activation function. In RBF network, the number of hidden layer neural elements is equivalent to the number of the radial basis function, which is also equivalent to the number of centers. Consequently, during the construction of radial basis artificial neural network, the number of centers of the radial basis function determines the size, complexity, and computation processing efficiency of the network, wherein the center location of the radial basis function can affect the convergence rate of the network.
2.2.6. KNearest Neighbor (KNN)
KNN algorithm [42] is a relatively mature theoretical method, which is also one of the easiest machine learning algorithms and an algorithm that is most widely used today with high effectiveness and easy manipulation. The concept of KNN classification algorithm is a method for classifying nearest training samples in a characteristic space. In other words, it searches for number of neighbors nearest to the new data having instances of the same class and high similarity among each other such that similarity of known classes of instances can be calculated in order to evaluate the possible classification of unknown classes of instances. Similarity is calculated by using distance functions, in which smaller distance refers to greater similarity. To understand the operation of KNN, we provide further explanations on the basic steps of the KNN algorithm as follows:(1)Determine parameter , referring to the number of nearby neighbors.(2)Calculate the distance of the queryinstance and all of the training samples.(3)Arrange the distances in step in order, and determine the nearest neighbor based on the th smallest distance.(4)Gather the class of the nearest neighbor.(5)Use the simple majority determination of the nearest neighbor class as the prediction value for queryinstance.
2.3. Gene Expression Programming (GEP)
Gene expression programming (GEP) was proposed by Ferreira in 2001 [3]. Such method combines the advantages of the genetic algorithm (GA) and genetic programming (GP) such that it is a genotype/phenotype genetic algorithm. It is able to separate the evolutionary process and evaluation, encoding into a chromosome of a fixed length first, and then expressing in tree structure of different sizes and shapes at the time when the adaptive values are evaluated. Therefore, the method is of the simplicity of genetic algorithm and the function of the genetic programming at the same time such that, in terms of solving nonlinear problems, it has excellent data processing ability and exploration ability. In other words, GEP is one of the stable linearGP techniques. GEP is mainly constructed by function set, terminal set, fitness function, control parameters, and termination criteria. Such algorithm utilizes text strings of fixed length and uses tree structures of different sizes and shapes to express the solution for the problem; this tree is known as the Express Tree. The characteristic of GEP relies in that it is able to construct a complicated evolution procedure into a multiple of subprocedures. Each GEP gene includes a list of any element symbols of fixed length, such as function set or terminal set .
In the genetic algorithm [43, 44], the most important part is to express the problem applied in chromosomes, and the set of all of the solutions of the problem is called a population. In the population, each solution is a chromosome; therefore, a population can be seen as a set of chromosomes. Genes of a fixed number form each chromosome, and each gene represents a certain independent variable; therefore, each solution uses such variables of fixed numbers to express its characteristic. The gene expression programming combines advantages of the genetic algorithm and the genetic programming, which is identical to the genetic algorithm and genetic programming. At first, a set of initial population is generated randomly, and the quantity of the chromosomes in the population is set according to different problems. Next, the chromosomes are expressed in an Express Tree structure, and the adaptive value function set is used to calculate the adaptive value of each chromosome. During the calculation process of the adaptive function, greater adaptive value means that it fits our requirements on the solution of the problem better, and the contrary is false. Based on such standard, excellent chromosomes are preserved, and the chromosomes of poor qualities are eliminated, and the genetic operations of mutation, transformation, and reconfiguration are used to generate the next generation of new population. Repeat the aforementioned process and iteration until the termination criteria set is reached, and the computation is then stopped.
3. Proposed Method
Since 1966, there have been many methods used in predicting company bankruptcy and financial crisis, and many research results indicate that artificial intelligence yields better results than the traditional statistical method does. Despite that many statistical methods and artificial intelligence technologies have been used in establishing prediction models for financial crisis over the past years, only a few have used the timeseries model. The data of financial crisis is seasonal timeseries data, and financial data is nonlinear and unstable, and economic and financial systems are changed due to changes of model structures and behaviors. As a result, for different times and data, different timeseries model would be required to interpret the evidencebased data. To allow the research data to be objective, this study used objective financial ratio as a research variable. In addition, to allow the class tag (financial healthy/crisis) of the research data to have balance class, this study redefined the term of financial distress based on the researches done by Chen et al. [45]. Finally, the research is based on the assumption that the profitability of a company does not vary differently in the four seasons. In other words, this study employed the asset profitability [46] to redefine the financial crisis as follows: assuming the quarterly asset profitability of a company is greater than zero, then we define that the company is healthy; whereas if its quarterly asset profitability is smaller than zero, then we define that the company is in financial distress. The asset profitability refers to the value obtained from the total income before tax divided by the total asset, which is a common method used for measuring the profitability of a company.
This paper utilized the nonlinear attribute selection to expedite the convergence of the gene expression programming. In addition, the reasons for establishing a financial crisis prediction model include the following: (a) high dimensional data cannot be handled with ease. For example, financial ratio includes a lot of attributes in a financial statement. To overcome this problem, this study proposed a new integrated attribute selection method to rank the ordering of attributes in order to reduce the data dimension and to establish a nonlinear prediction model and to determine whether the financial status of the company is healthy. (b) The three methods of gene expression programming (GEP), genetic algorithm (GA), and genetic programming (GP) have extremely similar evolution steps; moreover, GEP includes the properties of GA and GP, and GEP is able to achieve the objective of using simple encoding to solve complicated problems. The traditional GA uses the encoding method of a fixed length; consequently, it tends to yield poor results while dealing with complicated problems. GP uses a nonlinear and dynamic data structure for encoding such that it is able to handle complicated problems but, nevertheless, its operation is relatively complex. For GEP, it utilizes the chromosome encoding of a fixed length of GA and individuals of different sizes and shapes of GP in combination. In other words, the genetic part uses the method of GA for expression, and the tree structure of GP method is used for presenting the expression part. According to the article of “gene expression programming” by Ferreira in 2006, the evolution speed and effect of GEP have greater performance than those of GA and GP.
The procedure of the research model proposed in this paper is shown in Figure 1. Firstly, the dataset of financial statements of corporate operation status is collected from Taiwan Economic Journal (TEJ), and the financial ratio is calculated from collected attribute data. Secondly, the ten attribute selection methods are used to select attributes, and the integrated attribute selection method proposed is used to rank the ordering of attribute. Thirdly, the gene expression programming is utilized to perform training, respectively, and the most optimal timeseries financial crisis prediction model is established. Finally, the other prediction models are compared with the proposed model, and this study also compares the performance of the linear and nonlinear attribute selection methods.
Computational Steps. For understanding the proposed model easily, the followings split the proposed model into four steps to be introduced in detail.
Step 1 (data preprocessing). For high dimensional data, preprocessing is an extremely important step to be performed. In geranial, the financial data and statements of companies are incomplete with missing values, interference values, or outliners. Therefore, this study practically collected the financial statements of the operation status of companies from the Taiwan Economic Journal, the related attribute data which possibly affects the financial crisis of the companies is obtained, and the collected dataset covers fiveyear period with 54 variables including a total of 13452 records. After the data is collected, the financial ratio is calculated and the redundant attributes, missing values, and empty values are deleted in this preprocessing step.
Step 2 (attribute selection). The primary purpose of attribute selection is to select important and useful information from a vast amount of dataset and deleting unnecessary or interference attributes. Through attribute selection, the fewer attributes will be able to yield a more effective result. This step employed 10 most commonly used linear and nonlinear attribute selection methods, respectively, to select attributes, and the selected attributes in each attribute selection method are normalized by according to importance level to rank its ordering, wherein refers to the importance level of the th attribute under each attribute selection method. In this paper, the nonlinear attribute selection methods used include the nearest neighbor method, support vector machine, multilayer perception, radial basis function network (RBF network), and rough set theory. The linear methods used include the chisquare, logical regression, decision tree, linear discriminant analysis, and Native Bayes methods. To integrate the results of different attribute selection methods, this study proposed a new integrated attribute selection method to rank the ordering of attribute and provide a flexible selection method. The steps of the integrated attribute selection are introduced as follows: Step 1: using the financial healthy/distress of a company as a dependent variable (class label) and the other 34 attributes as independent variables, the attribute importance of each attribute under the optimized linear attribute selections and nonlinear attribute selection methods is calculated, respectively. Step 2: the above linear and nonlinear attribute selection methods are normalized by according to the importance level, wherein refers to the importance level of the th attribute under each attribute selection method. Next, the joins and disjoins under the linear and nonlinear attribute selection methods to select attributes, respectively. Here, the join is defined to be that when any two attribute selection methods have a common attribute, it would be a member of the join set. Accordingly, joins and disjoins are generated for the linear attribute selection methods. Similarly, joints and disjoins are also generated for the nonlinear attribute selection methods.
Step 3. To establish the optimal timeseries financial crisis prediction model by GEP: according to the linear and nonlinear attribute selection methods as well as the integrated attribute selection methods, there are a total of 14 sets of attributes to build timeseries financial crisis prediction model by using GEP algorithm, respectively, such that the optimal timeseries financial crisis prediction model is established. The basic computation steps of the gene expression programming are listed as follows (the procedure of computation step as Figure 2):(1)Set the fitness function for correct classification, and the fitness function of any GEP individual “” is defined as where refers to the total number of the company in dataset, refers to the target value for current financial status of company , and refers to the financial status prediction of company under GEP individual “.”(2)For the initial population, this step generates a chromosome of fixed length for each individual (candidate solution) randomly.(3)Express the chromosome in a tree expression and evaluate the fitness of each individual.(4)Perform reproduction and revision according to their fitness value, and select the most optimal individual.(5)Based on a given number of generations, repeat the above Steps (2)–(4).
Step 4 (evaluation and comparison). This study employed decision tree C4.5, MLP, and SVM data mining technologies to compare with the proposed model for classification performance. And based on objectiveness and standards, this study utilized three most common evaluation indices to measure the performance of financial crisis model: Type I error, Type II error, and accuracy.
4. Experiment and Comparison
To verify the proposed method, the financial statements of the company operation status published by Taiwan Economic Journal (TEJ) are collected, and related attribute data has been obtained. The data collection period covers the quarterly statements from 2008/1 to 2013/3, and there are total of 54 variables and 13,452 records. Through preprocessing by deleting redundant attributes, missing values, and empty values, there are a total 35 attributes (including class) with 8,278 records, among which 6,196 records are healthy companies and 2,535 records are companies with financial distress. In order to verify the performance, this study conducts the timeseries experiment by sequential quarter. The total financial distress dataset covers 21 quarters for each company and the experimental dataset is partitioned into the former 14 quarters as training dataset and the last 7 quarters as testing dataset. The variables and the number of records for timeseries dataset are shown in Table 3.

The term definition of the 35 attributes and financial ratio formula are shown in Table 4 [25]. Based on the proposed computational steps in Section 3, the results of 14 attribute selection methods are shown in Table 5. In evaluation, this study calculates the selected attributes of 14 attribute selection methods combined with six classifiers, respectively; then the results of three evaluation indices (Type I error, Type II error, and accuracy of the classification) are listed in Table 6. Based on Tables 5 and 6, the findings are explained as follows:(1)Attribute selection: from Table 5, the attributes , , and are the higher frequency attributes, because the three attributes appear greater than 10 frequencies in 14 attribute selection methods. The three financial ratios are ROA(A)EBI%, Return on Equity%A, and Net Income%.(2)Accuracy: based on the selected attributes from 14 attribute selection methods combined with six classifiers, respectively, the results show that the accuracy of the GEP classifier is better than the other listing classifiers as Table 6. In terms of the attribute selection method, the decision tree attribute selection method yields better accuracy than the other methods. In addition, the proposed model utilizes decision tree attribute selection method combined with the GEP classifier that has the highest accuracy.(3)Type I error: from six different classifiers, MLP has the best performance in Type I error, and the performance of the RBF network attribute selection method is better than the other listing methods; therefore, the combination of the RBF network attribute selection method and MLP classifier has the optimal performance for Type I error.(4)Type II error: GEP classifier has the best performance in Type II error, and the logistic attribute selection method performs better than other methods in Type II error. Therefore, the combination of the logistic attribute selection method and GEP classifier has the optimal performance for Type II error.(5)Synthetic evaluation in three evaluation indices altogether: in different classifiers as Table 6, GEP is the best result, and linear attribute selection methods perform better than nonlinear attribute selection methods. It means that the proposed model in this research is advantageous. In addition, under the four attribute selection methods of joins and disjoins in linear and nonlinear attribute selection, the three evaluation indices are averaged, respectively, we can find that linear join attribute selection has better performance for accuracy and Type 1 error as Table 6.


 
Note. denotes the best result in accuracy, Type I, and Type II, respectively. denotes the better result for average of join and disjoin in accuracy, Type I, and Type II, respectively. 
At last for the advantages of GEP algorithm, the computational results can provide the tree structure and equation to user for easily understanding of financial distress prediction model. Therefore, this study also presents the two expressing formats in this section. After GEP computing, only five attributes, 1, 12, 22, 33, and 34, are employed, and the related coefficients for financial distress prediction model are Then the tree structure of proposed model is shown as Figure 3, and the equation of financial distress prediction model is shown as where is GEPModel class. If ( is rounding threshold); denotes that if (() and ()), then ; denotes that if (), then (); else (); means that if (), then ; represents attribute value.
5. Conclusion
This study has presented the results of the bankruptcy prediction model in Section 4. To compare with the shortcoming of some of previous researches in the Introduction, we can find that this study has overcome some shortcomings: (1) this study adopted objective selected index, rather than subjective selected index, (2) the variables need not to follow relevant assumptions, (3) most of the past researches use the linear attribute selection method or no attribute selection to build prediction model; however, this research uses nonlinear attribute selection method, and (4) this research has proposed a novel seasonal timeseries model based on gene expression programming for predicting the financial distress of companies. The experiment results indicate that the proposed model performs better than the other listing classifiers, and linear attribute selection methods perform better than nonlinear attribute selection methods. It means that the proposed method has relative advantages in predicting the financial distress of companies. Since financial statements are quarterly statements, the attribute data affecting the financial crisis of companies is nonlinear and unstable timeseries data with fluctuations. The result indicates that the prediction outcome of the artificial intelligence GEP proposed in this research has relative stability for accuracy, Type I error, and Type II error.
Conflicts of Interest
There are no conflicts of interest related to this paper.
References
 P. Dayan and K. C. Berridge, “Modelbased and modelfree Pavlovian reward learning: Revaluation, revision, and revelation,” Cognitive, Affective & Behavioral Neuroscience, vol. 14, no. 2, pp. 473–492, 2014. View at: Publisher Site  Google Scholar
 T. Korol, “Early warning models against bankruptcy risk for Central European and Latin American enterprises,” Economic Modelling, vol. 31, no. 1, pp. 22–30, 2013. View at: Publisher Site  Google Scholar
 C. Ferreira, “Gene expression programming: a new adaptive algorithm for solving problems,” Complex Systems, vol. 13, no. 2, pp. 87–129, 2001. View at: Google Scholar  MathSciNet
 W. H. Beaver, “Financial ratios as predictors of failure,” Journal of Accounting Research, vol. 4, pp. 71–111, 1966. View at: Publisher Site  Google Scholar
 E. B. Deakin, “A discriminant analysis of predictors of business failure,” Journal of Accounting Research, vol. 10, no. 1, pp. 167–179, 1972. View at: Publisher Site  Google Scholar
 D. R. Carmichael, The Auditor's Reporting Obligation: The Meaning and Implementation of The Fourth Standard of Reporting, American Institute of Certified Public Accountants, 1972.
 G. Foster, Financial Statement Analysis, Pearson Education India, 1978.
 R. C. Morris, Early Warning Indicators of Corporate Failure: A Critical Review of Previous Research and Further Empirical Evidence, Ashgate, 1997.
 J. R. Quinlan, C4. 5: Programs for Machine Learning, Elsevier, 2014.
 E.I. Altman, “Financial ratios, discriminant analysis and the prediction of corporate bankruptcy,” The Journal of Finance, vol. 23, no. 4, pp. 589–609, 1968. View at: Publisher Site  Google Scholar
 J. A. Ohlson, “Financial ratios and the probabilistic prediction of bankruptcy,” Journal of Accounting Research, vol. 18, no. 1, pp. 109–131, 1980. View at: Publisher Site  Google Scholar
 J. Sun, H. Li, Q.H. Huang, and K.Y. He, “Predicting financial distress and corporate failure: A review from the stateoftheart definitions, modeling, sampling, and featuring approaches,” KnowledgeBased Systems, vol. 57, pp. 41–56, 2014. View at: Publisher Site  Google Scholar
 P. Mitra, C. A. Murthy, and S. K. Pal, “Unsupervised feature selection using feature similarity,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 301–312, 2002. View at: Publisher Site  Google Scholar
 A. L. Blum and P. Langley, “Selection of relevant features and examples in machine learning,” Artificial Intelligence, vol. 97, no. 12, pp. 245–271, 1997. View at: Publisher Site  Google Scholar  MathSciNet
 M. Dash, K. Choi, P. Scheuermann, and H. Liu, “Feature selection for clustering  A filter solution,” in Proceedings of the 2nd IEEE International Conference on Data Mining, ICDM '02, pp. 115–122, December 2002. View at: Google Scholar
 H. Liu and L. Yu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491–502, 2005. View at: Publisher Site  Google Scholar
 H. Frydman, E. I. Altman, and D. Kao, “Introducing Recursive Partitioning for Financial Classification: The Case of Financial Distress,” The Journal of Finance, vol. 40, no. 1, pp. 269–291, 1985. View at: Publisher Site  Google Scholar
 C. E. Mossman, G. G. Bell, L. M. Swartz, and H. Turtle, “An empirical comparison of bankruptcy models,” The Financial Review, vol. 33, no. 2, pp. 35–54, 1998. View at: Publisher Site  Google Scholar
 A. F. Atiya, “Bankruptcy prediction for credit risk using neural networks: a survey and new results,” IEEE Transactions on Neural Networks and Learning Systems, vol. 12, no. 4, pp. 929–935, 2001. View at: Publisher Site  Google Scholar
 M.Y. Chen, “Predicting corporate financial distress based on integration of decision tree classification and logistic regression,” Expert Systems with Applications, vol. 38, no. 9, pp. 11261–11272, 2011. View at: Publisher Site  Google Scholar
 H. Li, Y.C. Lee, Y.C. Zhou, and J. Sun, “The random subspace binary logit (RSBL) model for bankruptcy prediction,” KnowledgeBased Systems, vol. 24, no. 8, pp. 1380–1388, 2011. View at: Publisher Site  Google Scholar
 R. Geng, I. Bose, and X. Chen, “Prediction of financial distress: an empirical study of listed Chinese companies using data mining,” European Journal of Operational Research, vol. 241, no. 1, pp. 236–247, 2015. View at: Publisher Site  Google Scholar
 D. Liang, C. Tsai, and H. Wu, “The effect of feature selection on financial distress prediction,” KnowledgeBased Systems, vol. 73, pp. 289–297, 2015. View at: Publisher Site  Google Scholar
 W.W. Wu, “Beyond business failure prediction,” Expert Systems with Applications, vol. 37, no. 3, pp. 2371–2376, 2010. View at: Publisher Site  Google Scholar
 C.H. Cheng and S.H. Wang, “A quarterly timeseries classifier based on a reduceddimension generated rules method for identifying financial distress,” Quantitative Finance, vol. 15, no. 12, pp. 1979–1994, 2015. View at: Publisher Site  Google Scholar  MathSciNet
 H. Jiawei and M. Kamber, Data Mining: Concepts And Techniques, vol. 5, Morgan Kaufmann, San Francisco, CA, itd., 2001.
 J. R. Quinlan, “Induction of decision trees,” Machine Learning, vol. 1, no. 1, pp. 81–106, 1986. View at: Publisher Site  Google Scholar
 B. Mak and T. Munakata, “Rule extraction from expert heuristics: A comparative study of rough sets with neural networks and ID3,” European Journal of Operational Research, vol. 136, no. 1, pp. 212–229, 2002. View at: Publisher Site  Google Scholar
 A. Ozturk and A. Arslan, “Classification of transcranial Doppler signals using their chaotic invariant measures,” Computer Methods and Programs in Biomedicine, vol. 86, no. 2, pp. 171–180, 2007. View at: Publisher Site  Google Scholar
 C. Cortes and V. Vapnik, “Supportvector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. View at: Publisher Site  Google Scholar
 L. Auria and R. A. Moro, “Support Vector Machines (SVM) as a Technique for Solvency Analysis,” SSRN Electronic Journal, 2008. View at: Publisher Site  Google Scholar
 L. Almeida, Handbook of Neural Computation, chapter C1. 2 Multilayer perceptrons, pages C1. 2: 1–C1. 2: 30, IOP Publishing Ltd. and Oxford University Press, 1997.
 S. Greco, B. Matarazzo, and R. Slowinski, “Rought sets theory for multicriteria decision analysis,” European Journal of Operational Research, vol. 129, no. 1, pp. 1–47, 2001. View at: Publisher Site  Google Scholar  MathSciNet
 Z. Pawlak and A. S. Pawlak, “Modification of iodometric determination of total and reactive sulfide in environmental samples,” Talanta, vol. 48, no. 2, pp. 347–353, 1999. View at: Publisher Site  Google Scholar
 S. Tsumoto, “Automated extraction of medical expert system rules from clinical databases based on rough set theory,” Information Sciences, vol. 112, no. 1–4, pp. 67–84, 1998. View at: Publisher Site  Google Scholar
 H.C. Chou, C.H. Cheng, and J.R. Chang, “Extracting drug utilization knowledge using selforganizing map and rough set theory,” Expert Systems with Applications, vol. 33, no. 2, pp. 499–508, 2007. View at: Publisher Site  Google Scholar
 H. J. Teoh, C.H. Cheng, H.H. Chu, and J.S. Chen, “Fuzzy time series model based on probabilistic approach and rough set rule induction for empirical research in stock markets,” Data & Knowledge Engineering, vol. 67, no. 1, pp. 103–117, 2008. View at: Publisher Site  Google Scholar
 A. I. Dimitras, R. Slowinski, R. Susmaga, and C. Zopounidis, “Business failure prediction using rough sets,” European Journal of Operational Research, vol. 114, no. 2, pp. 263–280, 1999. View at: Publisher Site  Google Scholar
 C.H. Cheng and Y.S. Chen, “Classifying the segmentation of customer value via RFM model and RS theory,” Expert Systems with Applications, vol. 36, no. 3, pp. 4176–4184, 2009. View at: Publisher Site  Google Scholar
 P. M. Briggen, B. Blocken, and H. L. Schellen, “Winddriven rain on the facade of a monumental tower: Numerical simulation, fullscale validation and sensitivity analysis,” Building and Environment, vol. 44, no. 8, pp. 1675–1690, 2009. View at: Publisher Site  Google Scholar
 D. S. Broomhead and D. Lowe, “Radial basis functions, multivariable functional interpolation and adaptive networks,” DTIC Document, 1988. View at: Google Scholar
 N. S. Altman, “An introduction to kernel and nearestneighbor nonparametric regression,” The American Statistician, vol. 46, no. 3, pp. 175–185, 1992. View at: Publisher Site  Google Scholar  MathSciNet
 A. E. Eiben and J. E. Smith, Introduction to Evolutionary Computing, Natural Computing Series, Springer, Berlin, Germany, 1st edition, 2003. View at: MathSciNet
 J. H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, University of Michigan Press, Oxford, UK, 1975. View at: MathSciNet
 Y. Chen, L. L. Zhang, and L. Zhang, “Financial distress prediction for Chinese listed manufacturing companies,” Procedia Computer Science, vol. 17, pp. 678–686, 2013. View at: Publisher Site  Google Scholar
 E. J. McMillan, NotForProfit Budgeting And Financial Management, John Wiley & Sons, 2010.
Copyright
Copyright © 2018 ChingHsue Cheng et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.