Abstract

In view of techniques for constructing high-order fuzzy time series models, there are three methods which are based on advanced algorithms, computational methods, and grouping the fuzzy logical relationships, respectively. The last kind model has been widely applied and researched for the reason that it is easy to be understood by the decision makers. To improve the fuzzy time series forecasting model, this paper presents a novel high-order fuzzy time series models denoted as GTS(M,N) on the basis of generalized fuzzy logical relationships. Firstly, the paper introduces some concepts of the generalized fuzzy logical relationship and an operation for combining the generalized relationships. Then, the proposed model is implemented in forecasting enrollments of the University of Alabama. As an example of in-depth research, the proposed approach is also applied to forecast the close price of Shanghai Stock Exchange Composite Index. Finally, the effects of the number of orders and hierarchies of fuzzy logical relationships on the forecasting results are discussed.

1. Introduction

In the last two decades, fuzzy time series approach [13] has been widely used for its power of dealing with imprecise knowledge variables in decision making. Many studies have been made to propose new methods or improve forecasting accuracy for fuzzy time series forecasting. For simplifying the computational process, Chen [4] improved Song’s methods and presented a simplified forecasting model in 1996. Since the lengths of intervals greatly affect forecasting accuracy in fuzzy time series, Yu and many others [510] adjusted the lengths of intervals by the distribution or the optimization technique. In view of higher accuracy of forecasting results, the weighted models concerned with the various recurrences and on chronological order had also been improved [1115]. In addition, many models based on the conventional fuzzy time series were combined with novel algorithms or technologies. For example, Singh [1618] proposed some methods to forecast the crop production based on computational method with different parameters. Lee et al. [1922] presented several models based on the fuzzy time series, genetic algorithm, simulated annealing algorithm, and type-2 fuzzy set to forecast temperature and TAIFEX. Kuo [2325] firstly introduced the particle swarm optimization (PSO) into the fuzzy time series models for forecasting TAIFEX. Song’s [3] and Aladag’s models [26, 27] gained more accurate forecasts by employing artificial neural network to determine fuzzy relationships.

Although the first-order fuzzy time series models have a simple structure, they are easy to encounter trouble on explaining more complex relationships. And the first-order models are not able to meet the demand of forecasting involved in multifactors or longterm time series. As compared with the alternative forecasting models, such as ARIMA, Hidden Markov, and ARCH models, there is still much room for higher forecasting accuracy in applying fuzzy time series models. For these reasons, Chen et al. [2832] proposed some new methods which applied a high-order fuzzy time series model to forecast enrolments. Aladag et al. [9, 26] introduced a high-order model based on feed-forward neural network. Lee et al. [20, 33] also presented some high-order models based on two-factor and genetic-simulated annealing techniques. Most of time series researchers [18, 22, 3437] had showed their, respectively, interest in high-order fuzzy time series forecasting models.

In process of forecasting with fuzzy time series models, Fuzzy Logical Relationship (FLR) is one of the most critical factors that influence the forecasting accuracy. To obtain high forecasting accuracy, a lot of efforts have been put into mining the FLRs from fuzzy time series. In view of techniques for partitioning the universe of discourse and constructing the fuzzy logic relationships effectively, the above high-order models consist of three parts. The first one is mining the FLRs by applying some advanced algorithms or theories such as genetic algorithms, rough set, neural networks, type-2 fuzzy set, and simulated annealing algorithm [20, 22, 26, 27, 30, 32, 34, 35]. The second one is the class represented by Singh [1618] whose models are on the basis of computational method with difference parameters. The last but not least one is the kind of models based on grouping the FLRs represented by [9, 28, 29, 31, 33, 36, 37]. In general, the first kind of hybrid models can get higher forecasting accuracy than the other two. However, the forecasting process of these algorithms is not easy to be understood. Unlike the fuzzy set theory, its procedure and forecasts are not understandable and accountable for most of decision makers. Although the second kind of models has been implemented on a real-life problem of crop production and rice production as well as enrolment forecasts, the models have little, if anything, to do with FLRs in the procedures of forecasting. The model obtains high forecasting accuracy by dividing the intervals to produce accurate localizations of the forecasting values. With regard to the third kind of models, the procedures of mining FLRs and forecasting principles are based solely on the FLRs sets. The forecasting procedure and principles are obvious and clear to fuzzy time series researchers and easy to be understood by the decision makers.

For these reasons, this paper proposes a high-order fuzzy time series model based on generalized fuzzy logical relationships [38]. The process of creating relationships’ matrices and finding out the patterns of time series fluctuations is carried out on the basis of understandable fuzzy rules. Of the above three kinds of models, the proposed belongs to the third. There are three reasons for Hwang’s [28] and Chen’s models [29, 31] to be chosen as the counterparts for comparing the single-factor forecasting results with determinate length of interval. The first reason is that the models of Chen’s [29] and Li’s [36] are similar in finding the most appropriate forecasting principle with state-transition analysis and backtracking scheme. The second is that the models of Li et al. [36] and Lee et al. [33] aim at multifactor forecasting problems, and the last is because models [9, 37] are improved by finding an optimal interval length. As regards the experiment data sets, two data sets were used for the empirical analysis: the enrolments of the University of Alabama and the close price of Shanghai Stock Exchange Composite Index (SSECI). In view of the three criteria of evaluations, the root mean squared error, mean absolute error, and mean absolute percentage error, the proposed method gets more satisfactory forecasts than the counterparts.

The rest of this paper is organized as follows. In Section 2, we briefly review the concepts of fuzzy time series. In Section 3, a new model based on high-order generalized fuzzy logical relationships is implemented on the procedure of forecasting enrolments. In Section 4, we compare the average forecasting accuracy rates of the proposed method with the methods presented in [28, 29, 31]. The effects of parameters on forecasting accuracy are also discussed in this section. Conclusions and future works are given in Section 5.

2. Preliminaries

In view of making our exposition self-contained, this section reviews some definitions and the framework of fuzzy time series forecasting models. Followed with some related definitions of generalized fuzzy logical relationship, the framework [13] is summarized in this section.

Definition 1 (see [39, 40]). A fuzzy set A of the universe of discourse  , , is defined as follows: where   is the membership function of the fuzzy set , denotes the membership degrees in the fuzzy set , .

Definition 2. Let be the universe of discourse in which fuzzy sets are defined. Let be a collection of . Then, is called a fuzzy time series on .

Definition 3. Let be a fuzzy time series. If is caused by , then the fuzzy logical relationship is represented by and it is called the th-order fuzzy time series forecasting model.

Definition 4. Let and . The relationship between two consecutive observations, and , is referred to as a fuzzy logical relationship (FLR) and denoted by , where is called the left-hand side (LHS) and the right-hand side (RHS) of the FLR.

Definition 5. Let  . If andare the maximum values of and , respectively, then let be called the -order first principal fuzzy relationship, noted as (generalized fuzzy logical relationship). If is the th maximum value of , then e called the th-order th-principal fuzzy logical relationship noted as.

From Definition 5, the fuzzy logical relationship is more general than that of conventional fuzzy time series model. In fact, the logical relationship is that of conventional models when , and the forecasting rules are obtained from grouping these relationships. We then named it generalized fuzzy logical relationship. According to Definition 4, all fuzzy logical relationships in the training data set can be further grouped together into different fuzzy logical relationship groups according to the same left-hand sides of the fuzzy logical relationship. For given and , the fuzzy logical relationships can be grouped into matrices denoted as with the group method proposed by Lee et al. [13]. Here, , the element of matrix , is the number of fuzzy logical relationships .

Then, there are fuzzy logical relationships matrices for a given training data set. To forecast time series with these generalized fuzzy logical relationship matrices, we defined the intersection operation as follows.

Definition 6. Let represent the LHSs of FLRG in the -orderth-principal fuzzy logical relationship at time . Let be the number of FLRG in the th-order th-principal fuzzy logical relationship,. To compute the logical relationships between FLRGs, the intersection operator is defined as

Based on the above definitions, this paper presents a high-order fuzzy time series model in the following section.

3. Proposed Model

3.1. Procedure of

In this section, we present a new forecasting method based on high-order and generalized fuzzy logical relationships. Since the proposed model is related to the number of orders denoted by and hierarchies of principal fuzzy relationship denoted by , we name the proposed model . In other words, means an -order fuzzy time series model based on -principal fuzzy logical relationships.

Step 1. Define the universe of discourse and intervals for rules of extraction. The universe of discourse can be defined as= [starting, ending]. According to equal length of intervals,is partitioned into several intervals equally. For example,, is the midpoint of whose corresponding fuzzy set is .

Step 2. Define fuzzy sets based on the universe of discourse and fuzzify the historical data. The fuzzy set would be expressed as where , which indicates the membership degree of in . The historical and observed data are fuzzified according to the definition of fuzzy sets. For example, a datum is fuzzified to , when the maximal membership degree of the datum is the th number. In other words, if  , then the data at time should be classified into the th class. In this paper, the fuzzy sets are defined with triangular fuzzy function showed by formula (3).
Consider
The membership degree of the value at time in    is defined by formula (4). Consider where is the observed value at time and is the length of interval.

Step 3. Establish the fuzzy logical relationships based on the orders and hierarchies of principal fuzzy logical relationship. Given the sample data set and the definition of fuzzy sets, all fuzzy logical relationships between two consecutive data can be created. To forecast the time series, the fuzzy logical relationship matrix must be created in this step based on the fuzzy logical relationships.

Among many different methods, the method proposed by Lee in [13] is chosen in this paper. For example, the fuzzy logical relationships of a model can be grouped into relationship matrices denoted by .

Step 4. Forecasting model. Let , and let be the th max membership degree, respectively. By Definition 6, we have the intersection fuzzy logical relationship , and then the th forecasts is conducted by formula (5). Consider where “” is a composition operation for forecasting with the following principles:(1)if the sum of equals to 0, then the forecasted value is , and the midpoint of the interval corresponding to ;(2)otherwise, the forecasted value is the weighting aggregate of .
Then, for a given , there are forecasts for time . The conclusive forecasting value for time can be obtained by the following formula: where is the adjustment parameter for the th forecast; the parameter also can be obtained by minimizing the RMSE or other criteria of evaluations for the training data set.

3.2. Computation ofon Forecasting Enrollments

Since most of the conventional models have been presented for forecasting the historical enrolments of the University of Alabama, in this section, we present stepwise procedures of the proposed method for forecasting the time series data with and . The historical enrolments of the University of Alabama from 1971 to 1992 are shown in Table 1.

Step 1. Partitioning the universe of discourse into seven intervals: , where , , , , , , and ; their midpoints are 13500, 14500, 15500, 16500, 17500, 18500, and 19500, respectively.

Step 2. Let = “not many,” = “not too many,” = “many,” = “many many,” = “very many,” = “too many,” and = “too many many.” are fuzzy sets corresponding to linguistic values for “enrolments.”

With triangular-shaped membership function defined by formula (4), the fuzzy sets and all observations are defined by the last column of Table 1 by formula (3). Thus, the fuzzy logical relationships corresponding to given and are listed in Table 2.

Step 3. Divide the derived fuzzy logical relationships into groups based on the states of the enrollments of fuzzy logical relationships.

In this paper, we use the method of Lee [13] to construct the fuzzy logical relationships matrix. For example, letand. The set of fuzzy logical relationships is listed as, , , , , , , , ,, , , , , , , ,, , , , which are all of FLRGs in the third column of Table 2.

And they would be grouped and weighted by the recurrent fuzzy relationships as follows:Group 1: with weight 2, with weight 1;Group 2:with weight 1;Group 3: with weight 7, with weight 2;Group 4: with weight 2, with weight 1; with weight 1,Group 5: with weight 1, with weight 1;Group 6: with weight 1, with weight 1.

The fuzzy logical relationship matrix then is . Based on the fourth, fifth, sixth, seventh, and eighth column of Table 2, ,, , and can also be obtained in the similar way. These six fuzzy logical relationship matrices have been listed as follows:

Step 4. Calculate the forecasts. To carry out the calculations by the proposed principles, we illustrate the forecasting process of enrolment of 1976 as follows. From Table 1, the fuzzy set of enrolment of 1975 is (0.02, 0.52, 0.98, 0.48, 0, 0, 0). The two maximum membership attributes are the third and the second. Then, , the third row of is (0, 0, 7, 2, 0, 0, 0) and the second row of is (2, 1, 6, 0, 0, 0, 0) from (7). According to Definition 6, we have .

According to the second principle listed in Section 3.1, then is , that is, 15500, which is the middle point of .

When, the fuzzy set of 1974s is (0.402, 0.902, 0.598, 0.098, 0, 0, 0) shown in Table 1. The two maximum membership attributes are the second and the third. The second row of then is (0, 0, 1, 0, 0, 0, 0) and the third row of is (0, 0, 2, 0, 0, 0, 0) from (8). With Definition 6, we have . By the first principle listed in Section 3.1, also is 15500.

In the similar way, from Table 1 and with , we can see that the fuzzy set of 1973s is (0.8165, 0.6835, 0.1835, 0, 0, 0, 0). The two maximum membership attributes are the first and second ones. The first row of is (0, 1, 2, 0, 0, 0, 0) and the second row of is (0, 1, 5, 2, 0, 0, 0) from (9). With Definition 6, we have . With the second principle listed in Section 3.1 in mind, equals to .

At last, the forecasted value of 1976, that is, , is 15590 which equals to . Also is the regress result of the enrolment data. Table 3 shows the actual and the forecasted enrolments of the data set.

4. Empirical Analysis

4.1. Data Description

To demonstrate the effectiveness of the proposed models, amounts of data are needed. Here, the enrolments and SSECI are used as the illustration data sets for the empirical analysis.

There are some causes for the two time series to be the subjects in our experiment. There are two causes for the choice of enrolment data at the University of Alabama. The first one is that most of the fuzzy time series studies have taken this well-known time series as their experiments. Thus, there are a lot of studies that can be used for our reference. The other is that it is simple and easy to display the process of the proposed model. Since time series models have been used to make predictions in the areas of stock price forecasting for many years, the daily SSECI covering the period from 1997 to 2006 is adopted for further experiment.

For the first data, the determined length of the seven intervals is 1000. All of the forecasting results, from one order to ten orders, are compared with those of the conventional fuzzy time series models based on fuzzy logical relationships.

To discuss the effects of parameters and, the order number and hierarchies of fuzzy logical relationships, on the forecasting results, the ten-order (time-lag periods), from one order to ten orders, models are performed on the second data set with ten different lengths of intervals, that is, . The forecasting results obtained by all of the high-order models are compared in terms of three evaluation criteria reviewed in following subsection.

4.2. Criteria of Evaluation

In statistics, the root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE) are three typical ways to quantify the difference between values implied by an estimator and the actual values of the quantity being estimated. MSE is a risk function for measuring the average of the squares of the difference. For an unbiased estimator, the MSE is the variance, and the RMSE is the square root of the variance known as the standard error. Furthermore, RMSE is superior to MSE for the reason that its scale is the same as the forecasts. Thus, we take RMSE as the first representative of the size of an average error. As an average of the absolute percent errors, MAPE serves as a criterion for the comparisons of forecasting results in the paper. Some comparisons of accuracy in the forecasted values of our proposed models with other models are made on the basis of the three criteria.

4.3. Performance Evaluation

In Table 4, this study compares the RMSE, MAE, and MAPE forecasting value of the proposed method and the counterparts [28, 29, 31] on the enrolment experiment. Table 4 shows that the proposed method gets smaller RMSE, MAE, and MAPE than Hwang’s model [28], and also smaller than Chen’s model [31] of 2011 in most cases. Although there is a fly in the ointment, that is, the proposed model not always gets a higher average forecasting accuracy rate than the model [29] of 2002, these results are improving as the orders increase. In summary, the results suggest that the proposed model obtains better forecasts as the orders and hierarchies increase. Unlike this point, this is not the trend in the three counterparts.

Moreover, we also apply the proposed method to handle forecasting the close price of Shanghai stock index of 2003 with = 60. The comparison of the three criteria is listed in Table 5. From this table, we can see that the proposed gets the best forecasts of the three counterparts when and as well as its better performance than those of Hwang’s and Chen’s model [28, 29] in all cases, and the shortcoming shown in Table 4 is gone. On the whole, there is a “law” that forecasting errors will be reduced when or is increased in the proposed model. However, this trend is not evident for the three counterparts by Tables 4 and 5. In Table 4, the three evaluation criteria of Hwang’s and Chen’s models [28, 31] are decreasing while those of model [29] are increasing. In contrast, it is obvious that the three evaluation criteria of Hwang’s and Chen’s model are increasing while those of Chen’s model [29] are decreasing from Table 5. The use of different data sets seems to be the best reason to account for this contradiction. From these discussions, we get a conclusion that the proposed model’s performance is more reasonable and robust than the three counterparts.

Tables 4 and 5 depict that , , and have the same results, although larger or is more easy to obtain the smaller forecasting errors. In fact, this trend is affected by the definitions of fuzzy sets and membership function. As an in-depth analysis, Figure 1 shows three triangular membership functions. The RMSEs of forecasts of 2003 by with and the three membership functions are listed in Table 6. The table tells us that the more intervals that are concerned in the membership function, the more different forecasting accuracy obtained. It is more obvious when is increasing. There is still another conclusion that the forecasts of the third membership function are not always better than those of the first function.

To further investigate the relation between the parameters and the length of interval, the proposed model has been applied to forecasting stock index close prices covering ten years from 1997 to 2006 with and . Since it has been affirmed by Tables 4 and 5 that the characters of MAE, and MAPE are similar to those of RMSE, MAE and MAPE, we will only list the RMSE comparison of these experiments as follows.

Some examples of actual values and forecasts of 2003 are depicted in Figures 2 and 3. Figure 2 shows us that has a better performance than in the same length of intervals, that is, . Figure 3 illustrates that gets the better forecasts when the lengths of intervals are small. Furthermore, some properties will be further described in Figures 4 and 5 which depict the mean forecasting errors of the ten years. Figure 4 shows the relation between the RMSEs and lengths of intervals with different orders, and Figure 5 shows the relation between the RMSEs and orders with different lengths of intervals.

From Figure 4, it is clear that the longer the interval, the bigger the RMSE, and the higher order models are better than the lowers. This conclusion also was testified by Figure 5 which shows us another important message that the shorter length of intervals can result in robuster forecasts. Overall, these conclusions are important for the proposed mode to be applied on other data set or area.

5. Conclusion

After discussing the high-order fuzzy time series models and presenting the definition and operation for generalized fuzzy logical relationship, we have proposed a novel high-order fuzzy time series models based on the new relationship. The work is driven by the three main reasons. Firstly, it is urged to generalize the fuzzy logical relationship by the advanced fuzzy time series models. Secondly, it is to abstract the relationships matrices among time series and find out the patterns of time series fluctuations based on understandable fuzzy rules. The last one is to make the fuzzy time series model explain more complex relationships.

By using the enrolment of the University of Alabama and close price of Shanghai Stock Exchange Composite Index as data sets for evaluating the models, the experimental results give two conclusions: the performance ofis more reasonable than the three conventional fuzzy time series models proposed earlier by Hwang et al. [28] and Chen and Chen [29, 31]; the number of orders and principal fuzzy logical relationship affect the forecasting result slightly. The higher the order, the better the forecasting results, the more hierarchy principal fuzzy logical relationships, the less forecasts error but not infinite decreasing.

In the future research, some suggestions are provided to improve this paper. The relation between the principal fuzzy relationships and the conventional fuzzy relationships needs to be further discussed. For example, what's the effect brought by exchange the definition of membership function and the operations of principal fuzzy logical relationship? How great this kind of effect? Since the proposed model is on the basis of fuzzy logical relationship, a generalized fuzzy relationship, study work is worth devoting into improvement of the model hybridized with some advanced algorithms.

Acknowledgments

This work was partially supported by the National Nature Science Foundation of China (nos. 31260273, 61261027), the Key Project of Chinese Ministry of Education (no. 210116), the Natural Science Foundation of Jiangxi Province, China (nos. 2010GQS0127, 20114BAB211013, 20122BAB211033, 20122BAB201044, 20122BAB2010), and the JiangXi Provincial Foundation for Leaders of Disciplines in Science (20113BCB22008).