#### Abstract

Particularly in recent years, artificial intelligence optimization techniques have been used to make fuzzy time series approaches more systematic and improve forecasting performance. Besides, some fuzzy clustering methods and artificial neural networks with different structures are used in the fuzzification of observations and determination of fuzzy relationships, respectively. In approaches considering the membership values, the membership values are determined subjectively or fuzzy outputs of the system are obtained by considering that there is a relation between membership values in identification of relation. This necessitates defuzzification step and increases the model error. In this study, membership values were obtained more systematically by using Gustafson-Kessel fuzzy clustering technique. The use of artificial neural network with single multiplicative neuron model in identification of fuzzy relation eliminated the architecture selection problem as well as the necessity for defuzzification step by constituting target values from real observations of time series. The training of artificial neural network with single multiplicative neuron model which is used for identification of fuzzy relation step is carried out with particle swarm optimization. The proposed method is implemented using various time series and the results are compared with those of previous studies to demonstrate the performance of the proposed method.

#### 1. Introduction

Nowadays, it is of vital importance to make predictions about the future in terms of planning and strategy formulation. This can be realized by accurate and realistic analysis of information and data that have emerged from past to present. Different approaches, namely, stochastic and nonstochastic approaches, have been proposed in the literature for the analysis of time series. Nowadays, the use of nonstochastic models such as fuzzy time series approach for the analysis of time series has become widespread. In some cases, expressing the observations of time series by linguistic values or fuzzy sets is more realistic. These types of time series are called fuzzy time series and their analysis should be made via fuzzy time series analysis methods rather than traditional ones. In recent years, to analyse the nonlinear time series such as time series of 1/f noise time series, Li et al. [1], Li et al. [2], and Li and Zhao [3] presented different approaches which are expressed as stochastic models. In addition, Li et al. [4] stated that a sufficient condition for 1/f noise type time series to be predictable is that variance of its predications errors exists and described that there are some challenges in prediction of 1/f noise type time series. The main advantage of fuzzy time series approaches is that they do not need assumptions that stochastic models do. Particularly, since fuzzy time series methods do not need linear model assumption and probability distribution assumption, they can be effectively used to analyse the nonlinear time series which is frequently encountered in the real-world problems.

The concept of fuzzy time series was first introduced by Song and Chissom [5] based on fuzzy set theory proposed by Zadeh [6]. Fuzzy time series can be evaluated under two main headings as time-variant and time-invariant. Song and Chissom [5] have reported that internal relations belonging to fuzzy time series are supposed to change over time in time-variant fuzzy time series but not in time-invariant ones. Song and Chissom [7] proposed an algorithm for the solution of time-invariant fuzzy time series which are the subject of almost all the studies in the literature. As the subject of this study is time-invariant fuzzy time series, in the remainder of the paper the term of “fuzzy time series” will be used instead of “time-invariant fuzzy time series.” As in fuzzy inference systems, fuzzy time series forecasting models consist of three steps as fuzzification, identification of fuzzy relation, and defuzzification which have an influence on forecasting performance of the method. Many researchers have carried out studies using different approaches on these three steps.

Universe of discourse has been used in fuzzification step until recently. Song and Chissom [5, 7, 8] and Chen [9, 10] determined fixedly interval lengths arbitrarily whereas Huarng [11] used average and distribution-based and Egrioglu et al. [12] used optimization-based methods. In addition, for the analysis of time series containing trend, a ratio-based length of intervals is proposed by Huarng and Yu [13]. Furthermore, Yolcu et al. [14] proposed a new approach and used a single-variable constrained optimization to determine the ratio for the length of intervals which change in time in partition of universe of discourse. More recently, Kuo et al. [15, 16], Davari et al. [17], Park et al. [18], Hsu et al. [19], and Egrioglu et al. [20] used particle swarm optimization whereas Chen and Chung [21] and Lee et al. [22, 23] proposed methods using genetic algorithms for determination of the changing length of intervals.

Although subjective judgments are avoided in these studies using optimization techniques, membership values ****are still determined subjectively and all membership values are not considered. The problem that membership values are determined subjectively may eliminate by using some fuzzy clustering techniques. In this regard, Cheng et al. [24], Li et al. [25], Aladag et al. [26], Alpaslan et al. [27], Egrioglu et al. [12, 28], and Alpaslan and Cagcag [29] eliminated by using fuzzy C-means (FCM) and Gustafson-Kessel fuzzy clustering techniques, respectively.

Identification of fuzzy relation is the step in which the appropriate model is determined. Therefore, this step plays the most important role in forecasting performance. In this stage, Song and Chissom [5, 7, 8] used fuzzy relation matrix and represented the fuzzy logic relations with only one matrix. Sullivan and Woodall [30] used transition matrices based on Markov chain instead of using fuzzy logic relation matrix. Chen [9] proposed a simpler approach using fuzzy logic group relationships tables by claiming that matrix calculations are based on complex processes. The approach proposed by Chen [9] is the most commonly used approach in the literature. Huarng and Yu [31] proposed a first-order fuzzy time series approach which uses feedforward neural networks (FFANN) in this step. Aladag et al. [32] developed the approach proposed by Huarng and Yu [31] and proposed a high-order fuzzy time series forecasting model which uses FFANN in the determination of fuzzy relations. In all of these approaches, when determining the fuzzy relations representing the internal relation of fuzzy time series, only the fuzzy set having the highest membership value was considered and membership values were ignored. Although Yu and Huarng [33] proposed an approach which considers the membership values, their approach has determined membership values subjectively. Alpaslan et al. [27] and Yolcu et al. [34] used FCM technique instead of determining the membership values subjectively. The use of ANN in identification of fuzzy relations has many advantages and disadvantages as well. Determination of unit number in hidden layer (architecture structure) and excessive number of parameters to be used during the analysis are the most prominent ones. Although Aladag [35] eliminated this problem by using artificial neural network with single multiplicative neuron model (SMNM-ANN) in the determination of fuzzy relations, membership values were not considered. Nevertheless, as the system output of these approaches consists of fuzzy set number or membership values, fuzzification step is necessary. This may be a factor that increases the model error. An approach not requiring defuzzification step would eliminate forecasting error that may occur in this step and improve the performance of the method.

Almost all approaches proposed in the literature focus on autoregresive (AR) model; in other words, in these approaches it is supposed that time series is affected by only its own lagged variables. Otherwise, there are various approaches which included autoregressive moving average (ARMA) model such as the method proposed by Egrioglu et al. [20] and seasonal autoregressive moving average (SARIMA) model such as methods proposed by Egrioglu et al. [36], Uslu et al. [37], Aladag et al. [38], and Alpaslan et al. [27].

The proposed method uses Gustafson-Kessel fuzzy clustering technique in fuzzification step and membership values are obtained more systematically. The use of SMNM-ANN in identification of fuzzy relations eliminates architecture selection problem and the need for defuzzification step by constituting the target values ****from observations of the real-time series. The training of SMNM-ANN which was used in the determination of fuzzy relations is carried out with particle swarm optimization. The proposed method comprises first-order fuzzy time series model and it can be referred to as an AR model. Main differences of proposed method from previous studies are that it does not need the defuzzification stage and also identification of architecture of ANN.

The rest of this paper is designed as follows. In Section 2, the basic concepts of fuzzy time series are briefly reviewed. In Section 3, PSO, Gustafson-Kessel fuzzy clustering technique, and SMNM-ANN are briefly presented under the related methods main heading. In Section 4, we introduce new hybrid fuzzy time series method. In Section 5, we apply the proposed method to different time series and make a comparison of the forecasted results of the proposed method with that of the existing methods. In the last section, the conclusions are discussed.

#### 2. Fuzzy Time Series

The fuzzy time series was firstly introduced by Song and Chissom [5]. The fuzzy time series and time-variant and time-invariant fuzzy time series definitions are given below by Song and Chissom [5].

*Definition 1. *Let , a subset of real numbers, be the universe of discourse on which fuzzy sets are defined. If is a collection of , then is called a fuzzy time series defined on .

*Definition 2. *Suppose that is caused by only; that is, . Then, this relation can be expressed as , where is the fuzzy relationship between and , and is called the first-order model of . “” represents max-min composition of fuzzy sets.

*Definition 3. *Suppose that is a first-order model of . If for any , is independent of , that is, for any , , then is called a time-invariant fuzzy time series; otherwise, it is called a time-variant fuzzy time series.

Song and Chissom [7] firstly introduced an algorithm based on the first-order model for forecasting time-invariant . In Song and Chissom’s work [7], the fuzzy relationship matrix is obtained by many matrix operations. The fuzzy forecasts are obtained based on max-min composition as follows:

The dimension of matrix is dependent number of fuzzy sets which are partition number of universe and discourse. If we want to use more fuzzy sets, we need different matrix operations to obtain matrix.

#### 3. Related Methods

##### 3.1. Particle Swarm Optimization (PSO)

Particle swarm optimization, which is a population-based heuristic algorithm, was firstly proposed by Eberhart and Kennedy [39]. Distinguishing feature of this heuristic algorithm is that it simultaneously examines different points in different regions of the solution space to find the global optimum solution. Local optimum traps can be avoided because of this feature.

In the literature, it was shown that using some time-varying parameters can increase the convergence speed of the algorithm. Ma et al. [40] employed time-varying acceleration coefficient in standard particle swarm optimization method. In another study, Shi and Eberhart [41] used time-varying inertia weight. In the modified particle swarm optimization, this time-varying constituents are used together. This is the only difference between standard and modified particle swarm optimization methods.

*Algorithm 4.* The modified particle swarm optimization.

*Step 1.* Positions of each th, particles’ positions are randomly determined and kept in a vector given as follows:
where represents th position of th particle. and represent the numbers of particles in swarm and positions, respectively.

*Step 2.* Velocities are randomly determined and stored in a vector as follow:

*Step 3.* According to the evaluation function, and particles given in (4), respectively, are determined:
where is a vector which stores the positions corresponding to the th particle’s best individual performance and represents the best particle, which has the best evaluation function value, found so far.

*Step 4.* Let and represent cognitive and social coefficients, respectively, and is the inertia parameter. Let , , and be the intervals which include possible values for , , and , respectively. At each iteration, these parameters are calculated by using the following formulas:
where and represent maximum iteration number and current iteration number, respectively.

*Step 5.* Values of velocities and positions are updated by using the following formulas.
where and are random values from the interval .

*Step 6.* Steps 3 to 5 are repeated until a predetermined maximum iteration number () is reached.

##### 3.2. The Gustafson-Kessel Fuzzy Clustering Technique

The algorithm of Gustafson-Kessel fuzzy clustering is firstly proposed by Gustafson and Kessel [42]. Let be the covariance matrix of the cluster, the center of the th cluster, the membership degree, and fuzziness index. For the th cluster, its associated Mahalanobis distance is defined as The covariance matrices are computed as follows: The objective function is defined as The objective function is, then, minimized under the following constraints: In this minimization problem, the center and the membership degrees are updated according to the expressions given below:

##### 3.3. Single Multiplicative Neuron Model

In neurons of feedforward neural networks, the input signal is calculated based on addition function. Yadav et al. [43] proposed a single multiplicative neuron model. In the model, the input signal of the neuron is estimated by the multiplication function. Yadav et al. [43] showed that single multiplicative neuron model gives better forecasting performance for time series forecasting. Zhao and Yang [44] recommended the use of PSO instead of backpropagation learning algorithm proposed by Yadav et al. [43] in the training of single multiplicative neuron model. The structure of single multiplicative neuron model for 5 inputs is given in Figure 1.

This model has a single neuron, and unlike feed forward neural network, multiplication is performed to the signal coming into the neuron. Function is the product of the weighted inputs. The multiplicative neural model with five inputs given in Figure 1 has 10 weights. Of these, five are the weights corresponding to the inputs and five to the sides of the weights . Suppose that activation function is taken as logistic given below: In this case, the net value of the neuron is obtained as follows: Thus, as the net value passes through activation function, output of the weight is obtained as . The fitness function to be calculated during the training of multiplicative neuron model with PSO can be used as a criterion as the sum of squares which was calculated from the difference between output values for all learning samples and target values: where and represent the target value and the output of the network corresponding to th learning sample.

#### 4. Proposed Method

In fuzzy time series approaches, each stage plays a decisive role in the forecasting performance of the method. Many studies on these steps have been conducted in the literature. As well as more systematical approaches in fuzzification step, flexible and superior calculation abilities of ANN in identification of fuzzy relation have been widely used recently. These studies have many advantages and disadvantages as well such as determination of unit number in hidden layer (architecture structure), identification of membership values subjectively, and excessive number of parameters to be used during the analysis. In this study, it was aimed to propose a model which is free from all these problems. In the proposed method, membership values were obtained more systematically by using Gustafson-Kessel fuzzy clustering technique in fuzzification step. The use of SMNM-ANN in identification of fuzzy relation eliminated architecture selection problem and the necessity for defuzzification step by constituting target values from real observations of time series; thus, the forecasting performance of the method was improved. The training of artificial neural network with single multiplicative neuron model which was used in identification of fuzzy relations is carried out with particle swarm optimization. The main advantages of the proposed method can be summarized as follows.(i)With the use of fuzzy clustering method in fuzzification step, subjective judgments are not needed anymore.(ii)With the use of ANN in identification of fuzzy relation, complex matrix operations and complex fuzzy relation tables are not needed. In addition, one may benefit from the flexible modeling advantage of ANN.(iii)The use of SMNM-ANN eliminates the problem of determining the number of units in hidden layer. (iv)Again, having inputs of SMNM-ANN which is used for identification of fuzzy relation from membership values ****leads to increases in the amount of information used in the solution process and thus providing a more appropriate approach for fuzzy set theory.(v)Defuzzification step is no longer needed by constituting SMNM-ANN target values with real values of time series and thus forecasting error that may occur in this step is prevented and forecasting performance of the method is improved.The algorithm of the proposed method is given below in steps.

*Step 1. *For , where is the number of fuzzy sets, Gustafson-Kessel algorithm is applied to the crisp time series. The centers of fuzzy sets and membership degrees, which are calculated for every observation according to this center, are obtained. Finally, ordered fuzzy sets, , , are obtained according to the ascending order centers, which are denoted by , .

For better understanding, we consider a time series data with 8 observations such as 20, 30, 40, 30, 20, 50, 60, and 80. Let , the number of fuzzy sets, be 3. When we applied the method of Gustafson-Kessel to this data, the centroid of the fuzzy sets and the membership degrees of each observation, which denote the belonging degree of that observation to the related fuzzy set, are given in Table 1. According to Table 1, the membership degree of belonging to the second fuzzy set () of the first observation is .

*Step 2. *Define the fuzzy relationship with SMNM-ANN.

The number of inputs of SMNM-ANN, used for determining fuzzy relationships, is equal to the number of fuzzy sets (). The architecture of the network is shown in Figure 2. In Figure 2, denotes the membership degree of belonging to th fuzzy set of related observation of time series . Then, the target values of SMNM-ANN are real observation of time series at while the inputs of the networks are every membership degree of belonging to fuzzy sets of the observation of time series at .

For example, suppose that we consider the time series given in Table 1. When we defined the architectural structure as given in Figure 2, the input and the targets of ANN would be as in Table 2.

Function is comprised of multiplication of the weighted inputs and is obtained by (16), where is the activation function and is the output of the model. The output of the model is calculated as in (17):

In the case where the number of fuzzy sets defined for the fuzzification process is , there are variables to be optimized by PSO. The position of these variables for a particle can be shown as in Figure 3, where , and , are weights and biases of SMNM-ANN, respectively.

The training SMNM-ANN given in Figure 2 is carried out via PSO with the following substeps.

*Step 2.1.* The parameters of PSO algorithm are determined. , , and are the possible starting and end values for cognitive component coefficient , social component coefficient , and inertia parameter , respectively. represents the maximum number of iterations, is the number of the valid iterations, and is the velocities of each particle for weights and biases of SMNM-ANN, respectively.

*Step 2.2.* Starting positions of the variables to be optimized by PSO are randomly generated. Positions of each th particle’s positions and velocities are randomly determined and kept in vectors and given as follows:
where represent th position of th particle for weights and biases of SMNM-ANN. and represent the number of particles in swarm and positions, respectively. The initial positions and velocities of each particle in a swarm are randomly generated from uniform distribution and , respectively.

*Step 2.3.* Evaluation function values for each particle are computed. Root mean square error (RMSE) given below is used as evaluation function:
where represents the number of learning sample for SMNM-ANN and and are real observation and forecasting of time series at , respectively.

*Step 2.4. * and are determined according to evaluation function values calculated in the previous step. is a vector stores the positions corresponding to the th particle’s best individual performance, and is the best particle, which has the best evaluation function value, found so far:
function values for each particle are computed.

*Step 2.5.* New values of positions and velocities are calculated. New values of positions and velocities for each particle are computed by using the following formulas:
where rand_{1} and rand_{2} are randomly generated from uniform distribution .

Steps 2.1–2.5 are repeating the number of maximum iteration times. Finally, the elements of are taken as the optimal solution.

#### 5. Applications

The proposed method was applied to five different time series, namely, Taiwan stock index (TAIEX) in years 2000, 2001, 2002, 2003, and 2004. In the analysis of TAIEX, we used observations of the last three months as the out-of-sample observations (test data). Therefore, we carried out five different analyses to evaluate of performance of the proposed method.

In the implementation of the proposed method, a new time series which was constituted from first-order differences of time series rather than time series was used as in Yu and Huarng’s study [33]. The creation of new time series can be summarized as follows.

Firstly, the differences between every two consecutive observations at and are obtained: The differences may turn out to be negative. To ensure that all the universes of discourse are positive, we add different positive constants to the differences for different years: On behalf of a better understanding of the implementation of the proposed method, let us examine the time series of the TAIEX in 2004. The stock index for 2000/1/5 is 6125.42 and that for 2004/1/6 is 6144.01. Hence, For the year 2004, the minimum of all the differences is −455.17. Hence, 500 is considered to be appropriate as the constant for the year 2004: Moreover, the outputs from the SMNM-ANN are the forecasted for the next difference. For example, when the forecasted difference between 10/4 and 10/5 is obtained as , the forecast is calculated as follows: Hence, 700, 300, 300, 200, and 500 are considered to be appropriate as the constant for the years 2000, 2001, 2002, 2003, and 2004, respectively. Moreover, in the analysis of all TAIEX data, the number of fuzzy sets is varied between 5 and 15 and parameters of PSO are determined as , , , and .

RMSE criteria were used in the evaluation of the results obtained by the analyses and the other methods in the literature.

The optimal results are obtained from nine, thirteen, six, seven, and five fuzzy sets for TAIEX data of years 2000, 2001, 2002, 2003, and 2004, respectively. Prediction error for the optimal results obtained from the proposed method as well as prediction error of other fuzzy time series methods is presented in Table 3.

Considering Table 3, it can be concluded that forecasting performances of the proposed method for all TAIEX data are better than those found in the literature with respect to RMSE criterion.

#### 6. Conclusions and Discussion

It is of vital importance to make predictions about the future in terms of planning and strategy formulation. This can be realized by accurate and realistic analysis of information and data that have emerged from past to present. Expressing observations of time series with linguistic and fuzzy clusters and analyzing these types of time series via fuzzy time series methods rather than conventional ones would provide more realistic approaches and more accurate outcomes.

Many studies aiming at making fuzzy time series more systematic approaches have been introduced in the literature. Therefore, some fuzzy clustering methods and artificial neural networks with different structures are used in the fuzzification of observations and determination of fuzzy relationships, respectively. Considering membership values especially in identification of fuzzy relations seems to be a factor that improves the forecasting performance of the method. In approaches considering the membership values, the membership values are determined subjectively or fuzzy outputs of the system are obtained by considering that there is a relation between membership values in identification of relation. This necessitates defuzzification step and increases model error. The study aimed to overcome all these problems. For this purpose, membership values were obtained more systematically by using Gustafson-Kessel fuzzy clustering technique in fuzzification step. In identification of fuzzy relations, problems such as architecture selection were eliminated by using artificial neural network with single multiplicative neuron SMNM-ANN and defuzzification step is no longer needed by constituting target values with real values of time series. The training of artificial neural network with single multiplicative neuron model is carried out with particle swarm optimization. Main differences of proposed method from previous studies are that it does not need the defuzzification and also identification of architecture of artificial neural network. In conclusion, considering the advantages and the superior forecasting performance of the method proved via different solutions, it can be argued that the proposed method would be applicable and make contributions to the fuzzy time series literature. In the future studies, proposed method can be extended to the high order structure. Moreover feedback mechanism can be added to model like moving average terms in ARMA.