Abstract

Neural networks (NNs), type-1 fuzzy logic systems (T1FLSs), and interval type-2 fuzzy logic systems (IT2FLSs) have been shown to be universal approximators, which means that they can approximate any nonlinear continuous function. Recent research shows that embedding an IT2FLS on an NN can be very effective for a wide number of nonlinear complex systems, especially when handling imperfect or incomplete information. In this paper we show, based on the Stone-Weierstrass theorem, that an interval type-2 fuzzy neural network (IT2FNN) is a universal approximator, which uses a set of rules and interval type-2 membership functions (IT2MFs) for this purpose. Simulation results of nonlinear function identification using the IT2FNN for one and three variables and for the Mackey-Glass chaotic time series prediction are presented to illustrate the concept of universal approximation.

1. Introduction

Several authors have contributed to universal approximation results. An overview can be found in [18]; further references to prime contributors in function approximations by neural networks are in [4, 912] and type-2 fuzzy logic modeling in [1323]. It has been shown that a three-layer NN can approximate any real continuous function [24]. The same has been shown for a T1FLS [1, 25] using the Stone-Weierstrass theorem [3]. A similar analysis was made by Kosko [2, 9] using the concept of fuzzy regions. In [3, 26] Buckley shows that, with a Sugeno model [27], a T1FLS can be built with the ability to approximate any nonlinear continuous function. Also, combining the neural and fuzzy logic paradigms [28, 29], an effective tool can be created for approximating any nonlinear function [4]. In this sense, an expert can use a type-1 fuzzy neural network (T1FNN) [1012, 30] or IT2FNN systems and find interpretable solutions [1517, 3134]. In general, Takagi-Sugeno-Kang (TSK) T1FLSs are able to approximate by the use of polynomial consequent rules [7, 27]. This paper uses the Levenberg-Marquardt backpropagation learning algorithm for adapting antecedent and consequent parameters for an adaptive IT2FNN, since its efficiency and soundness characteristics make them fit for these optimizing problems. An Adaptive IT2FNN is used as a universal approximator of any nonlinear functions. A set of IT2FNNs is universal if and only if (iff), given any process , there is a system such that the difference between the output from IT2FNN and output from is less than a given .

In this paper the main contribution is the proposed IT2FNNs architectures, which are shown to be universal approximators and are illustrated with several benchmark problems to verify their applicability for real world problems.

2. Interval Type-2 Fuzzy Neural Networks

An IT2FNN [15, 31, 35] combines a TSK interval type-2 fuzzy inference system (TSKIT2FIS) [13, 14, 33, 34] with an adaptive NN in order to take advantage of both models best characteristics. In general, when representing IT2FNN graphically, rectangles are used to represent adaptive nodes and circles to represent nonadaptive nodes. Output values of pair nodes (green color) and odd nodes (blue color) represent uncertainty intervals (Figures 14). In this kind of interval type-2 neurofuzzy adaptive networks, nodes represent processing units called neurons, which can be classified into crisp and fuzzy neurons.

The IT2FNN-1 architecture has 5 layers (Figure 1) [35] and consists of adaptive nodes with equivalent function to lower-upper membership in fuzzification layer (layer 1). Nonadaptive nodes in rules layer (layer 2) interconnect with the fuzzification layer (layer 1) in order to generate TSK IT2FIS rules antecedents. Adaptive nodes in the consequent layer (layer 3) are connected to the input layer (layer 0) to generate rules consequents. Nonadaptive nodes in type-reduction layer (layer 4) evaluate left-right values with the Karnik and Mendel (KM) [13, 14] algorithm. The nonadaptive nodes in the defuzzification layer (layer 5) average left-right values.

The IT2FNN-3 architecture has 8 layers (Figure 2) [35] and uses IT2FN for fuzzifying the inputs (layers 1-2). Nonadaptive nodes in rules layer (layer 3) interconnect with lower-upper linguistic values layer (layer 2) to generate TSK IT2FIS rules antecedents. Adaptive nodes in layer 4 adapt left-right firing strength, biasing rules lower-upper trigger forces with synaptic weights between layers 3 and 4. Layer 5’s nonadaptive nodes normalize rules lower-upper firing strength. Nonadaptive nodes I consequent layer (layer 6) interconnect with input layer (layer 0) to generate rules consequents. Nonadaptive nodes in type-reduction layer (layer 7) evaluate left-right values adding lower-upper product of lower-upper triggering forces normalized by rules consequent left-right values. Node in defuzzification layer is adaptive and its output is defined as biased average of left-right values and parameter . Parameter (0.5 by default) adjusts uncertainty interval defined by left-right values .

Architectures IT2FNN-0 and IT2FNN-2, which will be shown in Sections 3.2 and 3.3, respectively, as universal approximators, are described with more details in Section 2.1.

2.1. IT2FNN-0 Architecture

An IT2FNN-0 is a seven-layer IT2FNN, which integrates a first order TSKIT2FIS (interval type-2 fuzzy antecedents and real consequents) with an adaptive NN. The IT2FNN-0 (Figure 3) layers are described as follows.

Layer 0. Inputs

Layer 1. Adaptive type-1 fuzzy neuron (T1FN) , where the transfer function is a membership function, is the weighted sum of inputs () and the synaptic weights (), and is the threshold for each neuron.

Layer 2. Nonadaptive T1FN. This layer contains T-norm and S-norm fuzzy nodes where is the number of nodes in layers 1 and 2 for all , and , where is the table of indices of the antecedents of the rules , where is a vector of indices for each node of layer 2if else end,

where are lower and upper membership function values, respectively. and are vectors with even and odd indices of the nodes of layer 2.

Layer 3. Lower-upper firing strength . Having nonadaptive nodes for generating lower-upper firing strength of TSK IT2FIS rules (7), where is the Gaussian interval type-2 membership function, igaussmtype2 , defined by

Layer 4. Lower-upper firing strength rule normalization . Nodes in this layer are nonadaptive and the output is defined as the ratio between the th lower-upper firing strength rule and the sum of lower-upper firing strength of all rules (13) and (14): If we view as fuzzy basis functions (FBF) (32) and (33) and as linear function (16), then can be viewed as a linear combination of the basis functions (20) and (21): where , where .

Layer 5. Rule consequents. Each node is adaptive and its parameters are . The node’s output corresponds to partial output of th rule (16):

Layer 6. Estimating left-right interval values (18), nodes are nonadaptive with outputs . Layer 6 output is defined by where

Layer 7. Defuzzification. This layer’s node is adaptive, where the output , (20) and (21), is defined as weighted average of left-right values and parameter . Parameter (default value 0.5) adjusts the uncertainty interval defined by left-right values : where

2.2. IT2FNN-2 Architecture

An IT2FNN-2 [31] is a six-layer IT2FNN, which integrates a first order TSKIT2FIS (interval type-2 fuzzy antecedents and interval type-1 fuzzy consequents), with an adaptive NN. The IT2FNN-2 (Figure 4) layers are described in a similar way to the previous architectures.

3. IT2FNN as a Universal Approximator

Based on the description of the interval type-2 fuzzy neural networks, it is possible to prove that under certain conditions, the resulting IT2FIS has unlimited approximation power to match any nonlinear functions on a compact set [36, 37] using the Stone-Weierstrass theorem [5, 6, 10, 30].

3.1. Stone-Weierstrass Theorem

Theorem 1 (Stone-Weierstrass theorem). Let be a set of real continuous functions on a compact set . If    is an algebra, that is, the set is closed under addition, multiplication, and scalar multiplication,    separates points on , that is, for every , , there exists such that , and    vanishes at no point of , that is, for each there exists such that , then the uniform closure of consists of all real continuous functions on ; that is, is dense in [3638].

Theorem 2 (universal approximation theorem). For any given real continuous function on the compact set and arbitrary , there exists such that .

3.2. Applying Stone-Weierstrass Theorem to the IT2FNN-0 Architecture

In the IT2FNN-0, the domain on which we operate is almost always compact. It is a standard result in real analysis that every closed and bounded set in is compact. Now we shall apply the Stone-Weierstrass theorem to show the representational power of IT2FNN with simplified fuzzy if-then rules. We now consider a subset of the IT2FNN-0 on Figure 5. The set of IT2FNN-0 with singleton fuzzifier, product inference, center of sets type reduction, and Gaussian interval type-2 membership function consists of all FBF expansion functions of the form (38), (40). , ; is the Gaussian interval type-2 membership function, igaussmtype2 , defined by (27) and (31). If we view as fuzzy basis functions (32) and (33) and are linear functions (34), then of (38) and (40) can be viewed as a linear combination of the fuzzy basis functions, and then the IT2FNN-0 system is equivalent to an FBF expansion. Let be the set of all the FBF expansions (38) and (40) with given by (13) and (38) and let be the supmetric; then, is a metric space [38]. We use the following Stone-Weierstrass theorem to prove our result.

Suppose we have two IT2FNN-0s ; the output of each system can be expressed as where where where where

Lemma 3. is closed under addition.

Proof. The proof of this lemma requires our IT2FNN-0 to be able to approximate sums of functions. Suppose we have two IT2FNN-0s, and with and rules, respectively. The output of each system can be expressed as and that and , where the FBFs are known to be nonlinear. Therefore, an equivalent to IT2FNN-0 can be constructed under the addition of and , where the consequents form an addition of and multiplied by a respective FBFs expansion (Theorem 1), and there exists such that (Theorem 2). Since satisfies Lemma 3 and then we can conclude that is closed under addition. Note that and can be linear since the FBFs are a nonlinear basis interval and therefore the resultant function, , is nonlinear interval (see Figure 5).

Lemma 4. is closed under multiplication.

Proof. Similar to Lemma 3, we model the product of of two IT2FNN-0s. The product can be expressed as Therefore, an equivalent to IT2FNN-0 can be constructed under the multiplication of and , where the consequents form an addition of , , , and multiplied by a respective FBFs (, , , and ) expansion (Theorem 1), and there exists such that (Theorem 2). Since satisfies Lemma 3 and then we can conclude that is closed under multiplication. Note that and can be linear since the FBFs are a nonlinear basis interval and therefore the resultant function, , is nonlinear interval. Also, even if and were linear, their product is evidently polynomial (see Figure 5).

Lemma 5. is closed under scalar multiplication.

Proof. Let an arbitrary IT2FNN-0 be (20); the scalar multiplication of can be expressed as Therefore we can construct an IT2FNN-0 that computes in the form of the proposed IT2FNN-0; is closed under scalar multiplication.

Lemma 6. For every and , there exists such that ; that is, separates points on .

Proof. We prove that separates points on . We prove this by constructing a required (20); that is, we specify such that for arbitrarily given with . We choose two fuzzy rules in the form of (8) for the fuzzy rule base (i.e., ). Let and . If and with , we define two interval type-2 fuzzy sets and with If , then and , ; that is, only one interval type-2 fuzzy set is defined. We define two real value sets and with , where . Now we have specified all the design parameters except ; that is, we have already obtained a function which is in the form of (10) with and given by (18), (20), and (21). With this , we have where where Since , there must be some such that ; hence, we have and . If we choose and , then . Separability is satisfied whenever an IT2FNN-0 can compute strictly monotonic functions of each input variable. This can easily be achieved by adjusting the membership functions of the premise part. Therefore, separates points on .

Lemma 7. For each , there exists such that ; that is, vanishes at no point of  .

Finally, we prove that vanishes at no point of  . By observing (8)–(11), (20), and (21), we simply choose all (); that is, any with serves as the required .

Proof of Theorem 2. From (20) and (21), it is evident that is a set of real continuous functions on , which are established by using complete interval type-2 fuzzy sets in the IF parts of fuzzy rules. Using Lemmas 3, 4, and 5, is proved to be an algebra. By using the Stone-Weierstrass theorem together with Lemmas 6 and 7, we establish that the proposed IT2FNN-0 possesses the universal approximation capability.

3.3. Applying the Stone-Weierstrass Theorem to the IT2FNN-2 Architecture

We now consider a subset of the IT2FNN-2 on Figure 2. The set of IT2FNN-2 with singleton fuzzifier, product inference, type-reduction defuzzifier (KM) [13, 14], and Gaussian interval type-2 membership function consists of all FBF expansion functions. , ; is the Gaussian interval type-2 membership function, igaussmtype2 , defined by (8)–(11). If we view , , , as basis functions (44), (46), (49), (50) and are linear functions (41), then can be viewed as a linear combination of the basis functions. Let be the set of all the FBF expansions with , , , and let be the supmetric; then, is a metric space [38]. The following theorem shows that is dense in , where is the set of all real continuous functions defined on . We use the following Stone-Weierstrass theorem to prove the theorem.

Suppose we have two IT2FNN-2s ; the output of each system can be expressed as where where where where

Lemma 8. is closed under addition.

Proof. The proof of this lemma requires our IT2FNN-2 to be able to approximate sums of functions. Suppose we have two IT2FNN-2s and with rules and , respectively. The output of each system can be expressed as Therefore, an equivalent to IT2FNN-2 can be constructed under the addition of and , where the consequents form an addition of and multiplied by a respective FBFs expansion (Theorem 1), and there exists such that (Theorem 2). Since satisfies Lemma 3 and then we can conclude that is closed under addition. Note that and can be linear interval since the FBFs are a nonlinear basis and therefore the resultant function, , is nonlinear interval (see Figure 6).

Lemma 9. is closed under multiplication.

Proof. In a similar way to Lemma 8, we model the product of of two IT2FNN-2s which is the last point we need to demonstrate before we can conclude that the Stone-Weierstrass theorem can be applied to the proposed reasoning mechanism. The product can be expressed as Therefore, an equivalent to IT2FNN-2 can be constructed under the multiplication of and , where the consequents form an addition of , , , and multiplied by a respective FBFs expansion (Theorem 1), and there exists such that (Theorem 2). Since satisfies Lemma 3 and then we can conclude that is closed under multiplication. Note that and can be linear intervals since the FBFs are a nonlinear basis interval and therefore the resultant function, , is nonlinear interval. Also, even if and were linear, their product is evidently polynomial interval (see Figure 10).

Lemma 10. is closed under scalar multiplication.

Proof. Let an arbitrary IT2FNN-2 be (51); the scalar multiplication of can be expressed as Therefore we can construct an IT2FNN-2 that computes all FBF expansion combinations with and in the form of the proposed IT2FNN-2, and is closed under scalar multiplication.

Lemma 11. For every and , there exists such that ; that is, separates points on .

We prove this by constructing a required ; that is, we specify such that for arbitrarily given with . We choose two fuzzy rules in the form of (8) for the fuzzy rule base (i.e., ). Let and . If and with , we define two interval type-2 fuzzy sets and with If , then and , ; that is, only one interval type-2 fuzzy set is defined. We define two interval value real sets and . Now we have specified all the design parameters except ; that is, we have already obtained a function which is in the form of (20), (21) with and given by (8)–(11). With this , we have where where Since , there must be some such that ; hence, we have and . If we choose and , then . Therefore, separates point on .

Lemma 12. For each , there exists such that ; that is, vanishes at no point of  .

Finally, we prove that vanishes at no point of  . By observing (8)–(11), (20), and (21), we just choose all (); that is, any with serves as required .

Proof of Theorem 2. From (20) and (21), it is evident that is a set of real continuous functions on , which are established by using complete interval type-2 fuzzy sets in the IF parts of fuzzy rules. Using Lemmas 8, 9, and 10, is proved to be an algebra. By using the Stone-Weierstrass theorem together with Lemmas 11 and 12, we establish that the proposed IT2FNN-2 possesses the universal approximation capability.
Therefore by choosing appropriate class of interval type-2 membership functions, we can conclude that the IT2FNN-0 and IT2FNN-2 with simplified fuzzy if-then rules satisfy the five criteria of the Stone-Weierstrass theorem.

4. Application Examples

In this section the results from simulations using ANFIS, IT2FNN-0, IT2FNN-1 [35], IT2FNN-2, and IT2FNN-3 [35] are presented for nonlinear system identification and forecasting the Mackey-Glass chaotic time series [39] with with different signal noise ratio values, SNR(dB) = 0, 10, 20, 30, free as uncertainty source. These examples are used as benchmark problems to test the proposed ideas in the paper. We have to mention that the IT2FNN-1 and IT2FNN-3 architectures are very similar to I2FNN-0 and IT2FNN-2, respectively [35], and their results are presented for comparison purposes. The proposed IT2FNN architectures are validated using 10-fold cross-validation [40, 41] considering sum of square errors (SSE) or root mean square error (RMSE) in the training or test phase. We use cross-validation to measure the variability of the RMSE in the training and testing phases to compare network architectures IT2FNN. Cross-validation procedure evaluation is done using Matlab’s crossvalind function. Noise is added by Matlab’s awgn function.

In -fold cross-validation [40], the original sample is randomly partitioned into subsamples. Of the subsamples, a single subsample is retained as the validation data for testing the model, and the remaining subsamples are used as training data. The cross-validation process is then repeated times (the folds), with each of the subsamples used exactly once as the validation data. The results from the folds then can be averaged (or otherwise combined) to produce a single estimation. The advantage of this method over repeated random subsampling is that all observations are used for both training and validation, and each observation is used for validation exactly once. 10-fold cross-validation is commonly used. Three application examples are used to illustrate proofs of universality, as follows.

Experiment 1 (identification of a one variable nonlinear function). In this experiment we approximate a nonlinear function : (where is a uniform noise component) using a-one input one-output IT2FNN, 50 training data sets with 10-fold cross-validation with uniform noise levels, six IT2MFs type igaussmtype2, 6 rules, and 50 epochs. Once the ANFIS and IT2FNN models are identified a comparison was made, taking into account RMSE statistic values with 10-fold cross-validation. Table 1 and Figure 7 show the resulting RMSE (CHK) values for ANFIS and IT2FNN; it can be seen that IT2FNN architectures [31] perform better than ANFIS.

Experiment 2 (identification of a three variable nonlinear function). A three-input one-output IT2FNN is used to approximate nonlinear Sugeno [27] function : 216 training data sets are generated with 10-fold cross-validation and 125 for tests; 2 igaussmtype2 IT2MFs for each input, 8 rules, and 50 epochs. Once the ANFIS and IT2FNN models are identified, a comparison is made with RMSE statistic values and 10-fold cross-validation. Table 2 and Figure 8 show the resultant RMSE (CHK) values for ANFIS and IT2FNN. It can be seen that IT2FNN architectures [31] perform better than ANFIS.

Experiment 3. Predicting the Mackey-Glass chaotic time series.
Mackey-Glass chaotic time series is a well-known benchmark [39] for systems modeling and is described as follows: 1200 data sets are generated based on initial conditions and , using fourth order Runge-Kutta method adding different levels of uniform noise. For comparing with other methods, an input-output vector is chosen for IT2FNN model with the following format: Four-input and one-output IT2FNN model is used for Mackey-Glass chaotic time series prediction, choosing 500 data sets for training and 500 test data data sets with 10-fold cross-validation test, 2 IT2MFs for each input with membership function igaussmtype2, 16 rules, and 50 epochs. ANFIS and IT2FNN models are identified, comparing RMSE statistical values with 10-fold cross-validation. Table 3 and Figures 9 and 10 show the number of points out of uncertainty interval evaluated by IT2FNN model, RMSE training values (TRN) and test (CHK) obtained for ANFIS and IT2FNN models. It can be seen that IT2FNN model architectures predict better Mackey-Glass chaotic time series.

5. Conclusions

In this paper we have shown that an interval type-2 fuzzy neural network (IT2FNN) is a universal approximator. Simulation results of nonlinear function identification using the IT2FNN for one and three variables and for the Mackey-Glass chaotic time series prediction have been presented to illustrate the theoretical result. In these experiments, the estimated RMSE values for nonlinear function identification with 10-fold cross-validation for the hybrid architectures (IT2FNN-2:A2C0 and IT2FNN-2:A2C1) illustrate the proof based on Stone-Weierstrass theorem, that they are universal approximators for efficient identification of nonlinear functions, complying with . Also, it can be seen that while increasing the Signal Noise Ratio (SNR), IT2FNN architectures handle uncertainty more efficiently. We have also illustrated the ideas presented in the paper with the benchmark problem of Mackey-Glass chaotic time series prediction.

Acknowledgments

The authors would like to thank CONACYT and DGEST for the financial support given for this research project. The student J. R. Castro was supported by a scholarship from MYCDI, UABC-CONACYT.