Abstract

Continuity of power supply is of utmost importance to the consumers and is only possible by coordination and reliable operation of power system components. Power transformer is such a prime equipment of the transmission and distribution system and needs to be continuously monitored for its well-being. Since ratio methods cannot provide correct diagnosis due to the borderline problems and the probability of existence of multiple faults, artificial intelligence could be the best approach. Dissolved gas analysis (DGA) interpretation may provide an insight into the developing incipient faults and is adopted as the preliminary diagnosis tool. In the proposed work, a comparison of the diagnosis ability of backpropagation (BP), radial basis function (RBF) neural network, and adaptive neurofuzzy inference system (ANFIS) has been investigated and the diagnosis results in terms of error measure, accuracy, network training time, and number of iterations are presented.

1. Introduction

Power transformer is of prime importance and costly element of the power system and the reliability of the system then depend upon its well-being. Close and continuous monitoring and maintenance of it restore the service conditions. Thermal and electrical stresses can cause the incipient faults which further lead to failure of the equipment. Fault detection at the early stage can save the equipment. The important tool to diagnose the faults is DGA. Rogers ratio, Doernenburg ratio, IEC ratio, and Duval triangle are some of the standards established for diagnosis. The ratio methods are based on the single fault prediction but there are the situations of multiple faults and the diagnosis becomes erroneous. Among the existing methods for identifying the incipient faults, DGA is the most popular and successful method [13]. When there is any kind of fault, such as overheating or discharge fault inside the transformer, it will produce a corresponding characteristic amount of gases in the transformer oil. This concept is the underlying principle of DGA. Through the analysis of the concentrations of dissolved gases, their gassing rates, and the ratio of certain gases, the DGA method can determine the type of fault of the transformer. The commonly collected and analyzed gases are H2, CH4, C2H2, C2H4, C2H6, CO2, and CO. An ANSI/IEEE standard and IEC publication 599 [4, 5] describes three DGA approaches such as (1) key gas method; (2) Rogers ratio method; and (3) Doernenburg ratio method. All three methods are computationally straightforward. However, these methods, in some cases, provide erroneous diagnoses as well as no conclusion for the fault type. The key gas method based on the determination of the key gas provides the basis for qualitative determination of fault types from the gases that are typical or predominant at various temperatures. Now, if the fault is very severe, then all of the gas concentrations will be high but yet insufficient to register a fault when using the values specified in IEEE standard [2]. Also, the gas ratios obtained for the particular transformer sample may not fall within ANSI/IEEE-specified ranges, leading to the failure of the ratio methods for transformer diagnosis [6]. In recent years, many researchers studied the application of artificial intelligence, such as neural networks and fuzzy set theory to increase diagnosis accuracy [615]. The fuzzy systems, though good at handling uncertainties, could not learn from previous diagnosis results and, hence, are not able to adjust the diagnostic rules automatically [1013]. To account for uncertainties, the artificial neural networks (ANNs) have been proposed to diagnose the transformer faults because of their superior learning capabilities [69]. In general, fuzzy systems and neural networks deal efficiently with two different areas of information processing. Fuzzy systems are good at various aspects of uncertain knowledge representation, while neural networks are efficient structures that are capable of learning from examples. Both techniques complement each other. The generalized regression neural network was used in [14] but since this network is a one-pass network, efficiency is somewhat low for fault detection. An application of fuzzy clustering and a radial basis function neural network has been reported [15]; however, when one type of fault is in the neighborhood of the other type of fault, the chances of false diagnosis may increase.

In this paper, the investigations on transformer fault diagnosis using supervised neural networks and ANFIS has been made. In the initial work the diagnosis was carried out using backpropagation (BP) and radial basis function (RBF) neural network, which belongs to the category of supervised networks and is presented in Section 2 and, at the later stage, the diagnosis by TSK model of ANFIS in Section 3. Section 4 provides the diagnosis results of investigations by all the methods listed above.

2. Supervised Neural Networks and Training Algorithms

2.1. Feed-Forward or Backpropagation (BP) Network

In case of neural networks like backpropagation or feed-forward and radial basis function (RBF), training is performed using supervised approach in which the desired result is known for the samples in the training data. Backpropagation algorithm includes batch training in which the samples are presented in a batch and weight updates are done. In incremental training, samples are presented at each iteration for weight updates. Many algorithms, namely, gradient descent, gradient descent with momentum, conjugate gradient, quasi-Newton, and reduced memory Levenburg-Marquardt algorithm are available in the neural network tool box. In this paper on-line or adaptive Levenburg-Marquardt algorithm which is fast and consumes less memory is used for feed-forward neural network learning.

Network design includes selection of input, output, and hidden layers network topology and weighted connection of nodes. The corresponding connection weights are also determined in the process.

Figure 1 presents the artificial neural network used in fault diagnosis of power transformers and consists of three-layer feed-forward structure with the input, hidden, and output layers. Only one hidden layer is shown to understand its architecture; however the designed network has three hidden layers. The nodes in each layer receive input signals from the previous layer and pass the output to the subsequent layer. The nodes of the input layer receive a set of input signals from outside system and directly deliver the input data to the input of the hidden layer by the weighted links. Network is designed for seven inputs as the concentrations of gases and one output corresponding to the fault. Three hidden layers consisting of 7-7-1 neurons are selected for better design, so as to reveal the hidden relationship between faults and gas composition.

2.2. Levenburg-Marquardt Algorithm

The network proposed is then trained and tested using Levenburg-Marquardt algorithm. This algorithm needs less memory space and is fast in operation as compared to gradient descent and other algorithms. The learning steps used in this algorithm are as shown below. Each learning iteration (epoch) will consist of the following basic steps:(1)compute the Jacobian matrix, (by using finite differences or the chain rule);(2)compute the error gradient (3)approximate the Hessian ;(4)using the cross product Jacobian (5)solve to find ;(6)update the network weights using ;(7)recalculate the sum of squared errors using the updated weights;(8)if the sum of squared errors has not decreased, discard the new weights, increase using 5, and go to step 4;(9)else, decrease using 5 and stop,where = vector of network errors, -damping or scaling factor, = identity matrix, and is the increment at each iteration.

2.3. Radial Basis Function (RBF) Network

As shown in Figure 2, network consists of 3 layers (input, hidden, and output). Input layer is made up of nodes that connect network to environment. At input of each neuron (hidden layer), distance between neuron center and input vector is calculated applying Gaussian bell function to form output of the neurons. Output layer is linear and supplies response of network to activation function. Selection of radial basis function width parameter and number of radial basis neurons in the hidden layer is an important step. Larger width results in smaller size network but faster execution of data. Maximum number of neurons may be the number of inputs but the minimum neurons can be determined experimentally [10]. Network structure solely depends upon the number of neurons in the hidden layer. Training the network with the performance parameters specified, yield the number of neurons and the diagnosis error. Learning strategies includes the centre and spread and output layer weight learning. Centers can be fixed randomly or self-organized or supervised selection can be employed.

Clustering also can be performed in self-organized learning. Supervised learning of RBF network is performed using least mean square (LMS) algorithm. RBF training with supervised selection of centers and spread is done by using the following equations.

Output layer weights (linear weights):

Position of centers is given by

Spreads of centers (hidden layer): where is , is weight vector, is a matrix, and “” is the feature dimension. , , and are the step sizes. is the change in error with respect to weight at each iteration. is the change in error with respect to the centre.

For linear combination of the function, is used. Here is a Gaussian function: where is centre vector of a region, is an input vector, and is the radius or width of receptive field.

The sum squared error to be minimized between the actual input and target is given by the following equation: where “” is the desired output and “” is the network output.

In [12] OLS based RBFNN is proposed to optimize the parameters of the network for transformer fault diagnosis. Authors selected sufficient training exemplars from previous literature and the performance of the network in terms of misclassification and hide neurons is presented. A method based on -means clustering algorithm and RBF neural network is proposed in [13] with an accuracy of 82.2% and 78 neurons in the hidden layer with data base from the research papers. SOM cell splitting algorithm is used for optimal network architecture of RBF network in fault classification of power transformers [14].

3. Incipient Fault Diagnosis Using ANFIS

3.1. Fuzzy Inference System (FIS)

It is generally difficult to determine the hidden relationship between the gas concentrations and the fault type. Fuzzy set theory can be used to handle such type of uncertainty. In the proposed methodology, the gas concentrations based on the range are selected as low (L), medium (M), and high (H). The bell shaped membership function is used for all input gases and fuzzy inference rules are then developed. FIS consists of antecedents (if) and consequents (then) part and the rules are of the form.

If MH = M and AE = M and EE = L and EM = H, then the condition —Rule  1.

Similarly using the same gas ratios with different linguistic variables other than defined in rule  1, many such rules can be formulated as per the experience of the researcher. However using the concentration of 5 prominent gases with assigned linguistic variables and membership functions, various rules can be generated.

Using the max/min composition, the fuzzy inference, that is, the antecedent, consists of rules as shown below....The consequent part will specify the fault condition:Condition ;Condition .

ANFIS combines the best features of fuzzy systems and neural networks in which the representation of prior knowledge into a set of constraints, that is, network topology to reduce the optimization search space, is performed by fuzzy system and adaptation of backpropagation to structured network to automate fuzzy controller parametric tuning is done by neural network. Fuzzy inference is the actual process of mapping from a given input to an output using fuzzy logic. The process involves membership functions for input and output, fuzzy logic operators, and if-then rules. The architecture of fuzzy inference system is shown in Figure 3.

The process involves fuzzification, inference engine or rules, and defuzzification. The crisp inputs are to be fuzzified in the range from 0 to 1, using different membership functions with values of each linguistic label [15]. Using International Electrotechnical Commission (IEC) Code, Central Electricity Generation Board (CEGB), and American Standard Test Method (ASTM) standards to build the fuzzy logic system as a case study of DGA data of power transformer is proposed [16], in which crisp logic and fuzzy logic are used to interpret the fault type.

The input feature selection is based on competitive learning and neural fuzzy model in which the fuzzy rule base for the identification of fault was designed by applying the subtractive clustering method which is very good at handling the noisy input data [17]. Verification of the proposed approach has been carried out by testing on standard and practical data and has been shown in the efficient method which uses radius parameter in subtractive clustering with 96.7% diagnosis accuracy as compared to Rogers ratio and other neural fuzzy techniques.

The most important methods used in the FIS are Mamdani and Takagi-Sugeno-Kang (TSK) method. The main difference lies in the consequent of fuzzy rules. In the proposed work, TSK method of FIS has been used in the fuzzy toolbox of matlab, in which the fuzzy rules are generated from the input output dataset of 563 power transformer oil samples.

TSK model combines fuzzy sets in antecedents with crisp function in output:if is and is , then ;if is small, then ;if is medium, then ;if is large, then .

Here and are the fuzzy sets in the antecedent, while is a crisp function in the consequent. is the polynomial in the input variables and . Small, medium, and high are the nonfuzzy sets with the membership functions used in the present work.

In the architecture of TSK ANFIS model, five nodes are available and can perform the various functions. In layers 1 and 4, the nodes are adaptive and represented by the node functions. In layers 2, 3, and 5 the nodes are fixed. The overall output computed as the sum of all incoming signals at node 5 is given by where is the normalized firing strength from layer 3 and is the output of th rule.

3.2. ANFIS Structure

The details of the network structure used in the proposed work are as follows.

It uses 7 input gas concentrations with -bell membership functions and one output with linear transfer function. Input-output relationship is developed using fuzzy logic and inference regarding the particular fault is obtained. The generated Sugeno (TSK) FIS structure is as follows.fisNet =;name: ‘ANFIS’;type: ‘Sugeno’;and method:  ‘prod’;or method: ‘max’;defuzz. method: ‘wtaver’;imp. method: ‘prod.’;agg. method: ‘max.’;input: [1×7 struct.];output: [1×1 struct.];rule: [1×2187 struct.].

4. Results and Discussion

In Section 2 the architecture, design and algorithms used for training BP and RBF network are discussed in detail. In Section 3, the FIS methodology based on the gas ratios or the concentration of gases is highlighted. Generated FIS with input, output and rule structure is also presented.

In this diagnosis, eight faulty conditions, namely, arcing, corona, low energy discharge (D1), high energy discharge (D2), thermal fault of temperature of 150–300 degrees Celsius, thermal fault of temperature between 300 and 700 degrees Celsius, thermal fault of temperature >700 degrees Celsius, and corona with solid insulation degradation, and normal or healthy condition are considered. The incipient fault conditions are based on the energy and temperature at which the seven prominent gases such as H2, CH4, C2H6, C2H4, C2H2, CO and CO2 evolved. Generally CO and CO2 are responsible for solid insulation degradation. The chances of failure of equipment due to solid insulation degradation are less; hence five gases are enough to make the final diagnosis. But all the gas concentrations are considered and the additional combinational fault, that is, corona with solid insulation degradation is given due consideration.

DGA interpretation is mainly used as the basis in dealing with all the faulty conditions. Total 563 DGA samples of power transformer from the reputed ISO certified testing unit were used in the data base. Out of 563 samples, 40, 30, and 30% were used for training, testing, and model validation, respectively. The network structure and diagnosis results in terms of error as the performance measure in diagnosis were carefully studied and the comparative performance of the networks is presented.

4.1. Diagnosis by Feed-Forward (Backpropagation) Network

Figures 4 and 5 show the details of architecture, training, testing, and validation of model and regression analysis. Network performs better and the performance at epoch number 388 during training was 0.0445. However the model validity is not better. It takes about 5000 iterations to reach the network performance in terms of mean squared error (MSE) to 0.056 and the training time is also in hours as shown in Figure 4. The accuracy in diagnosis during training, testing, and validation was 99.55, 99.11, and 94.4%, respectively. The features revealed during the diagnosis of this 7-7-1 network are better performance in terms of MSE and less number of neurons in the hidden layer implying less memory space, but the execution time is too long during the validation of model. It indicates that the network is not suitable for the diagnosis on the given DGA samples.

4.2. Diagnosis by RBF Neural Network

Details of network structure and algorithm used in training are discussed in the previous section. Optimum performance of the network was observed at epoch number 135, where the sum squared error (SSE) was finally reduced to 0.015 as shown in Figure 8 indicating 98.78% true diagnosis accuracy. Network takes less time for execution, but the number of neurons in the hidden layer as finally determined during the experimentation was 135. More numbers of samples in the data base, more numbers of neurons, and hence better accuracy in diagnosis were obtained. The network performance seems to be superior for this problem. Network training for SSE, number of neurons, and epochs is shown in Figure 6 and the error between actual network output and target and the performance curve is shown in Figures 7 and 8.

4.3. Diagnosis by ANFIS

The well designed ANFIS was trained where the number of epochs was set at 3000 and the goal was set as 0. ANFIS training on 563 transformer oil samples was performed using 3 g bell membership functions for input and a linear function for the output. And method is used for input and weight average for the output. For defuzzification, weight averaging is used. The TSK model was then tested on 40% samples and the testing and validation were performed on the remaining 30% samples. The trained ANFIS provided the diagnosis in terms of root mean squared error (RMSE) as 0.28, indicating an accuracy of 93.83%, and the best validation performance was obtained with an accuracy of 92.33%.

The input membership functions have been shown in Figure 9. The ANFIS module with 7 inputs, 1 output, and 2187 rules automatically generated by the trained system model is shown in Figure 10. Since the input parameters are 7 and 3 membership functions which are used, the rules generated are . The rule viewer is shown in Figure 11 with the rules for 7 inputs and the output of the system. Figure 12 shows the performance curve with the root mean square error (RMSE) and the number of epochs during training. It has been observed that the trained ANFIS provides better performance at epoch number 3000. The FIS used in this work generates more rules and some may be redundant but the performance is reasonably good. Other membership functions, for example, triangular, trapezoidal, and sigmoid, were also tried but the diagnosis error was too high which is restricted to inclusion in the present study and only the single membership function results are presented.

The comparative diagnosis performance of the methods is shown in Table 3. To overcome the drawbacks of neural networks as stated earlier, ANFIS could have been the best choice. The ANFIS is slow in convergence as compared to RBF and occupies more memory space but since it possesses the advantages of both least square and gradient descent, better performance is revealed during the investigation.

4.4. Diagnosis Using Rogers Ratio Method

It uses 4 gas ratios such as CH4/H2, C2H6/CH4, C2H4/C2H6, and C2H2/C2H4 and has been coded as , , , and , respectively, and the ranges of ratios are shown in Table 1.

Fault diagnosis suggested based upon the gas ratios is shown in Table 2.

Matlab codes have been used to match the ratio codes and the related fault. 55 DGA samples out of the available 563 samples have been used and only 29 samples have been classified correctly showing an accuracy of 52.73%. This method is not so accurate and sometimes tends to have no diagnosis in many cases. It is not able to cover the entire range of input space.

4.5. Diagnosis Using Duval Triangle

Michael Duval developed Duval triangle utilizing a data base of thousands of DGAs of transformers for diagnosis. The Duval triangle is shown in Figure 13. This method has proved to be accurate but mainly depends upon the concentration of gases at medium and low level which affects the diagnosis.

When using Duval triangle for diagnosis, C2H2, C2H4, and CH4 values from the testing laboratory are plotted and a point that lies within one of the triangle fault zones or rarely might fall on the borderline between two fault zones will determine the particular fault.

The percentage of the prominent gases can be determined as follows:% , for in microlitre/litre;% , for in microlitre/litre;% , for in microlitre/litre.

The faults specified are as follows:PD: partial discharge;T1: thermal fault less than 300°C;T2: thermal fault between 300° and 700°C;T3: thermal fault greater than 700°C;D1: low energy discharge (sparking);D2: high energy discharge (arcing);DT: mix of thermal and electrical fault.

Making use of 55 DGA samples from the 563 available samples, Duval triangle provides correct diagnosis for 40 samples showing 72.66% accuracy and is better in diagnosis as compared to Rogers ratio method.

5. Conclusion

A comparison of the diagnosis ability of backpropagation (BP), radial basis function (RBF) neural network, and adaptive neurofuzzy inference system (ANFIS) has been investigated and the diagnosis results in terms of error measure, accuracy, network training time, and number of iterations are presented. It has been investigated that BP network performs better during training but it fails to validate the performance. RBF network is consistent in its performance during both training and testing and its validation accuracy is better than BP network. In the present work, ANFIS provides least accuracy as compared to BP and RBF network. It is slow in convergence as compared to RBF and takes more iteration and occupies more memory space but provides reasonably good diagnosis results and accuracy during training and testing on unknown samples also. It is superior in diagnosis as it uses either backpropagation or a combination of least squares estimation and backpropagation for membership function parameter estimation. It overcomes the drawback of BP network getting struck in local minima. It learns automatically from the input output data. Performance of ANFIS can be improved further by using various membership functions and parameters.

The artificial intelligence methods provide better accuracy in fault diagnosis as compared to Rogers ratio method and Duval triangle and the diagnosis results by all the methods are listed in Tables 3 and 4.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

Authors are grateful to M/S.B.R. Industrial Services for providing sufficient DGA data of power transformers of MSETCL. Without the oil samples the work would not have been completed. Thanks are due to the reviewers for their constructive criticism which made it possible to incorporate the changes for better formulation of the paper in true sense.