#### Abstract

Nowadays the demand of power supply reliability has been strongly increased as the development within power industry grows rapidly. Nevertheless such large demand requires substantial power grid to sustain. Therefore power equipment’s running and testing data which contains vast information underpins online monitoring and fault diagnosis to finally achieve state maintenance. In this paper, an intelligent fault diagnosis model for power equipment based on case-based reasoning (IFDCBR) will be proposed. The model intends to discover the potential rules of equipment fault by data mining. The intelligent model constructs a condition case base of equipment by analyzing the following four categories of data: online recording data, history data, basic test data, and environmental data. SVM regression analysis was also applied in mining the case base so as to further establish the equipment condition fingerprint. The running data of equipment can be diagnosed by such condition fingerprint to detect whether there is a fault or not. Finally, this paper verifies the intelligent model and three-ratio method based on a set of practical data. The resulting research demonstrates that this intelligent model is more effective and accurate in fault diagnosis.

#### 1. Introduction

The increasing scale of power system and the growing amount of power equipment have accelerated the integration of power grid. Due to the reform of electricity system, electric power enterprises continuously improve the service quality to satisfy the demand of the customers. Meanwhile, operation costs reduction tends to be the primary goal of electric power enterprises to maximize profits. Although, during “The Ninth Five-Year Plan” period, the number of accidents in power grid decreased significantly, there are still 3 accidents caused by electrical equipment fault which accounts to 23.1% out of the total number of accidents. Furthermore, the power grid accidents caused by power equipment malfunction have a tendency to increase year by year. On average, the corresponding power load of each accident increases to 585.7 MW/time and accident recovery time grows up to 526 min [1]. The subject as to ensure the safety of power equipment, to observe potential fault in time, to reduce power system accident effectively, and to improve power supply quality and reliability, has become the most urgent problem in relation to power system. The real-time online monitoring as to the condition of the equipment is the core essence which contributes to the safety of power system [2, 3]. As a result, the maintenance of power equipment underpins the normal operation of power system.

Condition based maintenance (CBM) which is based on actual health of the power equipment has strong pertinence and timeliness. CBM can ensure the safety of power grid, minimize the resources of maintenance of power equipment, reduce the operating cost of power enterprise, and improve the benefits of enterprise and society. Consequently, CBM has gained ground in both research and commercial realms [4, 5]. CBM assesses the health status of equipment base on online condition information and attribute information of power equipment comprehensively. Then it will make a decision whether to repair the power equipment according to the assessment result. In order to ensure the safety and reliability of the power equipment, the equipment with unperfected running status requires timely maintenance. In contrast, the frequency of maintenance for the equipment with fine running status should be reduced in order to ensure such good status as long as possible [6–8].

State diagnosis of power equipment relies on a vast amount of equipment state data. It is affected by the history running data and external environment of the power equipment. State diagnosis first estimates whether the equipment needs maintenance through comparing the time-point data with historical data vertically and analyzing data generated from the same kind of equipment in different time phase. Furthermore status diagnosis helps to identify the types of fault, the cause of fault, and corresponding severity. Based on the analyzed results obtained from status diagnosis, it will introduce reliable strategies and methods to overcome these faults [9].

In this paper, we propose an intelligent fault diagnosis model for power equipment based on case-based reasoning which will satisfy the new requirements of power grid. Moreover, a case study was conducted by applying IFDCBR and three-ratio method. The result shows that IFDCBR is more accurate and more widely used. The rest of the paper is organized as follows. In the second part, a new kernel function for SVM regression analysis is constructed based on the radial basis kernel function and polynomial kernel function. Then we propose IFDCBR in part three. Afterwards, we make a case study with IFDCBR and three-ratio method. The result shows that IFDCBR is not only more accurate but also more widely used. Finally we summarize the paper and put forward the potential challenges.

#### 2. Relevant Theories

##### 2.1. Case-Based Reasoning

Case-based reasoning (CBR) is an important method used in learning and solving problems in artificial intelligence. The core theory of CBR is to apply the past experience in solving similar problems at present or in the future [10, 11]. What is more, the problem solving ability increases as experience accumulated. When there are no suitable algorithms or models to solve the problem, CBR can use case-based experience to effectively solve this problem [12–14].

The typical steps in rational problem solving process include four main parts: case retrieval, case reuse, case revision, and case preservation [15], shown in Figure 1. First of all, to analyze target sample, CBR searches similar case in the problem space (case retrieval). Then it directly uses (case reuse) or adapts the solution of the example case (case revision) to solve the upcoming problem. Finally, the target case will be saved to the case base after being screened.

##### 2.2. Regression Analysis of SVM

Regression analysis is a statistical method to identify the interdependence between two or more variables in quantitative relationship [16, 17].

According to the theory of CBR, seeking an optimal function relation between the status and the operation data of power equipment based on historical running data set is the core of fault diagnosis. Essentially, fault diagnosis of power equipment is a method to classify power equipment status data [18].

One key technology of classifying data is to find the optimal hyperplane which is an optimal linear discriminant function. We set as the sample data which is divided into categories: +1 and −1, expressed as . is the number of corresponding data sample categories and is the size of samples. The general form of the function in linear separable conditions is shown as follows:where is an -dimensional vector and is the offset.

We should determine the coefficients and of linear function. The constraint is that the bias squares of and the observed is most minimum. In other words, it is the process of solving the optimal regression hyperplane equation.

Vapnic proposed function which is not sensitive to the error aswhere

The value is a negative number. It can be ignored when the deviation of and the observed is not more than . The function gives a sensitive zone with width of named pipeline, as shown in Figure 3.

The deviation is 0 if samples are in the pipeline. Otherwise, the sample is away from the group. In fact, samples from the group fundamentally are error or noise data which should be ignored to ensure the hyperplane is suitable for most of the samples. Therefore, we introduce slack variable and penalty factor C to control the condition of samples data from the group:

pipeline and slack variable in linear conditions are shown in Figure 2.

**(a)**

**(b)**

**(a) Feature figure of RBF**

**(b) Feature figure of polynomial kernel**

Essentially, the SVM regression problem is fitting with , where , . We convert seeking the optimal hyperplane into convex quadratic programming problem under the constraint condition based on function after introducing slack variable and penalty factor . Consider

is a given punishment coefficient beforehand, which is generally determined by experiment. When the slack variable of samples from the group is certain, the loss of objective function increases with the increasing of penalty factor .

In order to solve the convex quadratic programming, we introduce the Lagrange function: where are the Lagrange multipliers. Function meets the conditions, , , , from which we can get the hyperplane:

In view of the nonlinear regression problems, the basic idea of SVM is converting the inseparable data samples in -dimensional space into separable data samples in a high-dimensional feature space with the kernel function.

The nonlinear problem in the original data space is converted into linear problem in a high-dimensional feature by introducing kernel function . As a result, we use to replace , which is

According to the defined kernel function , we can get nonlinear regression function of SVM as

##### 2.3. Construct the Kernel Function

Kernel function can map the inseparable samples in low dimensional space to linear separable samples in high-dimensional feature space. It can solve the problem of ascertaining the mapping function, the problem of determining the high-dimensional feature space and “dimension disaster” which is due to operating in the high-dimensional space [19, 20].

It is easy to ascertain a kernel function satisfies Mercer’s theorem. Kernel functions include local nuclear functions and global functions. Different kinds of kernel function have a great influence on extracting efficiency of data information. Local kernel function applies to data with small scale but more accuracy. By contrast, global kernel function fits into data with large scale but less accuracy. RBF kernel function is a typical example of local kernel function while polynomial kernel function is a very good global kernel function.

By solely applying global function or local function, we cannot completely cover the distribution character of data. In practice, we should choose and construct a new kernel function reasonably to accommodate the kernel global and localized according to the data characteristics.

For RBF kernel function , we enter and selected kernel parameters . The characteristics of RBF kernel function are shown in Figure 3(a). We can easy see that the impact is only in a small area which is near the test point . RBF kernel function is not sensitive while it tends to 0 away from the test points. In contrast, we set , , and for polynomial kernel function and the characteristics of it are shown in Figure 3(b). As we can see, the impact of it is not only in the area near the test point but also away from it. Polynomial kernel function has a strong ability of global data generalization.

In this paper, we construct a new kernel function with these two kernel functions, as follows: where and are the coefficients to adjust the effects of compound kernel function by RBF kernel function and polynomial kernel function. We can adjust the coefficients to keep the fitting and generalization ability for data samples of different distribution. The characteristics of compound kernel function are shown in Figure 4 while we set , , , , and . Based on the characteristics of compound kernel function, we can see that it contains the characteristics of both RBF kernel function and polynomial kernel function. It has a good ability of data fitting and generalization. The compound kernel function can be adapted to different distributions data set by adjusting the parameters and .

By introducing the compound kernel function into formula (9), we can get the SVM nonlinear regression functions as follows:where are the Lagrange multipliers and , .

#### 3. IFDCBR

##### 3.1. Data Model

A unified information model for power equipment is the basis of state diagnosis and prerequisite for standardization of equipment [21]. The unified model is convenient for macrostatus diagnosis of equipment and makes it possible to share status data of the same or similar category equipment.

*Definition 1. *One defines the information model of power equipment as , where one has the following.(i) is a collection of common intrinsic properties of the equipment, for example, the name, type, manufacturing number, and manufacturer of the equipment. Furthermore, parameter is determined by particular equipment.(ii) is a vector which consists of various equipment characteristic values.(iii) is a collection of relationships among the equipment characteristics.(iv) is a collection of ordered data which consists of equipment characteristics and its ratings.(v) is a collection of environmental factors.

The running data of power equipment is gathered and stored by time. The running data is closely related to the gathering time . We construct abstract spatial data model for equipment. In this model, the monitoring value is axis and the time is the axis, as shown in Figure 5.

As we can see, is on behalf of the monitoring value at time . We can get the equipment monitoring data sample at time when we slice the data at time along axis.

Based on the information model and abstract spatial data model of equipment, we can define the power equipment case.

*Definition 2. *Power equipment case is a data collection which reflects the status of power equipment. One defines it as Case = , where(i) is the data sample of equipment monitoring point at time ;(ii) is the environmental factors of the equipment;(iii) is the condition of the equipment in environment at time . Generally, is a constant which is the classification thresholds of equipment conditions. is the important basis of case training. One assumes as null when one cannot confirm the value in the process of case reasoning.

##### 3.2. Status Fingerprint

We assume that is a status case base of equipment . There are five types of conditions identified as . We set up five subsets of case, , , , , and , according to different types. We train cases to establish the status fingerprint reader according to SVM nonlinear regression functions shown in formula (11).

We assume that is the status fingerprint reader of , , , , and , the train goals are , , , , and , and the tolerance is . The five subsets of conditions can be mapped to different data intervals, as shown in Figure 6.

According to the theory of SVM regression analysis, we can get the status fingerprint readers by training the cases. We set

In this paper, we define as the equipment status fingerprint, in which is the recognizer of status fingerprint and is the standard value of fingerprint identification. We set as the tolerance for fingerprint similarity. Thus we can set up the equipment status fingerprint, as shown in Table 1.

##### 3.3. Diagnosis Model

We establish equipment case base on the basis of running data, history data, environmental factors, and equipment data. Then we establish equipment fingerprinting by training the cases based on SVM regression theory in order to diagnose the status of power equipment. There are two phases in IFDCBR: learning and application. We train the fingerprint recognizer and establish the fingerprint identification database in the learning phase, while we diagnose the status with fingerprint identification in application phase. IFDCBR is shown in Figure 7.

We establish the equipment case base based on the basic information, historical operating data, parameter calibration of equipment, and environmental data. Then we set the learning goals and train the equipment fingerprint recognizer and establish the fingerprint identification database.

In the application stage of IFDCBR, first we get the current equipment monitoring data. Then we get the untested equipment condition case after filtering and normalization processing. Next we recognize the untested equipment condition case with the equipment fingerprint recognizer so as to get the tested equipment fingerprint. By comparing the equipment condition fingerprints with the tested equipment fingerprint, the resulting highly fitted fingerprint falls into the diagnosis category.

We can assess the diagnosis result given by application model of IFDCBR to optimize the equipment fingerprint recognizer. At the same time, we can improve and optimize the equipment condition fingerprints by storing the untested equipment condition case into equipment case base. In this way, the IFDCBR has the ability of self-learning and self-optimizing.

#### 4. Empirical Analysis

In this paper, we program corresponding applications to test IFDCBR. Then we diagnose the oil-immersed transformer using the IFDCBR and three-ratio method. The results show that IFDCBR is more effective and accurate.

##### 4.1. Classification of Transformer Condition

Types of the equipment condition are the premise of equipment state analysis. If we can not get the fine-grained equipment condition types, the roughly types are needed. The reasonable equipment condition classification model is very important at overcoming the appearance of overfitting and underfitting in IFDCBR. We divided the transformer condition into six types according to analyzing the transformer faults and the related information of dissolved gas in transformer insulation oil [22–24], as shown in Table 2.

##### 4.2. Modeling

A 110 KV oil-immersed transformer in Hebei Electric Power Company is put in PMS system and oil chromatography-line monitoring system, 1229 pieces of gas concentration data are collected and added to equipment data such as temperature and humidity, and 819 pieces of useful data are gotten after data was organized and noise data removed. Among them, 573 for the normal operation data, 246 for failure data, as a basis for experimental research data, we get data distribution of status classification as shown in Table 3.

Sample library of equipment status T is established according to experimental data, T1 is a subset of normal samples, T2 is a subset of low energy discharge fault samples, T3 is a subset of high energy discharge fault samples, T4 is a subset of partial discharge fault samples, T5 is a subset of low temperature thermal samples, and T6 is a subset of high temperature thermal fault samples.

The key to establishing a diagnostic model is optimizing parameters of identifier after the sample library was established. In this paper, we optimize classifiers using Cyclic Variable Method [25]. We train the fingerprint recognizer by repeatedly using related tools in Libsvm and get parameter values as shown in Table 4.

Through the above training, we can finally get the kernel function parameters: , , , , and . The status fingerprint of 110 KV oil-immersed transformer is shown in Table 5.

##### 4.3. Results Analysis

Each type of data is randomly divided into two parts at the ratio of 4 : 1 when classifier is trained to ensure the accuracy and reasonableness, 80% of data is used to train the fingerprint recognizer, and the other 25% is used to test. The highest classification accuracy is 95.19%, the lowest is 75%, and the average is 82.19%, as shown in Table 6.

Five groups of data in Table 7 cannot find results by using traditional three-ratio method (TRM) or comes to misjudgment and the diagnostic results gotten by IFDCBR are consistent with the actual situation. As we can see, the validity and accuracy of the model are high and will increase as more learning experience gathered.

#### 5. Summary

State diagnosis with artificial intelligence technology has become important part of smart grid construction. In this paper, CBR is used to organize online monitoring data, historical operating data, environmental data, and the basic test data of similar power transmission equipment; device status fingerprint is established to analyze real-time detection data. The results show that the diagnostic model has higher accuracy and wider applicability; it provides a viable solution for discovering and excluding latent failure of power transmission equipment. However, the diagnostic model still faces many challenges at the terms of self-learning,self-optimizing, and the assessment of diagnostic results.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

This research was supported by NSF of Jiangsu Province of China (Grant no. BK20141452), Program of Natural Science Research of Jiangsu Higher Education Institutions of China (Grant no. 14KJB470006), High level talents in Nanjing Normal University research start-up research project (Grant no. 2014111XGQ0078), and Jiangsu province postdoctoral research funding scheme (Grant no. 1402216C).