#### Abstract

Series arc fault detection can improve the safety of low-voltage power systems. The existing arc fault detection is mainly based on various indicators of a frequency domain or time-frequency domain transformation for feature extraction, which is difficult to extract comprehensive arc information, resulting in low detection accuracy. This paper presents a method for extracting comprehensive information, combined with a convolutional neural network to detect arc faults. First, the arc fault experimental platform is developed according to the UL1699 standard, and the current signals of various loads under different operating conditions are collected. Then, the current of a single cycle is embedded by coordinate delay, and the distance matrix is calculated by using 50 vectors reconstructed by a single cycle. Finally, a convolutional neural network classification model is designed, which is used to mine the information in the distance matrix to detect series arc faults. The experimental results show that the average accuracy of the method for arc fault identification of various loads is 99.00% and that the sampling frequency is low. It is suitable for lines with different loads and has certain robustness, so this method has the potential to be implemented on hardware.

#### 1. Introduction

With the continuous improvement in the electrification of residential electricity in recent years, the incidence of electrical fires has risen sharply. There are many causes of electrical fires, among which the arc fault of low-voltage lines is one of the main causes of electrical fires [1]. The series faulty arc can generate high temperatures at very small currents, so traditional electrical fault protection devices may not work in this situation. Therefore, it is necessary to find a series arc fault classification method with high accuracy to issue an alarm when an arc fault occurs, in order to ensure the safety of electricity consumption.

Waveforms of the arc current tend to be random, and waveforms produced by some special loads during normal operation are similar to those of faulty arc current in other loads, which increase the difficulty of arc fault detection in various series loads. In order to solve the above problems, the first step of the traditional arc detection method usually involves extracting features and then analyzing those features with a classifier, but it appears that the following difficulties arise in the process of extracting features: During feature extraction, time-domain metrics have the characteristics of being easy to extract, but they have low stability and are susceptible to extreme values. When extracting the frequency domain information of current signals, the frequency-domain transform or time-frequency domain transform should be used, among which frequency-domain transformation mainly includes the Fourier transform [2], CZT (chirp-Z-transform) [3], and the commonly used methods of time-frequency domain transformation are wavelet transform [4, 5], and Hilbert yellow transform [6]. However, the frequency-domain and time-frequency domain transformations are computationally intensive and hard to operate with hardware [7]. In order to solve the problems encountered in feature extraction, it is preferred that the one-dimensional signal be converted into image information as a new object for research.

In the earlier research, some arc fault detection methods have been proposed that convert one-dimensional signals into graphics as research objects. The authors in [8] have realized unsupervised online detection of arc faults by reconstructing phase space of the current series with a window width of two cycles and extracting the geometric features and attribute characteristics of the image with the fault detection method of PCA. The authors in [9] have used a grayscale symbiotic matrix to extract twenty-one eigenvalues from the recursive graph and adopted regularized linear discriminant analysis to detect arc faults. However, information about the image may appear anywhere in the diagram, and these methods only select some eigenvalues that describe the image and do not analyze the image globally. The authors in [10] have extracted the characteristic grayscale images of arcs and proposed a time domain visual recognition method based on the multilayer convolutional neural network (CNN), with an accuracy of 97.7%. However, the sampling frequency used in this method is 1 MHz, and to obtain the grayscale plot of each half-cycle wave, ten thousand values are required, which are too large to deal.

In order to preserve all the information about the image and reduce the amount of computation simultaneously, this paper adopts phase space technology to reduce the dimensions of the arc current for decreasing the size of the feature matrix. Phase space reconstruction technology has the advantages of strong noise immunity and being good at dealing with nonlinear data. In order to fully collect the potential information on the image, the CNN is introduced into the series arc fault detection method. The convolutional layer of the CNN can simultaneously identify connections between each class of feature, and good performance is achieved by utilizing features extracted from the input image, which presents strong self-learning capabilities [11].

Therefore, in this paper, a combination of phase space reconstruction technology and CNN is applied to series arc fault detection research. First, series arc fault signals and normal working signals under ten typical loads with fifteen types of power are collected by the experimental platform. Then, the delay time and embedding dimensions are obtained by the average mutual information method and the false proximity point method, and the coordinate delay embedding of each cycle signal of data collected by the experimental platform is obtained; the set of features is constructed. Finally, an ideal CNN model is designed, and the convolution operation of the arc fault of the typical load and the distance matrix image of the normal state extracts potential features from the distance matrix to realize the series arc fault detection of multiple loads.

#### 2. Experimental Device and Data Acquisition

In this section, a series faulty arc generation and acquisition device is designed according to the standard UL1699-2008AFCI [12]. The series arc test platform is mainly composed of an air switch, arc generation device of point contact, voltage and current transformer, data acquisition card, and different loads, and the device and the circuit diagram are shown in Figures 1 and 2, respectively. The arc fault generation device consists of a copper rod and static contact carbon rod, and at the beginning of the experiment, two electrodes are in contact with each other and then close the circuit. The distance between the two electrodes is then adjusted by a knob, and when the two electrodes are separated to a certain length, an arc spark is generated. The sampling frequency is set at 10 kHz.

Experimental loads can be divided into two categories: linear and nonlinear loads. A linear load includes a hair dryer with hot air, kettle, and induction cooker. A nonlinear load contains an air conditioner, fluorescent lamps with different powers (10 W, 20 W, 45 W, 55 W, 65 W, and 75 W), refrigerator, vacuum cleaner, dimmer, computer, and hair dryer with cold air. These loads cover the resistance, sensibility, and capacitive loads commonly used in daily life, which are representative to a certain extent, and also contain the interference load in the accidental tripping experiment according to the standard UL1699-2008AFCI [12] and the national standard GB/T 31143-2014 “General Requirements for Arc Fault Protection Appliances (AFDD) ” [13]. Among them, the circuit current waveform of the kettle and the dimming lamp in one experiment is shown in Figure 3. The waveform of the current of the kettle during normal operation is close to the sinusoidal wave, and after the arc fault occurs, there is obviously a “zero rest” phenomenon. In the normal operating state, the current waveform of the fluorescent lamp fluctuates near a small value for a long time, and the current waveform has a long “zero rest” when the arc fault occurs, while a large spike pulse appears during the nonzero break period.

#### 3. Construction of a Fault Characteristic Matrix Based on Phase Space Reconstruction

##### 3.1. Phase Space Reconstruction Theory

Data collected by using a small instrument, a transformer, often contains a lot of sampling noise that reduces the accuracy of fault detection. Previously, in many detection methods, some noise reduction methods were used to handle the original signal before calculating characteristic features, such as wavelet packet noise reduction [14, 15] and wavelet threshold noise reduction [16]. These noise reduction methods require time-frequency domain decomposition of the signal, and noise reduction techniques adopted in these methods usually use wavelet or wavelet packet decomposition when extracting characteristic features so that data can also be denoised when extracting the characteristic features. In this section, if we carry out noise reduction based on time-frequency domain decomposition specifically and then use other techniques to construct numerical matrices, it violates the original intention of using CNN to collect arc feature information independently. In order to simplify the process of fault detection, the part of noise reduction would be better to be skipped, so it is necessary to find a method that is highly resistant to noise.

By selecting the appropriate embedding dimension, Phase space reconstruction technology can reduce the influence of noise on the data, thus this technology is independent of sampling frequency. Since the process of arc current generation is affected by a variety of nonlinear factors and presents chaotic properties, it has a natural advantage in dealing with arc data based on chaos theory, while phase space technology can reduce the dimension of chaotic systems and decrease the amount of calculation. Delayed coordinate embedding is the most common method of phase space reconstruction, which reconstructs a phase space from the single time series, and it is a method of recovering the prime dynamic system. According to the Takens embedding theorem, the formula for constructing the -dimensional phase space vector for a one-dimensional time series with delay time is as follows:

##### 3.2. Determination of the Embedding Dimension and Delay Time

To embed a delay coordinate, it needs to determine two parameters, i.e., the embedded dimension and delay time . There are some methods that can obtain the two parameters at the same time, such as the C-C method and the differential entropy method [17], which avoid the inconsistencies of separate calculations, but for the calculation of multiple loads, the two values between different loads are always different, leading to some loads to be neglected. If the parameters that appear in pairs are taken apart, it is difficult to ensure the validity of parameters, but if the combination is selected, the applicability for other loads cannot be guaranteed. Therefore, in order to find a situation where the applicability for all loads is well, the method of solving separately is used, and two parameters are determined by summarizing parameter changing of each parameter. Since the calculation of the embedded dimension needs to determine the delay time first, the delay time should be solved as priority. Both the normal operation of the load and the arc fault waveform data are nonlinear, so the average mutual information method is used to determine . Based on the definition of information entropy in information theory, the average mutual information method [18] uses the average mutual information to represent the nonlinear correlation of two sequences, and then, can be selected effectively. The average mutual information about a finite time series and a sequence with a delay time of is calculated as follows:where is the joint probability distribution of the sequences and , and , which makes reach the first local minimum, isthe optimal delay time. Before the calculation, the data of each cycle are normalized by dividing the maximum value of the amplitude, and the current signals in the normal operation of ten different loads are selected to calculate the mutual information corresponding to different delay durations. The results of calculation are shown in Figure 4, from which it can be seen that the mutual information of different loads is obviously different from the aspect of the trend and size. In terms of the size of the mutual information, fluorescent and dimming lamps are at the lowest level and the rest of loads have little difference. From the trend point of view, except for the refrigerator, vacuum cleaner, and hair dryer with a cold air mode, most of the load curves are shown to be similar to the lower convex function on the plot, reaching a unique minimum value at a delay time of fifty sample points (i.e. 1/4 cycle). Under the cold air mode, fluctuation in the hair dryer produces two troughs with the time delay from 30 to 70, reaching a minimum value of 0.38 at the point of about 45 and 55, and mutual information is 0.43 at a delay of 50. The curve of the vacuum cleaner generally shows a “” type, reaching a minimum value of 0.45 at a delay of 22 and 77, and mutual information is 0.54 at a delay of 50. The curve of the refrigerator also exhibits a “ shape,” reaching a minimum value of 0.47 at a delay of 22 and 80, and mutual information is 0.56 at a delay of 50. Although values of the mutual information of these three loads are not extremely small with the sample point being 50 as the delay time, they are close to the minimum value. Since the time delay affects the size of the constructed feature matrix directly, it is necessary to unify the size of the input picture to apply the CNN to determine the arc fault of various loads in the next step, so this paper sets the delay time of each load at sampling points of 50.

The state of the system in a phase space always converges to certain attractors. When the parameter is small enough, the points on the attractor will close to or cross with each other, and thus the topological properties of the original attractor will not retain. If the parameter is too large, it increases the amount of computation and amplifies noise. Therefore, it is better to choose a smaller number of embedded dimensions as much as possible for retaining the topological properties of the attractor. The false nearest neighbor (FNN) is an effective method for determining the embedded dimension, and the delay time must be determined when calculating first. If the minimum embedded dimension of a time series is , the attractor reconstructed in an -dimensional delayed space matches the attractor in the original phase space, but if the dimension of the space is smaller than , this one-to-one correspondence will be broken. In this case, some points on the axis of the one-dimensional space are projected into the neighborhood of other points, but in a high-dimensional space, these points are actually not adjacent, and these points are called false proximity points. Based on this law, for each point in the time series, we search the distance of its nearest point in the space of the dimension, then increase the dimension of the phase space by 1, and calculate the distance between them, and we can determine whether it belongs to the false proximity point. Note is the vector , where is the minimum proximity point of , which is represented as , and the formulas for these two distances are as follows [19]:where and are the distances between and the nearest point in the and dimensions, respectively, and if the difference between two distances is too large, the point is marked as false proximity. Here, the method of judgment is proposed by Kennel, the founder of FNN. Either of formulas (4) and (5) is satisfied, and the point is determined to be a false proximity point, where is an estimate value of the embedded dimension, and in this paper, is taken as fifteen and is taken as two:

We calculate the proportion of adjacent points corresponding to the points in different embedding dimensions , and if the proportion is less than 5%, it can be considered that this embedding dimension is sufficient to open the chaotic attractor. Sampling data of ten load currents are selected for the calculation of false proximity points, and the results are shown in Figure 5, which shows that the minimum embedded dimension, which can make the proportion of false adjacent points of all loads less than 5%, is four.

##### 3.3. Construction of the Feature Matrix of the Arc

After phase space reconstruction, the vector with dimension is obtained, whose number is , and the expression of the distance matrix is as follows:

in this paper is a four-dimensional vector, and a total of fifty are calculated for each segment of data, so the final distance matrix is 50 50 that will serve as input to the intelligent recognizer of the series arc. In order to observe the characteristics of the matrix more intuitively, the data corresponding to the matrix are converted into the grayscale color, and the lighter the color, the smaller the value, while the darker the color, the larger the value, forming a grayscale map with 5050. For the vacuum cleaner, the distance matrix is calculated using the sampling data of a certain period, and it can be observed from Figure 6 that for the normal current, the corresponding distance matrix is uniformly gradient, but for the distance matrix corresponding to the fault data, image distortion appears. If we examine a row or column of the matrix, the varying trend of the element in the distance matrix of the normal current decreases and then increases later, and in the matrix of the arc fault, the trend is broken at certain points, which are caused by the normal circuit dynamic system broken by arcs.

#### 4. Faulty Arc Diagnosis Based on CNN

CNN is a deep feed-forward neural network with local connections and weight sharing, which is composed of several layers, including the convolutional layer, pooling layer, activation function layer, batch normalization layer, and fully connected layer. Extremely high accuracy can be obtained with CNN for image classification and face recognition [20, 21]. CNN can effectively classify data of noise and have a certain degree of fault tolerance [22]. Classic CNN mainly contains LeNet, AlexNet, ResNet, GoogLeNet, and DenseNet [22]. In recent years, although the basic layer of CNN has not undergone major changes, the performance of the network can be improved with different combinations of the basic layers or other operations. With the appearance of deep CNN, investigation of the CNN develops rapidly, but due to the disappearance of gradient or gradient explosions in deep training, the property of the network will decline if the deeper network is pursued only [23]. Therefore, the key to solving the problem is to find the appropriate framework of the neural network, training method, and data for the corresponding problem.

CNN can handle data in different dimensions, of which one-dimensional CNN is suitable for fixed-length time segments, and the main difference is the way it is moved by feature detectors on data, compared with the commonly used two-dimensional CNN [24]. The current sampling data of the load correspond to a one-dimensional time series, but some problems appear in the application of the one-dimensional CNN. The first issue is that one-dimensional CNN often require features that are not correlated with their location, but features of the arc tend to appear at peaks and zero cross points of the sampled signal. The second problem is that the number of samples in a single cycle is not enough, and if the data of multiple cycles are analyzed, the accuracy and timeliness will be reduced. However, the above problems do not emerge for the two-dimensional CNN.

A convolutional layer is a core component of the network, and it consists of subregions connected to the input image or to output neurons connected to the previous layer, which learns the features localized by these regions while scanning the image. The formula for convolution operations is as follows:where is the -th local output of the layer , and represent the weight and bias of the -th filter core in the layer , is the -th local input of the layer , and is the activation function.

Overfitting of neural networks can be avoided by adding a pooling layer after the activation process, so the pooling layer is usually added after the convolutional layer, which can narrow the dimension of the characteristic map in the network and reduce the computational complexity. The formula for the maximum pooling operation is as follows:where is the value of the neuron in the layer of the -th channel and and are the step size and width of the pooled kernel, respectively.

LeNet-5 is a CNN that was used for handwritten recognition in the early years and has been applied to arc fault diagnosis recently, and in order to improve the accuracy and stability of the network, the following improvements can be made on the basis of the LeNet-5 architecture:(1)To speed up the training of CNN and reduce sensitivity of network initialization, a batch normalization layer is used between a convolutional layer and nonlinear part, which normalizes small batches of data in the observation values for each channel independently(2)The sigmoid function was used in the original convolutional layer to construct the output feature, but in this article, it is replaced with the ReLU activation function, which makes a part of the output as zero, for improving the sparseness of the network [25], reducing overfitting to some extent, avoiding gradient dispersion, and speeding up training. The softmax function is used in the output layer to convert the output result into a value that is more in line with the probability.(3)This article replaces the average pooling layer in the original network with the maximum pooling layer. The advantage of the maximum pooling layer is that the neural network can focus on most important elements, thereby reducing the impact of parameter errors of the convolutional layer on the estimate.(4)RMSProp is selected as an optimizer. The RMSProp optimizer uses the method of exponential attenuation averaging, and when the loss value oscillates locally, amplitude updated can jump out of the oscillation process so that the optimizer can converge quickly.

The final network in this paper is shown in Figure 7, and the parameter designed is shown in Table 1.

#### 5. Analysis of Experimental Results

The environment of the experiment is as follows: the programming language used is MATLAB2020b, the operating system is 64 bit Microsoft-Windows 10, and the CPU is AMD Ryzen7 5800H. Sixty types of loads are selected, a total of 900 sets of data, which are randomly divided into training sets and test sets with a ratio of 7 : 3, and then, the distance matrix is calculated as the input of the CNN. In order to select the appropriate parameters of the model, different batch sizes and numbers of training rounds are set for the experiment, and the accuracy of the test corresponding to different parameters is shown in Table 2.

Since it is observed that the accuracy rate is the highest when the training round is forty and the batch size is sixteen, this set of parameters is selected for ten experiments, and the two datasets are redivided before input is determined to reduce the impact of the dataset division on the results. The results of ten experiments are shown in Table 3. In order to show the training trends and advantages of the model more clearly, the training curves of the improved LeNet-5 network and four other networks (traditional LeNet-5, residual convolutional neural network ResNet18 [23], AlexNet [26], and DarkNet19 [27]) constructed in this paper are plotted, which reflect the loss and prediction accuracy changes of the test set during training. From Figure 8 and Table 3, it can be seen that the improved LeNet-5 network built in this paper reaches a stable speed faster than the other four networks, and the accuracy rate of the test set is also the highest, while the average accuracy rate of detection for the test set reaches 99.00%.

In order to verify the applicability of this method for series arc fault identification for each load, fifty samples (ratio of normal and failure is one) are selected for each load, and the average accuracy rates of ten recognition with the CNN are calculated. It can be seen in Table 4 that there is a high recognition accuracy for the identification of arc faults in series for all loads.

Then, robustness of the proposed scheme is verified. This paper analyzes the influence of noise data, partially missing data, and outlier data on recognition results: (1) Noise data: Gaussian noise with a signal-to-noise ratio of 30 is added to the original data; (2) partially missing data: 10 data are randomly selected from every period to become missing values, filled with the previous; (3) outlier data: we randomly select one data within every period to replace with outliers. The diagram of the three abnormal data is shown in Figure 9. Based on 900 different abnormal datasets, we calculate the recognition accuracy of the three abnormal data, respectively. The results are shown in Table 5. It can be inferred from the table that the proposed scheme has certain robustness.

In terms of hardware implementation, recent network and hardware technology have progressed rapidly, and data acquisition and storage are more convenient. The deep convolutional neural network system can be implemented by the field programmable gate array (FPGA) hardware platform. The platform can simultaneously process about 300 convolution operations in all convolution layers in one clock cycle with characteristics of flexibility, modularization, and miniaturization. Because the convolution object in this paper is a matrix rather than an image, the hardware processing speed will be faster. If the acquisition frequency is 10 kHz, the memory chip operating frequency is required to be about 500 kHz.

In order to claim the superiority of the proposed method for the identification of series arc faults, this paper selects contemporary methods for comparison and analysis, which are shown in Table 6. The standards used in the experimental platform for each method in Table 6 are consistent, but the sampling frequency of the experimental platform is different.

Compared with the methods in [8, 9], the method in this paper does not need to manually extract indicators and indicator screening, retaining all the information about the image, and thus, the recognition accuracy can be improved. Compared with the method in [10], the size of our convolutional neural network input matrix is a quarter of it, which greatly improves the operation speed. Meanwhile, compared with other methods, the method proposed in this paper is far lower than the other three methods in terms of sampling accuracy and makes a good balance between the calculation amount of arc information extraction and analysis information. Obviously, the accuracy of this method is the highest among these methods.

#### 6. Conclusion

In this paper, a high-precision diagnostic method is under investigation, which is suitable for arc detection of multiple loads. Based on the data of arcs collected by the experiment, the effectiveness of the model is proved, and the following conclusions are drawn:(1)In this paper, phase space reconstruction has been used to deal with the original acquired signal without manual extraction of features. The constructed distance matrix can fully retain arc fault characteristics while reducing the matrix size of characteristics and reducing the amount of calculation.(2)The CNN proposed in this paper has a 99% recognition accuracy rate for fault diagnosis in the case of many different loads, which has a higher convergence speed and accuracy rate than other CNNs(3)Theoretical analysis and experimental verification have shown that the integrity of the extracted information and the property of the CNN lead the proposed method to have a high recognition accuracy. Its accuracy is higher than that of other detection methods, and it is suitable for lines with different loads. Our work may provide new ideas for the research and development of arc fault protection electrical appliances.

#### Data Availability

The data used to support the findings of this study are included within the article.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest.

#### Acknowledgments

This work was supported by the Natural Science Foundation of Hebei Province (Grant No. E2019502123).