#### Abstract

Nowadays, due to the increase in the demand for electrical energy and the development of technology, the electrical devices have a more complex structure. This situation has increased the importance of concept of the power quality in the electrical power system. This paper presents a deep learning-based system to recognize the power quality disturbances (PQDs) in the solar photovoltaic (SPV) plant integrated with power system networks. The PQDs are analyzed using continuous wavelet transform (CWT) and image files are obtained from scalograms of CWT. Then, these image files are used to recognize PQDs with the help of a hybrid deep learning approach based on convolutional neural network (CNN), neighbor component analysis (NCA), and support vector machine (SVM). In this hybrid deep learning approach, the image files are given as input to AlexNet and GoogLeNet. The NCA is applied to the features obtained from the last dropout layer of each architecture. The distinctive features obtained from the NCA process are classified using the SVM algorithm. In order to evaluate the proposed approach, PQD data are obtained from a modified IEEE 13-bus test system including the SPV system. Several analyses and comparisons are carried out to verify the success of the proposed approach. It has been found that the proposed hybrid deep learning approach has the ability to accurately recognize the PQDs even if the SPV plant integrated power system has a negative effect on power quality.

#### 1. Introduction

Recently, the rapid development of technology has increased both the need for energy demand of consumers and the use of complex loads. The inadequacy of traditional energy generation methods has led to the expansion of the use of hybrid energy systems with the solar photovoltaic (SPV) energy systems. A power system integrated with SPV needs more effective solutions in power system issues such as operation, protection, reliability, and stability. Besides, collisions during the coordination and operation of the SPV plant integrated power system cause power quality disturbances (PQDs) that can create undesirable effects on both the grid and end-user equipment [1]. A PQD can be defined as the deviation in standard values of the power system parameters, voltage, current, and frequency [1, 2].

Identifying a PQD type in a microgrid integrated with SPV is an important issue in terms of determining the decision-making process of the system and the appropriate operating modes for energy management [3, 4]. Traditionally, in literature, the proposed approaches for identification of the PQDs consist of two stages which are feature extraction and classification. In the feature extraction stage, the most important and critical function is to obtain distinctive features of the data. If the distinctive features get high score, the classification stage has a high recognition performance. In the feature extraction stage, many signal processing techniques have been proposed to analyze PQDs. Fast Fourier transform (FFT) [5], short-time Fourier transform [6], wavelet transform (WT) [7–10], S-transform (ST) [1, 11–14], Hilbert-Huang transform [15, 16], Kalman filter (KF) [17], and curvelet transform (CT) [18] are used for feature extraction of PQDs. In the classification stage, types of the PQDs are determined by using the features obtained from feature extraction stage. Before the 2000s, artificial neural networks (ANNs) [19] were used in the classification stage of recognition system for PQDs. Later, many methods such as support vector machine (SVM) [8, 20], fuzzy expert system (FES) [17, 21], k-nearest neighbor [22], and genetic algorithm [23] have been developed for intelligent classification systems and these systems have been widely used in the classification stage of PQDs. Nowadays, deep learning-based classifiers used as a popular approach in intelligent image recognition algorithms are used in recognition systems for PQDs [3, 7, 9, 18, 24].

In recent years, the increase in the integration of the SPV system into the power system has led to an increase in research studies on power quality. In [25], a survey on negative impacts of a power system integrated with SPV is present. Besides, the authors investigate the performance of artificial intelligence (AI) systems versus traditional methods in mitigating PQDs. As a result, the authors emphasize that AI methods generally perform significantly higher than traditional methods in terms of controllability and response time. In [1], an algorithm based on ST and fuzzy c-means (FCM) clustering is presented for recognition of the PQDs associated with SPV energy penetration in power system. In [25], the authors present a method-based ST for detecting PQD, islanding, interruption, and grid synchronization in a power system containing a renewable energy source. In [26], a method including WT, SVM, and independent component is proposed for detection of the PQDs in a SPV integrated microgrid.

In this paper, a hybrid deep learning approach is proposed for classification of PQD. The proposed algorithm is an effective approach that can classify PQDs occurring in both SPV integrated power systems and conventional power systems. Firstly, a technique using continuous wavelet transform (CWT) based decomposition has been presented to obtain scalogram images representing time-frequency information of PQD signals that occur in the SPV plant integrated power system. Secondly, the proposed algorithm includes two deep learning architectures, namely, GoogLeNet and AlexNet. Scalograms obtained from processing stage are given to the input of two deep learning architectures. These architectures are used for feature extraction and the output of the dropout layers of the CNN is utilized for feature extraction. Thirdly, the NCA is applied to the features obtained from two deep learning architectures. Thus, the data size-reduced distinctive features of PQDs are obtained. Finally, features obtained from the NCA feature selection process are applied to the SVM classifier and the PQD classification process is performed. PQDs generated by MATLAB/Simulink are used to evaluate the proposed algorithm. The experimental results showed that the automatic PQD classification algorithm could analyze and classify the PQDs efficiently. The main contributions of this paper are as follows:(1)The main objective of this paper is to classify PQDs, but classification of the PQDs associated with SPV energy penetration in power system is a serious issue in hybrid energy systems. Since this paper includes two CNN architectures with direct links, it takes advantage of the most effective selected features from the very small confidential information in the original input field. Each hidden layer of CNN used in the proposed algorithm is an independent feature extractor. With the NCA feature selection process, more compact and complex features are obtained and an effective PQDs classification process is performed.(2)Considering that PQDs are a signal containing 1D time-amplitude information, in the proposed approach, 1D PQD data is converted to 2D image data and finally 2D image data is reconverted to 1D PQD features, which are reduced in size and have effective distinguishing features. Thus, the proposed approach is an effective algorithm that can also be used in classification problems with deep learning approaches of one-dimensional data.(3)A large number of experimental analyses are conducted to verify the performance of the proposed hybrid deep learning approach under various noises. The results indicate that the proposed approach has a high classification accuracy. Besides, a robust deep learning-based PQD recognition algorithm against noise is obtained.

This paper is structured into six sections. In Section 2, several brief definitions of CWT, CNN, NCA, and SVM are given. Section 3 explains the PQD datasets obtained from the modified IEEE 13-bus test. The proposed algorithm is described in Section 4. Experimental results and discussion for proposed algorithm are given in Section 5. Finally, conclusions are discussed in Section 6.

#### 2. Methodology

##### 2.1. Continuous Wavelet Transform

In literature, many physical quantities are expressed by signals. Nowadays, while many methods are used to analyze these signals, the WT has become one of the most preferred methods among these methods. Basically, the WT is a signal processing method capable of processing data at different scales and resolutions [27]. The Fourier transform decomposes sine waves in various frequency bands of a signal. Similarly, WT is the process of applying shifted and scaled versions of the main wavelet signal to a signal. Two different approaches are used in the WT, discrete WT, and CWT. In general, while discrete WT is preferred for decomposition, compression, and feature determination, CWT is preferred for time-frequency analysis. The key advantage of DWT over CWT is that, for certain scales, conversion and reconstruction of signals are excellent, while CWT suffers from edge effects, but CWT analyzes all potential scales or frequencies with respect to DWT, while DWT is limited to multiples of two powers of the sampling interval. One advantage of CWT over DWT is that it can be applied directly at any scale without iterations. In this paper, CWT is chosen for analysis of PQDs. However, CWT [7] and DWT [8, 9] in many PQDs classification papers have been proposed as an effective signal processing technique. In CWT, the signal of all time and scales is made compatible with the main wavelet by means of compression, expansion, and transformation processes. The mathematical representation of CWT is given in (1) and mother wavelet expression is given in (2).

Here, *x (t)* is continuous time signal, *ψ (t)* is mother wavelet function, *ψ**(t)* is the conjugate of the mother wavelet function, *b* is translation (time) parameter, and *a* is the scaling (frequency) parameter.

##### 2.2. Deep Learning and Convolutional Neural Network

Deep learning is an approach to machine learning developed to generate systems capable of feature extraction especially from large amounts of labeled training data such as image recognition [28, 29] and video analysis [30]. It consists of an advanced neural network containing many hidden layers [31]. In recent years, deep learning has been very popular due to the large amount of data in the field of image processing and major advances in graphics card-based computing. It is seen in the literature that many different deep learning architectures are used. Nowadays, the CNN is an innovative deep learning architecture used in many research areas. The CNN has been developed inspired by biological neural networks. The difference from classical neural networks is that it learns self-attributes from the raw image itself [32]. Convolution process is used in at least one of the hidden layers instead of general matrix multiplication [33]. Since the neurons in each layer do feature inference by learning the nonlinear relationships of the original inputs, there is no need to extract an extra feature [34]. Generally, a CNN architecture consists of three basic layers with different tasks. These are convolution, pooling, and fully connected layers. The CNN uses the pixels of the image as input. Pixels generate the motif consisting of edge combinations. Object parts are obtained from combining different motifs. Then, object parts are combined to form an object [35]. In the convolution layer, the output value of a pixel is calculated as the weighted sum of itself and its neighboring pixel values. In this layer, a convolution filter representing the weight matrix is used to create various feature maps. It has also an effective architecture for combining the image into these feature maps. The filters apply the convolution process to the images in the previous layer to generate the output data. A feature map, namely, activation map, is obtained as a result of the convolution process. Basically, the equation for the convolution process is given as follows:

Here, *b* is the output vector, *x* is the sign itself, *h* represents the filter, and *N* represents the number of elements.

In the CNN, the activation layer is generally used after the convolution layer. This layer is used to reduce the negative value of the inputs from the convolution to zero. Thus, convolution outputs having a linear structure are transformed into a nonlinear structure. Although many functions are used for the activation process in deep learning networks, the rectified linear unit (ReLU) function is the most frequently used among these functions. The mathematical expression regarding the ReLU activation function is given as follows:

After the ReLU layer, the pooling layer is used to reduce the size of the data that will be used in the next convolution layer. Besides, the pooling layer reduces the processing load on the next layer and prevents the memorization of the network.

Fully connected layer after the last pooling layer is connected to all areas of the previous layer. Before this layer, the feature vector of the image is obtained and this vector obtained is given as an input to the fully connected layer.

The number of layers can also be different in different architectures. Since large data is processed in the CNN architecture, a problem such as overfitting of the network may occur. Dropout layer is used to prevent this situation [36].

The last layer used in the CNN architecture is the classification layer, namely, Softmax layer. In this layer, output is produced as much as the number of classes given at the input. A probability between 0 and 1 is calculated for each output produced. The result predicted by the network is the result closest to a value of 1. Softmax function calculates the probability distribution of *k* number of output classes as given in the following equation:

Deep learning has shown improved performance compared to other machine learning techniques as it can extract universal features from very complex datasets [35]. In deep learning algorithm, the high-level features are computed using the fully connected layers and, in general, classification processing is performed by applying these high-level features to the Softmax layer input. In this paper, two deep learning architectures are used to obtain high-level features of PQDs and SVM algorithm is used as a classifier.

##### 2.3. Neighborhood Component Analysis

The features obtained from the dataset used in the classification processes greatly affect the success of the classifier. Therefore, feature selection should be done before classifier training. The success of the classifier can be increased faster and in a shorter time [37]. Nowadays, various optimization techniques are used to classify high-dimensional data by reducing its size. For this purpose, various size reduction techniques such as principal component analysis, linear discriminant analysis, sequential forward selection, sequential backward selection, t-distributed stochastic neighbor embedding, lasso, evolutionary optimization, and feature collection are used [38]. NCA is a nonparametric embedded analysis technique based on the k-nearest neighbor algorithm that discards unsuitable redundant features from a given feature set and creates a small subset of features. Correct feature selection not only reduces the size of features but also increases the execution speed and generalization performance of algorithms. While NCA does not lose any information during the dimensionality reduction process, it also obtains the information of important features by sorting features. The NCA algorithm learns the feature weight vector by maximizing the excluded classification accuracy with an optimized editing parameter [39]. Let the training set be given as follows:where is *d*-dimensional feature vector, are class labels, *n* is the observation number, and *c* is the class number. In terms of the weighting vector , the distance function between two observations and is given as follows:where *d*_{W} are feature weights. An exclusion technique is used in the training set (*S*) to maximize classification accuracy. It is labeled by choosing a random point *x* from *S* as the reference point. The choice of *x* is made according to the probability distribution. The selected reference point can be expressed as follows:where *k* is the kernel function. , where *σ* is the core width. The probability of the classifier to correctly classify is *P*_{i}:where

Therefore, an approximate left-out classification accuracy can be written as

NCA maximizes *F(w)* relative to using the regularized objective term (*λ*).where *λ* > 0 is an edit parameter that can be set via cross validation.

Since the objective function *F(w)* is differentiable, its derivative can be calculated as follows depending on .

At the best *λ* value, the classification loss is minimal. A relative threshold value (*T*) is defined by [39] for the selection of important feature criteria using feature weights.

Here, *τ* is the tolerance fixed to 0.02. are the updated feature weights. If the weight of the features is greater than the *T* threshold value, these features are selected as important features, and the remaining features are removed from the training set. When the value of is too large, all of these features become irrelevant because the weights of the features approach zero. Therefore, the choice of is very important as it reduces the classification loss [39].

##### 2.4. Support Vector Machines

Classifiers used to characterize events according to selected attributes have an important role in pattern recognition. SVM is an algorithm developed by Vapnik for solving pattern recognition and classification problems [40]. SVM helps in solving nonlinear problems without creating an explicit high-dimensional feature space using kernels. It also has coupling capability for both binary and multiclass characterization and provides precise classification with noise-insensitive state vectors [41]. In the classification made with SVM, examples of two classes, which are usually shown with class labels as , are tried to be separated from each other with the help of a decision function obtained from the training data. For this purpose, using the decision function in question, a hyperplane is found which can separate the training data in the most appropriate way. While SVM finds the hyperplane, it finds the hyperplane that maximizes the distance between the points closest to it. Kernel functions are needed for SVM to classify nonlinear applications. Thus, the data in the nonlinear input space is transformed into a linear high-dimensional feature space with the kernel function. In this way, it is possible to separate the data linearly in high dimensions [42].

The SVM was first developed for binary classification and then successfully used for multiclass classification problem by adopting a set of binary SVMs. The most commonly used methods in multiclass SVM are one-to-all (OAA), one-to-one (OAO), and a directed acyclic graph SVM [43]. In the OAO method, binary models are created for a k-class problem and classification is performed. In this method, each model is trained with the data of two unlike classes, and the prediction of the class for the test data is based on the maximum voting. In the other method, OAA, *k* number of two classification tasks are created to solve the k-class problem. In this process, training is performed with a positive label for the current class and a negative label for all other classes. OAO and OAA methods are considered to be quite effective for multiclass classification problems [43].

#### 3. Test System and PQD Data

In this paper, PQD datasets obtained from the modified IEEE 13-bus test system in Matlab/Simulink environment are used to evaluate the proposed approach. The test system is created by adding SPV array system and different types of loads to the standard IEEE 13-bus test system [44]. The test system created for this paper is shown in Figure 1.

To the original IEEE 13-bus test system, SPV plant is connected to bus 680 with a 5 km long transmission line and a transformer. The SPV array consists of 500 parallel strings. A 1.5 MW SPV system is integrated into the power system. In the test system, three transformers are substituted as in Figure 1.

In order to train and test the proposed PQD recognition system, two datasets for PQDs have been obtained from system designed in the Matlab/Simulink environment. The first dataset includes single and multiple PQDs. The second is achieved by adding noise to the first dataset. So, the proposed system is evaluated with regard to the classification capability using both noisy and noiseless PQDs. PQD data are obtained on the test system at 256 samples/cycle sampling rate for fourteen cycles with different variations. Nine types of PQDs and normal sine are generated from test system by integrating the SPV plant.

Each dataset consists of ten classes, nine types of PQDs, and normal sine. PQDs are sag, sag with harmonics, swell, swell with harmonics, interruption, oscillatory transient, harmonic distortion, notch, and notch with harmonics. List and data size of PQDs are listed in Table 1. PQD data and sinusoidal data are obtained within the parameters specified in the IEEE 1159 standard [45]. PQD data are obtained as results of simulations using random parameter values between the lower value and upper value of parameters such as the moment of occurrence, fault resistance, load values, capacitor values, and inception time. For each of these parameters, the step size is determined for each parameter to provide the best data distribution in the range of minimum value to maximum value. In classification studies, deepening the network does not significantly improve recognition accuracy when the dataset is very small. When the dataset is too large, problems such as time consumption and overfitting occur during the training phase. Therefore, in this paper, approximately 1000 pieces of data are obtained for each class.

Normal signal data is created at standard voltage amplitude varying from 0.95 pu to 1.05 pu. Sag data is obtained by single line-to-ground fault event. Swell data is obtained from the unfaulted phases during single line-to-ground fault event. Interruption data is generated by three-phase fault event. Oscillatory transient data is created by switching a three-phase capacitor bank. Harmonic data is obtained by switching a three-phase nonlinear load. Notch data is created by connecting a three-phase thyristor-controlled rectifier. Sag with harmonics data is created by single line-to-ground fault event and switching a three-phase nonlinear load. Swell with harmonics data is obtained by single line-to-ground fault event and switching a three-phase nonlinear load. Notch with harmonics data is created by connecting a three-phase thyristor-controlled rectifier and a three-phase nonlinear load. Second type dataset includes single and multiple PQDs with Gaussian white noises. In order to determine the performance of the proposed PQD recognition system in different noisy environments, four different datasets have been created by adding the noise with signal-to-noise ratio of 20, 30, 40, and 50 dB to the PQD data in first dataset.

#### 4. The Proposed Recognition System

The proposed hybrid deep learning approach for the PQDs is presented and discussed in this section. Figure 2 presents summary of the procedures of the proposed system in this paper. These procedures are divided into three stages: preprocessing, processing, and classification.

During preprocessing stage, PQD is extracted from three-phase voltage signal, segmented as fourteen cycles, converted to a relative scale (in pu, where pu stands for per unit), and saved to hard disk. Thus, at the end of this stage, 1D data with a total of 14 × 256 = 3584 samples for a PQD are obtained.

The processing stage is the most important part of the proposed hybrid deep learning approach. This stage consists of two substages. The first substage of the processing is feature extraction of PQDs based on CWT and CNN. The second substage is feature selection stage using NCA algorithm.

In the first substage of the processing, scalograms of PQDs are firstly generated by CWT and saved to hard disk as image files. The scalogram represents the time-frequency energy density obtained by the WT and is the square of the amplitude of the WT. In other words, a scalogram is a visual WT representation, where the *x*-axis and *y*-axis represent time and frequency, and the *z*-axis shows magnitude displayed in terms of a color gradient [46]. In this paper, PQDs are transformed into scalograms and images representing meaningful and distinctive time-frequency features of PQDs are obtained. Therefore, the first substage of the processing is very important in case of the performance of the classification stage. In the CWT analysis using the Morse wavelet, the frequency coefficients of each piece of PQD data are calculated and these coefficients are displayed as scalograms. Each scalogram is saved to hard disk as an RGB jpeg file with a 224 × 224-pixel resolution. Figure 3 shows waveforms and scalograms of ten randomly selected pieces of PQD data for all classes. In this paper, scalograms of a two-type dataset for PQDs are obtained. In Figure 4, the time-amplitude waveforms and scalograms of a sag signal with no noise and 20 dB noise are given. Although there is a significant distortion in the waveform as a result of adding noise to this sag signal, a slight change is observed only in blue color changes representing high frequency components in the scalogram change. In the final process of the first substage, the CNN algorithm is used to obtain 1D feature vector from 2D scalogram images. CNN algorithm can learn directly from raw data without object segmentation or feature extraction compared to traditional learning algorithms. However, improving the quality and distinctive characteristics of the data presented to a CNN algorithm will increase the classification performance. Therefore, it is of great importance to search for an optimal time-frequency method in many deep learning method-based studies [7, 9, 18, 47]. In this paper, scalograms representing the distinctive time-frequency features of PQD signals are obtained using the CWT method. For this purpose, two CNN architectures, GoogLeNet and AlexNet, are designed as feature extractor. In the GoogLeNet algorithm, features are obtained from the activation data in the output of the dropout layer (newDropout) at level-141. Features of the AlexNet algorithm are derived from the activation data in the output of the dropout layer (Drop7) at level-22. As a result of the CNN algorithm, a total of 4096 and 1024 1D features are obtained from GoogLeNet and AlexNet architectures, respectively.

In the second substage of the processing, the feature selection process is performed by applying the NCA algorithm to the features obtained from CNN architectures. A total of 5120 features obtained for GoogLeNet and AlexNet architectures are applied to the NCA algorithm. At the output of the NCA algorithm, the feature weight of each feature is obtained. A total of 1000 features with high feature weight are sorted in descending order. In this paper, further analysis is performed to obtain the feature set with the highest classification performance. In this analysis, firstly, the first feature with the highest feature weight is applied to the SVM classifier and classifier performance is obtained for only first feature. Secondly, the feature set is obtained by using the first feature and the second feature. Classification process is performed using the two features with the best feature weight. Then, the other features are incrementally added to the feature set. Finally, as a result of 1000 classification processes, the feature set with the highest performance is used in the classification stage.

The classification stage, which is the last step of the proposed hybrid deep learning approach, classifies the PQD using SVM. Hyperparameter optimization of SVM is performed by a grid search approach. Hyperparameters can be thought of as coding design, kernel function, box constraint level, and standardization of input feature space with z-score. The coding design evaluates the model of OAO or OAA classification. Optimal hyperparameters are found which minimize classification loss of SVMs using the 5-fold cross-validation set. Hyperparameter optimization defines the optimal values of hyperparameters of SVM from the search space defined in Table 2.

#### 5. Experimental Results and Discussion

The results of the proposed hybrid deep learning algorithm are presented and discussed in this section. In this paper, two datasets are used to evaluate the proposed system. Dataset-1 contains simulated PQDs obtained from the modified IEEE 13-bus test system with the SPV plant, while dataset-2 is based on noisy PQDs. So, the proposed system is evaluated with regard to the classification capability of the PQDs under the conditions of noise and no noise.

A series of experiments have been conducted to evaluate the performance of the proposed approach. In the first experiment, each PQD dataset is divided into training and test samples such as 50% and 50%, respectively. Firstly, dataset-1 is applied to the proposed algorithm. Hyperparameters of SVM obtained from the optimization stage are obtained as “OAO,” “Polynomial,” “0.01,” and “No” for coding design, kernel function, box constraint level, and *z*-score normalization, respectively. From Table 3, it can be seen that all the classes are exactly correctly classified for the noiseless condition. In the second experiment, the proposed algorithm is tested using dataset-2 created under the different noise conditions. The results are illustrated in Table 3. In noisy signals, notch and notch with harmonics classes result in misclassified features because of the difficulty of detecting the distinctive feature of notch with high-level noise, while the other classes are correctly classified. This case gradually decreases in low noise levels.

In the third experiment, the performance of processing stage is investigated using dataset-2 with 20 dB. Figure 5 shows the distribution of feature weights obtained at the end of the preprocessing step. Among the 1000 features with the highest feature weight, 831 features are AlexNet features and the rest are GoogLeNet features. In Figure 6, the classification results obtained for the 1000 features with the highest feature weight are shown. As can be seen in Figure 6, the best classification performance is obtained by using the first 48 features with the best feature weight. Among these features, 38 features are AlexNet features and the rest are GoogLeNet features. Table 4 gives classification accuracy of each CNN architecture for dataset-2 with 20 dB. According to these results, it is very important to determine the optimal number of features using the feature selection algorithm. Besides, it is seen that AlexNet features are more effective than GoogLeNet features.

In the final experiment, the performance of the proposed approach is evaluated using less training samples. In the literature, the amount of training data in the classification of PQDs is generally higher than the amount of test data. Different training/testing ratios can be used such as 33.3 : 66.6 [16], 50 : 50 [20], 55 : 45 [9], 66.6 : 33.3 [7], 70 : 30 [18], and 80 : 20 [14]. In this experiment, each dataset is split into 10% training and 90% testing samples. Thus, the numbers of the training and testing samples for each dataset are determined as 958 and 8630, respectively. The results are illustrated in Table 5. It is seen from Table 5 that the classification accuracy for the third experiment is obtained as 99.85% for dataset-1, 96.28% for dataset-2 with 20 dB, and about 99.70% for dataset-2 with no noise. This is a promising result. As can be seen from Table 4, the proposed algorithm has a high recognition performance even if a small number of training samples are used.

Regarding the computational burden, the proposed algorithm has an effective computation time to test the PQD data, considering a personal computer with the following characteristics: Windows PC with Intel(R) Core(TM) i5-8300H CPU @ 2.53 GHz, 8.0 GB RAM, CUDA Toolkit 11.4, and GeForce GTX 1650 GPU (4 GB caches). In the computation analysis, the computation time for processing and classification stages of the proposed algorithm has been determined. In the first substage of the processing stage, the conversion time of 1D PQD data to 2D image data using CWT is 34 ms on average. In the second substage of the processing stage, the computation time to obtain features from the activation data of the pretrained GoogLeNet and AlexNet architectures is 128 ms on average. In the classification stage, the computation time of the SVM classification process, which is performed using the features selected in the proposed algorithm, is 22 ms on average. Thus, the testing time of the proposed algorithm is about 184 ms. The IEC 61000-4-30 standard [48] defines a basic measurement time interval of voltage quality parameters. In the IEC 61000-4-30 standard, voltage quality parameters shall be a 10-cycle time interval for 50 Hz power system or 12-cycle time interval for 60 Hz power system. Hence, the proposed PQD classification is performed in 184 ms (about 11-cycle) considering a 12-cycle analysis window as defined in the standard IEC 61000-4-30. This computation time is quite efficient and sufficient for monitoring PQD data and can be reduced by using higher performance computers or embedded systems such as FPGAs. Hence, it is deduced that results of the proposed algorithm show remarkable accuracy, strong robustness, and less burden to classify the PQDs in the considered SPV plant integrated power system.

##### 5.1. Comparison with Other Methods

To evaluate the performance of the proposed algorithm presented in this paper, it is compared with recently published articles in [3, 7, 9, 13, 17, 18, 20, 23, 49–52]. The comparison results are shown in Table 6. In [3], a CNN algorithm using the dataset obtained from the mathematical model of PQDs is presented. A total success rate of 99.96% is achieved for 16 PQDs. In [7], a PQD recognition system based on CWT, CNN, and Bayesian optimization algorithm (BOA) is introduced for a dataset containing 1500 real-life disturbance signals and the classification rates of five types of PQDs are achieved as 99.8%. In [9], a new approach using WT, singular spectrum analysis (SSA), compressive sensing (CS), and deep neural network (DNN) for recognition of PQDs is presented. Optimal features instead of raw data of the PQDs are used for input of DNN. Classification results of the proposed WT-CS-DNN method with SSA-CS-DNN method are obtained as 99.56% and 99.85%, respectively. In [13], a real-time method using ST, FCM, and decision tree (DT) is presented for detection and classification of PQDs and the classification rates of ten types are achieved as 98.33%. In [17], a hybrid technique based on KF and DWT is used for classification of PQDs. Features obtained from this hybrid technique are applied as input to the FES. The average classification accuracies of seven types of PQDs are 92.3%, 97%, and 98.71% with SNR of 20 dB, 30 dB, and 40 dB, respectively. In [18], an approach to detect and classify the PQDs based on SSA, CT, and CNN is proposed. For validation of this approach, 31 categories of synthetic and real PQDs are used. Classification results of nine types of single PQDs and 22 types of complex PQDs are obtained as 100% and 99.52%, respectively. In [20], adaptive chirp mode tracking (ACMP) is used for the preprocessing phase of PQDs, and grasshopper optimization algorithm (GOA) is proposed for the optimization of SVM parameters. The average classification accuracy of the 23 types of PQDs is 99.56%. In [23], ST and time-time transform (TT) are used for detection and feature extraction of PQDs. Besides, the presented algorithm includes nondominated sorting genetic algorithm II (NSGA-II) for multiobjective feature selection in PQDs classification and DT is used for classification. The computational time and overall accuracy of presented algorithm are 0.194 s and 99.93%, respectively. In [49], a downsampling empirical mode decomposition (DEMD) and a metaheuristic grey wolf-based optimized random forest approach (ORFA) with machine learning are proposed to detect islanding conditions and classify nonislanding PQDs. The proposed approach is simulated in MATLAB/Simulink with IEEE 13-bus test system integrated with SPV and wind energy system. Classification results of nine types of PQDs with no noise and with 20 dB noise are obtained as 99.70% and 99.10%, respectively. In [50], the authors present the detection and classification of islanding and PQDs for multiple distributed generation systems using an adaptive cross variational mode decomposition (XVMD) with reduced kernel ridge regression (RKRR). The classification rate of twelve types of PQDs is achieved as 99.2%. In [51], a sequence-to-sequence deep learning architecture based on the bidirectional gated recurrent unit (Bi-GRU) for type recognition and time location of PQDs is presented. The overall accuracy is over 98% for both synthetic and real signals with noisy environment and the corresponding execution time is around 0.34 s. In [52], an automatic classification technique based on WT, Probabilistic Neural Network (PNN), and Artificial Bee Colony (ABC) is presented. In this technique, classification accuracy of the 16 types of PQDs is 99.87% and computational time of the presented technique is 0.176 s.

Performance comparison of recently published articles in the noiseless and noisy environment is illustrated in Table 6. For noiseless environment, the proposed method in this paper has exactly the right classification success, while other papers’ methods have poor classification rate. For noisy environment with 20 dB, the obtained result in this paper is lower than other papers’ results because of the difficulty of detecting the distinctive feature of notch with high-level noise. For noisy environments with 30 dB, 40 dB, and 50 dB, the results of the proposed method are quite high and effective compared to the results of the proposed methods in recent years. Besides, the results in this paper are obtained using half of all data for training, while other papers use a higher rate of all data for training. In some of the cited works, no analysis has been performed on computational time. When the computational time of the algorithm proposed in this paper is compared with those of other works, it is seen that some works have lower computational time and others higher. However, the computational time performance of any technique is influenced by several other factors such as sampling frequency, feature extraction technique, classifier type, and processor speed. Considering all such factors, the proposed hybrid deep learning approach is clearly a viable option for the PQD classification problem. When the results obtained in this article are compared with the results of recently published literature papers, it is seen that the proposed approach is suitable for the classification of PQDs.

#### 6. Conclusion

In this paper, a hybrid deep learning approach based on CWT, CNN, NCA, and SVM has been proposed for classification of PQDs in the SPV plant integrated power system. Even if the SPV plant integrated power system has a negative effect on power quality, the obtained result showed that the accuracy level of the proposed approach is significantly higher than the results of the recently published papers. The high performance is achieved in the processing stage due to the implementation of the CWT and CNN-based efficient feature extraction method and the feature selection algorithm using NCA. The proposed approach has exactly the right classification success for noiseless environment, while it has high performance in the severe noise level. The comparison results show that the proposed approach has abundant capability for recognition of PQDs even in noisy environment. The proposed approach performs very well in the recognition of various PQDs in the SPV plant integrated power system. Besides, the proposed method can be used in the novel technologies of power system such as microgrids and a grid-connected hybrid power generation system.

#### Data Availability

The power quality disturbance data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.