#### Abstract

As a crucial and widely used component in industrial fields with great complexity, the health condition of rotating machinery is directly related to production efficiency and safety. Consequently, recognizing and diagnosing rotating machine faults remain to be one of the main concerns in preventing failures of mechanical systems, which can enhance the reliability and efficiency of mechanical systems. In this paper, a novel approach based on blind parameter identification of MAR model and mutation hybrid GWO-SCA optimization is proposed to diagnose faults for rotating machinery. Signals collected from different types of faults were firstly split into sets of intrinsic mode functions (IMFs) by variational mode decomposition (VMD), the decomposing mode number* K* of which was preset with central frequency observation method. Then the multivariate autoregressive (MAR) model of all IMFs was established, whose order was determined by Schwartz Bayes Criterion (SBC), and all parameters of the model were identified blindly through QR decomposition, where key features were subsequently extracted via principal component analysis (PCA) to construct feature vectors of different fault types. Afterwards, a hybrid optimization algorithm combining mutation operator, grey wolf optimizer (GWO), and sine cosine algorithm (SCA), termed mutation hybrid GWO-SCA (MHGWOSCA), was proposed for parameter selection of support vector machine (SVM). The optimal SVM model was later employed to classify different fault samples. The engineering application and contrastive analysis indicate the availability and superiority of the proposed method.

#### 1. Introduction

Rotating machinery is a crucial part in modern industrial manufacture, and its failure may result in serious safety accidents and economic losses. Therefore, one of the major topics to be investigated in preventing failures of mechanical systems is recognizing and diagnosing rotating machine faults [1, 2]. In various rotating machineries, rolling bearings are the most widely used components, which are greatly related to the remaining lifetime of machineries [3]. Unfortunately, rolling bearings are prone to failure to some extent because of complex operating conditions and structural characteristics [4]. Given this fact, effective and feasible fault diagnosis methods need to be developed for rolling bearings, to promote the reliability of rotating machinery [5].

Owing to rich information carried by vibration signals, most of fault diagnosis methods for rolling bearings rely on analyzing vibration signals [6]. However, vibration signal is commonly nonstationary in actual operation, which restricts the extraction effect for fault features, and thus affects accuracy of fault diagnosis. Addressing this issue, many time-frequency signal analysis methods are proposed to analyze nonstationary signal, including empirical mode decomposition (EMD) [7], ensemble empirical mode decomposition (EEMD) [8], empirical wavelet transform (EWT) [9] and variational mode decomposition (VMD) [10]. EMD has advantages in dealing with nonstationary signal and does not need to preset any basis function, but its performance is affected by end effects and mode mixing. To overcome these defects, EEMD as an improved version of EMD is proposed by introducing noise-assisted analysis method, which is also increasing the computational cost and cannot completely neutralize the added noise. EWT is a self-adaptive signal decomposition method that inherits advantages of EMD and wavelet transform, but the selection of an appropriate wavelet filter bank remains to be a problem. Unlike the methods mentioned above, VMD is a quasi-orthogonal signal decomposition method and its main structure is to construct and solve a constrained variational problem. It can effectively realize the separation of each frequency component in signal, avoiding the problem of mode mixing or filter selection. Its advancement and effectiveness have been demonstrated in many previous studies [11, 12]. As the nonstationarity of signal weakened by above signal processing methods, fault features can be smoothly extracted. Due to the high efficiency and consideration of variable correlation, multivariate autoregressive (MAR) model [13] is introduced to extract features from vibration signal, which has been widely adopted for feature extraction and shows superior performance [14, 15]. Further, key components among the features extracted from MAR model are screened out via principal component analysis (PCA) to finally construct feature vectors [16].

The essence of fault diagnosis for rolling bearings is a pattern recognition problem; that is, the fault type is determined by fault features of samples. In engineering application, many alternative methods are available for addressing this issue. For instance, Bayesian decision has strong recognition ability by considering prior probability as well as class conditional probability, and its good accuracy requires assumption of a suitable prior model [17]. K-nearest neighbor (KNN) based on Euclidean or Manhattan distances is easy to realize and susceptible to distribution of samples [18]. Artificial neural network (ANN) has good ability in processing recognition problem with the support of a large number of samples [19]. Based on statistical learning and structural risk minimization theory, support vector machine (SVM) is an excellent machine learning method. SVM finds an optimal hyperplane that meets the classification requirements, where the hyperplane maximizes the interval between two classes, thus promoting classification performance. Currently, SVM combined with feature extraction method has been successfully employed in the field of pattern recognition [20, 21].

Although SVM has superior ability in pattern recognition, its performance is affected by parameters. In view of this, various optimization algorithms are developed and applied to search the best parameters, such as genetic algorithm (GA) [22], particle swarm optimization (PSO) [23], sine cosine algorithm (SCA) [24], and grey wolf optimizer (GWO) [25]. As novel optimization algorithms, SCA and GWO proposed by Mirjalili et al. have proved effective in many previous works [26–28]. To achieve better performance in convergence precision, a mutation hybrid GWO-SCA (MHGWOSCA) optimization method is proposed in this paper. To sum up, based on blind parameter identification of MAR model and mutation hybrid GWO-SCA optimized SVM, a fault diagnosis method for rotating machinery is presented in the research. Firstly, VMD was applied to decompose the vibration signal into a set of intrinsic mode functions (IMFs), and the decomposing mode number* K* was preset with central frequency observation method. Then the MAR model of all IMFs was established, whose order was ascertained by Schwartz Bayes Criterion (SBC). And all parameters within the model were identified blindly through QR decomposition, where the key features were subsequently extracted via PCA to construct feature vectors of different fault samples. Afterwards, a hybrid optimization algorithm combining mutation operator, GWO, and SCA theory, called mutation hybrid GWO-SCA (MHGWOSCA), was proposed for parameter selection of SVM. Finally, the optimal SVM model was employed to classify different fault samples. The engineering application and contrastive analysis reveal the availability and superiority of the proposed method.

The paper is organized as follows: Section 2 is dedicated to the basic knowledge of VMD, MAR, and SVM. Section 3 introduces the proposed fault diagnosis method based on blind parameter identification of MAR model and mutation hybrid GWO-SCA optimized SVM. Section 4 presents the engineering application, the result analysis of which demonstrates the availability of the proposed method. The conclusion is summarized in Section 5.

#### 2. Fundamental Theories

##### 2.1. Variational Mode Decomposition

Variational Mode Decomposition (VMD) is a new adaptive signal preprocessing method [29]. By setting a mode number* K* in advance, the given signal can be decomposed into* K* band-limited intrinsic mode functions (IMFs). The cyclic iterative method is used to obtain the optimal solution of constrained variational problem, through which the frequency center and bandwidth of each mode are determined. The constrained variation problem can be described as follows:where represents the set of mode functions and represents the set of central frequencies, while and are the partial derivative of time* t* for the function and unit pulse function, respectively. is the given real valued input signal.

For the above variational problem, quadratic penalty function term and Lagrange multiplier are employed to transform it into an unconstrained problem. Then problem (1) can be specified as follows:where *α* and represent penalty factor and Lagrange multiplier, respectively, and* f*(*t*) is the given signal.

Then alternate direction method is utilized to obtain the saddle point of Lagrange multiplier, that is, solving (2) by optimizing* m*_{k},* w*_{k}, and *β*(*t*) alternately [30]. The optimization problems of* m*_{k} and* w*_{k} are formulated as (3) and (4), respectively.

Solving problems (3) and (4), the iterative equations are deduced as follows.

The Lagrange multipliers can be iterated with where* τ* is an updating parameter.

The main steps of VMD can be summarized as follows:

*Step 1. *Initialize ,* n*=1.

*Step 2 :. *Execute loop,* n*=*n*+1.

*Step 3. *Update* m*_{k} and* w*_{k} based on (5) and (6).

*Step 4. *Update *β* based on (7).

*Step 5. *If , loop end, else turn to Step 2 for next iteration.

##### 2.2. Multivariate Autoregressive Model

Autoregressive (AR) model deduces the regressive variables with itself, namely, using a linear combination of white noise at time* n* and values of previous* p* times to describe the output of system at time* n*. The equation can be described as follows [31]:where is the autoregressive coefficient, is the white noise with mean 0 and variance* σ*^{2}, and* p* is the order of model AR(*p*).

When multivariate modeling is required, AR model must be analyzed and established multiple times. For this reason, the model order determined by different variables may be inconsistent. Compared with AR model, multivariate autoregressive (MAR) model only models once and can reflect the internal relationship among different variables. Let be the value of the* k*-th variable at time* n*; obviously it relates to not only the values of previous* p* times and the white noise at time* n*, but also the values of previous* p* times of other variables [13]. The equation of MAR model can be described aswhere* p* is the order of MAR model, is the autoregressive coefficient for the* k*-th variable when the* m*-th variable is delayed with step* i*, and is the random error of the* m*-th variable.

##### 2.3. Support Vector Machine

Support vector machine (SVM) is a machine learning model developed by Vapnik [32]. It can deduce the optimal solution between model complexity and learning ability with limited information, so as to obtain the best classification accuracy. SVM has unique advantages in solving problems of limited samples, nonlinear and high-dimensional data. For a given sample set , the main idea of SVM is to map samples into a high-dimensional feature space and find a hyperplane to solve the corresponding nonlinear problem in low-dimensional space. The hyperplane function can be constructed aswhere* w* is weight vector and* b* is bias parameter, while represents the inner product of* w* and* x*.

Taking binary classification issue as an example, to classify samples correctly with sufficient classification interval, all samples are required to meet the following constraint.

In the sequel, classification interval can be calculated as , and maximizing the interval is equivalent to minimizing . To solve the linear indivisibility problem, slack term* ξ* and penalty factor

*C*are brought into (11), and hence construction of the optimal hyperplane is transformed into a constrained optimization problem.

When mapping samples to high-dimensional space, different inner product kernel functions will form different algorithms. Among them, radial basis function is widely employed in application of pattern recognition, which can be described aswhere is kernel parameter.

The solution of constrained optimization problem (12) is determined by saddle point of Lagrange function, and this quadratic programming problem is transformed into the corresponding dual problem; that is,where is Lagrange multiplier.

With the optimal value of* μ*_{i} obtained from above dual problem, the optimal classification discriminant function can be ascertained as follows.

#### 3. Fault Diagnosis by Blind Parameter Identification of MAR Model and Mutation Hybrid GWO-SCA Optimized SVM

##### 3.1. Blind Parameter Identification of MAR Model

Multivariate autoregressive (MAR) model is the multivariate version of autoregressive (AR) model. By synchronous autoregressive modeling and parameter identification of multiple variables, it possesses advantages of high efficiency and consideration of variable correlation. Parameter identification is a critical part of MAR analysis and appropriate parameters will contribute to a better fitting effect, during which process the order selection of MAR model not only relates to the extraction of essential features implicated in the sequence, but also has a great impact on recognition efficiency. In this research, Schwartz Bayes Criterion (SBC) is adopted to estimate the order of MAR model. The SBC function is as follows [33]:where* n*_{p}=*mp*+1, , and* R*_{22} is the fourth part of the upper triangular matrix obtained by QR decomposition of K matrix of the model (see [33] for more details).

The specific method of order selection is that, within the given range of model order,* S*(*p*) will gradually decrease with the order increasing. If* S*(*p*) takes the minimum value or there is no significant decreasing in* S*(*p*) as the order increases, the corresponding* p* value is the model order.

Considering the inner relation among different variable sequences, there will be a mass of parameters in MAR model. Generally, minimum variance criterion is used to estimate model parameters directly [34], but it would be computationally expensive for MAR model. To achieve better computational efficiency, a fast calculation method based on QR decomposition is applied to identify the parameters (see [33] for more details). Then all the identified parameters are combined to construct the feature vector of the multivariate sequences.

##### 3.2. Mutation GWO-SCA Optimization

###### 3.2.1. Sine Cosine Algorithm

Sine cosine algorithm (SCA) is a newly swarm optimization algorithm, which is simple in structure and easy to realize. Using only sine and cosine functions, SCA achieves good optimization effect. Exploration and exploitation are the two key phases when SCA processes optimization problem [24]. With an elaborate mechanism of exploration and exploitation, SCA is able to search for feasible solutions quickly in the searching space. Let be the position of* i*-th individual, whereupon each solution of the optimization problem corresponds to a position of corresponding individual in the searching space, where* d* is the dimension of individuals. The core updating equation of SCA is as follows [24]:where is the best position of individual* i*. The parameter determines the region of individual* i *at next iteration, where ,* l*, and* a* are the maximum number of iterations, the current number of iterations, and a constant, respectively. The parameter* r*_{2} is a random number in the scope of , which defines the distance that next movement should be towards. To stochastically emphasize () or deemphasize () the effect of the best position of individual, a random weight* r*_{3} within is brought in the equation. Finally, the parameter* r*_{4} is a random number in the range of to switch fairly between sine and cosine components.

###### 3.2.2. Grey Wolf Optimizer

Grey wolf optimizer is a swarm intelligence algorithm based on strict organization system of grey wolves and its sophisticated cooperative hunting method. GWO algorithm simulates the hierarchy and hunting behavior of grey wolves in the wild. Wolves are divided into four subgroups, i.e., *α*, *β*,* δ*, and

*. The first three subgroups are in turn the three wolves with the best fitness values, which guide other wolves (*

*ω**) to search for the target. During the optimization process, the wolves update the positions of*

*ω**α*,

*β*,

*, and*

*δ**. The updating equations of grey wolves can be described as [25]where , , and are distances between the position of individual*

*ω**α*,

*β*, and

*and an individual, respectively, while*

*δ**C*

_{1},

*C*

_{2}, and

*C*

_{3}are random numbers in .

*A*

_{1}=2

*a∙r*-

*a*,

*A*

_{2}=2

*a∙r*-

*a*, and

*A*

_{3}=2

*a∙r*-

*a*, where

*a=*2-2(

*l*/) is an adaptive parameter which will decrease linearly to 0 with iteration and

*r*is a real random number in . , , and represent the current position of individual

*α*,

*β*, and

*, respectively.*

*δ**X*indicates the final position of an individual which would be in a random place within a circle defined by individual

*α*,

*β*, and

*. Finally,*

*δ**l*is the current number of iterations.

###### 3.2.3. Mutation Hybrid GWO-SCA Optimization

In the updating strategy of GWO, the position of *α* is of the most importance, which guides the search behavior of other individuals. Besides in SCA, sine and cosine functions in (17) possess outstanding ability of balancing exploration and exploitation process. Hence the combination of capabilities between GWO and SCA is expected to improve the convergence performance, which makes the hybrid algorithm, grey wolf optimizer-sine cosine algorithm (GWO-SCA), powerful in both exploring different regions and exploiting promising regions. The part of individual *α* in (18) is improved with the updating strategy of SCA [35]:where values of parameter* r*_{1},* r*_{2}, and* r*_{4} are the same as in SCA, and the meaning of ,* C*_{1}, , and* X* are the same as in GWO, while* l* is the current time of iterations. Except for the subsequent mutation operation, the rest of the processing procedure of GWO remains unchanged.

Analysis of GWO shows that diversity of individuals is gradually losing with iteration and it is likely to result in the local optimum in the later stage of iteration, which will affect the optimization effect. To solve this problem, mutation operators are introduced to enrich diversity of individuals [36]. The new equation containing the mutation operators is applied to enhance (19):where* G* and* T* (*T*<) are the given mutation amplitude and mutation period, respectively, while* unif* represents a random number that conforms to the uniform distribution* U *. It can be observed that, during the updating process, the position of individual *α*, *β*, and* δ *will mutate periodically, which gives them a certain ability to jump out of local optimum even in the later stage of iterations.

The main steps of proposed mutation hybrid GWO-SCA (MHGWOSCA) are shown below:

*Step 1. *Initialize the population randomly in the given range of variables and set mutation parameters.

*Step 2. *Calculate the fitness value of each individual.

*Step 3. *Choose and save the best three individuals as *α*, *β*, and* δ*.

*Step 4. *Update other individuals (* ω*) according to (21) and (22).

*Step 5. *Update parameters* a*,* A*, and* C*.

*Step 6. *If the maximum number of iterations is not met, go to Step 2.

*Step 7. *Output the position of individual *α* as the optimal solution.

##### 3.3. Diagnosis Method Based on Blind Parameter Identification of MAR Model and Mutation Hybrid GWO-SCA Optimization

In this section, to promote the fault diagnosis accuracy, a fault diagnosis method based on blind parameter identification of MAR model and mutation hybrid GWO-SCA optimized SVM is proposed. The specific steps are detailed as follows:

*Step 1. *Collect the vibration signals.

*Step 2. *Determine the mode number* K* of VMD by central frequency observation method.

*Step 3. *Decompose the samples into sets of IMFs with VMD.

*Step 4. *Establish the MAR model of all IMFs and apply SBC to determine the model order; then MAR parameter vectors of different fault samples can be identified by QR decomposition.

*Step 5. *Extract principal features from parameter vectors based on PCA; thus the feature vectors of different fault types are obtained.

*Step 6. *Optimize the parameters* C* and for SVM with the proposed MHGWOSCA optimization method.

*Step 7. *Train the SVM model with optimal parameters* C* and ; then the optimal SVM model is obtained.

*Step 8. *Apply the optimal SVM model to classify different fault samples and evaluate the performance of the model.

The flowchart of the proposed fault diagnosis model is shown in Figure 1.

#### 4. Engineering Application

##### 4.1. Data Collection

Vibration signals with different fault locations and sizes gathered from Bearings Data Center of Case Western Reserve University [37] were employed as the experiment data in this research. The experiment stand was mainly composed of a motor, an accelerometer, a torque sensor/encoder, and a dynamometer. The bearing was a deep groove ball bearing with model SKF6205-2RS. Accelerometers were applied to collect vibration signal; meanwhile, the torque sensor/encoder was for the measurement of power and speed. By using electro-discharge machining, the experiment device simulated three fault states of the rolling bearing: inner race fault, ball element fault, and outer race fault, and the depth of faults was 0.011 inches. In the experiment, vibration signals collected from the drive end (DE) were taken as the research objects, where the sample frequency was 12kHz and the rotation speed was 1750 rpm under the rated load of 2 hp. To fully verify the validity of the proposed fault diagnosis method, 9 types of samples were used in this paper, namely, inner race fault, ball fault, and outer race fault with diameters of 0.007, 0.014, and 0.021 inches (i.e., each of the three types of faults has three defect sizes). Further, vibration signal of each fault type contained 59 samples and each sample contained 2048 sample points. Image of the experiment device is shown in Figure 2, and the experimental data [5] are listed in Table 1.

##### 4.2. Evaluation Metrics

In this application, adjusted Rand index (ARI), normalized mutual information (NMI), F-measure (F), and accuracy (ACC) as four widely used evaluation measures are employed to evaluate the diagnosis results [38], which reflect the matching degree between classification results and real sample distribution information.

Let be the given actual label set and be the classified result set. ARI considers the number of samples with the same labels and different labels. The scope of ARI is and the higher value indicates that more samples are labeled correctly. It can be expressed aswhere* n*_{11} is the number of sample pairs with the same label in both* Φ* and* Ω*, while* n*_{00} is the number of sample pairs with different labels in and* Ω*. is all possible combinations of sample pairs.

NMI is one of the common external evaluation metrics that can measure the degree of agreement between two data distributions [39]. The value of NMI lies between 0 and 1, and the higher value indicates a better classification performance. The calculation equation is as follows:where* p*(* φ*,

*) is the joint probability function of*

*ω**Φ*and

*Ω*, and

*p*(

*) and*

*φ**p*(

*) are the probability distribution functions of*

*ω**Φ*and

*Ω*, respectively.

F and ACC are derivations from the confusion matrix whose structure is reflected in Table 2 [40, 41]. F comprehensively considers the proportion of true positive result in positive actual label and in positive classified result. ACC represents the proportion of all correctly classified results in total results. The scope of these two metrics is both , and the higher value indicates a better classification performance. F and ACC can be defined as (25) and (26), respectively.

##### 4.3. Application to Fault Diagnosis of Rolling Bearing

To demonstrate the superiority of the proposed VMD-MAR-MHGWOSCA-SVM method, the experiment was carried out in a comparative form at each stage, as follows: EMD was applied for comparison in the signal processing stage; fuzzy entropy (FE) [42] and permutation entropy (PE) [5] were used in the feature extraction stage for comparison; GWO, SCA, and HGWOSCA were employed for comparison in the parameter optimization stage. The parameters of the contrastive experiments were set in the same way as the proposed method.

In the proposed method, the first step is to decompose signal of each fault type into a set of IMFs. The mode number* K* of VMD needs to be determined in advance with central frequency observation method. If* K* is too large, mode mixing would occur because central frequencies of adjacent IMFs are too close. In contrast, it is difficult to weaken the nonstationarity effectively if* K* is too small. Central frequencies under different* K* value are listed in Table 3, where* K* was ascertained by the sample of inner race fault with diameter of 0.007 inches (L1). As illustrated in Table 3, when* K *was set to 5, the first two central frequencies were relatively close, which meant excessive decomposition had occurred; hence parameter* K *was set to 4. The original signal waveforms of different fault types are shown in Figure 3 and decomposition results of 0.007 inches faults (L1, L4, L7) are illustrated in Figure 4. It can be observed that waveforms from 9 types of faults are quite different and IMFs of one fault type are also discriminative to a certain extent.

**(a)**

**(b)**

**(c)**

**(d)**

**(e)**

**(f)**

**(g)**

**(h)**

**(i)**

**(a)**

**(b)**

**(c)**

When decomposed components of a sample were obtained, a multivariate autoregressive (MAR) model was established for all IMFs. The values of SBC value* S*(*p*) are shown in Figure 5, where the order of MAR model was determined according to the sample of fault type L1. The figure shows that as the model order* p* increases,* S*(*p*) decreases gradually, but the value of* S*(*p*) is no longer significantly decreased when* p* increases to 10. Hence, the MAR model order* p* is selected as 10 in the experiment. After establishing the MAR model, all parameters were identified based on the QR decomposition and thus the fault feature vectors were formed. Further, principal features were extracted from all features by using principal component analysis (PCA), thus achieving a balance between the computational cost and the integrity of feature information. The principal component (PC) distribution of the feature vectors obtained by blind parameter identification of MAR model is illustrated in Figure 6 and Table 4, while the 3D projection of feature vectors for different faults is shown in Figure 7.

During optimization stage, the feature vectors of different fault types were divided randomly into two parts with 30 and 29 vectors, respectively, and the 30 ones were used for training while the 29 ones for testing. The proposed MHGWOSCA approach was applied to optimize the penalty factor* C* and the kernel parameter , where individual number and iteration time were set to 30 and 100, respectively, while searching ranges of* C* and were both . Considering the maximum number of iterations and the number of optimization parameters, mutation amplitude was set to 0.5 and mutation period was set to 5. During optimization process, fitness values were calculated by fivefold cross validation for the training samples. Hence, the optimization effect of the given* C* and can be measured by average accuracy of cross validation, and then the optimal* C *and were obtained. The convergence procedure of MHGWOSCA is depicted in Figure 8, from which it can be observed that, with the progress of iteration, the average fitness value of individuals rises rapidly at first and then tends to be stable; that is, the individuals are close to the global optimal solution. In addition, due to the effect of the mutation operators, the best three individuals *α*, *β*, and* δ *change greatly and periodically, trying to jump out of the local optimum without affecting the overall convergence trend. The shaded part shows the distribution of the convergence curve in 10 runs. The convergence curves of different optimization methods are compared in Figure 9, while the average fitness values for different optimization methods in the last 20 iterations are listed in Table 5. The convergence data are the average of 10 runs. The figure shows that the convergence rate of SCA is significantly lower than other methods; GWO and HGWOSCA both have good convergence trend; MHGWOSCA first converges to a high level in the initial stage and shows the best convergence effect at later stage of iterations among all methods.

**(a)**

**(b)**

**(a)**

**(b)**

With the optimal parameters* C *and , the SVM model optimized by MHGWOSCA was trained and applied to recognize testing samples. For an in-depth verification of the proposed method, diagnosis results were averaged over 10 runs and training samples were chosen at random in every run. Furthermore, deviation of each result was calculated as a reference for result evaluation. The optimal* C* and are presented corresponding to the best accuracy among 10 runs.

Fault diagnosis results for different methods are shown in Table 6 and Figure 10. It can be seen from Table 6 that the evaluation values of the proposed VMD-MAR-MHGWOSCA-SVM method are the highest among all four metrics, i.e., 09573, 0.9554, 0.9808, and 0.9808, and the deviations of the evaluation values are also at a small level. Specifically, the comparison between EMD-MAR-MHGWOSCA-SVM and the proposed method indicates the effectiveness of VMD; the comparison of VMD-FE-HGWOSCA-SVM, VMD-PE-HGWOSCA-SVM, and the proposed method shows MAR model possess a better feature extraction ability; similarly, it can be concluded from the contrastive experiment among VMD-MAR-GWO-SVM, VMD-MAR-HGWOSCA-SVM, VMD-MAR-SCA-SVM, and the proposed method that MHGWOSCA improves the diagnosis performance of SVM. Figure 11 is the boxplots of evaluation values, illustrating the performance of different diagnosis methods. As shown in the figure, the proposed method achieves the highest evaluation values and possesses outstanding stability among all methods.

#### 5. Conclusions

For rolling bearing fault diagnosis, a new method based on blind parameter identification of MAR model and mutation hybrid GWO-SCA optimized SVM is proposed in this study. Due to the nonstationarity of original signal, signals collected from different types of faults were firstly split into a set of IMFs by VMD, the decomposing mode number* K *of which was determined with central frequency observation method. Then all the IMFs were employed to establish a MAR model, whose order was determined by SBC. Later, all parameters of the MAR model were identified blindly through QR decomposition and key features of parameter vectors were extracted via PCA, thus forming the fault feature vectors. Finally, a hybrid optimization method combining mutation operator, GWO, and SCA theory, termed mutation hybrid GWO-SCA (MHGWOSCA), was proposed to optimize SVM model which was subsequently applied to realize the pattern recognition for different fault samples. The contrastive experiments in Section 4 were implemented among the proposed VMD-MAR-MHGWOSCA-SVM method and other seven relevant methods, where nine types of faults with different locations and sizes were successfully diagnosed. The analysis of results indicates that the proposed method exhibits the best performance in all classification metrics, showing its outstanding precision and stability among all methods.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare no conflicts of interest.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China (nos. 51741907 and 51409095), the Fundamental Research Project for Application Supported by Yichang Science and Technology Bureau (no. A17-302-a12), the Open Fund of Hubei Provincial Key Laboratory for Operation and Control of Cascaded Hydropower Station (no. 2017KJX06), and the Research Fund for Excellent Dissertation of China Three Gorges University (no. 2019SSPY070).