#### Abstract

Spiral bevel gears are basic transmission components which are widely used in mechanical equipment. These components are important elements used in the monitoring and diagnosis of running states for ensuring the safe operations of entire equipment setups. The vibration signals of spiral bevel gears are typically quite complicated, as they present both nonlinear and nonstationary characteristics. In previous studies, multiscale permutation entropy (MPE) has been proven to be an effective nonlinear analysis tool for complexity and irregularity evaluations of complex mechanical systems. Therefore, it is considered that MPE values can be used as the sensitive features for spiral bevel gears fault identifications. However, if the MPEs are used to directly construct the feature vectors, some problems will be encountered, such as large numbers of characteristic quantities, high dimensions, and issues related to diagnosis accuracy and efficiency, which have been proven difficult to obtain at the same time. In order to improve the accuracy and efficiency of fault recognition in spiral bevel gear evaluations, locality preserving projection (LPP) methods can be applied to reduce the high dimensionality feature vectors constructed by MPEs. They have the ability to extract low-dimensional sensitive information from high-dimensional feature data. In order to directly obtain the diagnostic results, classifications are necessary. When compared with traditional neural networks, it has been found that extreme learning machines (ELMs) have the advantages of faster training speeds and stronger learning abilities. In summary, this study proposed the use of MPE values which could be optimized and dimensionality reduced by LPP as the feature vectors, along with ELMs as the classifiers of the fault mode identifications, in order to carry out valuable research of fault diagnosis methods for spiral bevel gears. The proposed method was applied to the diagnoses of four types of fault state spiral bevel gears. Then, the MPE-LPP-ELM results were compared with those obtained using MPE-PCA-ELM and MPE-ELM methods. Their respective diagnostic accuracy is 100%, 98.75%, and 98.75%, and diagnostic time is 0.0023 s, 0.0033 s, and 0.0078 s. It was determined in this study that the results confirmed the accuracy and superiority of the proposed method.

#### 1. Introduction

Spiral bevel gears are the core transmission components of power transmissions and speed transformations in modern mechanical equipment. They are widely used in such fields as aerospace, automobile, energy, and so on due to their large meshing surfaces and stable transmission abilities [1]. However, due to the complex and changeable operational environment of spiral bevel gears, including long-term operations in environments with variable speeds and high loads, as well as the influencing effects of errors in manufacturing and installation, faults may very easily occur. These include gear tooth breakage and gear tooth surface scratches and pitting, which may seriously affect the reliability and performance of the whole equipment [2, 3]. Therefore, it is of major significance to find and detect potential faults in spiral bevel gear transmission systems in a real-time manner, as well as arranging reasonable equipment maintenance practices, in order to avoid serious equipment failures and reduce economic and property losses [4].

When the spiral bevel gear failure occurs, it will be reflected in various aspects, such as vibration [5], thermal imaging [6], and oil. The vibration-based method has been proven to be the most effective method for fault diagnosis of spiral bevel gears [7]. It has been found that, due to the continuous impact actions of the teeth of spiral bevel gears during meshing processes, the complex in situ noise of working environments tends to lead to many interference noise components of vibration signals. As a result, such fault vibration signals present high background noise effects and strong nonlinear nonstationary characteristics [8]. The nonlinear characteristics of spiral bevel gears are manifested in the complexity of the vibration signals, and the complexity of different working conditions may not be the same [9, 10]. Therefore, as a method for detecting the time series complexity and kinetic mutations, permutation entropy methods have been used due to their high computational efficiency and strong robustness [11]. Such methods were first proposed by Bandt et al. [12] and have been effectively verified in the field of mechanical fault diagnosis [13]. However, the deep feature information of the faults cannot be fully extracted by only using permutation entropy analysis on a single scale. It has been determined that in order to characterize the levels of the complexity of fault signals more accurately, multiscale analyses of time series are required. In previous research investigations, based on the aforementioned requirements, Aziz et al. put forward the concept of multiscale permutation entropy (MPE) [14]. MPE has the ability to perform the multiscale processing of complex vibration signals of mechanical systems according to various time series and can accurately extract fault feature information from both local and global perspectives [15]. Therefore, the calculated entropy values are considered to be more objective and are currently widely used in the fault diagnoses of complex mechanical systems [16]. For example, MPE methods have been successfully applied to the fault diagnoses of gears and bearings, as noted in this study’s reference [17, 18]. By inputting the calculated values of MPE of vibration signals into improved support vector machines based binary trees as the characteristic quantities, various gear and bearing states can be effectively distinguished. As described in [19], MPEs have been used to identify the normal states and three fault states of rolling bearings, and satisfactory results were obtained by clustering and recognition analysis methods. At the present time, when the existing MPE methods are used to construct the feature vectors for fault diagnoses, the entropy values under multiple scales can generally directly be used to construct the feature vectors. However, some problems have been encountered, such as large numbers of characteristic quantities, high dimensions, relatively time-consuming training and testing requirements, and the issues of diagnosis accuracy and efficiency not being obtained at the same time.

In order to improve the fault recognition accuracy and efficiency of spiral bevel gears, it was necessary to reduce the dimensionality of high-dimensional characteristic quantities, remove the redundant information, and extract the optimal low-dimensional sensitive characteristic quantities. The traditional dimensionality reduction methods, such as principal component analysis (PCA) [20, 21] and independent component analysis (ICA) [22, 23], have been successfully applied in the field of fault diagnosis. However, although the abovementioned methods, such as PCA, can effectively reduce the dimensionality of the data features with Gaussian linear distributions, the majority of the spiral bevel gear vibration signals are not like that in situ. Furthermore, although ICA methods can process the non-Gaussian data, they cannot effectively and concisely reduce the dimensionality of some Gaussian data features. Therefore, both PCA and ICA are utilized to project the original high-dimensional data of the global data features of linear structures into low-dimensional linear spaces through the corresponding algorithm flow. However, the vibration signals of spiral bevel gears are known to be strongly nonlinear and nonstationary. At the present time, effective methods are required for processing single nonlinear structural data, retaining the potential low-dimensional local characteristics in the data features, and then extracting the low-dimensional sensitive features, which are key factors for inaccurate diagnosis processes. In recent years, manifold learning techniques have been developed which include effective unsupervised data dimensionality reduction algorithms. Among those, locality preserving projection (LPP) [24, 25] methods have displayed the ability to extract low-dimensional sensitive information from high-dimensional feature data and then effectively mine and retain the local manifold characteristics of the data. Previous studies have achieved good classification effects, resulting in such methods being successfully applied to image processing and fault diagnosis. For example, as detailed in [26, 27], LPP methods have been used to realize the effective bearing fault status recognitions and improve diagnostic efficiency.

At the present time, BP networks and PNN networks are considered to be effective neural network fault mode recognition methods. Recently, with the increasing development of neural network theory, Huang et al. [28] put forward the theory of extreme learning machines (ELMs). The algorithm aims at and solves the problems related to long periods of training for single-hidden layers and the weak global optimization abilities of feedforward neural networks. When compared with traditional neural networks, the algorithm has displayed faster training speed and stronger learning ability. In recent years, the use of ELM techniques as intelligent diagnosis methods for the vibration signals of faults in rotating machinery has been greatly promoted. Examples of the use of extreme learning machines in the diagnoses of the bearing faults are presented in [29, 30].

In conclusion, this study proposed the utilization of MPE values which had been optimized and dimensionally reduced by LPP as the feature vectors, along with ELMs as the classifiers of the fault mode identifications, in order to carry out research regarding the fault diagnosis methods commonly adopted for spiral bevel gears. In addition, in view of the problems related to complex noise and serious noise interference issues in the working environments of spiral bevel gears, the original vibration signals were denoised prior to the construction of the MPE characteristic quantities in this study. The proposed method provides a valuable fault diagnosis way for spiral bevel gears, which combines the superior nonlinear feature extraction ability of MPE, the superior feature dimensionality reduction ability of LPP, and stronger learning ability of ELM.

#### 2. MPE-LPP-ELM Fault Diagnosis Method

##### 2.1. Construction of the Original Feature Vectors of the MPE

Aziz W et al. put forward a multiscale permutation entropy theory by studying the permutation entropy and multiscale sample entropy, which is the further optimization of permutation entropy. For the input signals, the multiscale permutation entropy ingeniously implemented multiscale coarse-graining of the signals for the purpose of completing analyses on the basis of the permutation entropy. Then, the permutation entropy values of the coarse-grained results were calculated. In addition, by assuming that the signal length of the spiral bevel gear to be detected was a time series of *N*, the calculation process of the multiscale permutation entropy was as follows.(1)The detection signal was at first coarse-grained, and the coarse-grained result was as follows: where *f* is the scale factor; indicates the coarse-grained multiscale time series; and *N* represents the length of the original time series to be measured. Therefore, when the scale factor *f* = 1, it was the result of permutation entropy calculation of the original signal.(2)After the phase-space reconstruction of coarse-grained , the following was obtained: where *e* represents the *e*^{th} reconstruction component ; *m* indicates the embedded dimension; and is the delay time.(3)After the reconstructed sequence was arranged in ascending order, the following was obtained: where *e* is the *e*^{th} reconstruction component ; *m* represents the embedded dimension; denotes the delay time. The original location code of the reconstruction sequence was . There were different permutations and combinations, and the occurrence probability of each permutation code was successfully counted. Therefore, the multiscale permutation entropy after coarse-graining was defined as follows:(4)In the present study, when , then had the maximum value . Then, following the normalization processing of , the normalized multiscale permutation entropy was obtained.

In this research investigation, through the detailed analysis of the measured signals and consultations of the relevant related studies, it was determined that the scale factor *f* was equal to 15; the embedded dimension *m* was 4; and the delay time *t* was 1. The multiscale permutation entropy was able to better detect the mutation characteristics of the vibration signals in different states for the spiral bevel gears on multiple scales. As a result, the calculation efficiency was improved. Then, the multiscale permutation entropy of the reconstructed signals was calculated, and the original high-dimensional eigenvector was constructed.

##### 2.2. Feature Reductions Based on LPP

As one of the classical manifold learning methods, the local preserving projection (LPP) maps of high-dimensional spatial data can be transformed into low-dimensional spaces through Laplacian feature projection according to the structural characteristics between adjacent points of the original data. LPP has the ability to maintain the local structures and dimensional reductions of the original data. Therefore, it can retain the inherent nonlinear structures and local sensitivity characteristics of the original high-dimensional eigenvectors of the spiral bevel gears and effectively reduce the dimensions of the high-dimensional features and extract the optimal low-dimensional sensitive characteristic quantities.

The original high-dimensional eigenvectors of the spiral bevel gears can be expressed as the high-dimensional space matrixes of , in which the number of samples is *n* and the dimensionality of each sample is represented by *f*. It has been found that, through the use of LPP algorithms, the low-dimensional projection matrixes can be obtained. That is to say, the low-dimensional sensitive eigenvector matrix (where *h*_{i} is the vector of and *d* < *f*) can be determined so that the nonlinear neighborhood relationship of the original high-dimensional eigenvectors can be maintained after mapping to the low-dimensional space (where ) is completed.

In order to obtain the projection matrixes, LPP finds the minimum distance between y_{i} and y_{j} by optimizing the function (5) as follows:where , ; *n* is the number of input samples; *h* indicates the projection vector; represents the symmetric and positive semidefinite Laplacian matrix; *D* is the diagonal matrix satisfying , ; and *W* is the relation matrix which can be defined by using the k-nearest neighbor algorithm as follows:

Therefore, in the present study, the following formula was obtained by the projection matrix via solving the following generalized eigenvalue decomposition problem:

Then, the projection matrix formed by the eigenvectors corresponding to the *d* eigenvalues was the sensitive eigenvector matrix , which could be utilized to realize the high-dimensional feature reductions.

##### 2.3. Fault Mode Identifications Based on ELM

In recent years, with the further development of neural networks, ELM algorithms have emerged as times have required. These theories have been improved on the basis of single-hidden layered feedforward neural networks (SLFNs), and the previous shortcomings of the SLFNs have been greatly overcome, such as time-consuming training and difficulties in global optimization. Therefore, the methods have become simple and efficient learning methods. Since only the number of network hidden layer nodes and excitation function are required to be set, the connection weights of the input layers and hidden layers can be effectively generated, as well as the random generations of the thresholds of the input layers, as the composition consists of input layers, hidden layers, and output layers, and the structure of ELM is detailed in Figure 1.

The mathematical modular form of ELM is as follows:where is the input weight; *b* denotes the deviation value of the hidden layer; is the output weight; and represents the activation function. In this study, the Sigmoid Function was selected; *Z* indicates the number of samples; and *X*i and *Y*_{i} are the input and output vectors, respectively.

For of the *Z* samples, the number of hidden layer nodes is denoted by *M*, and the output weight is obtained by training in order to complete the ELM network training. The algorithm process used in this study was as follows:(1)The input weight and hidden layer deviation *b*_{i} were set randomly and remained unchanged after initialization, *i* *=* 1, 2, …, *M*(2)The output matrix *G* of the hidden layers was calculated(3)The output weight values , were calculated

The flow of the spiral bevel gear fault diagnosis method proposed in this study is shown in Figure 2, and the specific diagnosis steps were as follows.

###### 2.3.1. Signal Acquisition

The vibration signals under the S1, S2, and S3 rotation speed conditions of the spiral bevel gears were collected. Then, taking the rotation speed condition S1 as an example, the normal state was set as A; one-third broken tooth state as B; two-thirds broken tooth state as C; and serious scratch state as D. There were a total of *n* samples in each fault state.

###### 2.3.2. CEEMDAN Denoising

CEEMDAN (Complete Ensemble Empirical Mode Decomposition with Adaptive Noise) decomposition of the state samples was implemented under the S_{1} rotation speed condition as detailed in Step (1). The correlation coefficients were used to select two intrinsic mode function reconstructions with the highest level of correlation in order to obtain the reconstructed signals containing the main information components. The corresponding reconstruction signals of the four fault states were denoted as *A*_{c}, *B*_{c}, *C*_{c}, and *D*_{c}, respectively.

###### 2.3.3. Construction of High-Dimensional Eigenvector

A multiscale permutation entropy method was used for the calculations of the reconstructed signals, as detailed in Step (2), and the high-dimensional eigenvectors which could preliminarily characterize the different fault states of spiral bevel gears were obtained as follows:where *t* = 1, 2, …, *n* and *f* indicates the scale factor.

###### 2.3.4. Feature Reductions

The popular dimension reduction features of the LPP were applied to the obtained high-dimensional eigenvectors. Some of the redundant eigenvectors were eliminated, and the optimal low-dimensional sensitive eigenvectors among the high-dimensional eigenvectors were effectively extracted as follows:where *k* = 1, 2, …, *d* and *d* is the eigenvalue of the LPP projection matrix.

###### 2.3.5. Recognition and Diagnosis

The optimal sensitive eigenvectors were input into the ELM for training and testing purposes, and the diagnosis results were obtained.

#### 3. Example Analysis of the Spiral Bevel Gear Fault Diagnosis

##### 3.1. Collection of the Experimental Data

In this study, the effectiveness of the proposed diagnosis method based on MPE-LPP and ELM was proven through fault diagnosis experiments of spiral bevel gears. The experimental data were obtained from the fault test bench for spiral bevel gear box as shown in Figure 3, which was mainly composed of a spiral bevel gear box and its accessory equipment and signal acquisition system.

The auxiliary equipment included a drive motor, governor, coupler, and so on. Then, the PULSE system of the B&K Company was used as the signal acquisition system, and the driving gear at the input shaft was taken as the test gear, with the number of gear teeth set as 10. The fault states of normal gears; gears with one-third broken tooth; gears with two-thirds broken tooth; and gears with severe scratches were simulated in this study, as shown in Figure 4.

**(a)**

**(b)**

**(c)**

**(d)**

During the experimental testing processes, the input shaft speed was controlled by the speed governor in order to maintain constant speeds of 900 r/min, 1,200 r/min, and 1,500 r/min for the simulations of three rotation speed conditions of the spiral bevel gears denoted as S_{1}, S_{2}, and S_{3}, respectively. A vibration acceleration sensor was installed at the bearing seat of the input shaft, and the sampling frequency was set as 16384 Hz.

##### 3.2. Case Analysis of the Fault Diagnosis

This study selected the vibration signals with an input shaft speed of 900 r/min for further analysis. Figure 5 shows the vibration time domain signals of the four fault states of spiral bevel gears at 900 r/min. It can be seen in the figure that the amplitude of vibration signal had increased significantly when the spiral bevel gears reached failure mode.

**(a)**

**(b)**

**(c)**

**(d)**

A CEEMDAN system was used to decompose the original vibration signals and select the two intrinsic mode function reconstructions with the largest correlation coefficients in order to obtain the reconstructed signals after noise elimination. The meaningless low-frequency components within the reconstructed signals were also eliminated. Meanwhile, the middle- and high-frequency parts containing the main information were retained, which effectively filtered out the interference effects of the background noise and increased the proportion of impactful components in the reconstructed signals.

The entropy values under the best scale factor *f* of the MPE were selected as the eigenvectors, and the sum of the absolute values of the entropy differences between the two pairs of MPE under the same *f* and different fault states was taken as *S*_{k}. It was found that the larger the *S*_{k} value was, the better the discrimination degree of the scale factor *F* would be for different faults. Then, by calculating the multiscale permutation entropy of the spiral bevel gear reconstruction signals *A*_{C}, *B*_{C}, *C*_{C}, and *D*_{C}, it was confirmed that the MPE values of the different scale factors had varied under the same state. The MPE values of the reconstructed signals of the four states under the same scale factor were also different, and the larger *S*_{k} values were observed to be mainly distributed in the front several scale factors, as shown in Table 1.

The MPE values of the denoised signals of the spiral bevel gears under the scale factor *f* = 15 under the four states were taken to construct the high-dimensional original eigenvectors. Then, through the popular characteristics of the LPPs, the original high-dimensional eigenvectors were optimized and reduced in order to obtain the optimal low-dimensional sensitive eigenvectors. The number of dimensionality reductions was set as 3 and the number of nearest neighbors was *k* = 7. Figure 6 details this study’s three-dimensional diagram of the MPE-LPP. It can be seen in the figure that the classification effects of the four states were ideal, among which the clustering effects of the normal state, one-third state, and two-thirds state were found to be superior. The four states of the spiral bevel gears could be visually identified by taking the gears under the one-third broken tooth state as the center.

In order to verify the effectiveness of the proposed MPE-LPP feature extraction method, the results were compared with those obtained using the feature extraction methods MPE-PCA and MPE. The original vibration signals were denoised by CEEMDAN. The three-dimensional diagrams constructed by the first three-dimensional eigenvectors of the MPE-PCA and MPE are shown in Figures 7 and 8. It can be seen in Figure 7 that following the dimension reductions of the PCA, the gear states with normal teeth, one-third broken teeth, and severe scratches were similar. The individual samples of the gears with one-third broken teeth were overlaid with those of the gears with severe scratches. In Figure 8, it can be seen that the important information based on the multiscale permutation entropy was mainly contained in the first few scale factors. Therefore, when combined with the data detailed in Table 1, it was determined that the classification effects of the reconstructed signals in the four states with the scale factors *f* = 2, 3, and 4 were better. Therefore, those were taken as the eigenvectors. In addition, in accordance with the data shown in Figure 8, some of the samples of the gears with one-third and two-thirds broken teeth were mixed up, and the clustering effects of the same states were found to be poor, with the distribution areas observed to be relatively scattered.

In the present study, twenty groups of data in each of the four states of the spiral bevel gears were selected to construct eighty samples. The eigenvectors were extracted as training samples using the feature extraction methods MPE-LPP, MPE-PCA, and MPE. ELM training was then performed. In addition, twenty groups of data were taken in the same way from each of the four states of the spiral bevel gears and eighty groups of testing samples for classification and diagnosis purposes, and the results are shown in Figure 9. In the current research investigation, the number of hidden layer neurons in the ELM algorithm was set as 14, and the set testing sample categories were denoted as “1”, “2”, “3”, and “|4”, respectively, in order to represent the four states of normal gears, one-third broken tooth gears, two-thirds broken tooth gears, and severely scratched gears. The diagnosis results are shown in Table 2. It can be seen in Figure 9 and Table 2 that the completely correct diagnosis of the four fault states could be realized by applying the first three-dimensional optimized and dimensionally reduced eigenvectors of the MPE-LPP. When MPE-PCA was used and the same first three-dimensional PCA components were taken, there was a sample recognition error in the two-thirds broken tooth state, and the average diagnosis results were 98.75%. Then, this study selected the original high-dimensional eigenvectors of the MPE, which had 15 dimensions, and determined that there was still one sample classification error in the two-thirds broken tooth state. The average diagnosis result was 98.75%. However, the time consumption was 0.0078 seconds, which was significantly higher than the three-dimensional eigenvectors of the MPE-LPP and MPE-PCA extraction methods.

**(a)**

**(b)**

**(c)**

In order to further verify the effectiveness of this study’s proposed method, the vibration signals of the spiral bevel gears with input shaft speeds of 1,200 r/min and 1,500 r/min were selected for further case analysis purposes. Figures 10 and 11 show this study’s three-dimensional diagram of the three-dimensional eigenvectors of the MPE-LPP, MPE-PCA, and MPE extraction methods at the respective speeds. It can be seen in Figure 10 that the classification effects of the four states were still ideal in the three-dimensional diagram of the MPE-LPP, but some of the samples of the gears with one-third and two-thirds broken teeth were mixed up in the three-dimensional diagram of the MPE. In Figure 11, the clustering effects of the MPE-LPP eigenvectors were still found to be superior, and both MPE-PCA and MPE exist overlaid between one-third broken teeth and two-thirds broken teeth. It can be indicated that the clustering effects of the MPE-LPP eigenvectors were better than those of the MPE-PCA and MPE at speeds of 1,200 r/min and 1,500 r/min. The ELM fault identification results of the spiral bevel gears at different speeds are detailed in Table 3. Those results also reflect the superiority of the proposed MPE-LPP method in both fault diagnosis accuracy and diagnosis speed.

**(a)**

**(b)**

**(c)**

**(a)**

**(b)**

**(c)**

#### 4. Conclusions

In this study, a fault diagnosis method for spiral bevel gears was proposed based on MPE-LPP and ELM. The MPE values which had been optimized and dimensionality reduced by LPP are used as the feature vectors, along with ELMs that are used as the fault classifiers for the fault mode identifications. The case analysis results of the vibration signals of spiral bevel gears in four states under three rotating speeds were obtained. Then, the results of the proposed system were compared with those of the MPE-PAC-ELM and MPE-ELM recognition methods. It was determined that the application of the proposed MPE-LPP and ELM method in the fault diagnosis of spiral bevel gears had resulted in obvious advantages related to diagnostic accuracy and diagnostic speed, when compared to the other examined methods. The proposed MPE-LPP-ELM recognition method can be used in other diagnostic requirements, such as the diagnosis of spur gears, bearings, and so on. It can even be used in other pattern recognition requirements. Therefore, further popularization and application of this method should be considered in the future research works.

#### Data Availability

The data are available upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work was supported by the National Natural Science Foundation of China (Grant no. 11872022) and the National Aeronautical Science Foundation of China (Grant no. 20200033116001).