#### Abstract

As the amount of data generated by monitoring the condition of rolling bearings is increasing, it has become a research hotspot in recent years to dig valuable information from massive data and identify unknown bearing states. In Internet technology, the collaborative filtering recommendation technology provides users with an intelligent means of filtering information. Aiming at the difficulty in designing the recommendation system scoring matrix in the field of fault diagnosis, we first obtain the bearing feature matrix based on the wavelet frequency band energy and then design a scoring matrix that accurately describes the bearing state; finally, we design a joint scoring matrix for bearing state identification by combining the matrix of these two different characteristics. After that, a collaborative filtering recommendation system for bearing state identification is proposed based on matrix factorization-based collaborative filtering and gradient descent algorithm. This method is used to identify and verify two types of fault data of rolling bearing: different position faults and different types of faults on the outer ring. The results show that the accuracy of the two identifications has reached more than 90%.

#### 1. Introduction

Rolling bearings are the most widely used general mechanical parts in various mechanical equipment. Whether its running state is normal or not often directly affects the performance of the whole machine, so it has important practical significance for condition monitoring and fault diagnosis of rolling bearings [1, 2]. As the amount of data generated by the monitoring of the condition of rolling bearings is increasing, the problem of information overload of state monitoring data is gradually becoming more prominent. It has become a research hotspot in recent years to dig valuable information from massive data and identify unknown bearing states.

In Internet technology, the recommendation system is an effective solution to solve the problem of information overload. The design goal of the recommendation system is to provide users with an intelligent means of filtering information when users are unable to process massive amounts of information [3–6]. Specifically, the recommendation system learns the user’s interests and behavior patterns by collecting and analyzing various data of the user, thereby recommending the information and services they need for the user.

Among the many recommended methods, collaborative filtering is currently the most widely used algorithm. The core idea of collaborative filtering is [7] to analyze user interests based on the user’s behavior record and find neighbor users in the user group that are similar to the target user (interest) and then integrate the evaluation of certain information by these neighboring users to form a prediction of the system’s preference for the target user. Finally, the system will make corresponding recommendations based on these preferences. For the black-start decision problem, Leng et al. [8] propose a novel black-start decision-making method based on collaborative filtering. In the proposed method, the values in the decision matrix are withheld, and the collaborative filtering technique is adopted to predict the withheld values. Based on the prediction values and true values, the Mean Absolute Error (MAE) weights of all the indexes are obtained. Finally, the MAE weights are used to compute the overall assessment value of each black-start scheme. Because of the overload in the electronic medical data and serious shortage of medical resources, Ma et al. [9] combine Skyline queries and the scoring method based on collaborative filtering in local areas and proposed a recommendation algorithm for intelligent personalized guidance. Hu et al. [10] explore the Gene Interest (GI) and recommend the genes for individual patients and propose a novel TOP-N Gene-based Collaborative Filtering (GeneCF) algorithm based on GI of patients. The GeneCF algorithm is aimed for matching more accurate recommendations about genes to the patients, with exceptional precision and coverage achieved. Collaborative filtering recommendation technology has achieved varying degrees of success in various fields for the problem of information overload and personalized recommendation.

However, in the field of mechanical fault diagnosis, it is difficult to design a reasonable scoring matrix, and the application of the collaborative filtering recommendation technology to the field is still in its infancy. Xu [11] applies collaborative filtering theory in the fault diagnosis field of civil aircrafts. Similarities between faults in the theory are calculated by the Pearson method and vector cosine method. By analyzing the defects of the collaborative filtering method, the concept of metasimilarity and weight is applied to solve these problems. Guo [12] makes failure recommendation for online electric multiple units by using collaborative filtering algorithms based on the real-time status data and further provides the accessible scheme for failure by solution knowledge base. But the above methods all belong to memory-based collaborative filtering. This method relies on the calculation of the similarity of faults. Moreover, the sparseness of fault data often leads to inaccurate similarity calculation, and it is impossible to identify more universal conditions and more general faults.

Aiming at the difficulty in designing the recommendation system scoring matrix in the field of fault diagnosis, we first obtain the bearing feature matrix based on the wavelet frequency band energy and then design a scoring matrix that accurately describes the bearing state; finally, we design a joint scoring matrix for bearing state identification by combining the matrix of these two different characteristics. After that, a collaborative filtering recommendation system for bearing state identification is proposed based on matrix factorization-based collaborative filtering and gradient descent algorithm. This method is used to identify and verify two types of fault data of rolling bearing: different position faults and different types of faults on the outer ring. The results show that the accuracy of the two identifications has reached more than 90%.

#### 2. Matrix Factorization-Based Collaborative Filtering

Collaborative filtering can usually be divided into two categories [13–15]: memory-based collaborative filtering and model-based collaborative filtering. Matrix factorization is an important method in model-based collaborative filtering.

In the following, the movie recommendation system is taken as an example to introduce the matrix factorization-based collaborative filtering [16, 17].

Table 1 is the “User-Movie” score table, from which the “User-Movie” scoring matrix can be obtained:where is the number of movies and is the number of users.

The *i*-th row and the *j*-th column element of **R** (denoted as ) is the score of user *i* for movie *j*. User *i* may not have scored all movies, and the goal of the movie recommendation system is to give the user a predicted score for a movie whose score is missing and to give recommendations to the user accordingly.

The idea of matrix factorization-based collaborative filtering is to factorize the higher-dimensional “User-Movie” scoring matrix into the product of two lower-dimensional matrices, and the two low-dimensional matrices are the user feature vector matrix and movie vector feature matrix. Suppose is the user feature vector matrix and is the movie feature vector matrix. The *i*-th row of is the feature vector of user *i* (denoted as ). The *j*-th row of **X** is the feature vector of movie *j* (denoted as ). is shown in the following equation:

The main task of matrix factorization-based collaborative filtering is the learning of the model, that is, using the existing scoring data to learn the best user feature matrix and the movie feature matrix **X**. Once the two matrices are obtained, they can be used to get user *i*’s predicted score for movie *j*:

The loss function of a single scoring example (for existing scoring records) is defined by the square of the error as follows:

For a sample set containing samples, the overall cost function is as follows:where *G* is the set of all “User-Movie” score records in the sample set.

The training of the matrix factorization model can actually be described by the following minimization problem:that is, looking for suitable parameters and **X** to minimize the overall cost function in equation (5).

To prevent overfitting, plus a regularization term (penalty term), the complete description iswhere is the regularization coefficient.

The gradient descent method is one of the methods used to deal with the above minimization problem. The processing of the gradient descent method is as follows:(1)Select the appropriate feature vector dimension *k* and the regularization coefficient . Initialize and **X** with a small random number.(2)For each sample in the sample set (user *i*, movie *j*, score ):(a)Calculating prediction error:(b)Update and as follows:(c)Repeat (b) until the cost function converges.

#### 3. Collaborative Filtering Recommendation System for Bearing State Identification

Matrix factorization-based collaborative filtering is built on the corresponding scoring matrix. However, for the identification of the state of the rolling bearing, there is no specific scoring rule, and the corresponding scoring matrix cannot be established. Wavelet frequency band energy can reflect the state of the rolling bearing well, and the bearing feature matrix is obtained by relying on the wavelet frequency band energy in this paper. The level of the score can reflect the degree of “likes” very well; based on this, this paper designs a scoring matrix that accurately describes the bearing state. Finally, this paper combines the two organically to obtain the joint scoring matrix for bearing state identification. On the basis of the joint scoring matrix, this paper proposes a collaborative filtering recommendation system for bearing state identification, based on matrix factorization-based collaborative filtering and gradient descent algorithm.

Assume that there are total *u* sets of rolling bearing vibration signal data , and these rolling bearings have total different types of states . Know the state of the first *h* sets of training data , and the collaborative filtering recommendation system is now utilized to identify the status of the last sets of test data .

Decompose the signal data into *a* layers using wavelet packets to obtain total subbands. So, the total signal can be expressed as follows [18]:

Let correspond to the energy , and then

The total energy of the signal is

Normalized feature vector of energy is constructed as follows:where and .

According to the feature vector, this paper designs a bearing feature score table, as shown in Table 2, and obtains the corresponding scoring matrix , as shown in the following equation:

According to the corresponding state of the bearing, this paper designs the state score table of the bearing, as shown in Table 3, and obtains the corresponding scoring matrix , as in equation (15). For the training data , its corresponding state score is given a maximum value of 1, while the nonexistent state score is given a small value . For test data , its score for state is unknown, giving a value of 0, denoted as :

In this paper, the bearing feature scoring matrix **A** and the bearing state scoring matrix **B** are combined to obtain the joint scoring matrix **C** for bearing state identification, as shown in the following equation:where and .

Our goal is to factorize the joint scoring matrix *C* of the bearing state identification into the product of two feature matrices and , namely, , as shown in the following equation:

Find the optimal parameters and **X** using the following equation to minimize the overall cost function :where is the regularization coefficient.

Finally, the gradient descent method is used to optimize the parameters, and then the predicted score of the test data for the state is obtained:

Then, the state corresponding to the highest score , that is, the state of the identification test data is obtained.

#### 4. Instance Verification

##### 4.1. Instance Verification of Different Fault Forms Identification of Bearing Outer Ring

Deep groove ball bearings of type 6205EKA are selected for pitting (Figure 1(a)), outer ring crack (Figure 1(b)), outer ring current damage (Figure 1(c)), and normal four states. In the motor speeds 600 rpm and 1200 rpm, 95 sets of outer ring pitting corrosion, 96 sets of outer ring crack, 139 sets of current damage, and 94 sets of normal, and total 424 sets of time-series data samples were obtained by using the bearing test bench, as shown in Figure 2(a). Vibration acceleration sensors are arranged, as shown in Figure 2(b), using B&K’s PULSE data acquisition system. Sampling frequency is 16384 Hz and sampling time 10 s.

The 422 sets of data samples are divided into three parts randomly: training set (255 sets), cross-validation set (84 sets), and test set (85 sets). According to the design method of the state identification scoring matrix proposed in this paper, based on experience, selecting Daubechies wavelet (D2) as the wavelet basis and implementing wavelet packet five-layer decomposition, the state identification scoring matrix of three data sets is obtained, respectively. Then, the bearing state is identified using the method proposed in this paper.

Table 4 shows the joint score for cross-validation set bearing status identification (*x*1 to *x*85 is the cross-validation set, and *x*86 to *x*340 is the training set).

Using different regularization coefficients and feature number *k*, relying on the training set learning model, using the cross-validation set for state identification, the identification rate is obtained, as shown in Table 5 and Figure 3. According to this, when , or , the state identification rate on the cross-validation set reached 92.86%.

Table 6 shows the bearing status predicted scores for the cross-validation verification set when .

Select , and , respectively, to evaluate the performance of the model on the test set and obtain the recognition rates of 91.76% and 87.06%, respectively, which proves that the model has good generalization ability under this parameter. Taking , as an example, Table 7 shows the specific identification result of the model for various states on the test set.

##### 4.2. Instance Verification of Fault Identification in Different Locations

In order to further verify the effectiveness of the fault identification method proposed in this paper, this section identifies the bearings fault in different locations. Data from public data were provided by Case Western Reserve University Bearing Data Center. Select the vibration acceleration data of the drive end bearing at the sampling frequency of 12000 Hz. The 450 sets of time-series data samples were sorted out, including four types of data: normal (10 sets), inner ring fault (160 sets), ball fault (160 sets), and outer ring fault (120 sets). The 450 sets of data samples are divided into three parts randomly: training set (270 sets), cross-validation set (90 sets), and test set (90 sets). According to the design method of the state identification scoring matrix proposed in this paper, based on experience, selecting Daubechies wavelet (D2) as the wavelet basis, and implementing wavelet packet five-layer decomposition, the state identification scoring matrix of three data sets is obtained, respectively. Then, the bearing state is identified using the method proposed in this paper.

Table 8 shows the joint score for cross-validation set bearing status identification (*x*1 to *x*90 is the cross-validation set, and *x*91 to *x*360 is the training set).

Using different regularization coefficients and feature number *k*, relying on the training set learning model, using the cross-validation set for state identification, the identification rate is obtained, as shown in Table 9 and Figure 4. According to this, when , or , the state identification rate on the cross-validation set reached 94.44%.

Table 10 shows the bearing status predicted scores for the cross-validation verification set when .

Select , and , respectively, to evaluate the performance of the model on the test set, and obtain the recognition rates of 94.44% and 92.22%, respectively, which proves that the model has good generalization ability under this parameter. Taking , as an example, Table 11 shows the specific identification result of the model for various states on the test set.

#### 5. Discussion

In the above verification experiment, the identification rate of the bearing state has reached more than 90%. Continuing to improve the identification accuracy of the algorithm, the future work is mainly carried out in three aspects: expand the number of samples and depth optimization model parameters and establish bearing status feature score table with comprehensive information (not limited to wavelet energy features).

The bearing fault identification method based on collaborative filtering recommendation technology proposed in this paper has a unique fault scoring mechanism. The level of scoring for different states of the bearing represents the sensitivity of the corresponding state; the same bearing has a corresponding rating for different states; new data samples are also easily integrated into existing fault identification models to improve fault identification accuracy. In the case of more and more mechanical equipment information, the amount of data is getting larger and larger, the mechanism of fault occurrence is complicated, and multiple faults are frequently combined; this method has a good prospect for the identification of the severity of the fault (the level of the score) and the multiple faults (the score of the different states).

#### 6. Conclusions

This paper applies collaborative filtering technology to the field of mechanical equipment fault identification. In Section 2 of this paper, the movie recommendation system is taken as an example to introduce the matrix factorization-based collaborative filtering. In Section 3, a collaborative filtering recommendation algorithm based on matrix factorization-based collaborative filtering for fault state recognition is proposed. Section 4 carries out instance verifications. Section 5 discusses the ways to improve the method and the advantages and potential of the method proposed in this paper. For the identification of the rolling bearing status, this paper proposes a design method of the scoring matrix. Firstly, the bearing feature matrix is obtained by relying on the wavelet frequency band energy. Then, since the level of the score can reflect the degree of “likes” very well, the scoring matrix that accurately describes the bearing state is designed. Finally, we design a joint scoring matrix for bearing state identification by combining the matrix of these two different characteristics. On the basis of the joint scoring matrix, this paper proposes a collaborative filtering recommendation system for bearing state identification, based on matrix factorization-based collaborative filtering and gradient descent algorithm.

Experiments were carried out on the normal bearings and bearings of pitting, crack, and current damage on the outer ring of the rolling bearing, and the vibration signal data were obtained. Combined with the vibration data of fault in outer ring, ball, and inner ring of the existing rolling bearing, the method proposed in this paper is used to identify and verify it. Use different regularization coefficients and number of features *k*, relying on the training set to learn the model and use the cross-validation set for state identification. Select the set of parameters with the highest identification rate on the cross-validation set and use it to identify the test set to evaluate the generalization ability of the selected parameters. The results show that the method can effectively identify the bearing state, and the optimized parameters have good generalization ability.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This research was funded by National Natural Science Foundation of China (grant nos. 51575178 and 51805161) and Hunan Natural Science Foundation of China (grant no. 2018JJ2120).