Abstract

The preventive cultural relics protection is one of the most concerned contents in archaeology, which includes environmental monitoring and accurate prediction of cultural relics diseases. In view of the deficiency of the analysis of cultural relics data and the prediction of cultural relics diseases, a prediction model of immovable cultural relics diseases based on relevance vector machine (RVM) is proposed. The key factors affecting the disease of immovable cultural relics are found out by the principal component analysis method, and the dimension reduction of data is realized; then, the RVM model under the framework of Bayesian theory is constructed, and the super parameters are estimated by the maximum edge likelihood method; finally, the prediction accuracy of the model is compared with the traditional diseases prediction methods. The experiment results demonstrate that the proposed RVM-based immovable cultural relics disease prediction approach not only has the advantages of more sparse model but also has better prediction accuracy than the traditional radial basis function neural network-based and support vector machine-based methods.

1. Introduction

The long process of human history precipitates innumerable valuable cultural heritage. The protection of cultural heritage is not only the premise of maintaining the world cultural diversity and inheriting human civilization but also the responsibility of human beings. Cultural heritage includes material cultural heritages and intangible cultural heritages. Material cultural heritages mainly refer to cultural relics with historical, artistic, and scientific values, including movable cultural relics and immovable cultural relics. Movable cultural relics refer to important works of art, documents, manuscripts, books and materials, representative objects [1, 2], etc. Immovable cultural relics refer to ancient cultural sites, ancient tombs, ancient buildings, grottoes and temples, stone carvings, murals, important modern historical sites and representative buildings [3, 4], etc. For example, a scene of the immovable cultural relics is shown in Figure 1. There are nearly 770000 immovable cultural relics registered in China, including 2352 national key cultural relics protection units [5]. Cultural relics are treasures of human beings, and people should not refuse to protect them.

Due to the long-term exposure of immovable cultural relics, the diseases of immovable cultural relics are greatly affected by human and natural factors [6]. With the accelerated pace of modernization, many large-scale projects had begun to be constructed. However, in the process of construction, various precious immovable cultural relics are often destroyed [7, 8]. At present, the state has paid attention to the protection of immovable cultural relics and formulated relevant laws and regulations and hopes to reduce the damage to cultural relics. Although it has achieved remarkable results [9], the impact of natural conditions is still severe.

The preventive protection of immovable cultural relics refers to the prediction of the crack opening of cultural relics by monitoring and analysing the natural conditions, such as the surrounding climate in the early stage of disease [10, 11]. The preventive cultural relic protection system includes the environmental monitoring of immovable cultural relics and the accurate prediction of cultural relics diseases. Under the existing scientific and technological conditions, environmental monitoring has become mature and the analysis of environmental data and the accurate prediction of cultural relics diseases have become a hot issue of current research [12]. However, environmental data usually have the characteristics of high dimension, complex relevance, and nonlinearity, so it is difficult to predict the disease of cultural relics [13]. Because there are so many natural factors that affect the disease of immovable cultural relics, it is necessary to select the most important factors through data dimension reduction. The most popular algorithm is the principal component analysis (PCA) method [14], which can extract the related variables in the original sample by orthogonal transformation and delete the closely related variables, so as to establish fewer unrelated new variables, which reflect most of the original information as much as possible [15].

The machine learning algorithms in the cultural relics prediction has been favoured by many researchers [16]. For example, radial basis function (RBF) neural network method [17] and support vector machine (SVM) method [18] are both studied. However, they all have their own application occasions and shortcomings. The RBF neural network method has the advantages of strong nonlinear mapping and generalization ability, which can be used to predict complex occasions. However, because the minimum value of the objective function is obtained by the gradient descent method, it is easy to fall into the local optimum and the network structure is complex. On the other hand, the SVM adopts VC dimension theory and structural risk minimization, which not only avoids the dimension disaster but also avoids overfitting and local optimization. However, the kernel function is limited by Mercer condition and is sensitive to parameters. Moreover, the training time is becoming longer with the increase of sample data [19]. Therefore, it is necessary to find a simple and effective method to predict the diseases of cultural relics.

Relevance vector machine (RVM) is an efficient machine learning algorithm proposed by Tipping in 2000 [20]. This method ensures the sparsity of the model by introducing the Gaussian prior distribution of the zero mean value of the weight vector given by the hyperparameter [21, 22]. The superparameter can be estimated by the maximum edge likelihood method [23]. Combining the Bayesian theory and the maximum likelihood estimation theory, it can accurately predict the safety and reliability of some engineering problems (such as aero-engine fractures and cultural relic fractures) [24, 25]. In this paper, a RVM-based crack prediction method for immovable cultural relics is proposed.

Compared with the existing immovable cultural relics disease prediction algorithms, this paper mainly makes the following contributions:(1)The key factors affecting the disease of immovable cultural relics are found out by the PCA method, and the dimension reduction of data is realized(2)The RVM-based disease prediction method for immovable cultural relics is proposed, and the prediction model is established to realize the accurate prediction(3)Comparing the proposed RVM-based immovable cultural relics disease prediction approach with the traditional RBF-based and SVM-based methods, it is shown that the proposed method not only has more sparse model but also has higher accuracy than other traditional machine learning algorithms

2. RVM-Based Immovable Cultural Relics Disease Prediction

2.1. Normalization

Normalization is an important dimensionless processing method [26]. By simple calculation, transforming the dimensioned expression into dimensionless expression [27], the absolute value of the physical system value can be changed into a relative value relationship, and the changing trend of physical quantities in different ranges can be understood intuitively [28]. The normalized formula is as follows [29]:where is the value after normalization, is the value before normalization, and are the maximum value and the minimum value before normalization, respectively, is the number of data (include training samples and test samples), and is the characteristic numbers of the original data.

2.2. Principal Component Analysis

PCA can analyse the principal component with the largest contribution rate by calculating the eigenvalue and relevance coefficient matrix, so as to achieve the effect of dimensionality reduction. The normalized data is standardized so that the mean value of each attribute is 0:where is the -th index of the -th data after normalization. is the mean value of each feature in the data, , and is the standard deviation of the sample, .

The relevance coefficient matrix is constructed:

The eigenvalue of the relevance coefficient matrix is calculated from and the corresponding eigenvector as . makes a linear combination [30]:where is the -th component. is the -th standardized variable in the sample.

The contribution rate and cumulative contribution rate of the eigenvalue are calculated as follows:where is the main component number [31].

The components of the first eigenvalues whose cumulative contribution rate reaches a certain value are selected as the principal components.

2.3. RVM Modeling

As an efficient machine learning method, RVM can be used for classification and regression [32]. The relationship between input and output of the regression model of RVM can be expressed as follows:where is the input eigenvector and . represents the -th input sample in the training set, and . is the number of input samples, and is a -dimension real number field. is the target eigenvector, and . is the output value determined by weight. is a kernel function. In the framework of sparse Bayes, additional noise is assumed to be . represents a Gaussian distribution, and is the variance of Gaussian noise [33].

When the target vectors are independent of each other, the likelihood function of the sample set can be expressed aswhere is a kernel matrix composed of kernel functions , namely, and .

In order to avoid the overfitting phenomenon when and are calculated directly by the maximum likelihood estimation method [34], Gaussian prior distribution with a mean value of 0 and a parameter of should be assigned to [35]:where is the hyperparametric vector of the dimensions, .

According to Markov properties [36], for the input vector , the probability prediction formula of the corresponding predicted value is as follows [37]:where . However, . The posterior distribution can be obtained by the following formula [38]:where the variance and the parameter . is a unit array.

The final approximate value is as follows [39]:where and are the maximum likelihood estimations of equation (10), which determine the optimal values of model weights.

After obtaining and by the iterative method, the predicted value and prediction variance of RVM are as follows:where the covariance , the parameter , and the posterior distribution mean .

2.4. Disease Prediction for Immovable Cultural Relics

For the accurate prediction of immovable cultural relic diseases, a prediction method based on PCA dimension reduction and RVM regression is proposed. The specific process is shown in Figure 2.

First, the data of crack width and environmental factors (including ultraviolet intensity, precipitation, and wind speed) affecting the cracks of cultural relics were monitored in Dafo temple in Binzhou City, Shaanxi Province. Since the size and unit of data of different environmental factors are not consistent, it is necessary to normalize the data, reduce the range of data from 0 to 1, and still retain its change trend. Because there are 13 kinds of environmental impact factors, it is necessary to reduce the dimension of the data by PCA, so as to extract the most important factors affecting the cracks of immovable cultural relics as the input of samples. The extracted principal component data are divided into training samples and test samples , where and are the input of training samples and test samples, respectively. and are the output (crack width) of training samples and test samples, respectively; next, the Gaussian kernel function can map training samples to a higher dimensional space , and finally, the Gaussian kernel matrix is formed. The Gaussian kernel function iswhere is the width of Gaussian kernel. and are the training sample values of the -th and -th, respectively. is a matrix of , namely, . is the number of training samples.

The purpose of constructing the kernel function is to map the input of training samples from low-dimension to high-dimension space, so as to obtain better training effect [40]; set the initial values of posterior parameter and noise variance , and set the maximum number of iterations. The following formula is used to iterate and :where is the updated value of noise variance and is the updated value of the superparameter . Variable measures the corresponding parameter and the effect determined by the data is , and is the -th diagonal element in . The covariance , and the parameter . The mean value of covariance posterior probability distribution is . is the target vector. is the number of input samples. is the kernel matrix composed of kernel function ; , and . After updating, some of tend to infinity and its corresponding is 0; the rest of tend to be finite, corresponding to , which is called the relevance vector. After finishing the training, the best and are obtained.

The test input sample is mapped to a higher dimension by Gaussian kernel function. The Gaussian kernel function iswhere is the width of Gaussian kernel. is the value of the -th test sample, and is a matrix of , namely, . is the number of training samples.

The Gaussian kernel matrix is constructed by Gaussian kernel function, is the kernel matrix composed of kernel function , namely, and .

The best and are determined by the trained RVM model, and then, the predicted value and variance can be obtained by equations (12) and (13).

3. Experiments

3.1. Dataset

The data in this paper are based on the monitoring data of rock mass fracture environment in Dafo temple, Binzhou City, Shaanxi Province, China. The disease of immovable cultural relics can be expressed by the degree of fracture opening, and the factors affecting fracture opening include wind speed (m/s), temperature , ultraviolet intensity (μw/cm2), precipitation intensity (mm/h), and other climatic conditions. Therefore, the above multiple conditions can be used as the input variable, and the crack opening is used as the output variable. The environmental monitoring data and fracture opening are shown in Table 1.

According to the monitoring of the fracture degree of Dafo temple rock mass and the climate factors under the natural conditions for 500 days, 500 groups of relevance data were obtained. The first 450 sets of data are taken as training samples of the model of RVM, and the remaining 50 sets of data are taken as test samples to verify the trained model. When different numbers of samples are selected for prediction, although the prediction error will be float up or down, the expected predicted trend will not be affected.

Since the units of the abovementioned input variables are not the same and there is a large difference between the data (less than one digit, up to tens of thousands), it is necessary to normalize each group of data. After normalization, the data of each influencing factor are limited between 0 and 1 and the variation trend of each variable can be observed. It can be seen from Table 1 that there is a complex nonlinear relationship between the rock mass fissures of Dafo temple and various environmental factors.

Through the PCA dimension reduction processing, the contribution rate and cumulative contribution rate of each component are analysed through the eigenvalue, and then the principal components can meet the requirements. The results of PCA are shown in Table 2.

It can be seen from Table 2 that the cumulative contribution rate of the first eight principal components reaches 95.23%, so the first 8 components with the largest contribution rate are selected as the principal components extracted by PCA.

Then, the projection matrix is composed by the eigenvectors corresponding to the eigenvalues, which can represent the relationship between the reduced principal components and the original data, and it is shown in Table 3.

It can be seen from Table 3 that the principal component 1 mainly reflects the information of cumulative illumination and dew point factors; the principal component 2 is affected by dew point the most; principal components 3 and 4 mainly reflect SO2 concentration and humidity, respectively; principal components 5 and 6 had the closest relationship with light and dust, respectively; and principal components 7 and 8 had the highest relation with ultraviolet intensity and O3 concentration, respectively.

3.2. RVM-Based Disease Prediction Modeling

After normalization and PCA, the disease prediction model of immovable cultural relics based on RVM is constructed and the effectiveness of the model is verified:Step 1. Initialize the hyperparametric vector and variance and set the maximum number of iterations.Step 2. Set the maximum value of . In the RVM iteration process, if it exceeds the maximum value, it will be considered that it tends to infinity. If the corresponding is 0, the value of this part will not be updated; if the variance threshold is set, when the relative error of its variance is less than the threshold value, it is considered that the training requirements are met, and then, exit the cycle.Step 3.After 1000 iterations, the experimental training data finally meet the accuracy requirements, and there are 22 which tend to be finite (there are 125 that tend to be finite which are not reduced by PCA); is not 0, and the optimal model parameters are obtained.Step 4. Put the test sample into the trained model to predict the fracture value of immovable cultural relics, and compare it with the measured value.

3.3. Performance Index

In order to verify the prediction performance of the research model, RBF neural network, SVM, and RVM are used to train and predict the degree of crack of the immovable cultural relics. Finally, the prediction performance of the model is evaluated by using four indicators, namely, mean absolute error , mean absolute percentage error , root mean square error , and decision coefficient [41]:where, is the predicted value, is the actual value, is the average value of the real value, and is the number of test samples.

4. Results and Discussion

In this section, we compare the RVM-based prediction approach for immovable cultural relics (RVM-DP) with RBF neural network-based (RBF-DP), support vector machinebased (SVM-DP), and RVM after PCA-based (PCA-RVM-DP) methods.

At first, 450 sets of training data are used to construct RBF neural network, SVM, and RVM. The corresponding parameters are obtained through training, and then, 50 sets of test data are used to verify the model. Among them, the selection of model parameters of three methods is as follows: RBF neural network is trained with a precise radial basis function with a dispersion of 223; SVM is trained with a radial basis function with a parameter of 5.9 and a regularization coefficient of 0.3; RVM is trained with a Gaussian kernel function with a core width of 5.6. Finally, a RVM-based disease prediction model of immovable cultural relics after reduced PCA dimension is constructed. The original 13 feature samples are reduced to 8 feature samples by PCA, and then, the dimension-reduced training samples and test samples are used to construct and verify the RVM model. At this time, the Gaussian kernel width of RVM model is 1.5. The fracture opening and the predicted values of the four methods are shown in Figure 3. The box diagram of the prediction absolute error of the four models is shown in Figure 4.

From Figure 3, it can be concluded that the prediction results of PCA-RVM-DP and RVM-DP models are close to the real values, while some of the predicted values of RBF-DP and SVM-DP seriously deviate from the true values, so the prediction effects are poor. It can be concluded from Figure 4 that the relative error of RBF-DP and SVM-DP prediction are relatively large, and the maximum error and average error are larger than those of the PCA-RVM-DP and RVM-DP models. Moreover, the prediction effects of PCA-RVM-DP and RVM-DP are similar, so it releases the effectiveness of PCA.

Table 4 summarizes the different performance indicators of the four prediction models which makes it easier to analyse the prediction performance of the four models. It can be concluded from Table 4 that the prediction results of PCA-RVM-DP and RVM-DP are very close. The determination coefficient reflects the fitting degree of prediction results. The closer its value is to 1, the better the fitting effect is. It can be seen that PCA-RVM-DP and RVM-DP have the best fitting effect.

5. Conclusions

In this paper, a RVM-based prediction method for immovable cultural relics is proposed. First, the key factors affecting the disease of immovable cultural relics by PCA are found out, and then, the dimension-reduced data are divided into two parts: training set and test set. Second, the Gaussian kernel matrix of the training set is constructed and the parameters of RVM are obtained by iteration. Finally, the Gauss kernel matrix of the test sample is constructed, and the optimization model and parameters are obtained by using the training set to predict the crack width of the test sample.

We compared the proposed RVM-DP approach with RBF-DP, SVM-DP, and PCA-RVM-DP methods. The results show that the traditional RBF-DP and SVM-DP method has a large error in the prediction of the disease of immovable cultural relics, and the PCA-RVM-DP with the more sparse model and RVM-DP approach are similar and have higher prediction accuracy.

Data Availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported in part by grant for the National Natural Science Foundation of China (61703329), China Postdoctoral Science Foundation (2018M633538), Xi'an Science and Technology Plan Project (2020KJRC0068), Scientific Research Program funded by Shaanxi Provincial Education Department (18JK1005), the 13th Five Year Plan of Education Science in Shaanxi Province (SGH18H159), Key R&D Projects of Shaanxi Province (2019GY-097), and Key Industrial Chain Projects of Shaanxi Province (2019ZDLGY15-04-02).