#### Abstract

Firstly, a genetic algorithm (GA) and simulated annealing (SA) optimized fuzzy c-means clustering algorithm (FCM) was proposed in this paper, which was developed to allow for a clustering analysis of the massive concrete cube specimen compression test data. Then, using an optimized error correction time series estimation method based on the wavelet neural network (WNN), a concrete cube specimen compressive strength test data estimation model was constructed. Taking the results of cluster analysis as data samples, the short-term accurate estimation of concrete quality was carried out. It was found that the mean absolute percentage error, e_{1}, and the root mean square error, e_{2}, for the samples were 6.03385% and 3.3682KN, indicating that the proposed method had higher estimation accuracy and was suitable for concrete compressive test data short-term quality estimations.

#### 1. Introduction

Expressway and railway construction projects have become more dependent on information technology in the past decade [1]; however, much of the collected information is not being extracted or effectively utilized [2]. Much of this construction data is related to laboratory based concrete cube specimen compression tests, the quality of which directly affects the quality of the whole project and is relevant to the project operations and maintenance stages. Because concrete quality has far-reaching impacts on the overall project, strengthening compressive test process monitoring is vital to construction safety and project success. While test machine data is monitored over a long period of time, it is also necessary to make accurate short-term estimations based on previous test data to identify any possible problems before any abnormalities occur and to make corrections to ensure the concrete quality being used for the project. Compared with the traditional laboratory sampling of concrete cube specimen, conducting concrete quality assessments by applying time series estimation algorithms to the test data can more effectively utilize the massive information data and more accurately estimate the concrete compressive data.

There has been significant research conducted on estimation methods, with the most classical estimation algorithms being support vector machines (SVM), WNN, and decision trees. While SVM methods [3] use statistical theory to minimize structural risk, when there is a large data quantity, the algorithm is slow, takes up a great deal of computer memory, and is unable to resolve multiclassification problems [4]. Consequently, many alternative optimization methods have been proposed. In [5], for example, a cuckoo search (CS) algorithm was used to optimize the unknown parameters in a support vector machine model and in [6], a decision tree inductive algorithm was applied to classify specific data. However, when there is incomplete data, decision tree data performance degrades, which leads to overfitting and uneven distributions [7]. For that, Yang and Fong (2013) [8] developed an incremental optimization mechanism to optimize fast decision trees. The compression test data classifications were fixed and the data volume was large, which can be suitably predicted using artificial neural network (ANN). Filik (2016) developed a new hybrid approach for wind speed estimation using a fast block least mean square algorithm and an artificial neural network [9]; however, artificial neural networks have been found to have inherent faults such as weak error-tolerance and missing information [10].

WNN, which apply wavelet theory to neural networks, have been found to make up for the lack of a Fourier Transformation in time domains when predicting time series problems [11, 12]. For example, Zhang and Wu (2015) [13] proposed a GA-WNN model to achieve optimization of piezoresistive pressure sensors and corresponding measurement systems. Sharma and Yang (2016) [14] proposed a mixed WNN for short-term solar irradiance forecasting and compared three algorithms with the mixed WNN wavelet algorithm and proved that the wavelet algorithm significantly reduced estimation errors in solar radiation. Guan and Luh (2013) [15] used an improved WNN to solve a short-term load forecasting problem without an estimated forecast interval and Falamarzi and Palizdan (2016) [16] used a wavelet transformation to decompose input data and improve the estimation accuracy of a transpiration estimation model.

Therefore, WNN has been found to have good estimation performances and better practicability and application for compression data. On this basis, this paper applies wavelet technology to ensure effective estimations. While previous research has mainly focused on pressure transmitter research, solar radiation estimations, short-term load forecasting, and other fields, there has been no research to date that has examined concrete test data estimating methods for engineering construction nor the methods needed for the sample extraction and classification of mass test monitoring data.

In this paper, a time series estimation method based on WNN is therefore proposed, with the fracture load, F(KN), and the compressive strength, (MPa), being selected as the concrete cube specimen compressive data attributes. A FCM algorithm is optimized using a developed simulated annealing algorithm and genetic algorithm (SA+GA) applied to classify the test specimens from a YAW-2000 compression-testing machine in the TJ-01 contract section of the Shaanxi Xi'an-Hancheng Intercity Railway. Then, by using the improved WNN algorithm, the estimation model is used to train the clustering results, after which the trained estimation model is employed to estimate the test data. Finally, compare the estimation data with the actual measurements to verify the accuracy and the validity of the proposed method.

The remainder of this paper is organized as follows. In Section 2, a hybrid optimization FCM algorithm based on the GA+SA is proposed to cluster the sample data. In Section 3, an improved time series estimation model based on WNN is proposed. Section 4 uses the clustering method proposed in Section 2 to cluster the sample data, after which the sample data are trained to construct the compression test data estimation model, and the measured data compared with the estimation data from the estimation model. Finally, concluding comments are given in Section 5.

#### 2. Optimized FCM Clustering Based on the GA+SA

All the data in this paper comes from the information management system of the Shaanxi Xi'an-Hancheng Intercity Railway construction project. The concrete cube specimen compressive test data from the YAW-2000 compression-testing machine of the TJ-01 tender section are analyzed and estimated, the aim of which is to estimate the concrete quality in advance and to prevent unqualified concrete being used in the construction materials. The fracture load, F(KN), and the compressive strength, (MPa), are used to estimate the concrete compressive test data.

With the Lagrange multiplier as the objective function, the FCM algorithm calculates the clustering center by optimizing the objective function. However, because the initial FCM clustering center is random, it can easily fall into a local optimal solution if the clustering center is not properly chosen. Therefore, as the clustering results depend strongly on the clustering center, incorrect clustering boundary divisions and inaccurate clustering results could result [17]. To overcome this problem, following [18], a GA+SA is used to optimize the initial FCM algorithm clustering center.

##### 2.1. Genetic Algorithm (GA)

GA [19] are practical algorithms that imitate nature’s “selection and survival of the fittest” evolutionary process, in which the genes adapted to the environment remain. Individuals who fit the environment best should have a better chance to propagate their offspring. Similarly, GA transform data into corresponding binary numbers, corresponding to genes in genetic process, after which a natural selection process using an adaptive function takes place. The genetic process involves genetic inheritance, genetic variation, and gene selection to eventually produce a new population, in which the binary number crossovers are heredity; the jumps from 0 to 1 in the binary numbers are the genetic mutations, and the adaptive judgment is associated with natural gene selection.

The flowchart of GA is shown in Figure 1 and the overall structure of GA is described as follows.

*Step 1 (algorithm parameters initialization). *For any GA, before starting to calculate, the value of the parameters should be determined, which are group number, N, crossover probability, , mutation probability, , and the fitness function, f (x).

*Step 2 (encoding). *A chromosome encoding is needed to describe each chromosome in the population. The encoding method determines how the problem is structured in the algorithm and the genetic operators that are used. Each chromosome is made up of a sequence of genes from certain alphabet which can consist of binary digits (0 and 1), floating-point numbers, integers, symbols (i.e., A, B, C, and D), etc. Each chromosome represents a solution for the problem. In our GA model, taking the sample of the concrete cube specimen compressive test data as the initial population and transforming the quantitative characteristics into genes, the encoding is binary.

*Step 3 (fitness evaluation). *The fitness is computed for each chromosome in the current generation. The fitness value, , of each chromosome (string) needs to be calculated and to be sorted by value size. The total population fitness, , needs to be calculated as well.

*Step 4 (selection). *At each iteration, the best chromosomes are chosen for reproduction. The selection probability of each string should be calculated by using the formula of , and the cumulative probability, , should be calculated as well. The selection process can be based on the roulette method, which ensures a high probability of a large fitness value selection.

*Step 5 (crossover). *This is the procedure to generate new chromosomes by exchanging genes in the same position of two different chromosomes which are selected for breeding the next generation. First, each chromosome's random numbers, r, between should be computed, and if , this chromosome and the other selected chromosome are selected to cross. Then, according to value of r, the crossover location of each pair chromosomes is obtained. Subsequently, the content on both sides of the intersection is exchanged.

*Step 6 (mutation). *In the selected chromosome, some genes in the chromosome undergo hetero-transformation. If , then the gene in this position is turned over. If the gene is 1, when it is mutated, it becomes 0 and vice versa.

*Step 7 (determination criterion). *When the group number, N, reached and the stop criterion is satisfied, the algorithm ends and the best chromosome, together with the corresponding result, is given as an output. Otherwise, the algorithm iterates again Steps 4–7.

As GA has random selection characteristics based on probability, it can avoid the problem suffered by other optimization algorithms of falling too quickly into a local optimal solution. However, genetic algorithms also have disadvantages. After it finds the optimal solution, it prematurely ends the search, and its parameter selection and general form have no quantitative stipulation or uniform format. Yet because the general compatibility is good, it can be combined with a variety of other algorithms [20].

##### 2.2. Simulated Annealing Algorithm (SA)

When the temperature is high, the internal energy in metal is high, and the entropy value is large and therefore tends to chaos. As the temperature decreases, the entropy value of metal also decreases, and the energy commensurately decreases. If the search for the object energy value is taken as the objective function, a value can be determined using an annealing process. Therefore, the simulated annealing algorithm takes the initial temperature, T, as the starting point, sets the objective function and acceptance probability, and then continuously reduces the temperature to determine the optimal solution [21]. Therefore, it has strong local search ability and short running time advantages; however, to overcome its poor global searching abilities, so it is combined with GA [22] to get better.

The flowchart of SA is shown in Figure 2 and the overall structure of SA is described as follows.

*Step 1 (initialization). *Setting an initial solution, S_{0}, and making the best solution, , to calculate the value of the objective function, E(S_{0}), then, setting an initial temperature, T_{0}, the genetic number of iterations, G, and making* k*=1.

*Step 2 (setting loop part). *Do while ()

Produce a new state solution, , calculate the value of the objective function, E(), and ∆E= E()-E(S_{0}).

If ∆E<0, make =; otherwise, if min,exp[-(E()-E())/≥ random, then = .

*Step 3. *Until the Metropolis criteria issatisfied, reduce temperature, =update(), and* k*=*k*+1.

*Step 4 (terminating condition). *When the stop criterion is satisfied, output the search results.

##### 2.3. GA+SA+FCM Clustering Algorithm

The GA+SA algorithm firstly generates the initial population and then generates a new population through selection, crossover, and mutation of GA. The new population is selected and replaced by SA. When the iteration algebra reaches the genetic number of iterations, G, or the fitness value reaches the expected value, it jumps from the inner loop. After exiting, the outer loop is performed by reducing the temperature parameter, T, of SA until the temperature parameter reaches the preset value,, or the algorithm converges to end the cycle. The GA+SA+FCM clustering algorithm flowchart is shown in Figure 3 and the steps of the GA + SA optimized FCM algorithm can be described as follows.

*Step 1 (parameters initialization). *Set group number, N, genetic number of iterations, G, crossover probability, , mutation probability, , starting temperature, T_{0}, cooling coefficient, q, and end temperature, .

*Step 2 (fitness evaluation). *According the number of clusters, c, to generate cluster groups, using GA to encode the initial value and to generate the initial population, the fitness value, , of each chromosome (string) needs to be calculated and sorted. The total population fitness, , needs to be calculated as well.

*Step 3. *Set the cycle count variable g=0.

*Step 4 (cycling part). *Perform GA steps such as selection, crossover, and mutation to generate a new population. Use SA to calculate a best solution to replace the new population obtained from the GA.

*Step 5. *While g<G, set g=g+1 and cycle the Step 4. Otherwise, exit from the cycling part and jump to Step 6.

*Step 6 (get the optimal solution). *If , then take the best solution calculated by SA as the optimal solution. Otherwise, continuously reduce the temperature to seek the optimal solution.

The GA + SA optimized FCM algorithm is used to cluster the compression test data sample. First, the number of clusters,* c*, is determined according to [23], and the control parameters initialized as follows: group number, N=10, genetic number of iterations, G=50, crossover probability, =0.4, and mutation probability, =0.02. According to [24], the control parameters are valued as follows: cooling coefficient,* q*=0.95, starting temperature, T_{0}=100, end temperature, , and Metropolis chain length, L.

The GA+SA+FCM algorithm method divides the concrete compressive test specimen data into three categories, which lays the foundation for the construction of the concrete compressive test data estimation model.

The performance of the different clustering algorithmic results is evaluated by comparing the cluster validity indices [25], which are made up of a partition coefficient (PC), a partition entropy (PE), a sector index (SC), a separation index (S), a Xie-Beni validity index (XB), a Dunn validity index (DI), and an ADI validity index.

#### 3. Wavelet Neural Network Time Series Estimation Algorithm

##### 3.1. Wavelet Neural Network (WNN)

The WNN combines the wavelet function and the neural network and then transforms the activation function into a wavelet function. Because the wavelet function is able to analyze the signal in both time and frequency domains, it is able to express the complete signal information and determine the local optimal solution; therefore, it has a simpler structure and higher estimation accuracy than a general neural network [26].

The WNN computation process is divided into three steps.

*Step 1 (calculate the hidden layer’s output). *where* h(j)* is the output value of the* j*-th node of the hidden layer, is the connection weight of the input layer and the hidden layer,* h*_{j} is the wavelet basis function,* b*_{j} is the translation factor for the wavelet basis function, and* a*_{j} is the telescopic factor for the wavelet basis function. This paper uses the Mexican Hat (Mexh) function as the basis function, with the Mexh wavelet being the second derivative of the Gaussian function.

The Mexh function expression is

*Step 2 (calculate the output layer’s output). *where is the weight from the hidden layer to the output layer,* h*(*i*) is the output of the* i*-th hidden layer node,* l* is the number of hidden layer nodes, and* m* is the number of output layer nodes.

*Step 3 (correct the error). *In this paper, a gradient modification method is used to modify the weights of the WNN and its basis function parameters. As the gradient learning method efficiency is low, an increasing momentum method is used to improve the efficiency [27]. The correction error is set to* e*, and the correction processes are described as follows:

Calculate the network estimation errors:where* yn*(*k*) is the desired output and* y*(*k*) is the output value predicted by the WNN.

Correction formula of weigh and parameters adds the momentum term:where , and are calculated based on the network estimation error.where *η* is the learning efficiency.

##### 3.2. Build Estimation Model

Based on the clustering results from Section 2.3, WNN is used to construct the test data estimation model for each concrete compressive test data set [28]. The network is divided into three layers: an input layer, a hidden layer, and an output layer [29]. The input for the input layer is the fracture load and compressive strength of the first* n* times from the test machine, the hidden layer node is composed of the wavelet function, and the output layer is the predictive compressive test data for the current test times. Figure 4 shows the flowchart for the concrete compressive test data estimations.

In this paper, the mean absolute error percentage,* e*_{1}, and the root-mean-square error,* e*_{2}, are selected as the evaluation index [30] to assess the accuracy of the estimation results, the formula for which is

where* y*_{i} is the estimation data and* z*_{i} is the actual measured data.

##### 3.3. Analysis of Restrictive Conditions and Applicability

WNN estimation is not a completely unfamiliar tool, but it is still in the exploratory stage in concrete test data estimation. Wavelet analysis itself has the ability to solve unstable time series and is widely used in data flow prediction with complex characteristic. It is feasible to apply WNN to concrete quality test data estimation.

There are many factors that affect the prediction results of concrete test data, such as the loading rate of the compression-testing machine, the different operator’s operations, changes in the compression-testing machine, concrete cube test blocks of different size, the singular test data, and so on. However, for a certain compression-testing machine, using standard concrete cube specimen, the time series of the compression test data has a self-correlation. In this estimation model, all the test data come from the same YAW-2000 compression-testing machine. The equipment is computer-controlled and has a hydraulic synchronous loading device, which eliminates the influence of the change of the different compression-testing machines and the instability of the loading rate on the estimation results. For the estimation error caused by singular test data, the training test data and the actual measured verification data both come from the Shaanxi Xi'an-Hancheng Intercity Railway construction information platform. The information platform system has the function of alerting for the unqualified data and reviewing for suspicious data. The data remaining in the platform system are all reliable.

#### 4. Simulation Analysis

Using concrete compressive test data from the Shaanxi Xi'an-Hancheng Intercity Railway construction information platform, a detailed simulation was conducted. Using one-step-ahead estimation, test data from the YAW-2000 compression-testing machine in the TJ-01 contract section field laboratory were selected for clustering and training. There are three types concrete cube specimen depending on the concrete ratio (which are C20, C30, and C40). The standard concrete specimen is a 150mm 150mm 150mm cube. Through the compressive strength test, we got the fracture load, F(KN), and the compressive strength, (MPa), the formula for which is

where A is the pressure-resistant area of the specimen and A=150mm150mm.

Take 200 times concrete compressive test data from the YAW-2000 compression-testing machine test, training them as samples to build estimation models. After that, using the data after 200 times tests as the actual measured data samples to compare with the estimate data, the comparison results prove the validity of the estimation model.

The 200 times concrete compressive test data using as training sample are shown in Table 1.

##### 4.1. Clustering Analysis

Cross-validation was used to determine the clustering number for the FCM algorithm,* c*=3, and the GA+SA+FCM clustering algorithm was used to train the samples for the analysis. The concrete compressive test sample clustering results are shown in Figure 5.

To verify the clustering effect of the optimization algorithm, the FCM clustering algorithm, the GA + FCM clustering algorithm, the SA + FCM clustering algorithm, and the GA+SA + FCM clustering algorithm were each used to cluster the samples. The validity measure values for these four methods are shown in Table 2.

In Figure 5,* X* and* Y* were the coordinates. The group located in the lower left corner was the C20, the group in the middle was the C30, the group in the rightmost was the C40, and the small red circle in each type shows the cluster center. In Table 2, the GA+SA+FCM algorithm XB index was 3.2631, the DI index was 0.0308, and the ADI index was 0.0084. To compare the simulation results with the other three algorithms, it was found that the clustering effect was closer and the separation ratio was better. The results indicated that the GA+SA+FCM algorithm can be used to cluster the concrete compressive test data.

##### 4.2. Simulation Analysis

To verify the estimation effects of the WNN, a BP neural network and a WNN were used to estimate the concrete compressive test data under the same experimental conditions. Compared the two estimation data with the actual measured data, the scatter diagram results are shown in Figure 6. The actual measured data samples for fifty C30 concrete cube specimen test data and fifty C40 concrete cube specimen test data were selected from the same YAW-2000 compression-testing machine after the 200 times training sample tests.

**(a) Fracture load estimation results of the C30 concrete.**

**(b) Compressive strength estimation results of the C30 concrete.**

**(c) Fracture load estimation results of the C40 concrete.**

**(d) Compressive strength estimation results of the C40 concrete.**

A comparison of the measured data with the WNN estimated data indicated that the WNN was able to estimate the fracture load and compressive strength of concrete compressive test data, with the estimated data trends being basically consistent with the measured data trends. However, if there is a sudden data change, the estimation error obviously increases. From a comparison of the measured data, the WNN estimated data, and the BP neural network estimated data, it was found that the two estimation algorithms were able to accurately estimate the fracture load, F, and compressive strength, , of the concrete compressive test data when the data changed gently and regularly. However, when the data changed abruptly, there were more BP neural network estimation errors than WNN estimation errors at the same position. As shown in Figures 6(a), 6(b), 6(c), and 6(d), at the 6th, 7th, 30th, 32nd, and 47th times in Figures 6(a) and 6(b), and the 3rd, 12th, 43rd, and 44th times in Figures 6(c) and 6(d), the WNN estimation accuracy was obviously higher than the BP neural network, which proved that the WNN was able to overcome some of the BP neural network shortcomings and was superior to the BP neural network for the estimations. Table 3 shows the evaluation index data for these two estimation algorithms, where* e*_{1} is the mean absolute percentage error and* e*_{2} is the root mean square error. The* e*_{1} of the C30 and C40 concrete fracture load estimated by WNN are 4.7121 and 4.7409 which are far less than the results 9.5870 and 9.3584 estimated by BP. The* e*_{2} of the C30 and C40 concrete fracture load estimated by WNN are 7.2174 and 7.4650 which are far less than the results 11.2873 and 11.4894 estimated by BP. The e1 and e2 of the C30 and C40 concrete compressive strength have the same characteristics as the former. The evaluation index data comparison showed that the WNN estimation effect was superior to the BP neural network.

#### 5. Conclusions

As the GA+SA optimizing clustering algorithm does not fall into a local optima and has a rapid calculation speed, it was proven to be effective when searching for the global optimal solution and had a significantly better clustering effect than traditional algorithms. It was also proven that the traditional FCM clustering algorithm was strongly depended on the clustering center; the incorrect selection of the clustering center would lead to an incorrect clustering boundary, which would significantly reduce the clustering effect.

The WNN estimation algorithm was found to be able to analyze and estimate the central lab clustered concrete compressive test data, with the estimation results clearly meeting the requirements. In the WNN error correction process, an increasing momentum method was adopted to improve the learning efficiency and to remedy the low efficiency of the traditional gradient modification methods.

The WNN estimation algorithm based on the GA+SA+FCM optimized clustering algorithm proposed in this paper was able to analyze and estimate the concrete compressive test data. The clustering effect was obvious and the estimation accuracy met the requirements. For example, the fracture load estimation for the C30 concrete estimation index* e*_{1} reached 4.7121% and* e*_{2} reached 3.0389KN, and the C40 concrete’s* e*_{1} attained 4.7409% and the* e*_{2} was 3.7480KN. The C30 concrete compressive strength estimation* e*_{1} was 7.2174% and* e*_{2} was 3.2447KN, and the C40 concrete compressive strength estimation* e*_{1} was 7.4650% and the* e*_{2} was 3.4412KN. Therefore, as the estimation algorithm was able to predict the concrete quality in advance, any problems associated with the use of unqualified concrete could be avoided.

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This research was supported by Fundamental Research Funds for the Central Universities no. 300102258402, named remote intelligent monitoring of mechanized construction quality, and the Traffic Research Project of Shaanxi Communications Department in 2017 of Research on Information Technology for Quality and Safety Management of Railway Construction Projects under Grant no. 17-55X.