Abstract

In this study, the daily lactation data of Holstein dairy cows in one lactation period (305 days) were used as lactation time series data. Empirical mode decomposition (EMD) was used to decompose milk yield series. The nonstationary milk yield series with multiple oscillation modes was decomposed into several components. After eliminating the interference components, the interference components were superimposed. Remaining component reconstruction was used to get the denoising milk yield series. The denoising milk yield series retained the basic characteristics of the original milk yield series and corrected the errors of the original data. The back propagation neural network (BPNN) was used to predict and compare the original milk yield series and the denoising milk yield series. The results showed that it was feasible to use EMD to smooth the original daily milk production data. The noise reduction milk production series was beneficial to the learning of prediction model and could improve the accuracy of prediction of daily milk production of dairy cows.

1. Introduction

With regard to the predictive study on the milk yield of dairy cows, Brody et al. [1] had used a mathematical model to describe the functional relationship between milk yield and lactation time in 1923. Thereafter, a large amount of mathematical model research is carried out to predict the milk yield [24], and the basic thinking of selecting or constructing model is generally consistent, which is to fit the mean milk yield of cows in a certain period of time and then to predict the milk yield using model. It is useful to predict the population milk yield of dairy cows using the population mean statistical data for investigating the nutritional requirements of population dairy cows [57]; however, such treatment suggests the difference in production performance between individual cows. Clearly, applying such prediction model in predicting the milk yield of individual dairy cows will increase the prediction error [810]. With the increasingly demanding standards of fine feeding for dairy cows, the milk yield prediction accuracy of individual dairy cows should be improved, the nutritional requirements of individual cows should be obtained, and the diet nutrition concentration for individual cows is clustered, thus obtaining the optimal grouping scheme and group feeding formula of dairy cows.

The biological difference of individual dairy cows, together with the metabolic difference in different feed nutrients, has resulted in changes in the daily milk yield of dairy cows with time. Therefore, within a lactation period, the daily milk yield of individual dairy cows shows a nonlinear time series. Measuring error during production or the unpredictable external effects will affect partial daily milk yield and become the noise term. Consequently, the partial daily milk yield represents the unpredictable part in the milk yield series from the perspective of the timing characteristics of the entire lactation period, which will disturb the learning of major data characteristics by the prediction model and reduce the model prediction performance [1113]. Traditional forecasting methods and models have not been flexible enough to predict lactation dynamically and reliably due to the complex physiological mechanisms of dairy cows, and in addition, single-forecasting models usually do not provide accurate prediction results. Combined forecasting models combine the advantages of different techniques or methods to predict data trends more effectively. This study employed empirical model decomposition (EMD) [14] to decompose the original milk yield data series according to the time scale characteristics, then to produce the denoised milk yield series after reconstruction, and to carry out simulation forecast using the BPNN model. The results suggested that the denoised milk yield series could well preserve the series properties of the original data, which contributed to improving the model prediction accuracy. The proposed combined model enhances the learning capability of the prediction model.

2. Materials and Methods

2.1. Data

In this study, daily milk yield data was converted to 4% FCM [15]. A lactation period of dairy cow was defined as a milk yield series, while the lactation period started from cow calving to the 305th lactation days after delivery, and the last milk yield data was not collected for prediction. Data in a milk yield series were randomly divided into training set and verification set datasets, among which the training set samples () were used to train the model, while the verification set samples () were utilized for model verification.

2.2. Relevant Theories and Techniques

EMD-based milk yield data decomposition: the original milk yield series was carried out stationary processing using EMD and decomposed into a residual sum and a series of finite intrinsic mode functions (IMFs) with low quantity [14]. The original milk yield series was , which represented the cumulative sum of several timing components:

stands for the IMFs of the original milk yield series arranged in the order of high frequency to low frequency; represents the number of IMFs, and indicates the trend of original milk yield series . The original milk yield series was decomposed by EMD, as shown below: (1)All local maximum values and local minimum values on the original milk yield series were identified, and the upper envelop [16] curve comprised the local maximum values, together with the lower envelop curve constituted by local minimum values were obtained by the cubic spline function [17]

The means of upper envelop curve and lower envelop curve was , and the difference between the original milk yield series and the mean was (2)If was an IMF, then was IMF1, and . Otherwise, the original milk yield series was replaced by , and the calculation was returned to step (1), until IMF1 was extracted(3)The residual was calculated as follows , if was a monotonic function, then was replaced by ; and steps (1)-(4) were executed. Otherwise, multiple different decomposition results would be obtained, the high frequency IMF1 was removed [18], then other IMFs were cumulatively added, which were then added with residual to obtain the denoised milk yield series

2.3. BPNN Model

BPNN is a kind of multilayer feedforward neutral network with signal forward transfer and error back propagation, which is frequently used in nonlinear prediction, and it is also one of the most extensive neural network models applied at present. A three-layer BPNN model is constructed for simulation forecast, the numbers of input layer and output layer neurons in the model topology are related to the prediction task, and the Hecht-Nielsen method is utilized in obtaining the neuron number in the hidden layer [19]. If the neuron number in the input layer is , then the neuron number in the hidden layer should be . BPNN is greatly dependent on the initial weight and bias, and a group of superior weight and biasing was of crucial importance to the network prediction performance. Under the premise of guaranteeing the minimal training error, the improved distribution iteration is adopted to obtain these parameters, to replace the method of standard BPNN to randomly produce parameters.

2.4. Simulation Forecast

Simulation forecast was constituted by the original milk yield series/denoised milk yield series, distribution iteration adaptive operator finds, and BPNN sets (Figure 1), as described below: (1)The normalized formula () was used to normalize the original milk yield series/denoised milk yield series to interval [-1,1](2) distribution iteration adaptive operators were found, BPNN was initiated, and the self-adapting operator find algorithm was used to optimize the BPNN initial weight and bias(3)BPNN was trained using the optimal weight and bias, the training set milk yield was used as the input variable, while the morrow milk yield of training set was used as the output variable, and finally the verification set was used for test(4), RMSE, and SSE were employed to evaluate the prediction results

2.5. Evaluation Indexes of Prediction Performance

-square (coefficient of determination) was used to characterize the fitting effect of data, and the value closer to 1 indicates better fitting degree of data by the model.

Root mean squared error (RMSE) was the fitting standard deviation of the regression system.

The sum of squares due to error (SSE) closer to 0 indicated better model selection and fitting and more successful data prediction.

3. Results and Discussion

3.1. Denoised of the Milk Yield Series

In Figure 2, the observed daily milk yield data of every cow represented one timing value of the milk yield series within a lactation period; then, this series was characterized by the nonstability, nonlinearity, complexity, and noise. As IMF included various scale features, EMD was utilized to decompose the original milk yield series, and all finite IMFs and residual series were extracted. The remaining IMFs and residuals after removing the first IMF were recombined to obtain the denoised milk yield series.

The milk yield series was decomposed through EMD, to obtain IMF and the residual series, and the resultant IMF included the local characteristic signals of the original milk yield series at various time scales, which was the “roguing” [20, 21] reflection of the entire original milk yield series. The resultant IMF after decomposition showed obvious physical significance, which represented the hidden fluctuation components in the original milk yield series at various scale, while the residual indicated the basic trend of the milk yield trend, and the residual trend was consistent with the group statistical data trend features [2224].

The front milk yield of the original milk yield series was gradually increased (early lactation period), which peaked in the second or third lactation month and maintained for a period of time (peak lactation period), and thereafter decreased slowly (middle-late lactation period). The trend of milk yield is consistent with the known reports [25, 26]. The denoised milk yield series also well reflected the lactation peak features at the period of 49-79 days, corrected the measuring error of the 271 lactation days, and preserved the milk yield characteristics of individuals in the original milk yield series on the whole (Figure 3).

3.2. BPNN Prediction

The training datasets of the original milk yield series and denoised milk yield series were used to fit the prediction equation using BPNN. The prediction equation scatter diagram of the training set of the original milk yield series is shown in the left of Figure 4, while that of the training set of the denoised milk yield series is presented in the right of Figure 4. The prediction equation scatter diagram of the verification set of the original milk yield series is displayed in the left of Figure 5, while that of the verification set of the denoised milk yield series is shown in the right of Figure 5.

The prediction scatter diagram of the training set of the original milk yield series was clearly scattered, while obvious correlation could be seen in the training set of the denoised milk yield series. The minimum of 10.4 kg was seen on the 271st lactation day in the original milk yield series, which belonged to the measuring error upon verification; the predicted value was 32.3 kg, which had resulted in the maximum residual of 21.9. Such abnormal value was repaired to be 17.8 kg after denoised processing, and the residual was reduced. At the same time, the milk yield values in the training set of the original milk yield series mostly concentrated on 30 kg/d to 35 kg/d, and such feature was reflected in the scatter diagram of the training set of the denoised milk yield series. The greatest scatter density was seen at 30-35 in the right of Figure 4.

The prediction scatter diagram of the verification set of the original milk yield series was not clearly scattered, which also demonstrated that the errors and noise data randomly divided to the verification set of the original milk yield series were less than those to the test set. The obvious correlation could still be observed in the verification set of the denoised milk yield series. With regard to the residual, the verification set of denoised milk yield series also performed better than that of the original milk yield series.

In this study, the prediction effect of the denoised milk yield series was better than that of the original milk yield series, and the indexes of correlation and residual in the denoised milk yield series were better than those of the original milk yield series, which partially overcame the problem of prediction distortion of the measuring errors and noise data.

3.3. BPNN Evaluation

The training datasets of the original milk yield series and denoised milk yield series were fit using BPNN (Table 1), respectively. The determination coefficient of the original milk yield series fitting model was 0.37, RMSE was 4.35, and SSE was 3827.68. The determination coefficient of the denoised milk yield series fitting model was 0.90, RMSE was 1.37, and SSE was 379.90. The original milk yield series fitting model was applied to the original test set samples, and the prediction result was 0.56, RMSE was 3.35, and SSE was 1146.28. The denoised milk yield series fitting model was applied to the denoised test set samples, and the prediction result was 0.86, RMSE was 1.79, and SSE was 326.81. Clearly, the fitting degree obtained upon directly using the original milk yield series was substantially decreased, and the error was greatly increased, suggesting apparently poor effect of the prediction model constructed based on the original milk yield series. By contrast, the prediction model constructed based on the denoised milk yield series had better selection and fitting, as well as more successful data prediction [27].

In Table 1, the observed value is the original milk yield value. Data in a milk yield series were randomly divided into training set and verification set datasets, among which the training set samples () were used to train the model, while the verification set samples () were utilized for model verification. The samples are frequency.

The training set and verification set were compared in the same prediction model. The of the verification set of the original milk yield series was increased by 19% compared with that of the training set, RMSE was decreased by 1 unit, and SSE was reduced by 2681.4 units. The of the verification set of the denoised milk yield series was decreased by 4% compared with that of the training set, RMSE was increased by 0.42 unit, and SSE was reduced by 53.09 units. The differences in , RMSE, and SSE between the training set and verification set of the original milk yield series were great, which suggested poor stability of the prediction model fitted using the original milk yield series; at the same time, the model was associated with the problem of excessive fitting.

4. Conclusion

The original milk yield data series of dairy cow within a lactation period displays dynamic complexity. Prior to construction of the prediction model, EMD is adopted to reduce the noise of the original milk yield series, which can remove the abnormal points in the series, obtain the basic trend of milk yield within the lactation period and daily milk yield fluctuations of various scales, and restore the clear lactation features of individual cows. The original milk yield data series of individual cows is subjected to EMD denoised processing. Besides, fitting a nonlinear model (such as BPNN) for prediction is a feasible strategy, which can not only improve the daily milk yield prediction performance but also enhance the stability of the prediction model.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no competing interest.

Authors’ Contributions

Zhiyong Cao and Zhijuan Cao contributed equally to this work as co-first author.

Acknowledgments

This research was supported financially by the project “Major Science and Technology in Yunnan Province/2018ZF012” and the project “Soil Pollution Prevention and Control in Yunnan Province/201839# A3008784.”