Abstract

The security of network information in the Internet of Things faces enormous challenges. The traditional security defense mechanism is passive and certain loopholes. Intrusion detection can carry out network security monitoring and take corresponding measures actively. The neural network-based intrusion detection technology has specific adaptive capabilities, which can adapt to complex network environments and provide high intrusion detection rate. For the sake of solving the problem that the farmland Internet of Things is very vulnerable to invasion, we use a neural network to construct the farmland Internet of Things intrusion detection system to detect anomalous intrusion. In this study, the temperature of the IoT acquisition system is taken as the research object. It has divided which into different time granularities for feature analysis. We provide the detection standard for the data training detection module by comparing the traditional ARIMA and neural network methods. Its results show that the information on the temperature series is abundant. In addition, the neural network can predict the temperature sequence of varying time granularities better and ensure a small prediction error. It provides the testing standard for the construction of an intrusion detection system of the Internet of Things.

1. Introduction

The big data of agricultural production is based on the continuous observation of the environmental elements of the farmland. It integrates massive multisource and multiscale information [1, 2]. Relying on the perception terminal of the Internet of Things (the following are expressed in IoT) to collect farmland environmental information has been widely used [36]. The Internet of Things sensor terminal integrates various sensors, such as meteorology, water and salt, soil, and groundwater, and combines ground and air sensor cluster to collect and transmit all kinds of data in real time. Sensor nodes in the Internet of Things are usually distributed in an unattended environment, which is vulnerable to external malicious attacks and requires high security for nodes. The perspectives of attack mode and intrusion behavior are the two main ways to influence the normal routing forwarding of nodes and to consume node resources [79]. Although the existing intrusion detection technology for wireless sensor networks can resist system attacks to a great extent, there are also some shortcomings [6], such as high false alarm rate of intrusion detection system, the unstable speed of intrusion detection system, and a previous update of attack feature library. With the development of artificial intelligence, neural networks have attracted much attention because of their ability of self-learning and searching for optimal solutions at high speed. Using the principle and technology of neural network to realize intrusion detection has become a new direction in the development of intrusion detection technology in recent years. It has emulated the theory and method of the biological information processing mode to obtain the intelligent information processing function [10]. The intrusion detection system based on neural network belongs to the category of abnormal intrusion detection, including data acquisition module, data training, and detection module and a response module. The most essential and most important feature of the neural network algorithm is the data training and detection module. In this study, the research on the data training and detection module is carried out. The prediction data is added to the data training and detection module. By using the better prediction method, the accurate prediction of the evidence is realized [11], the characteristics of the collection information are extracted, the internal association rules of the collection information are excavated, and the detection standard for the subsequent accurate intrusion detection is provided.

At present, the prediction of farmland climate mainly involves indicators such as rainfall, humidity, wind speed, and soil temperature. Among them, Ashok Mishra adopted the SWAP crop model which run for the rice and two scenarios and realized the rainfall forecasting. It was confirmed that the accurate prediction of rainfall could save rice irrigation water [12]. I. Białobrzewski used neural network modeling and STATISTICA method to predict relative air humidity and found that neural network prediction results are more accurate [13]. To realize mean hourly wind speed modeling prediction, R.E. Abdel-Aal using GMDH-based abductive networks verified abductive networks predictions have better predictive effects than neural networks [14]. Z Gao et al. using the revised force-restore method to predict the soil temperatures in naturally occurring nonuniform soil [15]. In summary, most of the temperature prediction is aimed at atmospheric temperature prediction, but the generalized climate and field microclimate have different climatic characteristics. Agricultural microclimate research is of great significance to the development of agricultural production, and farmland temperature is critical to crop production. Therefore, the temperature of agricultural microclimate is taken as the research variable [16]. Therefore, the temperature in the agricultural microclimate is used as the research variable. Moreover, the suitable forecast model has been chosen to predict the farmland temperature to provide some data guidance for agricultural production.

Although there are many studies on the prediction of atmospheric temperature, most of the research is based on the projection of the temperature according to the average annual temperature, the monthly average temperature, or the average daily temperature. Time has a significant influence on the prediction results, so the time factor should be taken into account in the prediction [17]. Regarding atmospheric temperature prediction, Changjun used the Winters method to predict the average monthly temperature from June to August in summer [18]. Zhang Yingchun used the artificial neural network learning algorithm to predict monthly average temperature data in the Karamay Desert [19]. B. Ustaoglu used the three artificial neural network algorithms (RBF, FFBP, and GRNN) to predict the daily average, maximum, and minimum temperature series [20]. For the prediction of greenhouse temperature, Zuo Zhiyu and others established the ARMA 1-step prediction model using time series analysis method and realized the prediction of greenhouse temperature during the next period with the acquisition time unit of 30 minutes [21]. Zhang Xiaodan using parameter optimization support vector machine to predict and model the daytime temperature sequence in the greenhouse, the time interval of data is one hour [22]. HuihuiYu et al. used the improved PSO to optimize the LSSVM to predict the temperature series collected in the solar greenhouse. A temperature sequence with a time granularity of 6 hours is predicted by contrasting different methods [23]. It can be perceived that most of the forecast of the climate temperature is based on the monthly and daily forecasting units, and the time granularity of the greenhouse temperature forecast is mostly in groups of hours. However, the greenhouse temperature is controlled more manually than in the farmland microclimate. Therefore, the temperature of farmland microclimate is taken as the research object, the temperature time series data is organized, different time granularities are divided, and the trend characteristics are analyzed. The traditional time series analysis method and the neural network prediction method are used to predict the different time granularity, respectively. Based on the feature extraction and prediction of the collected data, we construct the association rule base of intrusion detection and update the rule base of the detected intrusion information and achieve the goal of dynamic learning.

2. Method

The Internet of Things can make all kinds of integrated embedded sensors work together, monitor, perceive, and collect the information of various environment or monitoring objects in real time by using all sorts of sensors to work together. The embedded system analyzes the data, and through adaptive wireless network communication, the collection and perception of various signals in the physical world are realized. However, because many sensors are distributed in relatively open and unsupervised places, it is easy to be attacked from outside. Therefore, the security of the Internet of Things will become an important research direction. The power supply of the Internet of Things is limited, the communication ability is limited, and the calculation and storage are also finite. In this case, how to establish an effective security system, detect all kinds of intrusion and malicious attacks, and ensure the reliability of the Internet of Things is particularly important [24]. From the point of view of security technology, the technologies for the security of the Internet of Things include authentication technology to ensure its own security, key establishment and distribution mechanism to ensure secure transmission, and data encryption to ensure the security of the data itself [25]. These technologies are passive precautions and cannot detect intrusions actively. Intrusion detection based on Internet of Things security technology is a proactive defense technology. By monitoring the state, behavior, and usage of the whole network and system, the intrusion detection system detects the primary use of the system users and the attempt by the external invaders to invade the network or system. It can not only identify the intrusion from the outside but also monitor the illegal behavior of the internal users [26]. Zhang Jianfeng et al. have carried out a series of discussions on the intrusion detection technology of WSN and introduced the application of neural network to intrusion detection technology in the Internet of Things [27]. By dividing the intrusion detection system into different modules, the neural network is applied to each module to realize the intelligent and dynamic detection of the intrusion detection system. Through the feature extraction and prediction of the collected data, the predefined dataset rules and attack data set rules are trained to train the neural network module to provide a dynamic rule base for the intrusion detection system.

Before analyzing and forecasting meteorological data, it is necessary to examine the characteristics of the sequence and grasp the changing rule of the data. The main characteristics of meteorological data are seasonal analysis and periodic analysis. Among them, the seasonal study is the analysis of the climate differences between different seasons, the amplitude of commonly used climatic elements, and the large magnitude indicating strong season. Through seasonal analysis, we can understand the seasonal changes of data and help people to conduct seasonal decomposition according to needs. Moreover, the periodic report is to explore whether a variable shows an inevitable trend of change with time [28]. The relatively long periodic patterns of timescale include annual cyclical trend, seasonal trend, cyclical trend, relatively slight quarterly periodic trend, weekly cyclical trend, even shorter day and hour cycle trend. The object of this study is the temperature sequence. The feature analysis of the series is helpful to understand the variation of the chain and can also be used to distinguish the different prediction time granularity.

Temperature series are time series data, and the commonly used methods of analyzing time series data were divided into traditional time series prediction model and data-driven time series forecasting model [29]. The conventional time series prediction models mainly include ARMA (AR, MA), ARIMA, improved time series model Threshold Auto-Regressive (TAR), Vector Auto-Regression (VAR), Auto-Regressive Conditional Heteroscedasticity (ARCH), and Generalized Auto-Regressive Conditional Heteroscedasticity (GARCH). The use of ARMA model must satisfy the self-correlation of the parameters, and the autocorrelation coefficient must be higher than 0.5, and the model can only be used to predict the economic phenomena related to its early stage. The main problem that the TAR model has for meteorological time series prediction is that it requires a lot of complicated optimization work in the modeling process [30]. The VAR model can be viewed as a multivariate extension of the AR model. Using the VAR model must eliminate the periodic nonstationary nature of the variables [31]. Both ARCH and GARCH processes are new stochastic processes that show the variation of the variance of random variables over time, but not all-time series data exhibit heteroskedasticity [32, 33]. The ARIMA model only requires endogenous variables without resorting to other exogenous variables. Data-driven time series prediction methods include chaotic time series forecasting, gray time series forecasting, fuzzy logic time series forecasting, neural network time series forecasting, and SVM time series forecasting and so on. When selecting the chaotic time series forecasting model, the specific characteristics of the time series should be analyzed to grasp the nature of the chaotic precursors. Gray prediction still needs improvement regarding grey measure, sequence operator, correlation measure, residual error correction, etc. Fuzzy time series has the problems of the quantitative level of fuzzy inference, prediction accuracy, and prior knowledge dependent on specific issues. SVM is challenging to implement for large-scale training samples, and the speed of operation needs to be improved. Moreover, the neural network has better nonlinear mapping ability, generalization ability, and fault tolerance. Based on the above analysis, this experiment selected the ARIMA model in traditional time series analysis and the data-driven neural network model to predict the farmland temperature.

ARIMA. ARIMA model only needs endogenous variables and does not need to use other exogenous variables. The use of ARIMA model needs to satisfy that time series data must be stable. Moreover, the model can capture the linear relationship in essence and cannot capture the nonlinear relationship.

Step 1. To test the stability of the original sequence, if the p-value of the nonstationary test is more than 0.05, the different treatment should be continued at this time, and then the stability test after the difference processing is carried out, if the sequence is stationary, the first order difference is stable. If nonstationary, the most two-order difference stationary test is carried out. If the two-order difference post is nonstationary, the sequence is a nonstationary sequence, and it is not suitable for the next step prediction.

Step 2. According to the recognition rule of time series model, the corresponding model is established. If the partial correlation function of stationary sequence is truncated and the autocorrelation function is tailed, it can be concluded that the sequence is suitable for AR model; if the partial correlation function of stationary sequence is tailed and the autocorrelation function is truncated, it can be concluded that the sequence is suitable for MA model; if the partial correlation function and autocorrelation function of stationary sequence are tailed, then the sequence is suitable for MA model. Sequences are suitable for ARMA models. (Truncation is the property that the autocorrelation function (ACF) or partial autocorrelation function (PACF) of a time series is zero after a certain order (such as the PACF of AR); trailing is the property that ACF or PACF is not zero after a certain order (such as the ACF of AR).)

Step 3. Carry out parameter estimation and test whether it has statistical significance.

Step 4. The hypothesis test is used to diagnose whether the residual sequence is white noise.

Step 5. Predictive analysis was performed using the tested models.

Levenberg-Marquardt Algorithm. The Levenberg-Marquardt algorithm is the most widely used nonlinear least squares algorithm [34]. It is the use of gradient to find the maximum (small) values of the algorithm. The goal of the algorithm is to the function relation , given and Noise-containing observation victor, estimates. Calculation steps are as follows.

Step 1. Take the initial point , terminate the control constant , and calculate .

Step 2. Calculate the Jacobi matrix , calculate , and construct an incremental normal equation .

Step 3. Solve delta normal equation to obtain .(1)If , make , and if , then stop the iteration, output the result, otherwise make , and go to Step 2.(2)If , make , resolve the normal equation to obtain , and return to 1.

3. Materials

The primary data sources and temperature data collected by the automatic acquisition equipment of the Internet of Things are introduced, and the data are pretreated and analyzed.

Monitoring Data. The real-time sensing system of dynamic farmland information based on the Internet of Things broke through significant real-time problems, such as real-time dynamic detection of salt, alkali, water, rapid self-diagnosis of equipment faults, and online automatic real-time warning. The information database realizes data receiving, cleaning, storage, integration, and sharing, effectively improves the authenticity and reliability of the collected data, and provides a useful data service foundation for subsequent data mining and precision agriculture [35]. The data are divided into two groups. One group is of Dongying city meteorological station air temperature data, which is every 3 hours for a sampling frequency. We selected the data from 2014-2017, a total of 11680. The second part was based on IoT equipment acquisition in Dongying, which is collected one hour at a time. We choose the data for the whole year of 2016, with a total of 8,784 data.

Data Feature Analysis. Before data analysis, two groups of data are preprocessed to fill in missing values and smooth noise data, identify and delete outliers and resolve inconsistencies, and eliminate duplicate data. Data is transformed into data mining form by smoothing and normalizing.

Annual Statistical Variation. Four-year overall air temperature changes are obtained by plotting the farmland air temperature for 2014-2017, as shown in Figure 1.

It can be seen from Figure 1 that the curve of farmland air temperature changes in the region is similar to the function curve of . The changing trend of air temperature in a year is firstly increased and then decreased. The months with the highest temperatures occur every year in the three months from June to August, the lowest temperatures in January, February, and December.

In order to compare and analyze the difference in air temperature spacing, we selected daily maximum temperature and daily minimum temperature of the most substantial temperature change in September (which is 0.528), the smallest change in July (which is 0.270) and the general temperature changes in April (0.421) and December (0.393) as the object of study, their changing trends are shown in Figure 2.

The daily maximum and minimum temperature gaps in July are smaller than those in September which are evenly distributed. The highest and lowest temperature curves in December are distributed at 0°C, and the temperature gap is minimal on the 22nd day.

Diurnal Temperature Change. The daily variation data for each month were obtained by the statistical mean of 24 hours of daily temperature in each month.

By analyzing Figure 3, we can find that the daily temperature trends are similar, and the temperature is higher at 11:00-16:00 daily, at 22 o’clock the next day 4 o’clock to achieve the lowest. The mean daily temperature in January was the lowest and the highest in July. In September, the temperature difference between day and night was tremendous, while that of July was the smallest.

Stability Test of Air Temperature Time Series. Before ARIMA predictive modeling, we need to test the stability of the data. Therefore, the stationary test of time series of air temperature in 2014-2017 is obtained by Table 1. Time series is first-order differential stationary, that is, the temperature time series is stationary.

Fitting Results and Analysis. The fitting of the average daily temperature in 2015 and 2016 can be obtained as follows.

The equation of the fitting curve for two years is . Coefficients (with 95% confidence bounds) are as shown in Table 2 and Figure 4.

Because the fitting curve is the sine function, the mean squared error of the fitting is 3.201 and is 0.9067. There is no discernible trend in the short term, but a certain periodicity. If using traditional time series forecasting may not get the ideal effect, but the neural network has good learning ability, in the sequence some advantages can be predicted.

4. Modeling

Using ARIMA model and the L-M algorithm model, respectively, for modeling, we obtain the experimental results.

4.1. Data Group

The temperature data of two groups with different time granularity were divided into a training set, a verification set, and a test set, and the proportion was 0.7:0.15:0.15. First, the average monthly temperature of 2014-2017 years is modeled and predicted, and the effect of temperature prediction [36] with time granularity is analyzed. Then the temperature data of the time granularity of 3 hours are modeled and predicted. We selected the temperature data in 2017 to verify the model and choose the best model. Finally, the data for the last week of 2016 was forecasted by the time granularity of 1 hour.

4.2. Model Construction

Two forecasting methods were used in the experiment to predict the farmland temperature, in which the prediction experiment of ARIMA should first carry out the stability test. When the time series is stable, the order forecast is carried out. Before using the neural network for prediction, clean the data, format the input variable and output variable, and divide the data set into proportions. Through the network training, the optimization is continuously optimized until the best state is reached. The execution steps of the two prediction methods were shown in Figures 5 and 6.

Figures 5 and 6 show the prediction process of the two modeling methods. It can be seen that different forecasting methods require different data, the ARIMA model needs the higher stability of data, and the neural network does not need data stability. The critical step of the ARIMA model is to solve the stationarity of data and determine the order of the model, and the key to the neural network lies in the optimization model.

4.3. Results

The model training is realized by the Levenberg-Marquardt algorithm. Besides, the network was trained according to the sample input vector, target vector, and hidden layer nodes and delay number parameters of the preset training network. The error autocorrelation was used to judge the training network whether it is optimal. Moreover, the model was continuously optimized until the autocorrelation coefficient of the error reaches the optimal range.

By setting the time granularity for months, we forecast the monthly average temperature. The results are as shown in Figure 7.

The MSE diagram shows the variation of the mean square error of training data, validation data, and test data in different training periods. The overall trend of the three curves is similar. The best state is at sixth times, at which the mean square error of the test data is minimized. In the training state graph, MU first dropped and then rose, then fell to , and remained stationary, which indicates that the model had reached its optimal state. The regression diagram describes the regression of the three datasets. Most of the data are in the vicinity of the diagonal, indicating that the regression works well.

The upper half of the graph (Figure 8(a)) is the response of the output element to the time series, and the lower part is the output error, whose range is (-5, 5), which indicates that the error is small. It can be seen in the chart (Figure 8(b)) that except for the 0 order autocorrelation, the correlation coefficients should not exceed the upper and lower confidence intervals. Some of the charts in the confidence interval indicate that the prediction results are not very ideal, and the reason is that the amount of data is relatively small.

Figure 9, because of the small amount of data, shows the effect of model learning in general. The results of the L-M prediction and the real value have a little gap, but the trend is the same, which is consistent with the effects of the error and error autocorrelation of Figure 8. The data quantity has a particular relationship with the accuracy of the prediction of the model. The black line shows the data predicted by the traditional time series forecasting method ARIMA. The RMSE, MAE, MPE, and MASE of the detected ARIMA are 1.588801, 1.051737, -86.78105, and 0.3993068. It can be seen that the trend of the ARIMA model is the same as that of the real value, but the numerical difference of the data is significant. The average difference is 6.537237°C, which of L-M neural network is 0.548778°C.

Daily Sequence Prediction. By setting the time granularity of the obtained data to the day which is collected every three hours. The result of the prediction is as shown in Figure 10.

Figure 10(a) shows the MSE of the three datasets trained 15 times is displayed, and the MSE becomes best when the number of training times is close to 9 times. The first curve of Figure 10(b) shows that the gradient of training shows a decreasing trend. When the value of MU does not change, it means that the model training reaches the best state, and stop the practice. Otherwise, it will cause overfitting and affect the prediction effect. The third figure is the verification of neural networks, whose main impact is to look at the effects of network evolution.

The target and output sequence of three data sets is all distributed in a sinusoidal style. From the error diagram, the time series error is small, and the distribution is about 0. Figure 11(b) shows that the time series has a high 0 order autocorrelation, and the other self-correlation values are small, and most of them are distributed in the upper and lower confidence intervals.

By comparing the predicted values with the real costs of different methods as shown in Figure 12, the sequences predicted by the ARIMA method are not accurate, indicating that ARIMA cannot achieve good results in predicting temperature time series.

Performance Evaluation. Two evaluation indexes are used to measure the accuracy of the model: mean square error (MSE) and . The accuracy of the ARIMA model is measured by comparing the average absolute standard error (MASE). The MSE is the expectation of the square of the difference between the parameter estimate and the true value of the parameter, which can evaluate the degree of change of data, and the smaller MSE value shows the prediction model has better accuracy in describing experimental data. is similar to MSE, but the difference is that compares the trend of the predicted value with the actual value. close to 1 indicates that the linear relationship between and is very close. The corresponding calculation formulas are (1) and (2).In formula (1) and formula (2), is the predictive value, is the real value, is the sample capacity, by calculating MSE, and can evaluate the performance of the model. Besides, in formula (1), the numerator is the sum of squares of errors, and is the degree of freedom. The evaluation index of the model is calculated as shown in Table 3.

In Table 3, the monthly mean temperature measured MSE values larger than regular daily temperature MSE. It is due to the fewer data on average monthly temperature. The value is measured by the two experiments indicating that the fitting result is good, and the prediction error is smaller than the real value.

Model Application. To verify the accuracy of the temperature prediction model, we selected the air temperature data collected in the project area Dongying in 2016.

According to Figure 13, it can be seen that the neural network algorithm has high accuracy and consistency.

By calculating the MSE and of the model again, we got Table 4. The values of MSE in the table are all less than 1.8. The value of is above 94%, which indicates that the error of model prediction data is small and the fitting degree of verification data is high.

4.4. Summary

It can be found that when the data quantity is few, the difference between the two models’ predicted results and the real value is excellent, but the L-M is better than ARIMA. The prediction effect of ARIMA model is weak when the data volume is large and has no long-term trend.

5. Summary and Prospect

The security of the wireless sensor network has been the focus of attention all the time. The contradiction between the security protection measures and the attack mode of the Internet of Things will emerge. Therefore, the application of intelligent new intrusion detection model to intrusion detection is one of the key points in its security research.

In this study, we study the application of neural network in intrusion detection system. By modeling and analyzing the real-time data collected by the Internet of Things terminal, we constructed the intrusion detection rule base. The main research work has the following two points.

(1) The neural network is applied to the intrusion detection system, which makes full use of the self-organization and self-learning ability of the neural network. Moreover, we make up for the shortcomings of the lack of active protection for the security technology of the Internet of Things.

(2) Taking temperature data as an example, we study the accuracy and efficiency of the neural network and the traditional ARIMA model in predicting the type of data. The study provides a reference for introducing prediction research into intrusion detection.

Through the research and discussion of the intrusion detection system, we propose a network intrusion detection system based on a neural network. On the premise of guaranteeing the security and reliability of the system, the system fully considers the intelligent characteristics of data acquisition and intrusion detection nodes in the Internet of Things. Using the intelligent perception ability of intrusion detection nodes in the Internet of Things (IoT), we synthesize intrusion detection and data prediction and provide a new scheme for the construction of the IoT security system.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work has been funded by Shandong’s independent innovation achievements transformation project (2014ZZCX07106).