Abstract

Research on stealthiness has become an important topic in the field of data integrity (DI) attacks. To construct stealthy DI attacks, a common assumption in most related studies is that attackers have prior model knowledge of physical systems. In this paper, such assumption is relaxed and a covert agent is proposed based on the least squares support vector regression (LSSVR). By estimating a plant model from control and sensory data, the LSSVR-based covert agent can closely imitate the behavior of the physical plant. Then, the covert agent is used to construct a covert loop, which can keep the controller’s input and output both stealthy over a finite time window. Experiments have been carried out to show the effectiveness of the proposed method.

1. Introduction

Industrial control systems (ICSs) are widely deployed in modern critical infrastructures (CIs), and their incapacitation can cause serious damage to equipment, environment, or even people’s lives [1]. During the past ten years, many efforts have been made to improve the security of ICSs [2, 3]. Among the existing research on ICSs security, a great deal of attention has been given to the study of stealthy data integrity (DI) attacks [4, 5], which can violate the integrity of control and sensory data. The purpose of such attacks is to disrupt the physical process while remaining stealthy with respect to anomaly detectors [6].

To construct stealthy DI attacks, a common assumption in most related studies is that attackers have prior model knowledge of physical systems. Kwon et al. [7] investigated three kinds of stealthy deception attacks on a linear time-invariant system with Gaussian noise. Their results showed that if an attacker had perfect model knowledge of the target system, he could carefully design a stealthy attack to avoid being detected by the monitoring system. Pang et al. [8] proposed stealthy false data injection (FDI) attacks for both feedback and forward channels of the networked control systems. It was assumed that the attacker knew the detailed system parameters. Such assumption can also be found in the recent work of Teixeira et al. [9], Sedghi and Jonckheere [10], Manandhar et al. [11], and Dutta and Langbort [12]. In particular, in [9], the authors also considered a more moderate scenario where the attacker’s model knowledge contains some uncertainties. In [13], the authors presented a covert agent structure and showed that the better the covert agent’s model of the plant, the easier it was for the covert agent to hide its actions.

Besides the perfect model knowledge of physical systems, there is a more rigorous assumption that attackers also have other model knowledge of target systems. Cárdenas et al. [14] studied three types of stealthy attacks that aimed at raising the pressure in a tank without being detected. The powerful attacker was assumed to have prior knowledge of the exact plant model and the anomaly detection scheme. In the work of Teixeira et al. [15], the model knowledge was divided into three categories: the model of the physical system, the model of the feedback controller, and the model of the anomaly detector. Attacks constrained by different levels of prior model knowledge were illustrated by experiments on a quadruple-tank process control testbed. In [16], the authors considered a stronger adversary who not only knew the physical model and the detection scheme, but also could adapt to different detection thresholds.

As discussed before, most prior works on stealthy DI attacks are based on various assumptions that attackers have model knowledge of target systems at different levels. However, there is no description of how such model knowledge can be obtained by an attacker. Although the assumptions of model knowledge are very useful for identifying subtle and stealthy malicious attacks, it may be difficult to acquire such prior knowledge in many practical scenarios, where explicit models of physical systems are usually not available directly [17].

Recently, increasing attention has been paid to stealthy DI attacks without the prior model knowledge of physical systems. Unlike the studies discussed before, Yu and Chin [18] proposed a principal component analysis (PCA) based method to design blind FDI attacks, which did not need any prior knowledge of Jacobian matrix in smart grid. Furthermore, Anwar and Mahmood [19] clarified that the PCA based blind attack strategy was only valid for the measurements with Gaussian noises. In the case of gross errors, they proposed the accelerated proximal gradient (APG) method to circumvent the gross error issue and construct stealthy attacks. Most recently in [20], the authors proposed a sparse optimization based stealthy attacks construction strategy and demonstrated how FDI attacks could be constructed blindly, that is, without the system model knowledge. However, unfortunately, these three studies were closely related to the smart grid, and the proposed methods were designed for the approximation of Jacobian matrix.

In the framework of a general dynamic cyberphysical system (CPS), Yuan and Mo [21] applied the classical system identification technique to the construction of stealthy attacks. The spectral factorization based method was used to identify the transfer function of the physical system by observing the input-output data from the system. Furthermore, they proved a necessary condition and a sufficient condition, under which the perfect model of the system could be successfully identified. However, such conditions are overly restrictive for widespread applications. In fact, it is more realistic to consider that the identified model of the system is not perfect. That is, there is a model error between the identified model and the real system model. Motivated by this consideration, we explored the possibility that an attacker can carry out stealthy DI attacks on the ICS by identifying a not so perfect model of the system.

The most similar work to ours is the recent study of Kim et al. [22], where a subspace estimation method was used to estimate a system operating subspace from sensor measurements. Based on the subspace information, stealthy attacks could be constructed without the need of prior system model knowledge. As shown in Figure 1(a), the unobservable attack is launched by adding a corresponding perturbation to the sensor data, and the modified sensor data can avoid being detected by the anomaly detector. However, because the ultimate objective of the attack is to disrupt the system’s behavior, the controller’s output will be abnormal. Another similar case is the replay attack, which also does not require any prior knowledge. It gathers sequences of data for a certain amount of time and afterwards just repeats the recorded data. Teixeira et al. [15] introduced an interesting instance of this attack scenario which consists of applying a physical attack to the plant while using the replay attack to render the physical attack stealthy. However, the replay attack on the sensor data could also cause anomalies in the controller’s output, and this point will be revealed later in our experiments.

Our goal is to design a covert agent to keep the controller’s input and output both stealthy over a finite time window. To this end, we propose a function estimation based covert agent as shown in Figure 1(b). The proposed covert agent can be used to construct a two-loop covert structure in Figure 1(c), which consists of two loops: the covert loop and the attack loop. In comparison, Figure 1(d) shows a typical structure of the prior model knowledge based covert attack [13]. The core idea of such structure is to calculate the attack effect on the plant output measurements and subtract the effect from the measured plant output. By contrast, in the two-loop covert structure, the covert loop covers up the effect of the real attack on the physical plant by closely imitating the expected behavior of the physical plant over a finite time window. For the sake of concentrating on the stealthiness, this paper will be restricted to the construction of the covert loop and will not deal with the attack loop.

The main contribution of this paper is the exploratory attempt to establish the feasibility of machine learning based stealthy DI attacks. In this paper, we use the least squares support vector regression (LSSVR) to demonstrate that point. The LSSVR has emerged as a popular data-driven modeling method, and it has uniform approximation ability for any complex nonlinear system [23]. As far as we know, there is no LSSVR-based DI attack reported in the literature. Overall, the contributions of this work are threefold. First, we give a formal description of the LSSVR-based covert agent. Second, we present the procedure of how to train a covert agent model. Third, we provide a case study of a continuous stirred tank heater (CSTH) pilot plant to illustrate and demonstrate the effectiveness of the covert agent.

It is necessary to mention that the purpose of this work is not to facilitate stealthy attacks but to disclose the potential attacks, where the attackers do not need any prior model knowledge of physical systems, and to encourage the corresponding research of the defending methods. The rest of this paper is organized as follows. Section 2 introduces the LSSVR for function estimation. Section 3 gives the covert agent model and the procedure of training the model. Section 4 is an overview of the experiments, and the experimental results are presented in Section 5. Finally, conclusions and future work are summarized in Section 6.

2. Least Squares Support Vector Regression (LSSVR)

The least squares support vector machine (LSSVM) is an alteration of the standard support vector machine (SVM) [24]. By changing the inequality constraints in SVR into equality ones, the LSSVM method can avoid the long and computationally difficult convex quadratic programming and, thus, largely speeds up training. The LSSVM for regression is called LSSVR, which has been extended and applied to forecasting by many studies [2527]. In this section, we briefly introduce the LSSVR for function estimation.

Given training set , the regression function of LSSVR can be defined as follows:where , , and is the mapping from the original feature space to the high dimensional feature space. is the coefficient vector and is a bias term. The optimization problem of LSSVR is given as follows:where is the regularization parameter and is the slack variable for . The Lagrangian is constructed as follows:where are the Lagrange multipliers. The conditions for optimality are

With the solution ofwhere , , and , for , the LSSVR model for function estimation is

The kernel function is any symmetric function that satisfies Mercer’s condition. In this study, the radial basis function (RBF) is used as the kernel function due to its strong nonlinear modeling ability. The RBF is formulated as follows:

Using RBF kernels, the LSSVR has only two tuning parameters: the regularization parameter () and the kernel function parameter (), which is lesser than the tuning parameters of standard SVR.

3. Covert Agent Based on LSSVR

3.1. Covert Agent Model

Suppose that the physical plant is a linear time-invariant (LTI) process, which is modeled in a discrete-time state-space form [28, 29]:where is the state variable, is the control input, and is the measurement vector. The measurement noise is independent Gaussian noise vector with zero mean and covariance . The system operates in closed loop and the control input is given by the feedback controller:where is the controller function that makes the closed-loop system stable.

We now consider the case where the attacker can both capture and inject the data transmitted via the network (i.e., and ). The control and sensory data are recorded by the attacker to generate the training dataset, which is described by the following notation:(i) is a set of sampling instants over a finite time window;(ii) is a dataset of output variables captured over the sampling time window ;(iii) is a dataset of input variables captured over the sampling time window ;(iv) is a data record of output variables at the time instant;(v) is a data record of input variables at the time instant;(vi) denotes the value of the output variable at the time instant;(vii) denotes the value of the input variable at the time instant;(viii) is a set of output variables of the physical plant;(ix) is a set of input variables of the physical plant.

From the system model in (8), we have

If is nonsingular, then we can obtain

In order to reduce the effect of Gaussian noise, a wavelet filter is applied to the data. The filtered data are given byand the estimated noises are

Based on (12) and (13), (11) can be rewritten aswhere

Assume that the signal noise can be well filtered by and the error can be ignored. Then, (14) changes to

Without the prior knowledge of , , and , we use the LSSVR to estimate of from the training data with the input and the output . However, for each in , we do not have the knowledge of the relatedness between and the other variables. For the relatedness between and , we keep it loose and select all as the input data. For the relatedness between and , we select as the input, for the reason that the sample is heavily correlated with the next sample in a physical process. Therefore, the function estimation of is given bywhere

Then, the prediction of can be expressed aswherewhere is the start time when the physical plant is covered by the covert agent. From (12) and (13), we have the output of the covert agent, which is the estimation of ; that is,

3.2. Procedure of Training the Covert Agent Model

The training of the covert agent model consists of three phases: data recording phase, model training phase, and output predicting phase. In the first phase, the control and sensory data are recorded to generate a training dataset and , which will be used to train the covert plant model in the second phase. As shown in Figure 2, the dataset is firstly preprocessed to generate the required data for training each LSSVR model. Then, optimal parameters and for each LSSVR are obtained through the automated grid search with -fold cross-validation [30] on the training data. Finally, the outputs of this phase are LSSVR models; that is, there is a LSSVR model for each output variable of the physical plant. In the third phase, as described in the previous subsection, the predictions are generated based on the LSSVR models, and they are fed back to the controller to cover the real outputs of the physical plant.

4. Experiment Overview

The covert loop is illustrated by a case study of a continuous stirred tank heater (CSTH) pilot plant. In this section, the CSTH Simulink platform is briefly introduced, and the experiment setup is presented. Moreover, the assessment method used to evaluate the experimental results is also presented.

4.1. The CSTH Simulink Platform

The configuration of the CSTH plant is shown in Figure 3. Hot water and cold water are mixed in a stirred tank, heated by steam through a heating coil, and drained from the tank through a long pipe. A more detailed description of the CSTH model can be found in [31].

Our experiment is based on the CSTH Simulink model with closed-loop control, which is provided by Thornhill et al. (http://personal-pages.ps.ic.ac.uk/~nina/CSTHSimulation/index.htm). Under the closed-loop control, the CSTH model runs to a steady state from a nonsteady initial condition. The steady-state valve positions and instrument conditions in this experimental case are shown in Table 1 [31]. The simulation input and output represent electronic signals on 4–20 mA scale. The inputs to the CSTH are control signals of the cold water and steam valves. The outputs are electronic measurements from the temperature, level, and cold water flow.

Based on the CSTH basic Simulink model, Gaussian noises are added to the three outputs of the CSTH. Figure 4 shows the normal control signals and measurements under the closed-loop control. The default simulation time is 1000 s, and the default sampling rate is 3600 samples per hour (s/h). The “nonsteady” initial phase of the CSTH plant lasts for about 150 seconds (s) and is excluded from all the experiments in this paper.

4.2. Experiment Setup

The CSTH plant depicted in Figure 3 is simulated in Matlab/Simulink, and its execution starts with the predefined base values. The covert agent is constructed based on the LSSVR method, which is available in the free LS-SVMlab toolbox (http://www.esat.kuleuven.be/sista/lssvmlab). In addition, the cumulative sum (CUSUM) algorithm is used to evaluate the stealthy time, which will be introduced in the next subsection. The setups of the experiments are illustrated in Figure 5. In order to better assess the stealthiness of the covert agent, we use the replay attack as a comparison and set up two observers to get the experimental data in the simulation. Observer 1 is used to capture the sensor data (i.e., , , and ), and Observer 2 is used to capture the output of the controller (i.e., and ).

4.3. Assessment Method

In order to evaluate experimental results, the stealthy time is used, and it is defined aswhere is the start time of the covert agent or the replay attack and is the time when an anomaly is detected. A longer stealthy time is favorable to the attackers, as they can have more time to make the physical plant go into an unsafe state while remaining stealthy with respect to the anomaly detectors. In this paper, the anomaly detector is designed based on the CUSUM algorithm, which is one of the most commonly used algorithms for change detection problems [14]. Mathematical details of the CUSUM method can be found in [32].

5. Experimental Results

In this study, we capture data from the two observers within the time window [201 s, 400 s] in a normal process and use them as the training or replaying data in the experiments. In order to get the statistical results, we run 100 simulations for the covert agent and the replay attack, respectively. In each individual simulation run, the covert agent or the replay attack starts at a random time , where (time is discrete), and persists for 200 seconds.

To get the corresponding stealthy time, the CUSUM algorithm is applied to the data that are obtained from the two observers in the simulations. The thresholds in CUSUM algorithm are determined based on the normal data in the time window ranging from 200 s to 1000 s, and each threshold is selected under the condition that it will not cause any false alarm on the normal data. In this section, we first introduce a covert agent experiment and a replay attack experiment. Then, we give the statistical results of all the experimental tests.

5.1. The Covert Agent and Replay Attack Experiments

In the two experiments, the covert agent and the replay attack are both started at the time = 501 s. Figures 6 and 7 show the results of the covert agent experiment. Figures 6(a) and 7(a) show a comparison of data with and without a covert agent, and Figures 6(b) and 7(b) show the detection of the changes using the CUSUM algorithm. In comparison, Figures 8 and 9 show the results of the replay attack experiment.

From Figures 6 and 8, we can see that the covert agent is able to imitate the behaviors of the three output variables over a finite time window, just like the replay attack does. What is more, the peaks of the CUSUM standard errors in the covert agent experiment are smaller than the ones in the replay attack experiment, which means that the covert agent has better stealthiness and can avoid being detected by the CUSUM with a lower threshold. From Figures 7 and 9, we can see that the covert agent can also keep the control output stealthy, but the replay attack causes anomalies in the controller’s output.

5.2. Statistical Results

Figure 10 shows the statistical results of the 100 simulations on the covert agent. Figure 10(a) provides the number distributions of the stealthy time by histograms, and Figure 10(b) gives the proportion distributions by the empirical cumulative distribution function (CDF). The empirical CDF is defined as the proportion of the values less than or equal to . As can be seen, the stealthy time is longer than 40 seconds in most of the covert agent simulations. Figure 11 shows the statistical results of the 100 simulations on the replay attack. Although the replayed sensor data can avoid being detected by the CUSUM detector, it is more likely to induce an abnormal behavior in the controller’s output. More specifically, for the control variable , the stealthy time is no more than 40 seconds in all the replay attack simulations.

6. Conclusions and Future Work

This paper has investigated the design problem of machine learning based stealthy DI attacks on industrial control systems. A LSSVR-based covert agent has been presented to estimate the model of the physical system, by which attackers can carry out a stealthy DI attack without the need of prior model knowledge of the physical system. The experimental results demonstrate that the covert loop can keep the control output and sensor data both stealthy over a finite time window. For future work, the proposed covert agent can be further extended to a two-loop covert structure, in which an attack agent can be added. In addition, it is also interesting to investigate the detecting methods of the LSSVR-based attacks.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work is supported by the National Key Research and Development Program (no. 2016 YFB1001404) and the National Natural Science Foundation of China (Normal Projects no. 61672093 and no. 61432004).