Abstract

We consider the secure state estimation of linear time-invariant Gaussian systems subject to dynamic malicious attacks. An error compensator is proposed to reduce the impact of local error data on state estimation. Based on that, a new estimation algorithm based on the Gaussian mixture model (GMM) aiming at dynamic attacks is proposed, which can cluster the local state estimates autonomously and improve the remote estimation accuracy effectively. The superiority of the proposed algorithm is verified by numerical simulations.

1. Introduction

Cyberphysical systems (CPSs), such as transportation networks and smart grids, integrate sensing, computing, and control technologies with a communication infrastructure. Tight integration and cooperation between cyber and physical components are the features of CPSs [1]. However, CPSs are vulnerable to any successful attacks especially network attacks on the data and communication channels, which causes serious harms to the national economy and social security, for example, the Stuxnet storm reported in [2], StuxNet malware [3], power blackouts in Brazil [4], and Maroochy water bleach [5]. Due to the widespread application of CPSs in many real-life critical infrastructures [6], the security of CPSs has become an increasingly important issue which has attracted attention from many researchers in the past decades.

In the recent literature, the secure state estimation is an important research direction of CPSs security. In [7], a distributed state estimation method based on parallelized stream computing is proposed, which can not only significantly improve the speed of state estimation calculation but also reduce the interregional convergence correlation and the residual pollution. In [8], a new sequential estimation method is proposed to improve the estimation accuracy, which sequentially estimates states by the particle filter (PF) and parameters by the separable natural evolution strategy (SNES). The state estimation of three-phase power system models is studied in [9]. In [10], a Bayesian network based on the wireless power transfer (WPT) system state estimation algorithm is proposed, which can estimate the WPT system states in a distributed way using the Bayesian tree structure. In [11], a robust generalized maximum likelihood (GM) estimator, which leverages modified projection statistics and a Huber convex score function, is designed to bound the influence of observation outliers while maintaining its high statistical estimation efficiency. In [12], a distributed dynamic state estimation method for microgrids incorporating distributed energy resources is presented. In [13], a robust generalized maximum-likelihood Koopman operator-based Kalman filter (GM-KKF) is designed, which can estimate the rotor angle and speed of synchronous generators. In [14], a correlation-aided robust adaptive unscented Kalman filter (UKF) for power system decentralized dynamic state estimation with unknown inputs is presented, which has lower requirement of number of measurements for dynamic state estimation while achieving better robustness against bad data. In [15, 16], the state estimation method based on undamaged sensors is studied. In [17, 18], the state estimation for different systems is studied based on the convex optimization methods. In [19], by modeling and adopting a variety of models, a random Bayesian approach is proposed to solve the state estimation against switching patterns and signal attacks. In [20], the state estimation against fixed target attacks, switched target attacks with disturbance, and sparse sensor attacks are considered, and the sufficient condition for the existence of the switched observer is given. In [21], a fusion algorithm based on the Gaussian mixture model is presented to solve the estimation of a linear time-invariant Gaussian system under stealth attacks. However, the dynamic attacks are not considered. In [22], a dynamic combination strategy and a distributed Kalman filter are proposed, which improve the robustness of the system against random error data injection and replay attacks.

Most of the studies mentioned above have focused on static attacks. However, dynamic attacks are very common in real systems. Therefore, this paper considers the state estimation for a networked system suffering from dynamic adversaries as shown in Figure 1. The different sensors are attacked randomly at each time instant, and it is assumed that the number of attached sensors does not exceed half of the sensors.

Inspired by [21], we have designed an error compensator to reduce the impact of incorrect data on state estimation. Based on that, a new GMM-based state estimation algorithm is presented, which can effectively improve the state estimation accuracy against the dynamic adversaries. The contributions of this article are listed as follows: (1)A new error compensator is proposed to alleviate the influence of wrong data on state estimation, which can judge whether the beliefs generated by the expectation-maximum (EM) algorithm are accurate based on the observability of the system, and correct the doubtful beliefs(2)By introducing the error compensator, a new GMM-based estimation algorithm is presented, which can improve the estimation accuracy effectively. The proposed algorithm can fuse the local data by adopting the modified beliefs as the weights of the local data with the centralized Kalman filter

The rest of the paper is organized as follows. Section II formulates the model of the considered system and the problem of interest. Section III proposes the error compensator and the new GMM-based state estimation algorithm against dynamic adversaries. In Section IV, the effectiveness of the proposed algorithm is demonstrated by numerical simulations. Conclusions are given in Section V.

Notation: and are the sets of positive integers and real numbers, respectively. denotes the -dimensional Euclidean space. is the set of positive semidefinite (definite) matrices. We write when . denotes the transpose of matrix . is the expectation of a random variable. is the Gaussian distribution with mean and covariance matrix , and denotes follows the Gaussian distribution . denotes a block diagonal matrix.

2. Problem Formulation

Consider the following networked system under attacks: where denotes the system state, represents the measured value from sensor at time , and is attack signal. The number of sensors is denoted by . is the process noise, and . is the measurement noise, and . Meanwhile, it is assumed that , , where , . , . The initial state is independent of and for all and . and are detectable and controllable, respectively.

The malicious attack satisfies the following assumptions:

Assumption 1. Any () sensors can be corrupted by the adversary, and the output values of the sensors are changed. Only when sensor is unattacked, .

Assumption 2. The number of attacked sensors is unknown, stochastic, and variable.

Assumption 3. The system parameters and noise statistics are known for the adversary.

Assumption 4. is statistically independent of and , respectively.

Remark 1. According to [23, 24], it is impossible to accurately reconstruct the state of a system when more than half the sensors are attacked. Thus, we assume that the maximum number of damaged sensors does not exceed in this paper, i.e., the upper limit of is .

When the system is not attacked, the measurements at time instant can be stacked as where

Then, we adopt a centralized Kalman filter as the remote estimator: where and are the priori and the posteriori estimation of the system state , respectively. and are the priori and posteriori estimation error covariance, respectively. is the Kalman filter gain.

From [21], we know that the information-form Kalman filter can be expressed as

Similarly, the local Kalman filter for sensor can be written as

It is noted that and can be calculated offline. According to [25], the Kalman filter converges from any initial condition exponentially when and are detectable and controllable, respectively. The steady-state values of local and centralized Kalman filter are defined as

It is assumed that the system starts from the steady state with and , and the fixed-gain of local and centralized Kalman filters can be represented as:

The objective of this paper is to design a new GMM-based estimation method for systems suffering from dynamic adversaries.

3. The GMM-Based State Estimation

In this section, an error compensator and the GMM-based state estimation algorithm against dynamic adversaries are proposed.

3.1. Modeling and the EM Algorithm

For a Gaussian mixture model with components [21], the mean and covariance of the -th component are expressed as and , respectively. is the mixture component weights of , and . In this case, the mixture density of a Gaussian mixture model can be expressed as where and are the Gaussian distribution density and weight of the -th component, respectively. Function is the probability density function (pdf) for Gaussian random variables:

At time instant , we denote the means of the state variables for sensor as under the unattacked scenario and under the attacked-scenario, respectively. and represent the covariance when sensor is unattacked and attacked, respectively. The local state estimation follows different distributions depending on whether sensor is attacked or not. According to the definition of GMM and the analysis of Kalman filtering in [25], it can be known that when sensor is unattacked (defined as the first component), follows the Gaussian distribution with the mean and the fixed covariance , i.e., . When sensor is attacked (defined as the second component), the exact distribution of is unknown since the specific type and the starting time of attacks are unknown. In this case, similar to [21], we can adopt a Gaussian distribution with the first and second moments, i.e., , to approximate the distribution of all local estimates in the second component. Then, can be described by the following 2-component Gaussian mixture model: where and are the weights of the first and second components at time , respectively.

The observation data set is defined as . According to [26, 27], it is known that the expectation-maximization (EM) algorithm can be adopted to find the maximum likelihood estimates for the parameter using . The log likelihood is shown as

Generally, the EM algorithm is divided into two steps: the expectation and maximization step. First, initializing the parameter at each time , then the expectation step generates a belief based on and for each sensor: where and represent the probability of sensor belonging to the component and , respectively.

Given all beliefs and , the parameters are reestimated in the maximization step:

The expectation and maximization steps iterate until they converge to a certain value. This iterative procedure maximizes the concave lower bound of the log likelihood in (14).

3.2. The Error Compensator

In this subsection, an error compensator is proposed to reduce the influence of incorrect data on the state estimation.

According to 3.1, the EM algorithm can be used to calculate the GMM parameters and find the maximum likelihood estimation. However, the convergence and clustering results of the EM algorithm are affected by the initial parameters. In this paper, the first and second moments are adopted as the initial parameters of the second cluster. Due to the randomness of dynamic adversary and its specific type is unknown, the output of some attacked sensors may be similar to that of normal sensors at some moments. In this case, will be miscalculated as in the iterative process (15)-(19), since the observed data are considered to be closer to the second cluster by the EM algorithm. When the above case occurs, the estimation accuracy will be reduced seriously because the number of data available for fusion is less than . On the other hand, the measurements that are similar to the true measurements can provide useful information for the remote state estimation, which means that the data belonging to the second cluster can be adopted to estimate system state. Hence, a compensator is designed to solve the above problem.

represents the average of all at time instant , which can be calculated as follows:

According to the EM algorithm, tends to 1 if and only if sensor is attacked, and the expectation step is accurate, which causes to approach . When the expectation step is miscalculated, tends to since approachs 0 for the attacked sensor . According to Assumptions 14, the maximum number of damaged sensors does not exceed (namely, ), which means . Hence, it can be known that and are miscalculated if . Based on the above analysis, the compensator is designed as follows: where and are the modified beliefs, and represents a threshold, which can be adjusted according to the performance requirements of the actual system.

3.3. The GMM-Based State Estimation Approach against Dynamic Attacks

In this subsection, a GMM-based estimation algorithm is proposed to deal with the dynamic attacks, which can improve the estimation accuracy effectively. where the initial values and are the steady-state values of the remote estimator when .

Theorem 2. Consider the linear time-invariant system (1)-(2) and the dynamic adversary satisfying Assumptions 14, and the remote state estimation can be calculated by

Proof. According to the Definition 2 in [16, 28], if sensors are attacked, the following system is still observable in the absence of attacks: where is the set of unattacked sensors, and is the measurement stacked by the set . Similarly, and are the system parameter and the measurement noise stacked by the set , respectively. The pair is observable.
According to Section II, Equation (6) can be expanded as where the default weight of each sensor is equal to 1 when the sensor is not attacked.

Based on the above analysis, we can calculate the remote state estimation by adopting the undamaged sensors. The belief represents the probability that the sensor is undamaged. Then, we can fuse the local data by adopting as the new weight of the local data, and then the Equations (22a)-(22d) can be obtained.

The system is assumed to reach steady state before time . The adversary can launch dynamic attacks at any time when . Starting from time , the local state estimation is calculated utilizing the measurement of sensor at each time instant . Based on that, the remote estimator clusters the local state estimates and calculates the parameter by the EM algorithm according to Equation (15)-(19). Then, the error compensator is used to correct the error beliefs. Finally, based on the modified belief , the local data can be fused by Theorem 2 to obtain the state estimation . The whole process is summarized in Algorithm 1.

1 // Run Kalman filter to steady state.
2: Initialize ;
3: fordo
4: // Local data reaches steady state.
5: Fordo
6:  
7:  
8: end for
9: // The remote estimator reaches steady state.
10: 
11: 
12: end for
13: // GMM clustering by the EM algorithm.
14: Set
15: fordo
16: fordo
17:  ;
18: end for
19: // the EM algorithm.
20: Initialize
21: while not achieve the maximum likelihood estimates do
22:  The expectation step: calculate and according to Equation (15)-(16).
23:  The maximization step: calculate by Equation (17)-(19).
24: end while
25: // the error compensator.
26: 
27: fordo
28:  ifthen
29:   ;
30:   ;
31:  else
32:   ;
33:   ;
34:  end if
35: end for
36: // Remote state estimation.
37: 
38: 
39: 
40: 
41: end for

4. Numerical Simulation

In this section, the effectiveness of the GMM-based estimation algorithm is verified through numerical simulations. Similar to literature [21], we consider a linear time-invariant dynamic process which is measured by 15 sensors. The system parameters and are randomly generated from intervals [0.4, 0.99] and [0.5, 2], respectively. Matrices and , , are randomly generated from intervals [1, 2]. The system reaches steady state before , and the attack signal starts from time , assuming that sensors are attacked by at each time instant .

4.1. Example 1

In this example, the estimation accuracy of GMM-based method with and without compensator against dynamic attacks has been compared. Similar to [15], the attack signal can be assumed to be a linear function of the measurement noise: where and are real number from the interval [-5, 5] and [-10, 10], respectively. Meanwhile, satisfies Assumptions 14.

Set the threshold in the following example. In Figure 2, the trajectories of the actual state and the states estimated by the GMM-based estimation method with and without compensator are plotted. It is shown that the estimated states of the GMM-based method with compensator (dotted line) are closer to the actual state than that without compensator (red line). Figure 3 shows the estimation error covariance for the GMM-based method with and without compensator, respectively. It is observed that the estimation error covariance of the method without compensator (red line) is larger than that with compensator (black line), which means that the error compensator proposed in this paper can effectively reduce the impact of faulty data on state estimation. According to Figures 2 and 3, the estimation accuracy of the GMM-based estimation method with the compensator is higher than that without the compensator against dynamic attacks.

The number of attacked sensors at each moment when is plotted in Figure 4, and the state estimation and corresponding error covariance of the GMM-based algorithm when the compensator takes different thresholds are shown in Figures 5 and 6, respectively. It is seen that the state estimation accuracy is higher when and 0.65 than and 0.95, which is indicated that the performance of the remote estimator will deteriorate while is too large or too small. Hence, the threshold can be adjusted according to the actual performance requirements of the real system.

4.2. Example 2

Distributed and centralized false-data detectors are common, and they determine whether an attack exists based on the statistical characteristics of the innovation and , respectively. From [21], a well-designed dynamic attack can successfully bypass the distributed detector but fails to remain stealthy to the centralized false-data detector. In this subsection, we have compared the proposed approach and the estimation methods based on different false-data detectors.

Similar to [21], the attack signal is set as where satisfies Assumptions 14.

In Figure 7, the trajectories of the actual state and the state estimated by estimation methods based on different detectors are plotted, respectively. It is seen that the GMM-based state estimation (black line) is closer to the actual state than the state estimation based on the distributed and centralized detector (red and green). Figure 8 shows the estimation error covariance of the corresponding methods, and it is observed that the GMM-based estimation error covariance is much smaller than that based on the distributed and centralized detector. It can be seen that the GMM-based estimation approach proposed in this paper can improve the performance effectively.

5. Conclusion

This paper studies the state estimation problem against dynamic malicious attacks. An error compensator is presented, which can reduce the influence of local error data on state estimation effectively. Based on that, a new GMM-based state estimation algorithm is proposed to improve the estimation accuracy for the system suffering from dynamic attacks. Finally, the effectiveness of the proposed algorithm is verified by numerical simulations. We will extend the GMM-based approach further to systems with parametric uncertainties in the future.

Data Availability

Some or all data, models, or code generated or used during the study are available from the corresponding author by request (Cui Zhu).

Conflicts of Interest

The authors declare that they have no conflicts of interest related to this work.

Acknowledgments

The author would like to thank the tutor and anonymous reviewers for their suggestions, which improved the quality of work. This work was supported by the National Natural Science Foundation of China (grant numbers 61603047, 61773334), the Scientific Research Project of Beijing Municipal Educational Commission (grant number KM201911232014), the Key Research Cultivation Program of Beijing Information Science and Technology University (grant number 2121YJPY221), and the Qin Xin Talents Cultivation Program, Beijing Information Science and Technology University (grant number QXTCPC202110).