Wireless Communications and Mobile Computing

Wireless Communications and Mobile Computing / 2020 / Article
Special Issue

Recent Advances in Cloud-Aware Mobile Fog Computing 2020

View this Special Issue

Research Article | Open Access

Volume 2020 |Article ID 8897926 | https://doi.org/10.1155/2020/8897926

Chao Wang, Bailing Wang, Hongri Liu, Haikuo Qu, "Anomaly Detection for Industrial Control System Based on Autoencoder Neural Network", Wireless Communications and Mobile Computing, vol. 2020, Article ID 8897926, 10 pages, 2020. https://doi.org/10.1155/2020/8897926

Anomaly Detection for Industrial Control System Based on Autoencoder Neural Network

Academic Editor: Fuhong Lin
Received23 Apr 2020
Revised05 Jun 2020
Accepted08 Jul 2020
Published03 Aug 2020


As the Industrial Internet of Things (IIoT) develops rapidly, cloud computing and fog computing become effective measures to solve some problems, e.g., limited computing resources and increased network latency. The Industrial Control Systems (ICS) play a key factor within the development of IIoT, whose security affects the whole IIoT. ICS involves many aspects, like water supply systems and electric utilities, which are closely related to people’s lives. ICS is connected to the Internet and exposed in the cyberspace instead of isolating with the outside recent years. The risk of being attacked increases as a result. In order to protect these assets, intrusion detection systems (IDS) have drawn much attention. As one kind of intrusion detection, anomaly detection provides the ability to detect unknown attacks compared with signature-based techniques, which are another kind of IDS. In this paper, an anomaly detection method with a composite autoencoder model learning the normal pattern is proposed. Unlike the common autoencoder neural network that predicts or reconstructs data separately, our model makes prediction and reconstruction on input data at the same time, which overcomes the shortcoming of using each one alone. With the error obtained by the model, a change ratio is put forward to locate the most suspicious devices that may be under attack. In the last part, we verify the performance of our method by conducting experiments on the SWaT dataset. The results show that the proposed method exhibits improved performance with 88.5% recall and 87.0% F1-score.

1. Introduction

In the context of Industry 4.0, the Industrial Internet of Things (IIoT) has attracted high attention in the academia and industry. Within IIoT, more and more devices have been joined together and produce massive industrial data every day, which requires powerful computing resources. Benefited from cloud computing, enterprises can move the computing tasks into the cloud instead of their own physical machines in order to mitigate the pressures. Also, cloud computing can be integrated into the mobile computing environment, which is called mobile cloud computing.

However, as the development of IIoT, the amount of data collected from sensors or other devices grows exponentially. When these data are transmitted to the cloud, network latency and bandwidth become a bottleneck [1]. To overcome these problems, edge computing (e.g., cloud-aware fog computing mechanism [2, 3]) is one of the most promising solutions. In real applications, industrial enterprises could make some necessary computing tasks close to the machine in the industrial control system (ICS). Only some important computing tasks or results are delivered and stored in the cloud center [4]. Workloads of the cloud center could be reduced sharply in this way. There are some researches that focus on problems of edge computing. Since data are sent to remote machines, the security of data transmission is another problem; thus, the authors of [5] proposed a new secure communication scheme to solve it.

In fact, ICS is an important part of the industrial edge, and its safety issues are becoming critical to the IIoT’s development. Attacks on ICS have increased over the last two decades, and the most famous one is Stuxnet. Damages to ICS can cause serious consequences; therefore, the study of methods to protect these systems is very important.

Intrusion detection system (IDS) is one means to provide protection. There are two ways to classify IDS according to their techniques used, namely, signature-based IDS and anomaly-based IDS [6]. When using signature-based techniques, the attacks are detected by comparing the characteristics of known attacks with new events like traffic and commands. An anomaly-based IDS constructs a template of normal behavior and detects attacks by calculating the deviations of observed behavior with the template. Since anomaly detection need only normal work conditions to learn the normal profile, it can detect unknown attacks. We focus on anomaly-based techniques in this paper.

There are two main issues concerned with anomaly detection of ICS. Firstly, it is a challenging task to model the complex system. Since there are many devices within the system, it is hard to learn the associations between them. How to locate abnormal devices is another challenge. Methods finding the devices that work abnormally could help workers to check the system in time correctly, which could reduce losses caused by anomalies.

Many anomaly detection techniques have been specifically developed for ICS. A discrete multi-input and multioutput (MIMO) system model [7] is used to represent the control and process behavior. But the model is not applicable to the nonlinear model. Using the phenomenon that traffic between devices is periodic; Goldenberg and Wool [8] modeled the Modbus/TCP traffic by deterministic finite automaton (DFA). However, their model is suitable for single-period traffic patterns only. Deep learning has demonstrated promising results to learn the complicated relations of variables. There are some works using deep learning methods to do anomaly detection [9, 10]. In order to achieve the anomaly detection in ICS, we use a composite autoencoder model similar to [11] to learn the work pattern of ICS and with our contributions as follows. (i)We propose a composite autoencoder model to learn the work pattern of ICS by predicting and reconstructing the input data. Anomaly is detected using the error obtained by the model(ii)Using the error distribution, we can locate the variables that behave abnormally. We define a change ratio to seek which devices are suffering attack(iii)We conduct experiments on the SWaT dataset, which is collected from a real ICS. Experiment results show that our method outperforms the other three methods with 88.5% recall and 87.0% F1-score

The remainder of this paper is organized as follows. Some related work about anomaly detection is summarized in Section 2. Section 3 introduces the dataset used in this paper briefly. The problem of detecting anomaly dealt with time series data is analyzed theoretically in Section 4. Section 5 describes our proposed method in detail. Section 6 conducts the experiments as well as performance analysis. Lastly, Section 7 presents our conclusions and future work.

The security issues of ICS have been extensively studied. In [12], the authors presented some threats and secure methods for these infrastructures. In [13, 14], the authors surveyed some researches on ICS and also presented some challenges that need to be addressed. Two strategies were described in [15] for securing SCADA networks in general. Besides, many researchers have put forward new techniques based on anomaly detection.

Anomaly detection method discovery attacks by estimating its differences with the normal profile. The normal profile can be constructed using many categories of data sources. The work in [8, 1618] used network traffic within ICS as a data source to model the normal communication. The Hidden Markov model (HMM) [16] was used to model packets delivered between devices for intrusion detection. In addition, [17] proposed a method that learns the Modbus/TCP traffic transactions using the request message only. The authors of [18] employed a dynamic Bayesian network structure to characterize normal command and data sequences at a network level and achieved a low false positive rate.

Every device in ICS are assumed to have their behavior pattern, the authors of [19] used features like the data response time and the physical operation time to create a physical fingerprint for the devices within ICS environment. To find malicious codes, a deep learning method was utilized to model the normal behavior of the power-grid controller [20].

In this research, we aim to use the time series data of devices within ICS to model the normal working condition. When it comes to anomaly detection of multivariable time series, some deep learning methods have been put forward. Malhotra et al. [9] used the stacked LSTM neural network. The network is trained on normal data only, and the prediction errors are used to determine whether the observation is normal or anomalous. And the mechanism in [10] reconstructed the normal time series data with the LSTM encoder-decoder. In [21], a deep convolutional neural network (CNN) is used to predict the time series.

Our proposed method experiments on the SWaT dataset would be described in Section 3. There are some works that have been done on this dataset. Goh et al. [22] used the LSTM neural network to predict the time series of ICS and used cumulative sum (CUSUM) for anomaly detection, but only the P1 stage in the dataset had been checked. The timed automata learning is combined with the Bayesian network for anomaly detection [23]. Inoue et al. [24] proposed two methods to detect the anomalies, namely, DNN and one-class SVM. In this research, the work in [23, 24] is used to compare with our proposed method.

3. Dataset Description

In this paper, we use the Secure Water Treatment (SWaT) dataset to test and verify our method. The SWaT dataset is provided by the Singapore University of Technology and Design [25]. It is collected from a water treatment plant testbed that produces purified water. Within the testbed, there are a numbers of devices, which are categorized into a sensor and actuator. The devices are distributed in six stages as demonstrated in Figure 1.

All the data of sensors and actuators are logged every second during the total 11 days of running, which means it is a multivariable time series data. In the whole running, the first seven days were under normal condition. And there are 36 attacks in the remaining four days. In the attack data, there are four types of attacks, namely, SSSP, SSMP, MSSP, and MSMP, which are described detailed in Table 1. For full explanations of the dataset, please refer to [25].

The SWaT dataset contains traffic data also, which had been parsed already. But in this paper, the time series data are our focus of work.

Attack typeMeaningExample

Single Stage Single-Point (SSSP)Attack to one device in any single stageThe 1st attack with target MV-101
Single Stage Multi Point (SSMP)Attack to multiple devices in any single stageThe 16th attack with targets MV-101 and LIT-101
Multi Stage Single Point (MSSP)Attack to one devices in many stagesThe 33rd attack with targets AIT-402 and AIT-502
Multi Stage Multi Point (MSMP)Attack to multiple devices in many stagesThe 18th attack with targets P-602, DPIT-301, and MV-302

4. Problem Statement

As described in Section 3, the data we deal with are multiple variable time series. Consider a time series of length , where one point is a -dimensional vector at time . We use a window of length sliding over the time series to obtain multiple time series. Also, a sequence of continuous observations from time to is denoted as .

The objective of anomaly detection for multivariable time series is to find anomalous part exploiting the regular pattern appeared in the history data.

In this research, we use the observations under normal working condition to train the model for learning the system work pattern by predicting future time series and reconstructing the origin input. After model learning the working pattern, it is used to detect the test dataset which includes attacks. The anomaly results are obtained by comparing the error of attack data against a threshold. A higher error indicates the point is anomalous with a higher likelihood.

5. Proposed Method

The whole process of our proposed method consists of two parts, training phase and testing phase. The training part learns the working pattern of ICS using only normal data. When the model is trained, the error obtained by training on normal data is used to select ananomaly threshold. The testing part checks the model’s performance using the data that includes attacks. The trained model is used to reconstruct and predict the testing data that includes attacks. The observations whose error is higher than the threshold are denoted as anomalous. After that, the abnormal part which may be suffering an attack is located. The whole structure of our method is demonstrated in Figure 2. The detail of our model is introduced in this section. In the process of learning the pattern of ICS, we employed a composite autoencoder model, which has the power to learn nonlinear relationships. Also, a change ratio is proposed to locate the anomaly devices.

5.1. Autoencoder Model

Researchers from different scientific fields have applied deep learning in their respective area of research, since deep learning has exhibited a powerful ability to solve complex problems in fields like computer vision and natural language processing in recent years.

An autoencoder neural network is an unsupervised learning framework. Generally speaking, an autoencoder has an input layer, an output layer, and one or more hidden layers. Unlike common deep neural networks, an autoencoder has an architecture where hidden layers are smaller than input layers. Benefited from this, it could learn a compressed representation.

In this research, we use the architecture of the autoencoder as shown in Figure 3, which is divided into two parts, the encoder and the decoder. The compressed representation of input data is learned by the encoder part, from which the decoder reconstructs the input. In some researches, after training the autoencoder, the decoder is removed and the remaining part is used for classification. However, we use the whole autoencoder to reconstruct the origin input.

We choose LSTM as the building blocks for the autoencoder model to deal with the time series. LSTM is a special RNN (recurrent neural network); it is designed to solve problems suffered by common RNN. Due to space limitations, we do not expand a detailed description.

5.2. Composite Model

The LSTM autoencoder can reconstruct the input (see in Section 5.1). It can be used as a future predictor also. In this paper, we use a composite autoencoder to achieve the goals of the learning the pattern of ICS normal working. The designed model can do the input reconstruction and future prediction work both, which can learn better data representations using them both in the same time [11]. Its architecture is shown in Figure 4.

Imagine the input of a composite model is , whose length is . The model could output two part data, namely, the reconstructed time series and the predicted time series. We use and to denote the output of the model. Considering a time series as the input of the composite model, is the reconstructed series and is the predicted series. For example, an input time series of length 5, . The output of the model is in two parts, the reconstruction part and prediction part , separately.

The MSE (mean square error) is used to calculate the difference between the actual input and output. Considering an input series , its th dimension value is , and the prediction or reconstruction output is . The calculation of MSE is given in (1).

And the loss function is used to calculate MSE of the whole time length. The reconstruction part loss is and the prediction part loss is . The model will try to minimize the sum of both parts, .

5.3. Anomaly Detection

In this section, we introduce the process of anomaly detection. As we mentioned in the previous section, the two-part output of composite autoencoder model aren’t matched in the time dimension. The is reconstructed from input , and the is predicted from input . It is significant to adjust the output values and make them match in time dimension before it passes to anomaly detection.

The error obtained by two parts is calculated separately and added together to obtain the overall error value . The steps of calculating are as follows.

In order to eliminate the error that happened due to abruptly changing of the input data, we use an exponentially weighted moving average method (EWMA) like paper [26] to obtain a smoothed error .

Meanwhile, we use a power technique to decrease the false negative rate.

When the error is higher than threshold , the observation at time is classified as anomaly. The threshold is selected from the error of training part. Steps used to detect anomaly are listed in Algorithm 1.

 Input time series ; reconstruct time series ;
 Predict time series ; error check threshold ;
 Anomaly result ;
1: Calculate sum of error by (2), (3), (4)
2: Calculate smoothed error by EWMA methods (5)
3: Calculate -power error by (8)
4: Mark the anomaly by (9)
5: Return the anomaly observations
Algorithm 1: Anomaly detection.
5.4. Locate Attacked Variables

Generally speaking, with the purpose of damaging the ICS or affecting its normal work, some attackers choose to destroy the devices like sensors or actuators. After the method mentioned in the last section detecting anomaly successfully when attacks happen, the next step is to locate variables (devices in ICS) that are suffering an attack. It helps the workers find anomaly and take measures to bring the system back to normal.

Unlike research [26] using the greatest error of variables, we use a change ratio to denote whether one part is anomalous. We believe if one device is attacked by attackers, the obtained within the anomaly detection phase presents difference between the normal and attacked conditions. Specifically, our model should output lower when ICS works normally than the time when the attacks happen. Under this assumption, the change ratio is proposed to measure the difference. The change ratio is given in (10).

When one observation at time is classified as anomaly, every dimension of its is checked by calculating the difference between average of on time window before the attack and after it. The length of the time window is denoted as . In this research, some attacks among test dataset have multiple targets. Therefore, we use the top- to denote devices that are under attack.

6. Experiments and Result Analysis

Based on the method mentioned in above part, the process of experiments on the SWaT dataset and results are listed in this section. First, we demonstrate the preprocessing steps for the input data. After that, the machine environment where our model is trained and the evaluation metrics we used to access our method are introduced. In the last part, we analyze the final results in detail, including attack detection results, the comparisons with other methods, etc.

6.1. Data Preprocessing

There are 51 variables (devices) in the SWaT dataset. The training data has 496,800 records which are collected when the testbed works under a normal condition. Also, testing data has 449,919 records and there are 36 attacks among them. Because the system has to be stabilized, so the first part of normal data has been trimmed.

In the experiment, we find that some variables are unstable, e.g., AIT-201. Its distribution between train dataset and test dataset is different extremely. Since we use normal data to train the model only, so these unstable and unrepresentative variables are removed, including AIT-201, P201, and all the variables in P6. Especially, variables in the P6 stage were not completely used during data collection as said in [23]. After removing these variables, 45 variables are remaining.

In this research, we use a sliding window approach to divide origin time series, and the length of window is . In order to learn the whole pattern in the training phase, an overlap length is used. Overlap means two continuous time series have the same part, the beginning length of the second part is the same as the end of the first part. But when calculating the training error to select the threshold or testing the model, the data are divided without overlap.

To accelerate the training speed and increase detect accuracy, all the training data are scaled to (0, 1). It would be wrong if we scale all the data including test data, since information would be leaked in this way. The test set that includes the attack is scaled using the minimal and maximal of every variables from the training set. Because our method uses a composite model, the output constructed part and predicted part are not in same time window, which result in the first sequence having only the reconstruction part and the last sequence do not have a prediction part. For these circumstances, the error only count one part.

In the experiment, we set as 120 seconds and overlap length as 115 seconds. Although the overlap part of every time series is large, we believe the model learns better representation using more time series by such allocation. Some arguments used in the method are listed here. The used in (6) is 120 seconds, which has the same length with . The is 4 and the for locating suspicious variables is 120 seconds also. The maximum of error obtained by training on the nonoverlap time series is selected as threshold.

6.2. Model Training

The model is trained and tested on our machine which consists of Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz, 3 NVIDIA Tesla P100 PCIe 12GB, and 128 GB RAM.

We implement our model based on Keras library (version 2.2.4) whose programming of neural networks is more convenient than Tensorflow. In order to elevate the training and testing time, we use the CuDNNLSTM as the building block, which has the same effects with LSTM.

In the experiment, the neuron number of two layers within encoder part is 64 and 32, respectively. The first layer is larger than the input dimension, which does not strictly obey the rule of hidden layers is smaller than the input layer. But it gets better performance in the experiments still. The neuron number of reconstruction part and prediction part is 32 and 64, which are symmetry with the encoder part.

We choose the Adam optimizer [27] and the training epochs is 100. Earlystop is used to cut down the training time. To overcome the problem of local minimum when training neural network, we train the composite autoencoder multiple times and choose the best one to present. The training losses of both parts are shown in Figure 5.

6.3. Evaluation Metrics

In the dataset, the records are labeled as “attack” or “normal” for every second. We use this label directly to evaluate our detect results. All possible results are listed in Table 2. When one record is classified as attack if it is attack indeed, this is a true positive (TP). Otherwise, a false negative (FN) is the situation when an attack record is marked as normal. Also, a true negative (TN) is that normal records been classified correctly. For the last one, a false positive (FP) is a prediction that the normal record is misclassified as an attack. We use , , and to denote our performance.

True classDetection result

AttackTrue positive (TP)False negative (FN)
NormalFalse positive (FP)True negative (TN)

6.4. Result Analysis

Before diving into the overall performance, we first analyze the detected results of some attacks. We demonstrate our method using two examples, attack no. 5 and attack no. 31.

In Figure 6, the original value is presented with the construction part and prediction part. Its value was decreased to a lower value during the attack. Since we use the EWMA method to smooth the error, the shape of the error is not a rectangle. We detect this attack with 100% recall. With decreasing the sensor’s value to extremely lower value, the error obtained after -power processing is bigger. As a result, this attack is easily detected relatively.

Another attack is no. 31, which impacts device LIT-101, as shown in Figure 7. Compared with attack no. 5, it causes lower value change. The value of error is small also, which affects the detect performance. The recall for this attack is 79.6%.

From Figures 6 and 7, we can see that the error does not shrink quickly after the attack is over. This is because the system needs time to stabilize. The recall of all attacks is shown in Figure 8.

We compare the results of our method (Com-AE) with the other three methods, which are DNN and SVM [24], as well as TABOR [23]. The performance obtained is shown in Table 3. Our method achieves higher recall and F1-score when compared with the other three methods. But the precision is lower. Through analysis, we find that using a higher threshold will obtain a higher precision. The number of detected attacks is reduced in this way, however. With consideration of detecting more attacks, we use the maximal error obtained during the training phase as the threshold as described before.

MethodPrecisionRecallF1 score


A detailed comparison of recall for every attack is listed in Table 4. There are 36 attacks total within testing data which were the records in the remaining 4 days. DNN and SVM fail to check some attacks. TABOR method makes 24 attacks detected. Our method Com-AE detects 26 attacks successfully. It seems some attacks may have little impact on the system since all the methods fail to detect them.


1MV-101FIT-101, MV-101, PIT-502000.0490.966
2P-102P-102, P-302, FIT301000.9300.919
3LIT-101P-302, FIT-301, DPIT-3010000.120
5AIT-202AIT-202, P-203, P-3020.7170.7200.9951.0
6LIT-301P-205, P-203, P10100.88800.354
7DPIT-301P-302, DPIT-301, FIT-3010.9270.9190.9920.992
8FIT-401FIT-401, UV-401, P-50110.4330.9940.150
9FIT-401UV-401, FIT-401, P-5010.97810.9981.0
13MV-303P-101, MV301, P-302000.5970.275
15AIT-504AIT-504, AIT-503, AIT-5010.8450.8480.9970.929
17UV-401,AIT-502,P-501UV-401, P-501, FIT-5040.99810.9980.961
18P-602,DPIT-301,MV-302DPIT-301, P-302, FIT-3010.8670.87500.984
19P-203,P-205P-203, P-205, P-2040000.523
21P-101,LIT-301P-102, AIT-503, FIT-201000.9990.976
23P-302P-302, MV-304, FIT-3010.9360.9361.0000.977
25P-101,MV-101,LIT-101FIT-201, P-102, AIT-50300.0030.9990.964
27LIT-301P-205, P-203, P-10100.90500.045
28LIT-101P-302, FIT-301, DPIT-301000.8900.225
29P-101P-102, P-101, AIT-503000.9900.178
30P-101,P-102AIT-202, P-101, FIT-201000.2580.507
31LIT-101AIT503, MV304, LIT-10100.1190.8890.796
32P-501,FIT-502FIT-504, FIT-503, PIT-502110.9980.795
33AIT402,AIT502FIT-101, AIT-502, AIT-4020.9230.9270.9961.0
34FIT-401,AIT-502AIT-503, FIT-401, AIT-5020.94000.3690.788
35FIT-401FIT-401, P-501, UV-4010.9330.9270.9970.805
36LIT-301LIT-301, AIT-402, AIT-50200.35700.321

6.5. Detected Target

Because of space limitation, we just list attacks associated with devices within the P1 stage in Table 5. As Table 5 shows, our method can locate 5 attacks within 10 attacks rightly. Even if there are 5 attacks located, they are not recognized as the most suspicious anomaly part.


1MV-101FIT-101, MV-101, P-502
2P-102P-102, P-302, FIT301
3LIT-101P-302, FIT-301, DPIT-301
21P-101,LIT-301P-102, AIT-503, FIT-201
25P-101,MV-101,LIT-101FIT-201, P-102, AIT-503
28LIT-101P-302, FIT-301, DPIT-301
29P-101P-102, P-101, AIT-503
30P-101,P-102AIT-202, P-101, FIT-201
31LIT-101AIT503, MV304, LIT-101

There are some reasons behind these. The variables are associated with each other closely since ICS is a complex system. And this system includes two types of devices, namely, the sensor and actuator. When one variable is suffering an attack, the other one would change also. The degree may be higher than the original variables. How to detect the most suspicious part considering these factors will be our future work directions.

For the detailed detection result, please check Table 4, where all the total 36 attack detection result are listed.

7. Conclusions and Future Work

A composite autoencoder neural network model is proposed in this paper. The model is aimed at learning the pattern of ICS working conditions using data under normal conditions only. After predicting and reconstructing the origin input time series at the same time, the error obtained from both parts is used to determine whether an observation of multivariable time series is anomalous or normal. With anomaly results, the change ratio comes to locate the variables which are suffering an attack.

We demonstrate the effectiveness of our methods by experimenting on the SWaT dataset which is collected from a real industrial control system. The F1 score is 87.0% and recall is 88.5%, which is higher than other current researches.

In the future, there are some directions to enhance the performance of our method. (i)When handling the error obtained by the neural network model, a static value is selected as a threshold to determine whether an anomaly happens. It will be more accurate if we use a dynamic threshold(ii)Although the algorithm used for locating attacked variables can find target most likely in some attack scenes. The accuracy needs some improvements

Also, we have only experimented with our method on the SWaT dataset. It is more significant to test its performance on more datasets.

Data Availability

Readers who want to reproduce our result or test their own methods can access the dataset used in this research from the website: https://itrust.sutd.edu.sg/itrust-labs_datasets. Please follow the instructions on the website to obtain this dataset.

Conflicts of Interest

All the authors hereby declare no conflicts of interest.


We thank the iTrust, Centre for Research in Cyber Security, Singapore University of Technology and Design for designing and sharing the SWaT dataset. This research is funded by the National Key Research and Development Program of China (No. 2018YFB2004200).


  1. F. Song, Z. Ai, Y. Zhou, I. You, K. R. Choo, and H. Zhang, “Smart collaborative automation for receive buffer control in multipath industrial networks,” IEEE Transactions on Industrial Informatics, vol. 16, no. 2, pp. 1385–1394, 2020. View at: Publisher Site | Google Scholar
  2. R. Menchaca-Mendez, B. Luna-Nuez, R. MenchacaMendez, A. Yee-Rendon, R. Quintero, and J. Favela, “Opportunistic mobile sensing in the fog,” Wireless Communications and Mobile Computing, vol. 2018, Article ID 2796282, 18 pages, 2018. View at: Publisher Site | Google Scholar
  3. C. Gong, F. Lin, X. Gong, and Y. Lu, “Intelligent cooperative edge computing in the internet of things,” IEEE Internet of Things Journal, p. 1, 2020. View at: Publisher Site | Google Scholar
  4. F. Song, M. Zhu, Y. Zhou, I. You, and H. Zhang, “Smart collaborative tracking for ubiquitous power iot in edge-cloud interplay domain,” IEEE Internet of Things Journal, vol. 7, no. 7, pp. 6046–6055, 2020. View at: Publisher Site | Google Scholar
  5. H. Hui, C. Zhou, S. Xu, and F. Lin, “A novel secure data transmission scheme in industrial internet of things,” China Communications, vol. 17, no. 1, pp. 73–88, 2020. View at: Publisher Site | Google Scholar
  6. I. N. Fovino, “SCADA system cyber security,” in Secure Smart Embedded Devices, Platforms and Applications, pp. 451–471, Springer, New York, NY, USA, 2014. View at: Publisher Site | Google Scholar
  7. S. Zhanwei and L. Zenghui, “Abnormal detection method of industrial control system based on behavior model,” Computers & Security, vol. 84, pp. 166–178, 2019. View at: Publisher Site | Google Scholar
  8. N. Goldenberg and A. Wool, “Accurate modeling of modbus/tcp for intrusion detection in SCADA systems,” International Journal of Critical Infrastructure Protection, vol. 6, no. 2, pp. 63–75, 2013. View at: Publisher Site | Google Scholar
  9. P. Malhotra, L. Vig, G. Shroff, and P. Agarwal, “Long short term memory networks for anomaly detection in time series,” European Symposium on Artificial Neural Networks, i6doc, 2015. View at: Google Scholar
  10. P. Malhotra, A. Ramakrishnan, G. Anand, L. Vig, P. Agarwal, and G. Shroff, “LSTM-based encoder-decoder for multi-sensor anomaly detection,” in ICML 2016 Anomaly Detection Workshop, New York, NY, USA, July 2016. View at: Google Scholar
  11. N. Srivastava, E. Mansimov, and R. Salakhutdinov, “Unsupervised learning of video representations using LSTMs,” in Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML15, pp. 843–852, Lille, France, 2015. View at: Google Scholar
  12. S. Plaga, N. Wiedermann, S. D. Anton, S. Tatschner, H. Schotten, and T. Newe, “Securing future decentralised industrial IoT infrastructures: challenges and free open source solutions,” Future Generation Computer Systems, vol. 93, pp. 596–608, 2019. View at: Publisher Site | Google Scholar
  13. M. Krotofil and D. Gollmann, “Industrial control systems security: what is happening?” in 2013 11th IEEE International Conference on Industrial Informatics (INDIN), pp. 670–675, Bochum, Germany, July 2013. View at: Publisher Site | Google Scholar
  14. M. Kaouk, J. Flaus, M. Potet, and R. Groz, “A review of intrusion detection systems for industrial control systems,” in 2019 6th International Conference on Control, Decision and Information Technologies (CoDIT), pp. 1699–1704, Paris, France, April 2019. View at: Publisher Site | Google Scholar
  15. R. Chandia, J. Gonzalez, T. Kilpatrick, M. Papa, and S. Shenoi, “Security strategies for SCADA networks,” in Critical Infrastructure Protection. ICCIP 2007. IFIP International Federation for Information Processing, vol 253, E. Goetz and S. Shenoi, Eds., pp. 117–131, Springer, Boston, MA, USA, 2008. View at: Publisher Site | Google Scholar
  16. K. Stefanidis and A. G. Voyiatzis, “An HMM-based anomaly detection approach for SCADA systems,” in Information Security Theory and Practice. WISTP 2016. Lecture Notes in Computer Science, vol 9895, S. Foresti and J. Lopez, Eds., pp. 85–99, Springer, Cham, Heraklion, Greece, 2016. View at: Publisher Site | Google Scholar
  17. B.-K. Kim, D.-H. Kang, J.-C. Na, and T.-M. Chung, “Detecting abnormal behavior in scada networks using normal traffic pattern learning,” in Computer Science and its Applications. Lecture Notes in Electrical Engineering, vol 330, J. Park, I. Stojmenovic, H. Jeong, and G. Yi, Eds., Springer, Berlin, Heidelberg. View at: Publisher Site | Google Scholar
  18. M. K. Yoon and G. Ciocarlie, “Communication pattern monitoring: improving the utility of anomaly detection for industrial control systems,” in Proceedings 2014 Workshop on Security of Emerging Networking Technologies, San Diego, CA, USA, 2014. View at: Publisher Site | Google Scholar
  19. D. Formby, P. Srinivasan, A. Leonard, J. Rogers, and R. Beyah, “Who’s in control of your control system? Device fingerprinting for cyber-physical systems,” in Proceedings 2016 Network and Distributed System Security Symposium, San Diego, CA, USA, Febuary 2016. View at: Publisher Site | Google Scholar
  20. Z. He, A. Raghavan, G. Hu, S. Chai, and R. Lee, “Power-grid controller anomaly detection with enhanced temporal deep learning,” in 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE), pp. 160–167, Rotorua, New Zealand, August 2019. View at: Publisher Site | Google Scholar
  21. M. Munir, S. A. Siddiqui, A. Dengel, and S. Ahmed, “Deepant: a deep learning approach for unsupervised anomaly detection in time series,” IEEE Access, vol. 7, pp. 1991–2005, 2019. View at: Publisher Site | Google Scholar
  22. J. Goh, S. Adepu, M. Tan, and S. L. Zi, “Anomaly detection in cyber physical systems using recurrent neural networks,” in 2017 IEEE 18th International Symposium on High Assurance Systems Engineering (HASE), Singapore, Singapore, January 2017. View at: Publisher Site | Google Scholar
  23. Q. Lin, S. Adepu, S. Verwer, and A. Mathur, “Tabor: A graphical model-based approach for anomaly detection in industrial control systems,” in Proceedings of the 2018 on Asia Conference on Computer and Communications Security, ASIACCS 18, New York, NY, USA, May 2018. View at: Publisher Site | Google Scholar
  24. J. Inoue, Y. Yamagata, Y. Chen, C. M. Poskitt, and J. Sun, “Anomaly detection for a water treatment system using unsupervised machine learning,” in 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 1058–1065, New Orleans, LA, USA, November 2017. View at: Publisher Site | Google Scholar
  25. J. Goh, S. Adepu, K. N. Junejo, and A. Mathur, “A dataset to support research in the design of secure water treatment systems,” in Critical Information Infrastructures Security. CRITIS 2016. Lecture Notes in Computer Science, vol 10242, G. Havarneanu, R. Setola, H. Nassopoulos, and S. Wolthusen, Eds., Springer, Cham, 2016. View at: Publisher Site | Google Scholar
  26. D. Shalyga, P. Filonov, and A. Lavrentyev, “Anomaly detection for water treatment system based on neural network with automatic architecture optimization,” 2018, http://arxiv.org/abs/1807.07282. View at: Google Scholar
  27. D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in International Conference on Learning Representations, San Diego, CA, USA, 2015. View at: Google Scholar

Copyright © 2020 Chao Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

More related articles

 PDF Download Citation Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

Article of the Year Award: Outstanding research contributions of 2020, as selected by our Chief Editors. Read the winning articles.