Abstract

With the development of IT technologies, an increasing number of industrial control systems (ICSs) can be accessed from the public Internet (with authentication). In such an open environment, cyberattacks become a serious threat to both ICS system integrity and data privacy. As a countermeasure, anomaly detection systems are often deployed to analyze the network traffic. However, due to privacy regulation, the network packages cannot be directly processed in plaintext in many countries. In this work, we present a privacy-preserving anomaly detection platform for ICS. The platform consists of three nodes running low-latency MPC protocols to evaluate the live network packages using decision trees on the fly with privacy assurance. Our benchmark result shows that the platform can process thousands of packages every ten seconds.

1. Introduction

A modern industrial control system (ICS) is a complex distributed system that consists of multiple field devices, e.g., sensors, actuators, and instrumentation, as well as some control/management systems. ICS is the interface of cyber-physical system (CPS), enabling humans to control operations and receive data from devices. In recent years, ICS has been widely used in many industrial scenarios, such as gas, water, and nuclear power systems, and the security of these systems is critical.

As shown in Figure 1, a typical architecture of an industrial control system has four layers. (i) The enterprise management layer offers business services and is often connected to public network, which may include the enterprise resource planning (ERP) system, manufacturing execution system (MES), and management information system (MIS). (ii) The supervisory control layer receives and stores data from the underlying devices and then gives appropriate responses. (iii) The process control layer has programmable logic controller (PLC) and remote terminal unit (RTU), which directly control devices in the underlying layer. (iv) The field control layer has multiple field devices that receive commands and send data to the process control layer. As the enterprise management layer connects to the public network, ICS is exposed to cyberattacks. Along with the advancement of cyberattacks, the corresponding countermeasure techniques also need to be upgraded. In practice, a great number of famous industrial control systems have been severely threatened by cyberattack. For instance, the Stuxnet virus spied and reprogrammed industrial systems controlling centrifuges of the Iran nuclear power plant [1]. In 2021, hackers breached Colonial Pipeline using compromised password and Colonial Pipeline had to give hackers ransom [2].

To enhance industrial control system security, defense systems like intrusion (or anomaly) detection system (IDS) are deployed in ICS. IDS plays an important role in protecting ICS, which is commonly used to detect potential cyberattacks. IDS can be classified as network intrusion detection system (NIDS) and host-based intrusion detection system (HIDS). The NIDS examines network traffic, while the HIDS monitors the system data logs. According to detection approach, IDS can be classified as signature-based detection and anomaly-based detection. The former detects intrusion by recognizing harmful system pattern, while the latter does it by analyzing network traffic packages.

In this work, we aim to design an anomaly-based network intrusion detection platform for ICS. The platform can be deployed alongside any existing off-the-shelf ICSs, and it can examine live network packages on the fly and raise alarms once fault is detected. However, in many countries, processing network packages in plaintext violates the local privacy laws and regulations. The European Union has put forward General Data Protection Regulation [3] in 2016, which has a clear standard for the processing of personal information. In 2020, The United States carried out the California Consumer Privacy Act [4], creating a series of privacy rights for consumers, such as the right to access, delete, and know. In China, new 2020 edition of the Personal Information Security Specification [5] has proposed clear regulations in the life cycle of the personal information, including collection, storage, use, processing, transmission, openness, and deletion. These regulations will have a profound impact to systems that store and/or process personal information. Therefore, our anomaly detection platform is designed to be privacy preserving.

As a closely related work, Gao et al. [6] used the homomorphic encryption scheme to encrypt data when training and applying ICS-specific anomaly detection model. But homomorphic encryption scheme will lead to heavy computation overhead. Alternatively, we utilize low-latency secure multi-party computation (MPC) techniques for privacy-preserving anomaly detection.

More specifically, our platform consists of three non-colluding servers that run low-latency MPC protocols to analyze network package in real time using the gradient boosting decision tree (GBDT) model with privacy assurance. GBDT is an effective machine learning algorithm which classifies input data rapidly with high accuracy.

1.1. Our Contributions

In this work, we present an efficient MPC-based privacy-preserving anomaly detection platform for ICS. More specifically, the contributions of this work are as follows:(i)We propose a new MPC-based anomaly detection architecture for ICS, and it is compatible with any off-the-shelf ICSs.(ii)We design several new constant-round low-latency MPC protocols for privacy-preserving decision tree evaluation.(iii)We implement a prototype of the proposed system, and our benchmark result shows that processing 1000 network packages with a depth-9 decision tree takes 11 seconds in the LAN setting.

1.2. Roadmap

The remainder of this paper is organized as follows. We introduce the preliminary knowledge about the approach we used in Section 2. Then, system overview and security model of our platform are given in Section 3. Section 4 describes privacy-preserving decision tree evaluation in detail. We present the performance of the proposed platform in Section 5. The related work is given in Section 6. Finally, conclusion and future work are given in Section 7.

2. Preliminary

2.1. Notations

Throughout the paper, we use the following notations. Denote as the security parameter. Denote a value indexed by a label as . -secret sharing is to divide secret into parts, and any participants can reveal secret jointly. Denote (2, 2)-additive secret sharing, (3, 3)-additive secret sharing, and (2, 3)-additive secret sharing in in Table 1.

rR means to randomly sample the element r from the set R. In addition, y represents y is the output when the function takes x as input. For , map it to by adding .

2.2. Gradient Boosting Decision Tree

Our proposed platform mainly uses gradient boosting decision tree (GBDT) as intrusion detection model. Decision tree is a classical machine learning model, which is efficient and interpretable. Its non-leaf node is decision node, which performs a test to decide to go to left sub-tree or right sub-tree. Its leaf node is the end of a decision path that begins with root node, including prediction result. Boosting is a kind of algorithm that combines many weak learners into a strong learner. The first step is training a base learner, like decision tree. Then, adjust training samples according to the classification result of base learner, so that those misclassified samples will get more attention in the subsequent training process. After that, train next weak learner using adjusted training samples. Repeat the process iteratively to obtain enough weak classifiers and combine them together according to their weight to obtain a strong classifier. Gradient boosting is an algorithm in boosting, which iterates the new learner through gradient descent.

The GBDT is a learning algorithm based on boosting. Its essence is that the next regression decision tree is built on the gradient descent direction of the loss function of the last round, and multiple regression decision trees are combined into a gradient boosting decision tree finally. When is the input of GBDT, its classification result is , where is number of decision trees in GBDT and is -th tree’s output. In general, the tree in GBDT is CART tree.

Given a training set , where is input feature vector and is its class label. The process of training GBDT consists of rounds iteration. In the -th iteration, the goal is to generate a decision tree to minimize the objective function .where is regularization item and is -th sample’s classification result in -th iteration. is loss function, is number of leaf nodes, and is the value of a leaf node. Expression (1) uses a second-order Taylor expansion to get the following expression.where is the first step degree value of loss function and is the second step degree value of loss function .

The -th decision tree only includes a root node with the training set initially. Suppose a sample set in a node is partitioned into and ; is defined as follows.

Perform a test for each possible split point and select the optimal split point that causes minimum . If the current node does not meet the splitting requirements, for example, the depth of the current node reaches the maximum, the current node becomes a leaf node with a value , and is defined as follows.

GBDT has strong classification ability in anomaly detection task.

2.3. Secure Multi-Party Computation

Secure multi-party computation permits two or more participating parties to obtain output result by jointly computing over sensitive data from respective inputs. At the same time, the participating parties do not learn more about other parties’ inputs than the information about the output, so that each participating party can get computation result without leaking sensitive message.

Secure multi-party computing usually includes two different adversary models, namely, semi-honest security model and malicious security model. A semi-honest security model is one in which the adversary will honestly perform the intended calculation process but may wish to know the information of each party to the maximum extent. A malicious security model is one in which an adversary can control, manipulate, and arbitrarily contaminate information on a multi-party computing network. In this work, we mainly consider the semi-honest security model.

Although the first MPC protocol was already proposed by A. C.-C. Yao [7] in the 1980s, it was implemented practically in the last eighteen years. Nowadays, MPC becomes more important as data privacy gets more and more attention. It was adopted for private set intersection [8] and privacy-preserving machine learning [9].

Secret sharing is one of important parts in MPC. The remaining part of this section introduces distributed interval containment function (DICF) and shared oblivious transfer that we adopt in our MPC protocol.

2.3.1. Function Secret Sharing

Function secret sharing (FSS) [10] can split a function into and for each , where denotes the number of element in . Distributed point function (DPF) is a FSS scheme. For a point function , the range has only one non-zero value . There are two algorithms in DPF:(i): It generates a pair of keys . Each key is the share of without revealing and .(ii): , it outputs , such that .

Denote run on all inputs by .

DICF [11] is also a FSS scheme that can judge whether a secret input value is in a publicly known interval. Denote an interval containment function as the following equation.

DICF uses offset interval containment function defined as follows.where and are random offset values. Like DPF, DICF also consists of two algorithms.(i): it generates , as are publicly known and and are unrevealed.(ii): outputs a result , so that .

2.3.2. Oblivious Transfer

Oblivious transfer (OT) [12] is an important basic block in many MPC protocols. In oblivious transfer protocol, a sender has multiple messages and only one of them will be selected by receiver. Which message is selected is oblivious to the sender and the receiver can only obtain the selected message.

Shared OT is a kind of OT scheme that is used to fetch value in the shared form without revealing the value. In our approach, we utilize a 3-party shared OT protocol. In this protocol, three participants share a data vector and an index , as holds , holds , and holds , where . Then, they can fetch in the shared form without revealing by jointly computing.

2.4. Intrusion Detection

The main goal of intrusion detection system [13] is to detect cyberattacks. Cyberattack is any type of offensive action against computer systems, computer networks, or personal computer. Damaging, exposing, modifying, disabling software or services, or stealing or accessing data from any computer without authorization is considered an attack on the computer and computer network. According to the attack mode, cyberattack can be divided into active attack and passive attack. An active attack attempts to destroy computer system, which includes denial of service (DoS), distributed denial of service (DDoS), and botnet, while a passive attack aims to learn information about network system like port scan attack.

DoS deliberately attacks flaws in the network protocol implementation or depletes the target’s resources by brutal means, so that service or network cannot provide normal services. DDoS is a special form of denial of service attack based on DoS. It is a distributed and coordinated large-scale attack that may come from multiple attackers.

Botnet refers to the use of one or more means of transmission to infect a large number of hosts with bot program virus, so as to form a one-to-many control network between the controller and the infected host. The attacking process of port scanning attack is usually to remotely scan each port of the target computer, detect the services provided by different ports, and then record the response of the target computer to collect its information.

Generally, network anomaly detection requires the information about data packets, such as packet header characteristics, characteristics about TLS, and packet length.

3. System Framework and Security Model

This section gives the overview of our system framework firstly and describes the security model in Section 3.2.

3.1. System Framework

There are several components in the system framework, as depicted in Figure 2. The ICS preprocessed its packages firstly, including extracting features and secret sharing. For each data package, a feature vector is extracted from it. The feature vector contains the information about packet header and packet length. Then, the feature vector is divided into three parts using (2, 3)-additive secret sharing among . Next, the parts of secret are distributed to three servers, where a well-trained CART model is stored in the shared form using (2, 2)-additive secret sharing between and . Finally, the three nodes jointly obtain a detection result based on CART model, running MPC protocols described in Section 4.

3.2. Security Model

In the process of anomaly detection, we cannot store and process sensitive data from ICS in plaintext, according to respective laws and regulations. In order to detect attacks in network traffic with privacy assurance, we adopt MPC to achieve our goals. Firstly, we assume that there is a component in the ICS that can extract feature vectors of its network packages. This component shall be trusted, and it will then secretly share the extracted features to the three MPC nodes of our platform. One out of the three MPC nodes can be semi-honestly corrupted by the adversary. The shared process result will be sent back to the system admin of ICS, who will recover the result and make further actions accordingly.

3.2.1. Security Requirements

As described above, our proposed platform should protect privacy of ICS data when examining the sensitive data. Besides, the platform should respond accurately and quickly so that ICS can identify anomalies in time. Thus, we define the following key security requirements.(i)Data Privacy. Though we detect the sensitive data from ICS, the data will not be stored or processed in plaintext. Even if a MPC node is semi-honestly corrupted by the adversary, the data privacy can still be protected.(ii)Accuracy. As the platform’s task is anomaly detection, the accuracy of detection model should be as high as possible.(iii)On Time. Our platform should respond ICS as fast as possible so that ICS can handle cyberattack timely.

4. Privacy-Preserving Decision Tree Evaluation

This section describes the MPC protocols utilized in our approach. Firstly, we describe 3-party shared OT in Section 4.1. Then, we give the whole detection process, including data preprocessing, tree model storage, and evaluation.

4.1. 3-Party Shared OT

Given a replicated shared data vector and an additively shared index , three participants hold the shared form , , and respectively, that is similar to [14]. Then, the three participants can obtain in shared form by running our 3-party shared OT protocol.

4.1.1. Intuition

Our protocol is mainly constructed on the basis of Paul et al. [14], whose main idea is that each participant serves as the generator of DPF scheme, while the other two participants serve as evaluators to get -th value of vector in the shared form. For instance, let be the DPF generator and be the DPF evaluators. Firstly, and randomly select , respectively. Then, exchange and send to . After that, compute and computes . It is easy to see that . Next, generates a pair of DPF keys for point function and sends keys to evaluators. Finally, run full domain evaluation to jointly obtain ( is 1 if , and is 0 otherwise). Note that -th element in shifted vector is , as is cyclic shifted to the right position. After all these steps, hold in shared form. Following similar steps, can jointly get and can jointly get . In our 3-party shared OT protocol, we let the generator produce DPF keys of , where is randomly picked by generator. Then, the generator can produce DPF keys, which leads to less communication. Subsequently, all participants jointly compute and reveal to evaluators. At the end, evaluators can get in the shared form.

4.1.2. Protocol Description

The 3-party shared OT is depicted in Protocol 1. Initially, for , and agree on a random seed as , and if index greater than 2, is the brief form of . Note that as the index , we omit in the rest of this paper. Before each round of shared OT, for , node generates DPF keys . Then, sends to and to . All participants generate some using random seeds and pseudo-random function PRF. They jointly compute and reveal to evaluators and . Next, evaluators use DPF keys to get by running algorithm. Then, they jointly obtain

They can jointly get as . Lastly, participants rerandomize shares to ensure their uniform distribution.

Initialization:
   for eachdo
     and have the same random seed ;
   end
Preparing:
   for eachdo
    Generate ;
    Generate a pair of keys for ;
    Send to , to ;
   end
for eachdo
 Receive from the environment;
end
for eachdo
for eachdo
  ;
  ;
end
;
 Send to , to ;
end
for eachdo
 Receive from , from ;
end
for eachdo
for eachdo
  ;
end
;
;
;
, ;
 Return ;
end
4.2. Data Preprocessing

Before ICS transmits its network packages, the data need to be preprocessed in two steps. Firstly, ICS extracts a feature vector for each package so that decision tree can detect on package level. Then, it completes data desensitization. A feature vector is shared as , where and . Then holds , holds , and holds .

4.3. Storage of Tree Model

As we adopt a constant-round MPC protocol that needs a full binary tree and our trained model is just a binary tree, we will pad the binary tree as depicted in Figure 3 when storing the tree model. Then, we get a full binary tree such that adding tree nodes does not affect the final result. This full binary tree can be saved as two vectors and , where is the number of non-leaf nodes and is the number of leaf nodes in a full binary tree with depth . denotes the non-leaf node with index . The index increases from top to bottom, left to right. If a non-leaf node has non-leaf sub-nodes, its left child node is and its right child node is . The values and belongs to the i-th non-leaf node. Given a feature vector, the decision tree algorithm extracts the -th value of the feature vector to compare with the threshold . If ti-th value of feature vector is greater than , perform the same operation on the right child node, otherwise on left child node. The algorithm will be end when the node is a leaf node. The in is classification result when the leaf node with index is the end of decision path. and are shared as and . Then, holds and holds , where , and .

4.4. Evaluation

When the three servers received a feature vector, respective feature values will be compared with each value in non-leaf node . For each edge in decision tree, and will obliviously set their cost to 0 if the edge is selected according to the comparison; otherwise, set to a random non-zero value, as depicted in Figure 4. Then, and jointly sum up edge costs for all paths. Among all costs of paths, only one is zero, that is, the corresponding path is the decision process and the classification of the leaf node in this path is detection result.

As described in Protocol 2, the process of evaluation contains three key steps: feature selection, comparison, and path evaluation.

Initialization:
   for eachdo
     and agree on the same random seed ;
   end
Preparing:
   ;
   foreachdo
     generates ;
generates keys for ;
     sends to , to ;
   end
for eachdo
for eachdo
  ;
  ;//feature selection
  Run 3-party shared OT protocol to get ;
  ;//comparison
end
;
 Send to ;
end
for each do
receives from , from ;
for eachdo
  ;
  ;
  ;//path evaluation
;
end
;
fordo
  Sum up the share of edge costs along -th leaf node’s path;
  ;
  , ;
  ;
end
  , ;
  Send to
end
receives from , from ;
for eachdo
ifthen
  ;
   for point function ;
  Send to , to ;
  Return ;
end
end
for eachdo
receives ;
;
;
 Return ;
end
4.4.1. Feature Selection

For each node , is stored in and in the shared form . will be extended to , and holds 0. Then, run 3-party shared OT mentioned above to get the feature value .

4.4.2. Comparison

Comparison depends on the DICF scheme, where generates keys and are evaluators. generates a pair of keys for each non-leaf node to compare corresponding feature value of input with a random value , such that servers cannot obtain . Then, get a value by jointly computing. Next, and jointly get comparison result by evaluating DICF keys with , as if and otherwise.

4.4.3. Path Evaluation

and generate random value r_i together for each non-leaf node in the tree. Then, and locally compute the left out-going edge cost and the right-going edge cost for node . Then, as depicted in Figure 4, and jointly get path costs , where and only one path cost in is 0. To obliviously get classification result according to the position of 0 in , jointly pick a random value , and cyclic shift and to the right position to obtain and . Then, they generate a random vector , and jointly compute ( is the element with index i in ). Subsequently, reveal to . After obtaining , generates a pair of DPF keys for point function , where is the index of . Lastly, serve as evaluators and jointly compute . send to ICS and sends to ICS, so that the system admin of ICS gets classification result. Note that the result .

4.5. Security Analysis

The main building block of our privacy-preserving decision tree evaluation protocol is the 3-party shared OT (cf. Section 4.1). The construction of the 3-party shared OT protocol is inspired by Paul et al. [14]. At high level, in turn, each of the three servers plays the role of a DPF generator, and the other two servers play the role of DPF evaluators to obtain the -th position value of their (2, 3)-replicated shared data . In particular, for instance, when is the DPF generator, it generates a pair of DPF keys for a random position . Let the shared OT choice be . In the online phase, the servers jointly open to . They can then shift the shared data by position such that the -th position of the shifted data is .

We now analyze the security of this part. First of all, revealing leaks no information about , as is masked by information theoretically. Moreover, assuming that the underlying DPF scheme is secure, and obtain the shared form of . Repeating the above process for all three servers, we obtain the final result. Note that, for efficiency, we use PRF to generate shares of 0 without communication; assume that the underlying PRF is secure, and the generated shares are computationally indistinguishable from uniformly random ones. Therefore, when any of the three servers is semi-honest corrupted, its view is computationally indistinguishable from a few random shares (and the DPF keys).

When the model is not a full binary tree, we pad the model to a full binary tree by adding dummy nodes; therefore, the tree evaluation process does not leak any information about the tree structure to the MPC players. The security of the feature selection phase can be reduced to the security of the 3-party shared OT. In addition, we adopt the DICF scheme for secure comparison, and its security is proven in [11]. Finally, with regard to the path evaluation, we designed an encoding scheme for the tree such that we can evaluate the path within one multiplicative round. As a result, after evaluation, only the output label will be 0, and the remaining labels are uniformly random. Since each tree uses different fresh random encoding instances, the three severs cannot learn any additional information other than the intended output label.

5. Implementation and Benchmark

5.1. Dataset Description and Experiment Setup

In the experiment, we adopt CICIDS2017 [15] as dataset. It captures normal packages and attacks in simulated network environment that is similar to the real-world network. The dataset contains several CSV files, and each of them includes a kind of attack. We perform experiment on the CSV files corresponding to DDoS, DoS, botnet, port scan, and web attacks, as described in Table 2. Each entry in CSV file contains a feature vector and a class label. The feature vector includes 78 features, such as timestamp, source IP, destination IP, package length, and protocol.

We evaluate the performance of the GBDT model and CART model for the binary classification task with CICIDS2017 dataset. The samples of DDos, DoS, botnet, port scan, and web attacks are labeled as attack, and the others are labeled as normal. We randomly select 80% of this dataset as our training set and the remainder as validation set.

5.2. Evaluation Metrics

To evaluate the performance of intrusion detection, we adopt some evaluation metrics, such as , , and . These metrics depend on four parameters. (i) True Positive (TP) denotes the number of attack samples that are correctly classified. (ii) False Negative (FN) denotes the number of attack samples that are wrongly classified. (iii) True Negative (TN) denotes the number of normal samples that are correctly classified. (iv) False Positive (FP) denotes the number of normal samples that are wrongly classified.(i) indicates how many samples that are classified as attacks are real attacks.(ii) indicates how many attack samples are correctly classified. Since the proportion of attack in the total sample is small and the attack will cause severe consequence, we need to identify as many attacks as possible. Therefore, is an important evaluation metric.(iii)-score is calculated based on and and shows the trade-off between and .

5.3. Intrusion Detection Result

In our experiment, we utilize GBDT mentioned in Section 2.2 as our detection model firstly. We set the regularization coefficients and and learning rate as 0.1. In the GBDT model, each tree’s maximum depth is set to 9. Figure 5 shows the performance of GBDT model, when the iterations are 5, 10, 15, 20, 40, and 60.

We use regularization item to prevent overfitting. As shown in Figure 5, the metrics and -score notably enhance as the number of iterations increases. When the number of iterations reaches 60, the GBDT model performs good on three metrics (, , and ).

Then, we use a CART decision tree model as detection model to evaluate its performance on CICIDS2017 dataset. We set the maximum depth of decision tree to 9 and obtain its evaluation result (, , and ). By adopting multiple decision trees for iterative learning, the GBDT model obtain stronger classification ability than single CART decision tree.

5.4. Time Efficiency

We run our platform in different network environments to evaluate its time efficiency. The network environments are simulated, including LAN (0.1 ms RTT, 1 Gbps bandwidth), MAN (6 ms RTT, 100 Mbps bandwidth), and WAN (80 ms RTT, 40 Mbps bandwidth). We set the depth of full decision tree to 5, 7, 9, 11, and 13. We evaluate the time efficiency of our proposed platform when the platform evaluates one tree and evaluates one thousand trees. Each tree is a full binary tree. Our benchmarks are executed on a desktop with Intel(R) Core i7 8700 CPU @ 3.2 GHz, and the operating system is Ubuntu 18.04.2 LTS with 6 CPUs, 32 GB memory, and 1 TB SSD.

As shown in Table 3, our platform performs good in different simulated network environments. Our platform can evaluate one thousand trees whose depth is 9 in 11 seconds when the network environment is LAN (0.1 ms RTT, 1 Gbps bandwidth). However, the increasing depth of decision tree results in more communication cost because the constant-round protocol’s communication cost is , where is depth of full tree. Therefore, the proposed protocol is not suitable for the tree model whose depth is greater than 9. As GBDT uses multiple trees, the GBDT model is significantly slower than the CART model. Each tree of GBDT can be evaluated independently, and finally client sums up trees’ evaluation result to obtain classification result. Therefore, we can improve the parallel computing capability of the proposed platform to enhance time efficiency of the GBDT model.

Anomaly detection has been developed for decades and is widely used as defensive method in conventional network. However, since ICS is different from conventional network system, anomaly detection technique cannot be used in ICS directly. Availability and real-time performance are required in ICS-specific IDS [16]. There are a large number of works on ICS-specific IDS. With the development of machine learning (ML) and deep learning (DL) algorithms, most recent works use them to detect anomaly in ICS. The authors in [17] evaluated several machine learning models on an ICS dataset called Power System Dataset, such as Nearest Neighbor, Random Forests, Naive Bayes, SVM, AdaBoost, and JRip. In [18], the authors evaluated different ML and DL algorithms using their generated ICS dataset Electra. These algorithms contain One-Class SVM, SVM, Isolation Forest, Random Forest, and Neural Network. In [19], the authors used the Pearson Correlation Coefficient (PCC) to select packet features and used the Gaussian Mixture Model (GMM) to transform important features for privacy preservation. Then, they used the transformed features as input of a Kalman Filter to detect anomaly. In [20], they utilized Bloom filter to store the signature database for packet-based intrusion detection and applied an LSTM model to learn temporal features.

In private branching program (BP) and decision tree evaluation with constant communication round, there have been several works. The work in [21] evaluates BP with input encrypted by homomorphic public-key cryptosystem. However, it is impractical when the input feature vector is too large. After that, some evaluation protocols with constant communication round are proposed. In [22], the authors utilized additive homomorphic encryption (AHE) and OT for obliviously feature selection and converted the BP model into a secure program with Garble Circuits for comparison. Bost et al. [23] evaluated a decision tree with costly fully homomorphic encryption (FHE) by treating decision tree as a high-degree polynomial. The authors of [24] used OT to select leaf node and DGK protocol based on AHE instead of FHE for comparison. Raymond et al. [25] improved the work in [24] by representing decision tree as linear functions instead of high-degree polynomial form. They computed “path cost” of each leaf node and used it to decide which leaf node contains classification result. In [26], they reviewed prior constant-round approaches and proposed a modular construction from three constant-round sub-protocols: private feature selection, secure comparison, and oblivious path evaluation.

7. Conclusion and Future Work

In this paper, we proposed a privacy-preserving anomaly detection platform for industrial control system. It depends on two main components, detection model and MPC protocol. We use GBDT and CART as anomaly detection models, which are able to detect anomaly with high accuracy. As information privacy is protected by laws and regulations in many countries, we adopt a MPC protocol that can detect network packages from ICS based on decision tree when sensitive data are invisible. The experimental results indicate that the proposed platform can detect anomaly on package level in real time with high accuracy.

Our platform can be developed in several ways in the future. Firstly, we plan to evaluate the performance of our platform in a simulated environment that resembles real environment. In addition, to make detection model more practical, it is necessary to use real data of ICS as training set. Lastly, we will utilize a privacy-preserving machine learning approach in training stage to ensure training data privacy.

Data Availability

The experiment data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the National Key R&D Program of China (no. 2021YFB3101601), the National Natural Science Foundation of China (grant no. 62072401), and the Open Project Program of Key Laboratory of Blockchain and Cyberspace Governance of Zhejiang Province. This project was also supported by Input Output (iohk.io).