Abstract

Industrial control systems (ICS) involve many key industries, which once attacked will cause heavy losses. However, traditional passive defense methods of cybersecurity have difficulty effectively dealing with increasingly complex threats; a knowledge graph is a new idea to analyze and process data in cybersecurity analysis. We propose a novel overall framework of data-driven industrial control network security defense, which integrated fragmented multisource threat data with an industrial network layout by a cybersecurity knowledge graph. In order to better correlate data to construct a knowledge graph, we propose a distant supervised relation extraction model ResPCNN-ATT; it is based on a deep residual convolutional neural network and attention mechanism, reduces the influence of noisy data in distant supervision, and better extracts deep semantic features in sentences by using deep residuals. We empirically demonstrate the performance of the proposed method in the field of general cybersecurity by using dataset CSER; the model proposed in this paper achieves higher accuracy than other models. And then, the dataset ICSER was used to construct a cybersecurity knowledge graph (CSKG) on the basis of analyzing specific industrial control scenarios, visualizing the knowledge graph for further security analysis to the industrial control system.

1. Introduction

Industrial control systems (ICS), which involve key industries such as oil and gas production, electricity, chemical processing, transportation, and manufacturing, have seen increasing security problems and cyberattacks in recent years due to access to the Internet, such as Stuxnet. Stuxnet [1] infected and manipulated programmable logic controller (PLC) and caused serious physical damage to equipment which led to system failure. In 2016, the power system of Ukraine was attacked by a variant of the BlackEnergy malicious code [2], resulting in a large-scale power outage that affected 225,000 citizens. An industrial control network involves a lot of important infrastructure construction; in the event of a cyberattack, huge losses will be caused and endanger the economy, public safety, human life, and other aspects [3]. With the support of 5G technology, the industrial Internet will be integrated with the development of 5G [4], which promotes industrial development while introducing more security risks, so it is necessary to further improve the guarantee of industrial network security.

Data-driven prediction and analysis of cybersecurity incidents is a hot topic in current cybersecurity research; through mining correlations among industrial control network data, the asset equipment information of the industrial control system can be associated with corresponding vulnerabilities, to identify the potential internal and external threat relationship with fine granularity and construct the asset threat graph based on a specific industrial control network structure. It is more explicit to see threat situation in security analysis of ICS by using visualization technology, which provides accurate support for industrial control network security protection decision-making. Currently, there are numerous open source threat intelligence sources periodically updating threat feeds fed into various analytical solutions. Security news, security forums, and vulnerability information are important data sources for cyberthreat intelligence. However, the above data is fragmented, and it is difficult to correlate such multisource data.

A cybersecurity knowledge graph (CSKG) is a powerful tool for data-driven thread intelligence computing. Researchers can intuitively know cybersecurity entities and relations between the entities through CSKG, such as utilization relation between malware and vulnerabilities, employment relation between attackers and organizations, and ownership between software and vulnerabilities. Relation extraction is a very important task in the construction of CSKG from unstructured data.

In relation extraction, the lack of labeled data for training is a challenge when constructing a network security knowledge graph. A common technique for coping with this difficulty is distant supervision in natural language processing. Distant supervision strategy is an effective method of automatically labeling training data. However, the assumption in the distant supervision method is too strong, leading to the wrong label problem.

In this paper, we first propose a novel overall framework of data-driven industrial control network security defense. In order to better mine entity relations in cybersecurity data, we propose a novel cybersecurity relation extraction model ResPCNN-ATT which combined Residual Learning, Piecewise Convolutional Neural Networks (PCNN), and multi-instance ATTention. The following list details the main contributions of the article: (i)A novel data-driven industrial network security defense framework is proposed, which structures fragmented multisource data and integrates with industrial network layout(ii)A distant supervised cybersecurity relation extraction model based on ResPCNN-ATT is proposed to reduce the impact of noise data in open source threat intelligence data sources(iii)ResPCNN-ATT first uses the pretrained word vector and the position vector between cybersecurity entity pairs as the model input and then uses PCNN to extract the semantic features. Deep residual learning is used to solve the problem of gradient disappearance caused by noise data. A multi-instance attention mechanism is used to calculate the correlation between instance and the corresponding relation to reduce the impact of noise data(iv)The datasets CSER and ICSER are constructed. We first empirically demonstrate the performance of the proposed method in the field of general cybersecurity by using dataset CSER. And then, we analyze asset information and network layout of Electric Power and Intelligent Control Testbed (EPIC) and use dataset ICSER to construct a cybersecurity knowledge graph for EPIC, visualizing the knowledge graph for further security analysis to the industrial control system

The rest of the paper is organized as follows. We describe related works in Section 2 and propose the overall framework in Section 3. The structure definition of CSKG is analyzed in Section 4. The cybersecurity relation extraction model and details are shown in Section 5, and performance evaluation of the model is discussed in Section 6. In Section 7, we construct and visualize a cybersecurity knowledge graph based on a specific industrial control scenario. Section 8 draws conclusions.

Industrial control systems (ICS) consist of integrated hardware and software components for monitoring and controlling various industrial processes, often deployed in critical infrastructure such as water treatment plants, power grids, and gas pipelines [5]. In recent years, more and more components of ICS are connected to the Internet, exposing more and more security vulnerabilities that may be exploited by attackers [6]. Various vulnerabilities in Internet are important internal causes of network security risks. There are vulnerabilities in all levels and links of the information network; once exploited by malicious actors, they will affect normal operation of the system and its services [7]. Due to the increasing number of attack events and the serious consequences of attacking, and the many threats in the complex industrial network environment [8, 9], it is crucial to study industrial network security. Traditional passive defense measures of cybersecurity have the difficulty of effectively dealing with the increasingly complex threats; we must strengthen cybersecurity analysis capability based on vulnerabilities, threat intelligence, and other aspects and enhance the industrial network security active defense capability.

Structuring and organizing data can improve the efficiency and accuracy of cybersecurity analysis. Sadighian et al. [10] proposed ONTIDS, an ontology alarm association framework based on context information. By defining the ontology structure, security alarms are represented and stored, and the association between alarm information is regularized; on this basis, rules are set to filter alarms to reduce the false alarm rate and facilitate network security analysis. In order to further achieve cybersecurity information correlation and semantic analysis, many researches are devoted to improving the interpretation, feature correlation, and data processing of the alarm log, reducing the false alarm rate, and enhancing cybersecurity analysis capability [1113].

Data-driven cybersecurity event prediction and analysis are hot topics in the current cybersecurity research [14]. Shu et al. introduced a new methodology that models threat discovery as a graph computation problem for threat intelligence [15]. Yu et al. proposed a relation extraction method for the construction of a knowledge graph in the food field [16]. As a semantic knowledge base, a knowledge graph is a powerful tool for managing large-scale knowledge consisting of entities and relations between them. Using a knowledge graph to analyze and process data provides a new idea for cybersecurity analysis, integrates open source fragmented data, identifies its correlation, associates asset equipment in ICS with corresponding vulnerability information, excavates the internal and external potential threat relation, and further conducts more accurate analysis on industrial control network security. It is crucial to mine the association of data resources efficiently and accurately.

Natural language processing technology [1719] tends to only consider the domain name and IP address when analyzing the relation between malicious entities, both of which have very simple relation definitions. Pingle et al. proposed the RelExt [20] system, which strives to improve various cyberthreat representation schemes, especially cybersecurity knowledge graphs (CSKG), by predicting the relations between cybersecurity entities identified by cybersecurity named entity recognizer. VIEM [21] analyzed a large number of inconsistencies by extracting software names and software versions in public security vulnerability reports, so the extraction of relations is more complicated.

Relation extraction (RE) is one of the most important topics in NLP. Many relation extraction methods have been proposed [2224], such as bootstrapping, unsupervised relation discovery, and supervised classification. Most existing supervised RE methods require a large amount of labeled relation-specific training data, which is very time-consuming and labor-intensive. Distant supervision is proposed to automatically generate training data. Under the framework of distance supervised learning, some recent work [2528] attempts to use deep neural networks in relation prediction. Although distant supervision is an effective strategy to automatically label training data, it always suffers from the wrong label problem.

3. Overall Framework

There are numerous open source threat intelligence sources periodically updating threat feeds fed into various analytical solutions; it is significant for cybersecurity analysis that structures these data and applies them to specific scenarios. As shown in Figure 1, we propose a data-driven industrial control network security analysis framework based on a cybersecurity knowledge graph. We combine threat intelligence such as third party attack reports and vulnerability libraries with asset network layouts, and so, internal network layout and threat information corresponding to assets in networks are integrated with external threat intelligence. A knowledge graph extends the problem of cybersecurity analysis to the study of the graph structure; graph-based analysis is conducive to the development of effective system protection, detection, and response mechanisms.

We first analyze ICS scenarios to identify asset equipment and communication layout. On this basis, we mine external vulnerability information from vulnerability libraries such as Cybersecurity and Infrastructure Security Agency (CISA) (https://www.us-cert.gov/ics), National Vulnerability Database (NVD) (https://nvd.nist.gov/), Common Weakness Enumeration (CWE) (https://cwe.mitre.org/), and Common Vulnerabilities and Exposures (CVE) (http://cve.mitre.org/). We collect data by the way of focused crawling and obtain the key corpus for constructing a knowledge graph after processing. And then, we utilize cybersecurity entity identification and relation extraction technology to form a cybersecurity knowledge graph (CSKG), offering structured analysis data for specific cybersecurity scenarios. Based on the constructed CSKG, we can use visualization technology to show the connection between assets and threats clearly; it becomes easier to query entities, relations, and path. We further research on the basis of the knowledge graph, utilizing knowledge reasoning technology to forecast correlation of threats and assets, to more comprehensively analyze industrial control network security.

We have done a lot of research on the key technologies of the knowledge graph. Information extraction, as a key technology of CSKG, is of great significance in the entire architecture. Cybersecurity entities have the characteristics of mixed Chinese and English, confusing classification, and unclear features, and the existing related datasets are also very few, leading to difficulties in cybersecurity entity relation extraction.

For the lack of related datasets, we construct dataset CSER for general cybersecurity relation extraction and dataset ICSER for industrial control network relation extraction. First, the cybersecurity entity recognition model based on FT-CNN-BiLSTM-CRF proposed by Qin et al. [29] is used to extract cybersecurity entity pairs. This method uses artificial feature templates to extract local context features and further uses a neural network to automatically extract character features and global text features. Cybersecurity entity pairs were used to manually annotate some of the relation extraction corpora and match entity pairs with text data from vulnerability databases to form final datasets. Finally, the cybersecurity relation extraction dataset CSER and industrial control network relation extraction dataset ICSER are constructed.

4. CSKG Structure Definition

4.1. Scenario Analysis

In this paper, we take Electric Power and Intelligent Control Testbed (EPIC) from iTrust Labs (https://itrust.sutd.edu.sg/itrust-labs-home/itrust-labs_epic/) as a specific industrial control network scenario. We analyze the network layout and list the key asset equipment and resources in EPIC.

EPIC is a power testbed that maps a small smart grid system in real life, including four stages of generation, transmission, microgrid, and smart home; each stage is controlled by its own PLC/controller. There are communication channels between SCADA, distributed control system (DCS), and energy management system (EMS) and each PLC/controller. Attackers can exploit vulnerabilities to enter the communication network and maliciously manipulate the control flow and launch DDos attack on the PLC control flow, and then, the system cannot work normally. Attackers can also utilize the communication channel to enter the SCADA workstation and operate on the SMA portal to launch more attacks.

According to [30], the communication layout of EPIC is shown in Figure 2, which is composed of a SCADA workstation, historian, programmable logic controller (PLC), intelligent electrical devices (IEDs), access points (APs), and switches (SWs), and redundancy in the ring network is achieved using high availability seamless redundancy (HSR) and media redundancy protocol (MRP).

EPIC uses the IEC 61850 standard as the communication protocol for automation systems. There are two main protocols: Manufacturing Message Specification (MMS) and General Object-Oriented Substation Event (GOOSE). It allows data communication between IED, PLC, and SCADA workstations. PLC uses MMS to communicate with SCADA workstations and IEDs and communicate through GOOSE in four stages. The fieldbus communication between physical process and PLC, Master PLC, and SCADA of each stage is achieved through optional wired and wireless channels.

The key asset resources in EPIC [31] mainly include the following: SCADA system, which uses Pcvue in EPIC and runs on a personal computer equipped with the Windows operating system; PLCs, which use WAGO’s PLC series PFC200 perform logic control in EPIC, located on control and network panel, and work based on firmware and control logic programs, and in a few cases, use Modbus TCP/IP communication; Codesys (Codesys v3), which is the programming standard of PLC; IEDs, SIPROTEC Relays from Siemens for protection and control which is used in EPIC, located in the control center and uses IEC61850 standard to communicate with the rest of the system, and maintains the entire process by firmware and control logic; VSD, SEW Eurodrive and the corresponding motor which are used as VSD in EPIC, located in the motor/generator room; and network switches and access points located in the network control panel which adopt HIRSCHMANN products.

4.2. Ontology Structure

Mining EPIC-related vulnerabilities to form a knowledge graph correspond to network layout and asset information of EPIC. For the convenience of research, the study mainly considers assets involved in the communication layout of EPIC. In this paper, we use assets as keywords to collect strong correlation information from vulnerability databases and form a relation extraction corpus with common vulnerabilities in ICS. The communication layout in EPIC is mapped into multiple groups of bidirectional communication relation between nodes and represented by triples. The connection between internal network layout and external threat information is established through the matching between nodes and specific asset information, thus forming the final industrial control network security knowledge graph. The ontology structure we define in this paper is shown in Figure 3.

We define 9 relations including model, _have_, version_, AKA, version, _by_, CVSS_score, module, help_out, and conn and additionally define two relations, comm and asset_info, to represent the connection relation in the EPIC communication network and asset information. There are 11 relations in total. Use <head, tail, relation> to identify the head entity, tail entity, and the relation between them. In this paper, the information of the network layout is mapped into triples <asset1, asset2, comm>, such as <MIED1, MIED2, comm>. Furthermore, <asset, Product, asset_info> combines the internal network layout and external threat intelligence through connecting asset nodes with the product information used by them. Through analysis of vulnerability databases, the vulnerability number is associated with CVSS score, solution, attack vector, and other relevant vulnerability numbers, making vulnerability analysis more multidimensional.

5. The Proposed Model

In this section, we describe the architecture of the proposed cybersecurity entity relation extraction model and then introduce each component of the model in detail.

Under the framework of distant supervised learning, the problem of insufficient label data in deep learning can be solved, but at the same time, it also brings some problems, such as the low-quality label data and the wrong label data. This would have a great impact on subsequent tasks of entity relation extraction. In view of the above problems, we propose a distant supervised relation extraction model ResPCNN-ATT based on the deep residual neural network and attention mechanism. The framework is shown in Figure 4. The model is mainly composed of a vector representation layer, a deep residual convolutional network layer, and a multi-instance attention layer.

The model first uses the pretrained word vector and the position vector between entity pairs as input, which can highlight the role of the two entities, and then uses the piecewise convolutional neural networks to extract semantic features. At the same time, deep residual learning is introduced to solve the problem of gradient disappearance caused by noise data, so as to extract more effective semantic features. Finally, in order to better capture the more important semantic features in sentences, the multi-instance attention mechanism is used to calculate the correlation between instances and corresponding relation, so as to reduce the impact of noise data and improve the performance of relation extraction.

5.1. Vector Representation

The vector representation layer in the model mainly includes word embedding and position embedding.

5.1.1. Word Embedding

Before training the relation extraction model, the text data needs to be vectorized so that the model can read the data. Compared with traditional one-hot coding, word vector mapping can represent more semantic and syntactic information. Word vector mapping is to map each word in the text to a -dimensional real-valued vector. It is a distributed representation of words. When training a neural network model, the most common method is to randomly initialize all parameters and then use an optimization algorithm to optimize the parameters. Research shows that when a neural network is initialized with a pretrained word vector, the parameters can be converged to a better local minimum.

For a given sentence consisting of words, use word2vec to map each word to a low-dimensional real-valued vector space, then perform word vector processing on the sentence, and finally get a vector representation of each word in the sentence, to form a word vector query matrix . Each input training sequence can be mapped by the word vector query matrix to obtain the corresponding real-valued vector .

5.1.2. Position Embedding

In the relation extraction task, we focus on finding the relation of entity pairs. Words that are often close to the entity are more able to highlight the relation between the two entities, such as some verbs: attack, use, etc. Therefore, in order to make full use of the information in the sentence, the position of each word in the sentence for two entities is an important feature in the relation extraction task. This paper uses the position vector (position embeddings (PE)) mapping representation method proposed by Zeng et al.; that is, the relative distance between the current word, entity and entity , is stitched and converted into a vector representation through embedding. In sentence position vectorization, if the dimension of the word vector is and the dimension of the position vector is , then the dimension of the sentence vector is

For example, the vectorized representation of “Alies discover Chrome has XSS vulnerabilities” is shown in Figure 5, “Chrome” and “XSS” in the sentence correspond to entities and entities , respectively. Then, the distance from “Alies” to “Chrome” is 2, the distance from “Alies” to “XSS” is 4, the distance from “vulnerability” to “Chrome” is -3, and the distance from “vulnerability” to “XSS” is -1.

5.2. Deep Residual Neural Network

In cybersecurity relation extraction tasks, the main challenge is that the length of the input sentence is variable and not fixed, and important feature information may appear in any area of the sentence. Therefore, in order to be able to use all local features and predict relations globally, this paper uses a piecewise convolutional neural network PCNN model to extract semantic features in sentences.

In this paper, a residual convolution block is designed for residual learning. Each residual convolution block is a sequence composed of two convolution layers. After each convolution layer, the activation function ReLU is used for nonlinear mapping, and features are then extracted using a local maximum pool. The kernel size of all convolution operations in the residual convolution module is , and the newly generated features are guaranteed to be the same size as the original ones through the border padding operation. The convolution kernels of the two-layer convolution are . The first layer of the residual convolution block is

The second layer is where are bias vectors. In this paper, we optimize the residual learning to get the output vector of the residual convolution block [32, 33].

After the semantic feature is acquired by the convolution layer, the most representative local feature is further extracted by the pooling layer. In order to capture characteristic information of different sentence structures, a piecewise max pooling process is used.

5.3. Multi-Instance Attention

In the relational extraction model, sentence-level attention is built on multiple instances, dynamically reducing the weight of noisy instances, and making full use of semantic information in these sentences to obtain final sentence vector representation.

For the instance set describing the same entity pair , is the instance vector output by the convolution layer and is the number of instances contained in the set . This paper will calculate the correlation degree between the instance vector and the relation . In order to reduce the impact of noise data and make full use of the semantic information contained in each instance in the set, the calculation of instance set vector will depend on each instance in the set: where is the weight of the input instance vector , which measures the correlation of the corresponding relation . The calculation formula of is as follows:

is a query-based function, which indicates the degree of matching between the input instance vector and the prediction relation .

Conditional probability of prediction relation is calculated by softmax function: where is the relation matrix and represents the bias vector. is used to predict the relation between pairs of cybersecurity entities:

6. Performance Evaluation

In this section, we empirically demonstrate the performance of the proposed method on datasets CSER and ICSER. Commonly used Precision-Recall () curve, AUC value, and average accuracy () are used to evaluate the model. The curve is a curve drawn with the recall rate as the abscissa and the accuracy rate as the ordinate, using and at different confidence levels. The AUC value is the area included under the curve. Generally, the larger the AUC value is, the better the model performs. is the accuracy rate calculated by comparing the first relation instances.

6.1. Datasets and Parameters

In order to verify the performance of our proposed model, we build a cybersecurity entity relation (CSER) dataset. 10 types of relations were labeled. The dataset CSER is clawed from the Freebuf (https://www.freebuf.com/) website and wooyun vulnerability database, which includes network text data such as technology sharing, network security, and vulnerability information.

The set of dimensions of the word vector is . The set of dimensions of the position vector is . During the training process, the Adam optimizer performs optimization training. The value set of the learning rate is {0.01, 0.001, 0.0001}. The set of batch sizes processed in one iteration is {40, 160, 640, 1280}. In order to prevent the model from overfitting, the dropout method is used in CNN. Other parameters are shown in Table 1.

6.2. Results and Analysis

The experimental comparison in this paper mainly compares two aspects of the models.

On the one hand, it uses the CNN algorithm with different performances to encode the training data and extract the semantic features in the sentence, mainly including the traditional models: CNN, PCNN, and ResPCNN.

The second aspect is based on how CNN/PCNN/ResPCNN uses the information in the packaging bag for experimental comparison. Three different methods were used to process the information in the bag, namely, AVE, ONE, and ATT. AVE assigns the same weight to all the sentences in the packet as the entity pair, that is, . ONE means to take the instance vector with the highest confidence and find a sentence with the highest score from each bag to represent the entire bag. All models in this paper have been trained and tested on the dataset CSER. Figures 68 show the curves of the results on different bag models. AVE can introduce more information of sentences, but since it has the same evaluation on each sentence, it will also introduce noise from the wrong label data, which reduces the performance of relation extraction, so AVE has the lowest performance of relation extraction among the bag models. The AUC value difference between ONE and ATT on model PCNN is 0.12%, which refers that the performance of relation extraction does not differ much. On model ResPCNN and CNN, the performance of relation extraction of ATT is slightly higher than that of ONE; ATT can achieve a higher accuracy rate throughout the recall scope.

From Figure 9, the AUC value of the model ResPCNN-ATT is the highest value on the dataset CSER, which reaches 12.68%. The model ResPCNN-ATT proposed in this paper can better extract the deep semantic information of sentences, indicating that the introduction of the ATT method can effectively reduce the redundant data in distant supervised learning.

As can be seen from Table 2, comparing the accuracy of the first 100, 200, and 300 relation instances on the dataset CSER, the relation extraction accuracy of ResPCNN-ATT is the highest, which reaches 32.67%. However, the accuracy of the CSER dataset is lower than other datasets. This is because the sentences in the CSER dataset are mixed with Chinese and English; the more complicated the sentence structure is, the less obvious the entity relation characteristics are, and the less the corpus data is.

In order to further analyze the relation extraction model proposed in this paper, by adding the depth of the ResPCNN-ATT model to verify the effectiveness of the introduction of residual learning, comparative experiments of convolutional layers with different depths are designed. In this paper, the number of convolutional layers is increased by increasing the number of residual convolution blocks, and the experimental comparison is performed on the CSER dataset. Figure 10 shows the curves on models with different depths.

7. CSKG Construction and Visualization for ICS

The proposed model ResPCNN-ATT performs well on the dataset CSER, and further, we apply ResPCNN-ATT to the relation extraction task in the construction of a knowledge graph for EPIC.

7.1. Relation Extraction

We analyze key assets and the communication relation between the assets in EPIC and obtained datasets through labeling in distant supervision. Due to the need for strong data correlation, after filtering and cleaning, 19,838 examples of industrial control network security entity relations were finally formed. 15,937 sentences were randomly selected as training data, which included 3838 entity pairs, and 4001 sentences were selected as test data, which included 876 entity pairs.

In this paper, when the depth of the ResPCNN-ATT model is 3 and 5, respectively, an experiment is carried out on dataset ICSER, corresponding to different layers of convolution layers. Figure 11 shows the curves at different depths. The curves above show the effectiveness of introducing residual learning when the model depth is shallow such as 3 and 5.

Table 3 shows the prediction accuracy and AUC values of the test set in the first 100, 200, and 300 relation instances of the model at two depths. Based on the complex industrial control network security dataset, the model has performed well.

7.2. Visualization and Analysis

Finally, 3878 relationships are extracted and stored. Asset as an entity has the communication relation between other assets in network layout. One specific asset node matches one asset equipment at least; through brands, models, or components used by asset equipment, the corresponding vulnerability information can be connected with the asset. A part of the relations of asset node SCADA workstation is shown in Figure 12.

The versions, components, and vulnerabilities of WAGO RFC200 series of products used by PLC in EPIC can be seen in Figure 13. The correlation between different vulnerabilities is defined, such as the correlation between vulnerabilities from CVE and CWE, which enables the network analysis to locate the source code faster and more accurately.

As shown in Figure 14, the CVSS score can quantify the vulnerability threat level; information such as vulnerability solutions, patch links, and security recommendations is structurally related to the corresponding vulnerability, which can help to troubleshoot equipment failures and strengthen security status. The asset vulnerability corresponding to the vulnerability, such as the port number used, is associated with the exploit relationship.

The preliminary construction of the EPIC industrial control network security knowledge graph not only facilitates daily management, daily maintenance, and network security analysis but also supports the completion of downstream tasks of the knowledge graph. The knowledge expression form in the knowledge graph is simple, intuitive, flexible, and rich. Based on the existing knowledge graph structure, we can deepen the industrial control network security defense at a deeper level and make network security defense research more diversified. Further, through knowledge reasoning, we can link to hidden entities and predict new relationships. It helps find out new attack behaviors and improve the richness and accuracy of the knowledge graph. The mining of entities and relationships offers constant supplement for the existing knowledge graph and makes sense in decision-making, to enhance the active defense capability of industrial control network security.

8. Conclusions

In this paper, we propose a novel data-driven industrial network security defense framework, which structures fragmented multisource data and integrates these threat data with the industrial network structure. In order to better mine entity relations in cybersecurity data, we introduce a novel distant supervised cybersecurity relation extraction model ResPCNN-ATT. The experimental results show that the model proposed in this paper has the highest accuracy of relation extraction compared with other model methods on cybersecurity datasets. Further, based on specific industrial control network security scenarios, we constructed an ICS security knowledge graph by applying ResPCNN-ATT, which strengthens the cybersecurity analysis capabilities. In the future, we intend to introduce reinforcement learning to the model to further reduce the impact of noise and study the downstream application tasks of the industrial control network security knowledge graph to strengthen the industrial control network security defense capabilities.

Data Availability

All the data used to support this study were supplied by Guowei Shen under license and so cannot be made freely available. Requests for access to these data should be made to Guowei Shen ([email protected]).

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Acknowledgments

This work is supported by the National Natural Science Foundation of China under Grant 61802081 and Big Data Application on Improving Government Governance Capabilities National Engineering Laboratory Open Fund Project (No.W-2018023).