Abstract

Computer network security has become increasingly controversial among many businesses as a result of the rise in cyber threats. Artificial neural network (ANN) is mature research in this field, whereas the traditional algorithm is slower, in feedback error, and has the disadvantage of easy convergence to local extreme value. To guide against these threats, in this paper, ANN-back propagation (BP) algorithm is used to establish the relationship between the level of cyber security situational awareness (CSA) and the perceptual parameters and quantitatively evaluate the situational awareness. This study established the ANN-back propagation (BP) to make the relationship between the level of cyber security situational awareness (CSA) and the perceptual parameters, which evaluates situational awareness. The ANN-BP with variable step size learning strategy and simulated annealing method is used for optimization to build a virtual network environment. The proposed model offers better precision, improved sensitivity, and higher (0.987%) accuracy.

1. Introduction

Computing resources, cyber security, software programs, and data are protected from attack using a combination of policies, techniques, technologies, and procedures [1]. Cyber security is comprised of several rules, technologies, and processes that work in concert [2]. Application, network, host, and data-level cyber security mechanisms exist. Several tools, such as firewalls, antivirus software, intrusion detection systems (IDSs), and intrusion protection systems, are used to prevent and detect security breaches and assaults (IPSs) [3, 4]. However, many opponents still possess an advantage because all they need to do is to take advantage of a vulnerability in the systems that need to be protected is find one. A wider attack surface is created as a result of the growing number of internet-connected systems. As the attackers become smarter, they build zero-day exploits and malware that evade security safeguards, allowing them to remain undetected for extended periods [5]. They are attacks which have never been seen before but are generally variations on a previously known one. Exploitation strategies are being commoditized, which exacerbates the situation by allowing for quick distribution without organizations that abuse their permitted access [6]. Compromise indications are present throughout an attack’s lifecycle; there may even be substantial indicators of an oncoming attack. Finding these markers, which may be dispersed across the environment, is a problem. Figure 1 depicts the cyber security situation awareness cycle.

Applications, servers, smart devices, and other cyber-enabled resources create massive amounts of data from machine-to-machine and human-to-machine interactions. Data generated by cyber security technologies, such as the security information event management (SIEM) system, might overwhelm security analysts [7]. The military program’s security posture can be improved by utilizing data science in cyber security. Data analytic-based cyber protection technologies are starting to appear. System signatures are being replaced by network intrusion detection systems (NIDS), which analyze packet transmission [8]. Figure 2 depicts the cyber security situation awareness attributes.

A robust intrusion detection system (IDS) is required to protect against all of these threats. ANN, decision tree (DT), support vector machine (SVM), naive Bayes, and other machine learning (ML) algorithms have been used to detect threats [9]. With the rise of cyber threats, computer network security has become a major issue for many businesses. A robust intrusion detection system (IDS) is required to protect against all of these threats. ANN, decision tree (DT), support vector machine (SVM), naive Bayes, and other machine learning (ML) algorithms have been used to detect threats. Considering the various threats, it is necessary to employ a combination of strategies to effectively improve intrusion detection in computer networks. This is because invaders are getting a foothold in the cyber world, and the consequences for businesses are difficult to quantify. The ANN is the most advanced research in this field; however, the traditional BP approach is slow in terms of feedback error and has the drawback of being quickly convergent to local extreme values.

This study develops a better model for detecting threats on a computer network. To accurately forecast threats, a comparison of ANN-BP and DT is studied. With this study, a network administrator can be rest assured based on these predictions.

1.1. Related Work

A machine learning-based intrusion detection system’s effectiveness and efficiency in detecting threats to organizations and other network users are interesting and valuable. Because cyber threats fall into a variety of categories, certain machine learning algorithms may not be able to detect them on a computer network. The data fusion model was initially introduced by cyber situation awareness (CSA), which then improved and adjusted the original model, established the CSA model, and clarified the CSA’s purpose [10]. The CSA model is well developed and mature as of 2016, related research has not advanced significantly since then, the CSA’s primary duty is evaluation, and it conducts extensive research in this area. The objective of the mathematical model (MM) method, which comprehensively takes into account various situational factors, is to assess the network situation from various angles [11]. Situation assessment was the first application of the MM approach. It integrates several systems and concentrates on various elements that influence the network state. The situation factor reflects the situation’s status from several angles, but MM only addresses multiattribute fusion and excludes the fusion of multisource data, and the model it employs is fixed [12]. The knowledge reasoning (KR) method emerged if just specific assessment findings could be reached, neglecting the uncertainty aspect to overcome the two main difficulties of MM. On the one hand, KR processes ambiguous information using fuzzy sets, probability theory, evidence theory, etc. [13]. On the other hand, the reasoning is used to obtain knowledge that has multiple sources and multiple attributes. A popular area of study is the KR approach as represented by Bayesian networks [14]. Numerous documents have surfaced, and further research is important. The largest problem facing KR is how to get inference rules and prior probabilities, particularly for a novel research area like CSA. To address this challenge, the pattern recognition (PR) method was developed. With its significant capacity for learning, the PR has been educated to use historical data or practice samples to mine the situation mode division’s knowledge methodically and scientifically since 2005 [14, 15]. The majority of the already used evaluation techniques will essentially introduce data mining, which also illustrates the tendency of a thorough application of numerous techniques the examination of the idea that “problems appear to solve issues” is embodied in the creation of evaluation methodologies. Research on the representation of knowledge is uncommon and only recently has it begun [4]. The crucial position of CSA has been simultaneously formed by network management requirements and broad application prospects, and associated research has continued to advance. The preliminary examination of existing CSA revealed the following characteristics of relevant research: (i) other aspects of network security situational awareness, such as traffic, faults, topology, and survivability, are infrequently studied; (ii) system architecture is the subject of study and is at a relatively advanced stage. It accepts data from joint directors of laboratories (JDL), despite variations. (iii) The representation of the network system is based on a hierarchical structure; the representation of uncertain information is primarily simple grading and turned into discrete data; the design principles of the fusion model and the Endsley situational awareness model. (iv) Weight analysis is the basic foundation for the method’s evaluation [4, 16]. The mathematical data fusion method has been attempted to be incorporated into CSA in some studies. Other issues still need to be resolved in the CSA research, which has attracted attention [12]. Figure 2 displays the situational awareness indicator system in cyberspace. Figure 3 contains several variables that have an impact on the situational security awareness system.

1.2. Study Characteristics and Problems

The existing research has focused mostly on the network management technologies already in use, failed to take into account the security situation in each unit, and is unable to provide a thorough analysis and presentation of the whole situation [17], owing to the absence of thorough and organized research on the entire CSA system. Second, the knowledge representation is inadequate. The term “situation” is too limited and does not adequately capture the general and macroscopic properties of the scenario. Although it is straightforward and obvious to use a hierarchical structure to depict a network system, it cannot reveal the intricate relationships between network pieces. Relationships prevent the potential situational information in multisource and multiattribute data from being mined [18]. Concerning data representation, research needs to be done on how to choose and expand the feature measurement employed for situation evaluation and create a fair and comprehensive indicator system. Third, there is no single standard for scenario assessment. This is partially attributable to the fact that a scenario is an abstract idea [19]. What kind of circumstance qualifies as good, and how good is it? Both grading and scoring are unreliable and lack a scientific foundation. However, the fundamental causes are neither persuasive nor evident. There cannot be agreement on the scenario and the situation assessment due to the formal definition of the situation and the absence of indications and techniques to gauge the benefits and drawbacks of the evaluation results. However, there are several distinct evaluation techniques. Nearly all theoretical data fusion approaches have been applied to the scenario appraisal stage, and new ways are constantly being developed. This entails extensive, recurring research, and sometimes even the application of specific mathematical techniques to deepen the theory. Only the standards for evaluation are consistent. Accurately, the goals and directions of the study on evaluation methods are well defined [19, 20]. Fourth, the available research is comparatively unconnected and only addresses the situation itself. There is a lack of integration between levels on the vertical (level 1/3) and horizontal (level 4/5/6) axes. It is not incorporated into the system for fusing data. The research framework in Figure 3 separates the research of each layer from one another and makes it difficult to directly apply level 1 because the current three main research areas are level 2 integration, the key technology of integration, without involving communication, interaction, system management, etc. The fusion’s output is utilized as the starting point for measuring situational awareness features; it does not meet the requirements for fusion at levels 3 and higher, which is better for threat analysis and decision-making [3, 20]. It has attracted widespread interest from the academic community and is currently a hot topic in the data fusion sector. However, the majority of studies only focus on one component of CSA and are still in the theoretical exploratory stage. The focus of present and future research efforts is comprehensive and in-depth theoretical research and the deployment of actual applications, and the most pressing requirement is to build a unified assessment system [3, 7].

1.3. System Evaluation Index

The ultimate goal of situational awareness in cyberspace is to protect the security of network information and to provide an integrated situational awareness system in cyberspace. The regulatory authorities can understand the operation status of the network and lay a strategic foundation for the development of cyberspace [21]. The potential sense of network space states is shown as and and , where is the quantitative evaluation result, represents the application of the fusion algorithm, represents the evaluation index set, denotes the specific analysis index, represents the performance of various fusion algorithms and represents the integration process of the evaluation system. The mode of situational awareness in cyberspace and description of the nature of situational awareness in cyberspace from four aspects is started. It covers the information entities in the network and fully reflects the status quo of situational awareness in cyberspace. These four aspects are, respectively, weakness, victimization, risk, and stability. The number of data parameters (weakness, victimization, risk, and stability indexes) of the definition reflects the network space situational awareness [9]. According to the four dimensions of situational awareness in cyberspace, this study established an index system of perceived security status. The results show that in the current network state, the system can calculate the trend of network quantitative analysis, to realize real-time network monitoring and control [21].

In this study, the occurrence of network space is divided into four levels, that is, risk, weakness, stability, and victimization. The first level index is described in two levels. According to the national first aid emergency plan and combined with the characteristics of network hazard elements, cyberspace is divided into five levels. The interval [0, 1] is used to describe the weight of indicators at several levels, as listed in Table 1. The grade matrix of the four first-level indicators and cyberspace situation is listed in Table 2, and the security level of first-level indicators is listed in Table 3. This study demonstrates the superiority of a multiclassifier ensemble over a single classifier.

2. Propose Optimization Approach

2.1. ANN-BP Algorithm

The BP neuron (input ) node through the weighted value () input parameter (), neural element threshold , excitation function , and the output parameter is expressed.

The hidden layer of ANN-BP requires a continuous excitation function, and the excitation function is selected as a sigmoid function at this time. The ANN-BP model includes the following two steps, information transmission and reverse correction of errors [22, 23]. The ANN-BP tends to fall into local minima, and the simulated annealing algorithm can fill this shortcoming [22]. During the first step set the starting temperature , the starting state , the lowest temperature , the optimal solution , and the current target value . While on the second step, if the pretemperature is higher than , the number of iterations is the dimension of the initial solution. News is produced in each iteration, and the calculated function value () and the phase ortho error difference . If , the current state value is taken as the optimal solution of the current iteration, that is, . If is greater than or equal to 0; then, this step size adjustment increases the error, because it will be settled with probability. Select the Metropolis criterion to calculate after iteration the optimal solution and its state value are obtained. The final step, first annealing optimization, detempering process . If , then ; on the contrary, it indicates that the global optimal solution is the current optimal solution. The situation awareness assessment process of the ANN-BP is optimized, as shown in Figure 4. The ANN-BP model is optimized and used to observe the current network condition, and the network grade difference is compared to determine whether it affects the current network condition. ANN-BP was a situational awareness evaluation process and a decision tree (DT).

2.2. Situational Assessment

Ten experts are selected to evaluate the system, which ensured the accuracy of the method. Recurrent back propagation in data mining is fed forward until a fixed value is achieved. After that, the error is computed and propagated backwards.

The main difference between both of these methods is that the mapping is rapid instatic back propagation while it is nonstatic in recurrent back propagation. The condition of the system is as follows:

The dispersion of the expert opinion is as follows:

where is the score taught by the position, according to the state potential special point, the full , is the second-level evaluation index, which is constructed into the expected output of the second-level evaluation. This experiment has 100 samples, 20 of which are test sets, and 80 of which are training sets. The network (Honeywall) topology diagram is presented in Figure 5.

Figure 6 shows the first layer inputs the divine meridian element , that is, the -dimensional vector , where ; the output of neurons in the hidden layer is , , threshold , ; , , threshold , . The weight from the input layer to the hidden layer can be an matrix, i.e., . Similarly, the weight matrix from the hidden layer to the output layer is . Each element of the neural network output is

When the training sample is sent to the output layer, it is compared with the expected:

The sum of training sample errors is the total error, where is the sigma count , is the 1, 2 sigma count sigma :

where the “count” is the total number of samples.

According to equation (4), if ( is the specified minimum error number), then, the bundle formation is trained, and the threshold value and corresponding weight of the neuron are determined. Adjust the weight along the negative gradient direction, that is

where is called the learning coefficient, equation (6) is transformed into

where is the number of layers, is the neuron in the layer is the output of the neuron in the layer: , while .

3. Results and Discussion

The specific steps of the experiment include Honeypot installation and configuration, Honeywell shut down construction and configuration, installation, my SQL database login, and information collection. This experiment uses software to build virtual simulation, and the Matlab platform is employed to implement the standard ANN-BP. Table 4 shows the results of a comparison between the output of several test samples and the expected output. The input data is preprocessed by , and the data is quantified to the interval [0, 1]. The traditional method of performing hyperparameter optimization has been grid search, or parameter swap, which is simply a complete search through a manually specified subset of the hyperparameter space of the learning algorithm. Where is the datum of the generation table, is the minimum value, and is the maximum value. Figure 7 is the comparison result between the situation value evaluated by the ANN-BP and the actual situation value. The ANN-BP algorithm can evaluate the situation of situational awareness in cyberspace, which weakens the human factor and improves the objectivity and authority of the results and can deal with nonlinear questions very effectively. The experimental results show that not only do the 80 training sets based on the ANN-BP model meet the evaluation of experts and professors but also the 20 test sets are consistent with the actual situation. The ANN-BP algorithm can evaluate the situation of situational awareness in cyberspace, which weakens the human factor improves the objectivity and authority of the results and can deal with nonlinear questions very effectively. The experimental results show that the modified BP oracle network can accurately evaluate the situational awareness system in cyberspace. In this study, the ANN-BP algorithm is used to quantitatively evaluate the level of situational awareness in network space. First, the requirements of situational awareness assessment in network space are analyzed. Then, in Figure 8, performance comparison of ANN-BP and a decision tree the shortcomings of traditional ANN-BP are introduced, and the defiring optimization method is used for improvement. When these different models are employed to make major predictions, stakeholders are seeking more transparency and explanation. Results depicting the output of ML models are crucial in cybersecurity, as specialists expect substantially more information from the model than a simple binary output for their analysis.

4. Conclusion

In this study, an ANN-BP creates a robust cyber situation awareness model that enhanced cybersecurity by improving intrusion prediction. A machine learning-based intrusion detection system’s effectiveness and efficiency in detecting threats to organizations and other network users are interesting and valuable. Because cyber threats fall into a variety of categories, certain machine learning algorithms may not be able to detect them on a computer network. The relationship between the level of cyber situational awareness and the perceptual parameters quantitatively evaluates the situation of situational awareness. The variable step size learning strategy and simulated annealing method are used for optimization to build a virtual network environment, after the construction and modelling of a threat detection prediction model, which suggests a safe system than a single classifier. They are data-oriented, which makes it easier to detect patterns in the datasets using the IDS classifiers. The proposed work offers better precision (0.989) and 0.984 sensitivity, while its accuracy was useful at 98.7% than the DT classifiers. It seems a small difference but important to note that there is a difference. This system has importance for the computer network administrator. Future work can introduce multiensemble algorithms in the perception phase to improve IDS accuracy.

Data Availability

Data has been cited inside the paper and also available upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors appreciate the research facilities and library access of affiliated institutes/universities. This research was funded by NSFC (nos. 61972106 and 62072131), the Key R&D Program of Guangdong Province (no. 2019B010136003), the National Key Research and Development Plan (Grant no. 2019QY1406), and Science and Technology Projects in Guangzhou (no. 202102010442).