Abstract

Most network security research studies based on signaling games assume that either the attacker or the defender is the sender of the signal and the other party is the receiver of the signal. The attack and defense process is commonly modeled and analyzed from the perspective of one-way signal transmission. Aiming at the reality of two-way signal transmission in network attack and defense confrontation, we propose a method of active defense strategy selection based on a two-way signaling game. In this paper, a two-way signaling game model is constructed to analyze the network attack and defense processes. Based on the solution of a perfect Bayesian equilibrium, a defense strategy selection algorithm is presented. The feasibility and effectiveness of the method are verified using examples from real-world applications. In addition, the mechanism of the deception signal is analyzed, and conclusions for guiding the selection of active defense strategies are provided.

1. Introduction

Network information technology is developing rapidly, and interconnected systems are on the rise [1]. However, network security incidents pose a major and perpetual problem [2]. Defense technologies represented by firewalls, intrusion detection, and antivirus software provide passive response defense based on a priori knowledge and attack characteristics, but they cannot respond to new types of complex network attacks in an effective and timely manner [3]. If the defending party can actively select a targeted defense strategy by predicting the attacker’s actions and disrupt or block the attack process, while simultaneously maximizing its own benefits, then the defense may be called an active defense [4]. The essence of cybersecurity is a battle between the offense and defense. The effectiveness of the defense depends not only on its own strategic action, but also influenced and constrained by the attacker’s action [5]. The key issue is how to select the optimal active defense strategy in an information-constrained confrontation environment.

The characteristics of opposite goals, strategic dependence, and noncooperative relationships in network attack and defense are in line with the core philosophy of game theory, namely, optimal decision in an environment of conflict. Some scholars, such as the authors of Refs. [611], have established network security models based on game theory, analyzed the offensive and defensive confrontation process, and solved the game equilibrium to determine the defense strategy and guide defense actions. We classified and analyzed the existing research results by combining the two factors of game information and action timing and came to the following conclusions:(1)In a static game with complete information, there are many premise assumptions and the model is easy to establish, as demonstrated in Ref. [12].(2)In a dynamic game with complete information, given the sustained nature of the offensive and defensive confrontation process, previous actions could be studied to affect the subsequent game process, as shown in Ref. [13].(3)In a static game with incomplete information, the players may use the static Bayes’ rule to infer the opponent’s private information and break through the complete information assumption, such as in Ref. [14].(4)In a dynamic game with incomplete information, the late player observes the partial action of the early player, even without fully understanding the behavior type. However, since the behavior is type dependent, one can modify the a priori judgment of the behavior type of the early player by using the dynamic Bayes’ rule, as depicted in Ref. [15]. Since neither the offense player nor the defense player can fully understand the opponent’s information, influenced by the dynamic and persistent nature of the confrontation process, the dynamic game with incomplete information is more in line with the actual network attack and defense. Hence, this type of game is the focus of current network security game research.

A signaling game is a typical dynamic game with incomplete information, which provides a formal mathematical way to analyze how identity and deception are coupled in cyber-social systems. [16] It describes the strategic interplay of the game process through signal transmission [17], which is well-suited for studying the selection of active defense strategy. In Ref. [18], from the perspective of dynamic confrontation and limited information, a two-stage signaling game model is constructed to derive an optimal defense strategy. As demonstrated in Ref. [19], the signaling game model can be used to analyze the moving target defense. The defense side can alter the information asymmetry of the two sides by releasing the dynamically transformed signal and thereby expand its own benefits. In Ref. [20], the DDoS attack and defense process is modeled as a multistage signaling game, and an equilibrium solution is found. Moreover, the server port hopping defense strategy has been demonstrated to be effective. In Ref. [21], a multistage offensive and defensive signaling game model is constructed for modeling the multistage dynamic attack and defense process under incomplete information constraints. Also, the signal attenuation factor is used to quantify the influence of the defensive signal of the defending party. In Ref. [22], to address the spear-phishing attack of industrial control systems, a multistage offensive and defensive game model is established. Defense strategies are selected based on the comprehensive consideration of the benefits and costs. Finally, Ref. [23] analyzes the security issues of the Internet of Things through a multistage game model and provides specific defense strategies.

Despite their strengths, all the studies above assume that the network attack and defense process involve only one-way signal transmission, so the attack and defense process is modeled and analyzed by designating either the attacker or defender as the signal sender and the other party as the signal receiver. However, in an actual network attack and defense process, the attacker and the defender will have a series of strategic interactions. The attack and defense parties are generally both senders and receivers of signals. If the sender’s transmitted signal is viewed as a stimulus, then the response chosen by the recipient is a reaction. In a two-way sustained stimulus-response process, the defender and the attacker are constantly adjusting and optimizing their respective strategies, thus dynamically propelling the attack and defense evolution [24]. Therefore, the game signal in network attack and defense should be a two-way send-and-receive mechanism.

To address the problem described above, we construct a two-way signaling game model to analyze the network attack and defense processes based on a two-way transmission mechanism of actual attack and defense signals. Based on the solution of the perfect Bayesian equilibrium, a defense strategy selection algorithm is presented. The main contributions of this work are as follows:(1)Two-way signal transmission mechanism: both the offense and defense parties play a dual role of the sender and receiver. While affecting the other party’s strategy selection by releasing the signal, they are also affected by the signal released by the other party.(2)Game signal set containing both true and fake signals: in order to disrupt the cognitive decision-making process of the other party, both the offense and defense sides in the process of network confrontation use information countermeasures that release a mixture of true and false signals. Since the signal recipient has a certain discriminating ability against false signals, the deceptive effect of the false signal diminishes as the attack and defense game progresses.(3)Dynamic multistage game process: the offensive and defensive confrontation continues in multiple stages as both sides continue to learn and evolve based on the interaction of signals, dynamically adjust the action strategy, and maximize their gains. Through a two-way signal transmission mechanism, the method proposed in this paper can more accurately characterize the offensive and defensive strategy confrontation process. Hence, this method more closely models an actual network attack and defense process. It also serves as a better theoretical reference, providing practical guidance in the selection of active defense strategies under dynamic conditions of incomplete information.

2. Construction of a Two-Way Attack and Defense Game Signal Model

2.1. Analysis of Attack and Defense Game Process
2.1.1. Basic Signaling Game Process

The basic signaling game consists of two players: the signal sender and the signal receiver. First, according to the Harsanyi conversion [25], the virtual player “Nature” selects the type of signal sender as θ and transforms the selection problem under the condition of incomplete information into a selection problem under the condition of uncertainty type. The signal sender knows that its type is θ, but the signal receiver only knows the a priori probability P(θ) that the sender belongs to type θ. The signal sender releases a signal H, and the signal receiver, having observed signal H, uses Bayes’ rule to deduce the posteriori probability from the a priori probability P(θ) and subsequently selects an action strategy. The signal sender determines its own action strategy by predicting the signal receiver’s action strategy, and both parties strive to maximize their respective gains. The process of the basic signaling game is shown in Figure 1.

2.1.2. Two-Way Attack and Defense Signaling Game Process

Network confrontations are dynamic and sustained. The attacker and the defender take sequential actions, and each party selects its own action strategy after observing the signal released by the other party. The two-way signaling game process is shown in Figure 2.

(1) Initial Configuration (ICN). The defender acts as the signal sender, and the attacker acts as the signal receiver. The defender deploys the network information system and configures the network topography, IP address, and network segmentation. Since the network must provide services to the outside world, it is characterized by open sharing, interconnection, and interoperability. The network must also have homologous, isomorphic, and homogenous characteristics of information network products. The attacker can gather information on the initial configuration of the defender through a variety of avenues, including infiltration by social engineering means, continuous scanning and detection, and public information acquisition [26]. Such information serves as the basis for the attacker to launch a network attack. In this work, the information is treated as a signal HD released by the defender. The attacker observes the signal HD, corrects the a priori judgment regarding the type of defender, and identifies its attack strategy. The game process is shown in the S1 stage of Figure 2.

(2) Dynamic Confrontation (DCN). Both the offense and defense sides are constantly switching between the role of the signal sender and the signal receiver. Each stage of the game consists of a basic signaling game, as shown in the S2, S3, and Si stages in Figure 2. In the S2 phase, the attacker selects the attack strategy and releases the signal HA. The defender receives the signal HA, corrects the a priori judgment about the type of the attacker, and selects the defense strategy accordingly. In the S3 stage, the defender releases the signal HD and the attacker receives the signal HD and again corrects the a priori assessment regarding the type of the defender to determine the attack strategy. In the process of dynamic confrontation, the signal is transmitted in both directions, and both the offense and defense sides use Bayes’ rule to incrementally correct their estimate of the true type of the other party. From the perspective of the defender, the termination condition of the game is when the attacker stops the attack and no longer releases signals. The game process is shown in the Sn phase of Figure 2.

2.2. Definition of Two-Way Attack-Defense Signaling Game Model

The signal plays a role in the strategic interaction between the sender and receiver. The sender of the signal determines the content of the signal and influences the recipient’s action strategy through the signal. According to the Cyber Kill Chain model [27], the first stage of network reconnaissance is an intelligence gathering activity, such as detection and scanning, which is conducted by the attacker on the defender. This may be regarded as receiving the signal released by the defender. In the course of the confrontation, the sender of the signal may adopt the idea of deception by releasing signals that do not match its own type for the purpose of misleading the other party’s judgment and expanding its own gain [28]. Therefore, the signals transmitted by both the offense and defense parties can be divided into two types: real signals and deception signals.

Definition 1 (real signal (RS)). A real signal is a signal that reflects the true type of the player. The player chooses the action strategy according to its own type. In the process of implementing its strategy, some private information is inevitably exposed; this information is transmitted to the receiver as a real signal. A real signal is accompanied by an action strategy, and the release of a real signal does not require additional cost.

Definition 2 (deception signal (DS)). A deception signal is a signal that does not match the true type of the player. In order to conceal its real type, the player induces the signal receiver to establish a wrong correction to the a priori probability by sending a signal that does not match its type, thereby rendering the receiver into a passive state. Since a signal will not be generated for no reason, the deceptive player must pay an extra cost to release the deceptive signal [29]. For example, if a low-defense user wishes to spoof as a high-defense user, it must deploy some camouflage facility and pay a certain defense cost to release the spoofing signal. The release of defensive signals by the defense player is a concrete manifestation of the active defense philosophy [30], in line with the deceptive concept that “when we are able to attack, we must seem unable; when using our forces, we must seem inactive” in Sun Tzu’s The Art of War.
Based on the above analysis, a two-way signaling game (TWSG) model is constructed for the two-way transmission mechanism in the actual network attack and defense confrontation process.

Definition 3. The TWSG model has ten elements, where . is the player space of the game. It includes two players: the defender ND and the attacker NA. is the type space. is the type of the defender, , , and is the type of the attacker, , . The type of the player is private information, determined by the action strategy, and the player type can affect the game return of both parties. is the signal space. is the defense signal, , , and is the attack signal, , . The signal receiver can estimate the type of sender according to the signal received, and the signal space logically corresponds to the type space. However, due to the existence of the spoofing signal, a specific signal does not have a strict correspondence relationship with the specific type of the attacker or defender.T is the number of game stages, and , . The two-way signaling game continues in multiple stages, and the tth stage of the game is represented as TWSG(t). is the spoofing signal attenuation factor. After multiple strategic interactions between the attacker and defender, the two sides become more familiar with each other, and the influence of deception signals is gradually attenuated. The posteriori probability generated in the tth stage of the game is modified by the factor to make it more realistic, where . The initial stage deception signal is not attenuated. The degree of attenuation of the deception signal at the TWSG(t) stage is expressed as . For a sufficiently large T, , and the influence of the spoofing signal disappears completely. The signal and type constitute a corresponding relationship, and the two-way signaling game degenerates into a static game of incomplete information. is the gain discount factor and represents the discount ratio of the gain in the t +1 stage as well as the gain in the t-stage. The discount ratio is used to convert the gain of a future stage into the present value. is the strategy space. SD is a defensive party strategy, and SA is an attacker party strategy, . is the a priori probability space. PD is the set of a priori probability of the defender, and it represents the a priori probability of the attacker’s type known to the defender, where , . PA is the a priori probability of the attacker, and it represents the a priori probability of the defender’s type known to the attacker, where . is the posteriori probability space. is a set of posteriori probability of the defender, meaning the defender’s posteriori assessment of the attacker’s type, where . is the attacker’s posteriori probability set, meaning the attacker’s posteriori assessment of the defender’s type, where . is the gain space. and represent the defender’s gain and the attacker’s gain, respectively.

2.3. Gain Calculation

Based on the characteristics of the two-way signaling game model, we provide the following definition and calculation method for the game return.

Definition 4. The system damage cost (SDC), attack cost (AC), defense cost (DC), and related definitions and calculation methods can be found in Refs. [23, 31, 32]. Among them, SDC is affected by the combination of attack and defense strategies and is often recorded as , which represents the value that the system suffers when the defense strategy is and the attack strategy is .

Definition 5 (deception cost). The deception defense cost (DDC) is the cost incurred to the defense party for actively releasing a spoofing signal to confuse the attacker. The deception attack cost (DAC) is the cost incurred to the attacking party for actively releasing a spoofing signal to confuse the defender.
According to the cost/reward calculation method, the returns of the attacker are the SDC and the total cost is the sum of the AC and DAC. The defender’s cost is the sum of the SDC, DC, and DDC.
The discount factor is used to convert future earnings into current gain. The gain target functions of the offensive and defensive parties can be expressed, respectively, as follows:According to the attack-defense types of and , the attack-defense strategies can be divided into different levels, such as enhanced type and regular type. The costs and returns of the strategies at the same level are basically the same. For example, if an attack level contains a total of h attack policies, then the probability that the attacker selects the strategy ah is 1/h. The gain from this attack level can be expressed as an average of . Similarly, if a defense level has a total of defensive strategies, the gain of the defense level is .

3. Two-Way Signaling Game Equilibrium Solution and Defense Strategy Selection

A two-way signaling game is a finite game consisting of several basic signaling games. In the game, the attacker and defender alternately act as signal senders and receivers and the single role equilibrium solution is no longer applicable. In this paper, we first present the solution process for a one-stage game equilibrium and then apply it to a multistage equilibrium solution.

We carry out the calculation and analysis for the single-stage game equilibrium solution by referring to the signal sender as the Leader and the signal receiver as the Follower. The relevant parameters are set as follows:Signal sender action strategy Signal receiver action strategy Defender type space θD = (, ) = (enhanced type defense, regular type defense)Defender’s signal space HD = (hDH, hDM) = (enhanced defense signal, regular defense signal)Attacker type space θA = (, ) = (enhanced attack, regular attack)Attacker signal space HA = (hAH, hAM) = (enhanced attack signal, regular attack signal)

3.1. Single-Stage Game Equilibrium Solution

Definition 6. The TWSG(t) game equilibrium solution is , where is the Leader’s signal strategy, abbreviated as , is the Follower’s strategy, abbreviated as , and is the Follower’s posteriori probability of the Leader type, where the parameter indicates that the Follower can be an attacker or defender in different game stages, abbreviated as . According to game theory, the equilibrium should satisfy two conditions:(i), indicating that under the condition of posteriori probability , the Follower is the optimal strategy for the Leader(ii), indicating that the Leader is the optimal strategy for the FollowerHere, represents the posteriori probability of the Leader type calculated for the Follower based on a priori probability P, observed signal h, and its own strategy .
The steps for solving the perfect Bayesian equilibrium is more complex, and the entire process may be divided into the following three steps:(1)Step 1. Calculate optimal strategy based on the signal received by the Follower(2)Step 2. Leader reduces the optimal strategy (3)Step 3. Select the perfect equilibrium solution The detailed process is shown in the Appendix.
Based on game theory, the perfect Bayesian equilibrium solution is the optimal strategy for the player [33]. Therefore, the defender should determine the active defense strategy based on its role and game equilibrium EQt.

3.2. Multistage Game Equilibrium Solution

In the multistage continuous confrontation process, the defense party may incrementally modify the attacker’s motivation and behavioral preference using the stimulus-response learning mechanism, reduce the impact of the attacker’s deception signal, and implement a targeted active defense strategy to maximize the expected return.(1)In the first stage of the game TWSG(1), the Leader is the defender and the Follower is the attacker.Based on the Harsanyi conversion, the viral player “Nature” selects the type of the defender. Type is selected with a priori probability p1, and type is selected with probability 1p1. The defender releases the signals hDH and hDM. Based on the observed signals, the attacker selects strategy types and and corrects its a priori assessment of the defender type. According to the single-stage game equilibrium solution process in Section 3.1, the game equilibrium can be obtained for TWSG(1). The TWSG(1) game tree is shown in Figure 3.(2)In the second stage of the game TWSG(2), the Leader is the attacker and the Follower is the defender.The attacker selects the attack strategy according to EQ1 and sends a signal to the defender. The offense and defense sides have interchanged their role as the sender and receiver of the signal. Through the TWSG(1) game, both the offensive and defensive sides have gained some mutual understanding and the decay phenomenon of the deception signal begins to emerge. At this point, the attacker no longer relies on “Nature” to select the type. Instead, the selection is determined by the signal attenuation factor σ of the deception signal and the posteriori probability in EQ1, as expressed by . The attacker chooses with probability and chooses with probability . The TWSG(2) game tree is shown in Figure 4.(3)In the third stage game TWSG(3), the Leader is the defender and the Follower is the attacker. The TWSG(3) game tree is shown in Figure 5.The defender selects the defense strategy according to EQ2 and sends a signal to the attacker. The attack and defense roles are interchanged again. After the first two stages of the game, the attenuation effect of the deception signal is more pronounced, as represented by the expression . The defender chooses with probability and selects with probability .(4)In the T-stage of the game TWSG(T), the Leader is the defender and the Follower is the attacker.

As described in Section 2.1.2, both the attacker and the defender continuously interchange their roles as the sender and receiver of the signal during the ongoing confrontation, which dynamically adjusts the strategy and moves the game process forward. When the game stage T is large enough, the spoofing signal will be screened by the other party and its influence will completely disappear. The two-way signaling game will degenerate into a static game of incomplete information. The defender will continue to use defensive measures as the Leader releases signals to the outside world. The attacker will terminate the confrontational behavior and act only as the Follower to receive the signals sent by the defender. The TWSG(T) game tree is shown in Figure 6.

3.3. Defense Strategy Selection Algorithm and Comparison with Results

The algorithm for designing the active defense strategy is shown in Algorithm 1.

Input: Two-way signaling game model
Output: Active defense strategy
(1) Initialize
(2) Calculate attack gain ;
(3) Calculate defense gain ;
(4) for
(5)  {
(6) Initialize ;
(7) Leader releases signal H;
(8) Calculate {Inferred optimal dependence strategy for Follower};
(9) Calculate {Inferred optimal dependence strategy for Leader};
(10) Generate posteriori inference of for Follower based on Bayes’ rule;
(11) If and not in conflict;
(12) Then, Create ;
(13) Return ;
(14);
(15)  }
(16) End

If the number of types on the defense side is n, the number of types on the attacker side is m, the number of game stages is t, the number of defense strategies is , and the number of attack strategies is h, then according to Refs. [17, 21], the time complexity of the active defense strategy selection algorithm is and the space complexity is .

The results of our method are compared with available research on signaling games in Table 1.

The signal transmission mechanism refers to whether the signal transmission direction is one-way or two-way in the model. The attenuation of the deception signal indicates whether the model characterizes the deception signal attenuation phenomenon. The game process is used to distinguish whether the model has single-stage analysis capability or multistage analysis capability. The model expansion indicates whether the type and strategy of attack and defense in the model can be expanded. The better the expansion ability, the wider the scope of application of the model. The equilibrium solution of the model represents the degree of detail of the game equilibrium solution process. The more detailed the solution process is, the more practical it is. In terms of operating costs, it means time complexity and space complexity of the defense strategy selection algorithm. The lower the operation cost, the better; the better the performance, the better. Most previous studies use the one-way signal transmission mechanism to model the attack and defense process, and less consideration is given to the phenomenon of deception signal attenuation in the confrontation. Additionally, some studies are limited to single-stage game analysis. In this paper, we conduct an in-depth analysis of the two-way signal transmission mechanism, establish a two-way signaling game model, provide a detailed game equilibrium solution process, and design a defense strategy selection algorithm. In terms of signal transmission mechanisms, deception signal attenuation, and game process, this work comes closer to actual network attack and defense, and the model has better scalability and practicability. By sending deception signals from both the offense and defense sides, the parties seek to control the other party’s strategy selection as well as maximize their own expected returns. This process embodies the confrontational philosophy under the condition of limited information.

Zhu et al. [34] propose two iterative reinforcement learning algorithms which allow the defender to identify optimal defenses. Reinforcement learning and signaling game model have their own advantages and disadvantages, and they should be adapted to different application scenarios. The purpose of this paper is to analyze process of network attack and defense. Reinforcement learning is a black box. Although the optimal defenses can be obtained, the analysis process and principles cannot be visualized. Using the two-way signaling game model to conduct the network attack-defense confrontation analysis, the analysis process and principles can be visulized more cleraly.

4. Real Case Application and Results Analysis

4.1. Experimental Environment and Parameter Configuration

In order to verify the feasibility and effectiveness of the proposed method, an experimental network environment was set up to carry out a simulation experiment. The experimental network was a typical business network, which was divided into three areas: external network, internal network, and DMZ. The attack and defense scenario are set as follows: the attacker located in the external network area and attempted to remotely attack the internal network zone of the enterprise intranet. The defender was the network security administrator of the enterprise and selected the active defense strategy according to the method in the paper. The topography of the experimental network is shown in Figure 7.

To ensure the availability and security of the enterprise network, a set of access control rules were set up between the network partitions as shown in Table 2. Among them, indicates that access was allowed; indicates that access was not allowed; and indicates that access requires certain permissions.

In general, the database server (databaseserver) stores a large amount of confidential data of the enterprise, so it was set as the target of attack in the experiment. According to the access control rules in Table 2, the attacker cannot directly access the databaseserver; however, through multiple steps, the vulnerability of the bastion server in the DMZ area can be used to obtain access to the internal network area, thereby achieving the goal of the attack.

Combined with the description of Common Vulnerabilities and Exposures (CVE) information in the information security vulnerability library [35], the vulnerability scanning tool Nessus was used to detect and discover the security vulnerabilities that existed in the experimental network. The security vulnerability of the experimental network is given in Table 3.

The attacker used the security vulnerabilities and defects that existed in the enterprise network to select an attack strategy consisting of several atomic attack actions. The defender selected a defense strategy containing different atomic defense actions in a targeted manner [36]. According to the attack and defense classification of the Lincoln Laboratory [37], we obtained the attack and defense strategies and their operating costs, as shown in Table 4.

In Refs. [17, 28], historical statistical data and expert experience were combined to provide the SDC values for different combinations of attack and defense strategies, as shown in Table 5, and to set and . In the ninth stage, , which shows that after this stage, the gain has very less influence on the total return calculation; thus, the number of game stages was set to .

4.2. Equilibrium Solution and Strategy Selection
4.2.1. TWSG(1) Game Equilibrium and Defense Strategy

“Nature” selects the type of defense strategy with a probability of (0.4, 0.6). When the strategy type of the defender is , the signal hDH is sent out. When the type of the attack strategy is , there are a total of four strategy combinations: , , , and . The SDC values for different combinations of attack-defense strategies are given in Table 5.

Under the first strategy combination , the spoof signal of the attacker is DAC = 0. Thus,

The gains for the other three strategy combinations can be calculated in the same way:

, , and .

Since the probability for selecting different strategies at the same attack and defense level is the same, the probability for each strategy combination is 0.25, and therefore the average gain u12 of the attacker under strategy type is

Similarly, we have.

, , , and .

Similarly, the above method can be used to obtain the offensive and defensive gains under different combinations of strategy types.

Using the equilibrium solution algorithm of Section 3.3, a pooling equilibrium solution is obtained for TWSG(1). There are two possible combinations of strategy types:Option 1: the defender selects strategy type and releases signal hDH, and the attacker selects strategy type . This time, U11 = −2960 and U12 = 1830.Option 2: the defender selects strategy type and releases signal hDH, and the attacker selects strategy type . At this time, U11 = −2727.5 and U12 = 2037.5.

Therefore, the defender selects option 2 as the defense strategy, designated as . The game tree of attack and defense is shown in Figure 8.

4.2.2. TWSG(2) Game Equilibrium and Defense Strategy

In the TWSG(1) equilibrium solution process, the attacker may choose either the strategy type or , and therefore the defender’s posteriori probability of the attacker is modified to (0.5, 0.5). Using the equalization solution algorithm described in Section 3.3, the solution of TWSG(2) remains a pooling equilibrium. There are two possible combinations of strategies:(i)The attacker selects the strategy type and releases signal hAM, and the defender chooses strategy type (ii)The attacker selects strategy type and releases signal hAM, and the defender selects strategy type

Therefore, the defender selects the regular type strategy, designated as .

4.2.3. Game Equilibrium and Defense Strategy for Stages Three through Nine

Using the above method, the game equilibrium for each stage is solved sequentially.

For stages three through six, as shown in Table 6, the game equilibrium solution remains a pooling equilibrium, but the deceptive signal is gradually attenuated. In stages seven through nine, the deception signal is completely attenuated, the game evolves into an incomplete information static game, and the pooling equilibrium solution becomes a separating equilibrium solution. At this point, the defender selects the enhanced as the strategy type and releases an enhanced signal hDH, designated as .

4.3. Experimental Analysis

Based on the above experiments and data analysis, the following conclusions can be drawn from the general analysis of the offensive and defensive game equilibrium and the gain without considering specific parameter values.(1)Deception signals can improve attack and defense performance.The game equilibrium solutions for stages one through six are pooling equilibrium solutions, indicating that, in the initial stage of the offensive and defensive game, the defender may adopt the regular type of defense strategy and confuse and mislead the attacker by releasing the spoofing signal hDH. By disrupting the cognition of the attacker, the defender’s own gain can be maximized at a small cost. The effectiveness of the spoofing signal should therefore be fully utilized to actively release the spoofing signal. At the same time, the ability to identify the attacking party’s spoofing signals should be enhanced so that the motivation and preference of the attacker can be recognized as early as possible and a targeted active defense strategy can be implemented.(2)The role of the spoofing signal is limited and attenuated.As the game progresses, the spoofing signal becomes gradually attenuated. In the seventh through ninth stages of the game, the game equilibrium solution becomes a separating equilibrium solution, indicating that the function of the deception signal has completely disappeared. The defender no longer releases spoofing signals but instead increases the defensive input and adopts an enhanced defense strategy to fight against network attacks. Therefore, when selecting the strategy, one should avoid the limitations of the spoofing signal and the attenuation process should be delayed by improving the quality of the spoofing signal. At the same time, attention should be given to collecting threat information and amplifying the limitations of the attacker’s spoofing signal.(3)Spoofing signals can delay the attack speed and reduce the suddenness of the attack.An analysis of the first through ninth stages of the game shows that the deception signal released by the defender can delay the formation of the network kill chain and gain some reaction time for the defender. The deception signal can partially offset the time asymmetry advantage and the first-move advantage possessed by the attacker. However, due to the limitations of the spoofing signal, relying solely on the spoofing signal itself cannot completely resist network attacks. Therefore, the defending party should evolve according to the game process and use other means of defense to dynamically adjust the defense strategy to maximize its own return.(4)Reduce security losses by enhancing defense capabilities.

We analyze the gamer’s return when different strategy types are adopted. In the first through sixth stages, the defender adopts the regular type of defense strategy and the average return is −2853. In the seventh through ninth stages, the defender chooses the enhanced defense strategy type and the defender’s average return is −2496. This shows that when faced with continuous high-intensity network attacks, the defending party should increase its security investment, enhance its defense capabilities, and reduce its security losses.

5. Conclusion

Active defense is a topic at the forefront of research in the field of network security. Strategy selection is the key to defense effectiveness. Under the conditions of attack-defense confrontation and limited information, the defense party’s optimal strategy is difficult to determine; however, a signaling game model is an effective way to solve this problem. To address the problem that one-way signal transmission does not conform to the actual problem of network attack and defense, we analyzed the two-way signal transmission process, constructed a two-way signaling game model, provided a multistage perfect Bayesian equilibrium solution process, and designed an active defense strategy selection algorithm in this paper. The feasibility and effectiveness of the method was verified through example applications and analysis. By analyzing the experimental results, we identified the mechanism driving the effectiveness and limitations of the deceptive signal and summarized four conclusions that guide the selection of active defense strategies. Compared with existing research, the two-way signaling game model proposed in this paper more accurately represents the offensive and defensive strategy confrontation process and more closely resembles an actual network attack and defense process. Thus, our work serves as the basis of, and provides reference to, the active defense strategy selection process under dynamic incomplete information conditions.

Appendix

Example Solution of Perfect Bayesian Equilibrium

Based on the parameter settings in this paper, the attacking party and defending party each have two strategy types and release two types of signals. The Leader type is represented by the symbols LH and LM, the signal space is represented by HLH and HLM, the Follower type is represented by the symbols FH and FM, {u11, u21, u31, …, u81} is the gain of the Leader, and {u12, u22, u32, …, u82} is the gain of the Follower. The single-stage signaling game tree is shown in Figure 9.

Step 1. Follower strategy calculation.
First, we assume that the posteriori inference of different signal sets on the single-stage game tree to be . We then calculate the maximum return .
When H = hLH,and the condition is satisfied.
Assuming that ,
we solve and obtain , and .
For , and .
For , and .
Similarly, we obtain .
For , .
For , .
By repeating the above process, we calculate for H=hLM.

Step 2. Leader strategy calculation.For , when and ,and we obtain .
Similarly, we obtain for different sections of and .
By repeating the above process, we calculate for .

Step 3. Calculate equilibrium solution.
We obtain and in Step 1 and Step 2, respectively, by combining this with a priori probability PL and obtain the posteriori probability . If the calculated value of is not in conflict with the premise hypothesis , then the equilibrium solution is .

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

Xiaohu Liu and Hengwei Zhang contributed equally to this work.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant nos. 61521003 and 61572517 and in part by the Henan Science and Technology Research Project under Grant no. 182102210144.