Research Article  Open Access
Adaptive Secure MIMO Transmission Mechanism against Smart Attacker
Abstract
The MIMO transmission against a smart attacker has recently been formulated as a noncollaborative game, in which both the MIMO transmitter and the malicious attacker try to maximize their predefined utilities. In this paper, by carefully analyzing the Nash Equilibrium (NE), we focus on the conditions, in which the gaming results incline to the malicious attacker instead of the MIMO transmitter. In this adverse case, it is highly desirable to develop an effective mechanism to suppress the attack intention by the attacker for better secure communication. Motivated by this, an adaptive secure MIMO transmission scheme was proposed to make the MIMO transmitter better resist malicious attackers in adverse channel conditions. Compared with the existing gamingbased strategy, not only the transmit power of the MIMO transmitter but also the transmission probability will be adjusted in the proposed adaptive secure transmission scheme. Our analysis results show that the proposed scheme can be regarded as a generalized adaptive transmission one, i.e., when the adaptive transmit power policy is enough to suppress the attack motivation, the proposed scheme will be reduced to the adaptive power control scheme; otherwise, both the adaptive transmit power and the adaptive probabilistic transmission can be employed to suppress the attack motivation. The analysis results confirm us that the proposed adaptive transmission scheme provides us a choice to enhance the secure MIMO transmission performance in adverse conditions.
1. Introduction
When a malicious attacker can cleverly switch its attack mode amongst eavesdropping [1], jamming [2], and spoofing [3] to obstruct the secure communication between a transmitter and its receiver, it will impose critical challenges on the transmission strategy design for secure transmission. Game theory provides us a useful framework to derive the optimal transmission strategy in the presence of uncertain attack modes [4–6]. A MIMO wiretap zerosum game was formulated in [4] to assume the secrecy rate as the utility function to analyze the conditions of equilibrium outcomes with various strategies. An exemplary multichannel spectrum access game (SAG) with unknown environment dynamics and limited information of other players is considered in [5] to find the best communication strategy through the joint reinforcement learning and type identification algorithm. In [6], a Bayesian game theory model is presented to allocate the defense effort among the nodes, in which the network defenders have little knowledge of the opponent’s information and an explicit solution to this dilemma is derived. Afterwards, game theorybased secure transmission was extensively studied in [7–10] to cope with active attackers (smart attackers). A zerosum game between the MIMO transmitter and the jammingcapable eavesdropper was formulated in [7] to derive the optimal mixed strategy and the realized Nash equilibria. By using gametheoretical methods, the secret communication in the multichannel network was studied in [8]. In [9], two simple stochastic games are proposed to deal with the dual threat of jamming and eavesdropping. The results show that, under some conditions, incorporating time slot for malicious threat detection in the transmission protocol can improve the confidentiality and reliability of the communication without additional transmission delay.
Motivated by the aforementioned progresses, some research efforts strived to combine the game theory with reinforcement learning to devise the secure transmission strategy. In [10], the Qlearningbased strategy was developed to cope with the smart attacker. The reinforcement learningbased spoofing detection and the deep reinforcement learningbased authentication scheme were proposed to further enhance the authentication in [11]. The Qlearning was utilized. The reinforcement learningbased power control scheme was developed in [12] to resist jamming attacks for the communication between the inbody sensors and the WBAN coordinator. In [13], the interaction between the user and a smart interferer in an ambient backscatter communication network was formulated as a game, and the closedform equilibrium of the Stackelberg game was obtained. Due to lack of information about the system SNR and transmission strategy of the interferer, the Qlearning algorithm was proposed to derive the optimal strategy in a dynamic iterative manner. A zerosum game between a base station equipped with multiple antennas and a smart jammer in the nonorthogonal multiple access (NOMA) system was formulated in [14]. The Stackelberg equilibrium of the antijamming NOMA transmission game was derived, and the reinforcement learningbased power control scheme was proposed for the downlink NOMA transmission without being aware of the jamming and radio channel parameters.
As a reinforcement learning (RL) algorithm to maximize the longterm expected reward in multistate environments, Qlearning provides us an effective technical solution for application in multiplayer generalsum games. The conditions that the Qlearning will converge with probability 1 to the optimal ones, the dynamic analysis framework, and the asymptotic convergence classes were addressed in [15–17], respectively. The MIMO transmission against a more powerful smart attacker, which can apply programmable radio devices (for instance, softwaredefined radios) to perform multiple types of attacks like eavesdropping, jamming and spoofing, was investigated in [18]. It is shown that the problem can be formulated as a noncooperative game, in which the power control strategy via reinforcement learning can be utilized to suppress the attack motivation of smart attackers in a dynamic MIMO transmission game without being aware of the attack and the radio channel model. Nonetheless, it should be addressed that most of the existing research efforts focus on how to derive the optimal power control strategy for suppressing the attack motivation of the malicious attacker by employing the reinforcement learning. Unfortunately, the power control strategy only will become no longer effective in terms of suppressing the attack motivation by the malicious attacker, especially in an adverse environment, as we will illustrate in this paper. In this case, it will become highly desirable to develop a new mechanism to make the MIMO transmitter better resist malicious attacker. And this is exactly the most important research motivation of our work in this paper.
Nash Equilibrium (NE) provides us the basis in the noncooperative game framework to determine the optimal solution, in which each player lacks any incentive to change his/her initial strategy, because a player does not gain anything by deviating from the initially chosen strategy, while other players keep their strategies unchanged. The NE analysis in [18] was presented to unveil that, when the transmit power of the MIMO transmitter is selected to be large enough, the game between the MIMO transmitter and the smart malicious attacker will incline to the MIMO transmitter, namely, the attacker tends to be idle. Although the proposed power control strategy via reinforcement learning can be utilized to cope with the smart attacker, one may readily observe the limitation of this design. In some adverse channel conditions, a very large transmit power might be infeasible, especially when the transmit power is limited. In this case, the challenges of the secure MIMO transmission arise. Exploring the critical factors that affect the strategy of both parties and developing an effective scheme to make the MIMO transmitter realize secure transmission against the attacker under an adverse channel condition are the other two motivations of our work in this paper. To this end, an adaptive secure MIMO transmission scheme was proposed in this paper, in which not only the transmit power of the MIMO transmitter but also the transmission probability will be adjusted in the proposed adaptive secure transmission scheme. And the proposed scheme can be interpreted as a generalized adaptive transmission strategy. When the adaptive transmit power policy is enough to suppress the attack motivation, the proposed scheme will be reduced to the adaptive power control scheme; otherwise, both the adaptive transmit power and the adaptive probabilistic transmission will be employed to suppress the attack motivation of the smart attacker. The contributions of this paper can be briefly summarized as follows:(i)A comprehensive Nash Equilibrium (NE) analysis of the noncollaborative game framework in [18] was presented to explore the critical factors that dominate the game decisions. In this way, we show that the power control strategy only is not enough if we wish to suppress the attack motivation by the malicious attacker in adverse channel conditions.(ii)A new probabilistic transmission scheme was proposed to realize a novel adaptive secure MIMO transmission scheme against the smart attacker. It is shown that, with both the probabilistic transmission control and the power control policy, we can improve the capability of the MIMO transmitter to resist the smart attacker, especially when compared with the original scheme in [18].
The remainder of this paper is organized as follows: The system model and the game problem formulation between the MIMO transmitter and the malicious attacker will be reviewed in Section 2. The comprehensive NE analysis will be presented in Section 3 to show that the game decision will not always be friendly to the MIMO transmitter. The proposed adaptive secure MIMO transmission mechanism will be addressed in Section 4. Numerical analysis was presented in Section 5 to show the applicability of the proposed scheme. Finally, we conclude our work in Section 6.
Notations. All boldface letters indicate vectors (lower case) or matrices (upper case). The superscripts denote the conjugate transpose. represents the determinant of a matrix. is a identity matrix.
2. System Model and the MIMO Game Problem Formulation
2.1. System Model
Let us consider the AliceBobEve communication model illustrated in Figure 1, in which Alice is assumed to be provisioned with M transmit antennas, Bob is assumed to be provisioned with receive antennas, and Eve has N antennas. Alice is supposed to communicate with Bob, while Eve is smart and he will try to choose his attack mode amongst eavesdropping, jamming, and spoofing to obstruct the communication between Alice and Bob. The AliceBob link, the AliceEve link, and the EveBob link can be represented by the channel matrix , the channel matrix , and the channel matrix , respectively. Throughout the paper, the ith largest eigenvalue of will be denoted by , .
Alice is assumed to transmit an Mdimensional signal vector to Bob, and the transmit power is , where stands for the maximally allowed transmit power constraint at Alice. When Eve decides not to attack the communication (namely, the attack mode indicator ), the MIMO transmission rate R can be given by [19]
If Eve chooses to overhear the transmission by Alice (namely, the attack mode indicator ), the received signal at Eve will bewhere represents the Ndimensional additive zeromean white Gaussian noise vector, , . In this case, the achievable MIMO secrecy rate will be [2]
If Eve decides to block Alice’s transmission with the jamming signal (namely, the attack mode indicator ), and we assume , is the transmit power by Eve in the jamming mode, Bob will receive the following signal:where represents the dimension white Gaussian noise at Bob, , . Then, the achievable secrecy rate can be given by [2]
If Eve chooses to spoof Bob by sending a fake signal with restricted transmit power when Alice is silent (namely, the attack mode indicator ), is the transmit power by Eve in the spoofing mode and Bob will receive the following signal:
The attack mode of spoofing aims at transmitting spoofing information to Bob only. The achievable secrecy rate of Alice when Eve spoofs Bob can be given by [18]where γ represents the spoofing message utility coefficient.
2.2. MIMO Transmission Game Problem Formulation
In order to cope with the smart attacker, the MIMO transmission scheme can be derived by employing a noncollaborative game framework, in which the following two utility functions of Alice and Eve are assumed [18]:where represents the unit power consumption at Alice, , and if ; otherwise, . represents the cost for Eve to choose attack mode . Considering the secure MIMO transmission game denoted by with the game participants , game strategy , and utility functions , the Nash Equilibrium (NE) strategy should satisfy the following conditions:
The NE condition that is friendly to Alice and the issue on how to suppress completely the attack motivation by Eve are highlighted in [18], in which the reinforcement learningbased power control strategy was proposed as well to realize the attackfree game results (namely, ). Nonetheless, it should be addressed that there exist multiple NE conditions, in which the game results will not always incline to Alice. In this paper, our emphasis will be focused on the case, in which the game results incline to the malicious attacker instead of the MIMO transmitter. Obviously, it is highly desirable to develop an effective mechanism to cope with the smart attacker in this case for a better secure communication. In fact, this is exactly the problem that we would like to highlight in this paper.
3. Nash Equilibrium Analysis
3.1. Nash Equilibrium in support of Alice
Considering the utility functions in (8) and (9), as well as the Nash Equilibrium condition in (10) and (11), one may readily derive that, the NE condition at in support of Alice for the game can be achieved when the following conditions are satisfied:
In fact, three threshold values of , , and can be interpreted as the minimum reward expected by Eve if he decides to overhear Alice, to jam or to spoof Bob. The NE conditions in (12) implies that, when the realized attack reward by Eve is less than the predefined minimum expected reward, Eve will give up attacking, namely, the game between Alice and Eve inclines to Alice. One may readily observe that for the given expected minimum reward , , and , the achievability of the NE conditions at will depend on the underlying channel . One may readily note that, the above NE conditions cannot always be satisfied, especially when the channel changes.
3.2. Nash Equilibrium in support of Eve
In the same way, on the basis of (10) and (11), we may derive the Nash Equilibrium conditions in support of Eve for in Table 1.

(1) Nash Equilibrium in Eavesdrop Mode. By comparing the NE conditions with those in (12), one may readily observe that the game decision will incline to the eavesdrop attack mode by Eve when the predefined eavesdropping reward can be fulfilled and the relative eavesdropping gain of is the largest one among three attack modes. A further examination of the conditions unveils that better channel between Alice and Eve will make the game have a larger opportunity to incline to the eavesdrop attack mode, which complies with the heuristics, because now Eve is closer to Alice and he is in a better condition to overhear the transmission by Alice.
(2) The Nash Equilibrium in Jam Mode. In the same way, by comparing the NE conditions with that in (12), one may readily observe that the game decision will incline to the jam attack mode by Eve if the predefined jamming reward can be fulfilled and the relative jamming gain of is the best among three attack modes. By carefully examining the conditions, we may observe that reasonable channel between Alice and Bob and reasonable channel between Eve and Bob will make the game have a larger opportunity to incline to the Jam attack mode (one may see this from ), which complies with the heuristics because now both Alice and Eve are in good conditions to transmit to Bob.
(3) The Nash Equilibrium in Spoof Mode. Similarly, by comparing the NE conditions with that in (12), one may readily conclude that the game decision will incline to the spoof attack mode by Eve if the predefined spoofing reward can be fulfilled and the relative spoofing gain of is the best among three attack modes. A further examination of the conditions unveils that better channel between Eve and Bob will make the gaming have a larger opportunity to incline to the spoof attack mode, which complies with the heuristics because now the Eve is closer to Bob and he is now in a good condition to spoof the reception by Bob.
In order to explicate more clearly the above NE conditions, let us summarize briefly their dependency on the underlying channel gains of .
From the perspective of attack, a large (EveBob link is good) is a beneficial situation for Eve to choose either jam or spoof mode. If also becomes large (AliceBob link is good), Eve would incline to jam. The NE in support of Eve may happen when both the AliceBob link and the EveBob link are not in good conditions, while we have a reasonable AliceEve link . In this case, the game decision may incline to make Eve overhear the transmission by Alice. The above discussion clearly tells us that the game framework between Alice and Eve will not always incline to Alice. When the wireless environment becomes adverse, such that Eve inclines to attack, the power control strategy only would be no longer effective to suppress Eve’s attack motivation. As we will address in Section 4, now the adaptive probabilistic transmission design can be utilized to discourage the attack motivation by Eve.
4. New Adaptive Secure Transmission Design
4.1. Probabilistic Transmission Policy
In Section 3, we have shown that, the NE at can be achieved when all the possible attack rewards by Eve are less than the predefined minimum expected rewards , , and . In this case, a reinforcement learningbased power control strategy can be employed to approach the NE at [18]. The paid cost may be some increase in the required transmit power at Alice. However, when some attack reward by Eve is larger than the predefined minimum expected rewards, in this case, the transmit power control strategy may be no longer effective in terms of the attack suppression. In order to better fulfill the secure transmission requirements and to realize reasonable secure transmission in adverse conditions, we proposed to use the probabilistic transmission strategy. Unlike the original transmission scheme [18], in which Alice will always transmit irrespective of the current channel conditions, in the proposed adaptive probabilistic transmission scheme, Alice may decide to transmit in a probabilistic manner, especially when the game decision inclines to Eve. To this end, the utility function of Alice can be modified as follows:where represents the transmission probability by Alice. One may readily observe that the above utility function will be reduced to the traditional utility function assumed in [18] when . We will show later that the proposed adaptive probabilistic transmission scheme will subsume the original adaptive transmission scheme in [18] as a special case by letting .
Because Eve will try to obstruct the communication between Alice and Bob with a certain paid cost, we may assume the worst case that Eve knows the probabilistic transmission control mechanism at Alice; the utility function at Eve can thus be revised as follows:
On the basis of the two updated utility functions in (13) and (14), we may derive the NEs in Table 2.

By comparing Tables 1 and 2, we may readily observe from the modified NE conditions how the proposed probabilistic transmission scheme is able to further suppress the attack motivations by Eve. By introducing the transmission probability control mechanism, now three possible attack rewards and the associated relative attack rewards will be reduced with a discount coefficient, which is proportional to the transmission probability . And this explicates the philosophy that the proposed probabilistic transmission leads to the suppression of the attack motivation by Eve.
Of course, it should be addressed that the proposed probabilistic transmission scheme will incur degradation in the achieved secrecy capacity, in that now Alice will not always attempt to transmit, no matter how Eve reacts and what about the underlying channel conditions. Nonetheless, as we will illustrate in Section 5, in adverse channel conditions where the traditional game strategy (power control strategy) fails in suppressing the attack motivation by Eve, the proposed adaptive secure transmission scheme can still help to improve the secure transmission between Alice and Bob by discouraging the attack motivation by Eve. Thus, some loss in the transmission opportunity is still worthwhile.
4.2. Reinforcement LearningBased Adaptive Secure Transmission Scheme
The Qlearningbased algorithm in [18] can be modified to derive both the optimal transmit power strategy and the optimal probabilistic transmission policy to realize the adaptive secure MIMO transmission scheme. As summarized in Algorithm 1, the Qlearningbased algorithm can be utilized to derive both the optimal power control and the optimal probabilistic transmission for Alice. Specifically, stands for the Q function of Alice, in which s represents the system state and P and denote two actions by Alice. indicates the maximum of over all possible actions, given the state of s. The learning rate represents the weight of the current quality during the learning process, while is the discount factor that denotes the uncertainty of Alice about the future gains. Alice would observe the strategy by Eve in the th slot and can assume it as its current state . With the times going by, Alice is able to choose the optimal power control strategy and . In practical applications, we may consider to set a smallest probabilistic transmission parameter according to the least transmission rate requirement by AliceBob link.

5. Numerical Analysis
In order to show the applicability of the proposed adaptive probabilistic transmission scheme, we will focus on the conditions in which the traditional game results will incline to Eve. By doing so, we will show clearly that when the conditions tend to be in support of Eve, the traditional game will fail in guaranteeing secure transmission from Alice to Bob. On this basis, we then highlight how the proposed probabilistic transmission policy can be utilized along with the power control strategy to improve the secrecy transmission from Alice to Eve by suppressing the attack motivation of Eve. In all simulations, without statement, we assume , , , , , , and . In simulations, we assume and . is assumed to make sure that a very low transmission rate can be avoided.
5.1. No Attack Benchmark System
And the following channel conditions are assumed in simulations as the benchmark setting: , , and . In this setup, although the AliceBob link is not the best among the three links, the transmit power control strategy at Alice can successfully suppress the attack motivation by Eve, as shown in Figure 2(b). As illustrated in Figure 2(c), with some paid cost in the increased transmit power at Alice, secure transmission from Alice to Bob can be realized (see Figure 2(a)). For illustration purpose, the realized secrecy capacity, different attack mode probabilities by Eve, and the required transmit power at Alice in the same benchmark system are illustrated in Figure 3. One may readily observe that by introducing the probabilistic transmission control strategy, we can also guarantee the suppression of the attack motivation by Eve for a secure transmission from Alice to Bob. Meanwhile, less transmit power is required due to the probabilistic transmission strategy (Figure 3(c)). The paid cost is some loss in the achieved secrecy rate from Alice to Bob. Then, in the following numerical analysis, we will show that the proposed probabilistic transmission strategy can be utilized with the transmit power control to formulate a more robust transmission scheme in the presence of malicious attack.
(a)
(b)
(c)
(a)
(b)
(c)
5.2. Ability to Suppress the Eavesdropping
Let us consider the following channel conditions as a typical eavesdropping setup: , , and . Compared with the benchmark system in Section 5.1, now the AliceEve link is the best among the three involved links, the EveBob link is the worst, and the AliceBob channel remains unchanged. According to our analysis in Section 3.2, now the traditional game decision will incline to the eavesdropping by Eve. As shown in Figure 4(a), now the power control strategy will only fail in suppressing Eve’s motivation to overhear the transmission by Alice. In fact, the eavesdropping probability by Eve is now about , as illustrated in Figure 4(b). We do not include the secrecy rate performance since now there is in fact no secure rate at all.
(a)
(b)
(c)
Obviously, it is highly desirable for Alice to figure out an effective mechanism to resist the attack by Eve to fulfill the secure transmission requirement. In the same eavesdropping setting, if the proposed adaptive probabilistic transmission scheme is utilized along with the power control strategy, the realized secrecy rate, different attack mode probabilities by Eve, the required transmit power, and the probabilistic transmission control at Alice are illustrated in Figure 5. We can see the game results now incline to Alice again, as illustrated in Figure 5(b). One may also note that by lowering the transmission probability of Alice, we can effectively suppress the attack motivation by Eve. As a result, secure transmission from Alice to Bob can be realized, as illustrated in Figure 5(a). Since the AliceEve link is noticeably better than the AliceBob link, there is some expected loss in the realized secrecy capacity, when compared with the realized secrecy capacity in the benchmark system, in which the same AliceBob link is assumed. From Figure 5(c), we may see that the loss in the realized secrecy capacity can be explicated by the low transmit power at Alice and the low transmit probability, which is the result of the game decision when the proposed probabilistic transmission policy is utilized.
(a)
(b)
(c)
5.3. Ability to Suppress the Spoofing
Let us consider the following channel conditions as a typical spoofing setup: , , and . Compared with the benchmark system in Section 5.1 and the eavesdropping system in Section 5.2, now the EveBob link is the best among the three involved links, the AliceEve link is the worst, and the AliceBob channel remains unchanged. According to our analysis in Section 3.2, now the traditional game decision will incline to the spoofing by Eve. As illustrated in Figure 6(b), now the spoofing probability by Eve is about . In order to resist the spoofing attack, Alice needs a relatively high transmit power to achieve the relatively small secrecy capacity, as illustrated in Figures 6(a) and 6(c). Here, we can clearly see that the power control strategy only cannot effectively resist the Eve’s motivation to spoof the reception at Bob.
(a)
(b)
(c)
In the same spoofing setting, if the proposed adaptive probabilistic transmission scheme is utilized along with the power control strategy, the realized secrecy rate, different attack mode probabilities by Eve, the required transmit power, and the probabilistic transmission control at Alice are illustrated in Figure 7. We can see that the game results now incline to Alice again. As illustrated in Figure 7(b), now the noattack probability is about . One may also note that by lowering the transmission probability of Alice, we can effectively suppress the attack motivation by Eve. As a result, improved secure transmission from Alice to Bob can be realized, as illustrated in Figure 7(a). Compared with the required transmit power in Figures 6(c) and 7(c), one may readily observe that, with the proposed probabilistic transmission control mechanism, less transmit power is needed to counteract the spoofing attack by Eve to realize a better secrecy capacity, which is obviously attractive for practical applications.
(a)
(b)
(c)
5.4. Ability to Suppress the Jamming
Let us consider the following channel conditions as a typical jamming setup: , , and . Compared with the benchmark system in Section 5.1, the eavesdropping system in Section 5.2, and the spoofing system in Section 5.3, now we have a much better AliceBob link, the EveBob link is in good conditions, and the AliceEve link is the worst. According to our analysis in Section 3.2, now the traditional game decision will incline to the jamming by Eve. As illustrated in Figure 8(b), now the jamming probability by Eve is about . Alice needs a relatively high transmit power to resist the jamming attack, as illustrated in Figure 8(c). Here, we can clearly see that the power control strategy only cannot effectively suppress the Eve’s motivation to obstruct the reception at Bob.
(a)
(b)
(c)
In the same jamming setting, if the proposed adaptive probabilistic transmission scheme is utilized along with the power control strategy, the realized secrecy rate, different attack mode probabilities by Eve, the required transmit power, and the probabilistic transmission control at Alice are illustrated in Figure 9. We can see that the game results now incline to Alice again. As illustrated in Figure 9(b), now the noattack probability is about . One may also note that, by lowering the transmission probability of Alice, we can effectively suppress the attack motivation by Eve. As a result, improved secure transmission from Alice to Bob can be realized, as illustrated in Figure 9(a). Compared with the required transmit power in Figures 9(c) and 8(c), one may readily observe that, with the proposed probabilistic transmission control mechanism, less transmit power is needed to counteract the jamming attack by Eve to realize a better secrecy capacity, which is desired in practical applications.
(a)
(b)
(c)
In summary, our analysis results confirm us that the proposed adaptive probabilistic MIMO transmission scheme do provide us an effective method to make the MIMO transmitter better resist malicious attackers in adverse channel conditions, in which not only the transmit power of the MIMO transmitter but also the transmission probability will be adjusted to suppress the attack motivation. Meanwhile, we can also conclude from the four typical settings that (1) when the adaptive transmit power policy is enough to suppress the attack motivation, the use of both the adaptive transmit power and the probabilistic transmission control will lead to a more energy efficient MIMO transmission scheme, but with some loss in the realized secrecy capacity; (2) when the adaptive transmit power policy is no longer enough to suppress the attack motivation, the use of both the adaptive transmit power and the probabilistic transmission control will be highly recommended, not only in terms of the attack motivation suppression but also from the perspective of the realized secrecy capacity and the improved energy efficiency. In terms of the realized secrecy capacity, one may also note that the proposed adaptive probabilistic transmission control policy seems to be very attractive in spoofing and jamming scenarios.
6. Conclusion
In this paper, we focus on the noncollaborative game between an MIMO transmitter and one smart malicious attacker, both of which try to maximize their predefined utilities. By carefully analyzing the Nash Equilibrium (NE) and the critical conditions that affect the achieved NE, an adaptive probabilistic MIMO transmission scheme was proposed to make the MIMO transmitter better resist the malicious attacker in adverse channel conditions. Compared with the existing gamebased strategy, not only the transmit power of the MIMO transmitter but also the transmission probability will be tuned in the proposed adaptive probabilistic transmission scheme. And our analysis results unveil that the proposed adaptive probabilistic transmission can be regarded as a generalized version of the previous adaptive transmission scheme, which can significantly suppress the attack motivation by the smart attacker and improve the secrecy capacity, even if the adaptive power control strategy fails. Other sophisticated strategies can also be employed to further improve the adaptive secure MIMO transmission scheme. For instance, just like [20], when fullduplex (FD) Bob is assumed, some subsets of antenna at Bob can be utilized to send the artificial noise to obstruct the eavesdropping by Eve. We leave this in the next step work.
Data Availability
In our work, all the data are generated by the simulation platform developed by ourselves, instead of any other data set. Of course, in order to make sure that our data set is properly generated, we have verified the results by ensuring all the results comply with the existing publications, for instance, [18] and [2].
Conflicts of Interest
The authors declare that they have no conflicts of interest.
Acknowledgments
The work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant no. 61771406 and the Key Research Program on Innovation in Industry, University and Research by Guangzhou University. The work of Q. Chen was also supported by Lingnan Yingjie Project of Guangzhou Municipal Government.
References
 Y. Tung, S. Han, D. Chen, and K. Shin, “Vulnerability and protection of channel state information in multiuser MIMO networks,” in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS ’14), pp. 775–786, Scottsdale, AZ, USA, November 2014. View at: Publisher Site  Google Scholar
 L. Xiao, J. Liu, Q. Li, N. B. Mandayam, and H. V. Poor, “Usercentric view of jamming games in cognitive radio networks,” IEEE Transactions on Information Forensics and Security, vol. 10, no. 12, pp. 2578–2590, 2015. View at: Publisher Site  Google Scholar
 P. Baracca, N. Laurenti, and S. Tomasin, “Physical layer authentication over MIMO fading wiretap channels,” IEEE Transactions on Wireless Communications, vol. 11, no. 7, pp. 2564–2573, May 2012. View at: Publisher Site  Google Scholar
 A. Mukherjee and A. L. Swindlehurst, “Optimal strategies for countering dualthreat jamming/eavesdroppingcapable adversaries in MIMO channels,” in Proceedings of the IEEE Military Communications Conference (MILICOM), pp. 1695–1700, San Jose, CA, USA, November 2010. View at: Publisher Site  Google Scholar
 X. He, H. Dai, P. Ning, and R. Dutta, “A stochastic multichannel spectrum access game with incomplete information,” in Proceedings of the IEEE International Conference on Communications (ICC), pp. 4799–4804, London, UK, June 2015. View at: Publisher Site  Google Scholar
 A. Garnaev, M. BaykalGursoy, and H. V. Poor, “Incorporating attacktype uncertainty into network protection,” IEEE Transactions on Information Forensics and Security, vol. 9, no. 8, pp. 1278–1287, 2014. View at: Publisher Site  Google Scholar
 A. Mukherjee and A. L. Swindlehurst, “Jamming games in the MIMO wiretap channel with an active eavesdropper,” IEEE Transactions on Signal Processing, vol. 61, no. 1, pp. 82–91, 2013. View at: Publisher Site  Google Scholar
 A. Garnaev and W. Trappe, “The eavesdropping and jamming dilemma in multichannel communications,” in Proceedings of the 2013 IEEE International Conference on Communications (ICC), pp. 2160–2164, Budapest, Hungary, June 2013. View at: Publisher Site  Google Scholar
 A. Garnaev, M. BaykalGursoy, and H. V. Poor, “A game theoretic analysis of secret and reliable communication with active and passive adversarial modes,” IEEE Transactions on Wireless Communications, vol. 15, no. 3, pp. 2155–2163, 2016. View at: Publisher Site  Google Scholar
 C. Xie and L. Xiao, “Usercentric view of smart attacks in wireless networks,” in Proceedings of the of 2016 IEEE International Conference on Ubiquitous Wireless Broadband (ICUWB), Nanjing, China, October 2016. View at: Publisher Site  Google Scholar
 L. Xiao, G. Sheng, X. Wan, W. Su, and P. Cheng, “Learningbased PHYlayer authentication for underwater sensor networks,” IEEE Communications Letters, vol. 23, no. 1, pp. 60–63, 2019. View at: Publisher Site  Google Scholar
 G. Chen, Y. Zhan, Y. Chen, L. Xiao, Y. Wang, and N. An, “Reinforcement learning based power control for inbody sensors in WBANs against jamming,” IEEE Access, vol. 6, pp. 37403–37412, 2018. View at: Publisher Site  Google Scholar
 A. Rahmati and H. Dai, “Reinforcement learning for interference avoidance game in RFpowered backscatter communications,” in Proceedings of the of 2019 IEEE International Conference on Communications (ICC), Shanghai, China, May 2019. View at: Publisher Site  Google Scholar
 L. Xiao, Y. Li, C. Dai, H. Dai, and H. V. Poor, “Reinforcement learningbased NOMA power allocation in the presence of smart jamming,” IEEE Transactions on Vehicular Technology, vol. 67, no. 4, pp. 3377–3389, 2018. View at: Publisher Site  Google Scholar
 R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT Press, Cambridge, MA, USA, 1998.
 E. R. Gomes and R. Kowalczyk, “Dynamic analysis of multiagent Qlearning with εgreedy exploration,” in Proceedings of the ACM 26th Annual International Conference on Machine Learning (ICML 09), pp. 369–376, Montreal, Canada, June 2009. View at: Publisher Site  Google Scholar
 M. Wunder, M. Littman, and M. Babes, “Classes of multiagent Qlearning dynamics with εgreedy exploration,” in Proceedings of the ACM Annual International Conference Machine Learning (ICML), pp. 1167–1174, Haifa, Israel, 2010. View at: Google Scholar
 Y. Li, L. Xiao, H. Dai, and H. V. Poor, “Game theoretic study of protecting MIMO transmissions against smart attacks,” in Proceedings of the IEEE International Conference on Communications, London, UK, May 2017. View at: Publisher Site  Google Scholar
 A. Goldsmith, Wireless Communications, Cambridge University Press, Cambridge, UK, 2005.
 L. Li, A. P. Petropulu, and Z. Chen, “MIMO secret communications against an active eavesdropper,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 10, pp. 2387–2401, 2017. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2020 Xujun Shen et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.