A Novel Optimal Strategy for Communication System in the Maritime Industry Based on Game Theory

Li, Weijie; Qi, Qin; Liu, Yifang

doi:https://doi.org/10.1155/2022/3996295

Wireless Communications and Mobile Computing

On this page

Abstract Introduction Model Results Conclusion Data Availability Conflicts of Interest Authors’ Contributions Acknowledgments References Copyright Related Articles

Special Issue

Artificial Intelligence and Computing on Industrial Applications

View this Special Issue

Research Article | Open Access

Volume 2022 | Article ID 3996295 | https://doi.org/10.1155/2022/3996295

A Novel Optimal Strategy for Communication System in the Maritime Industry Based on Game Theory

Weijie Li,¹Qin Qi,^2,3,4and Yifang Liu⁵

Academic Editor: Chia-Huei Wu

Received02 Jul 2022

Accepted20 Jul 2022

Published22 Aug 2022

Abstract

In this paper, we employ convex optimization and the saddle point equation to find the two-player optimal payoff in iterated rock-paper-scissors game. We also describe the equivalent of payoff written in a two-person non-zero-sum matrix in the hypothetical game system, which provides a possible way to make quantitative analyses. In addition, we use the interior-point methods to simulate rock-paper-scissors game and our numerical results verify that our hypothesis of the payoff equation, , can still work very well even if changes with the payoff , but it is never affected by other factors.

1. Introduction

It is challenging for human beings to make optimal decisions in noncooperative strategic interactions [1]. The finest approach for scissors of rock paper is to really behave allegedly. This means that each choice is played around a third of the time and a player cannot imagine what is next. Rock-paper-scissors (RPS for short) game widely used to study competitive phenomena in society and biology, especially species diversity and pattern formation [2–7], offers a new way. As is known to all, the concept of Nash equilibrium (NE), developed under the assumption that the players are sufficiently rational to ensure that they can accurately learn the strategies of the competing players and to optimize their own strategy accordingly, plays a fundamental role in both classic game theory and evolutionary game theory [1, 8–13]. Furthermore, much effort has been devoted to investigating RPS game using a variety of models, such as the “rock-paper-scissors dynamics model” [14, 15], scale-free memory model [16], and cyclic dominance model [17–19]. Note that rock play is a dominant strategy for both players (i.e., the best choice of rock, whenever your opponent plays! So, the balance for this game is unique: both players always choose rock).

Convex optimization is a reliable and efficient optimization method, which can obtain the global optimal solution precisely by selecting the appropriate algorithm. Convex optimization, frequently used to solve optimization problems due to its strong scalability and wide applications which include electronic technology [20–23], software engineering [24–27], and machine learning [28], can be fully applied to the game theory and sheds light on the optimal strategy in iterated RPS game. When we check the payment table for rock, paper, and scissors, we see that such a balance does not exist. There is no option where the selections for the two players are the best answer for the other player. So, there is no true Nash strategy balances.

In this paper, we employ convex optimization (CVX for short) and the saddle point equation to optimize two players’ payoff in iterated RPS game, which has a considerable amount to offer in both theoretical researches and practical applications. Make a move that either gives you a win or a stalemate that ensures you will not lose. You can suppose for instance that your opponent will not play it three times in a row, if they toss out scissors twice. Either rock or paper they will play. In order to verify our hypothesis of the study, we assign different values to the convex optimization model and then draw graphs with simulation software to find the pertinent rule, which shows the feasibility of the convex optimization method. Rock scissoring (also known as “rock” or ro-sham-bo) is a hand game usually played between two people, in which each participant produces a single one of three forms with an extended hand (articles of rock, “rocks” are sometimes known by various orders). The hand game normally is played in between two people. The forms “rock,” “paper,” and “scissors” are “clocked fist” (a fist with the index finger and middle finger extended, forming a V). “Scissors” are the same as the two-fingered V sign (also denoting “victory” or “peace”). It is not held vertically but is directed horizontally.

The marginal contribution of our study is that we introduce the two-player non-zero-sum matrix to describe the payoff of players quantitatively as and ; we simulate the system with Newton algorithm. We use the saddle point equation to calculate how players can obtain the maximal payoff and find out that the EPRs of player and player increase with the payoff . This method takes into account the opponent’s previous movements, to decide whether the opponent wants to choose one move over another. In essence, each rock, paper, and scissor have a “score.” The rating of the opponent’s move is increased after every move. If the rock paper scissors is a game that is bad, then they can only play in the devil’s sons, and of course, the frenzied fans are spectators. Therefore, the concept that the game is bad is not even taken into account or that the game is not bad.

2. Model

For the sake of simplicity, we take a two-person non-zero-sum RPS game model as an example to study the noncooperative game system. In this game, each player plays innumerable rounds of the game and can only choose one action among R, P, and S in each round, as shown in Figure 1. The payoff is defined as the only parameter of the winning action in this game [1] (see Figure 2), and rational players just make decisions according to the value of . Two players get a unit payoff when they choose the same action. Furthermore, player will win with payoff while player is going to get zero payoff when player beats player , and vice versa.

Figure 2

The payoff matrix: each element of the payoff matrix is from row player to column player. It has just two potential outcomes in a simultaneous zero-sum game: one player draw, a win, and a loss for the other. A player who decides to play rock is going to beat another player who has selected scissors (“rock smashes scissors” or “blunt scissors” at times) but is going to lose to the player who has picked paper (“paper covers rock”) (“scissors cuts paper”). The game is tied, usually played to break the tie quickly, if both players decide to play the same form. The game type was created in China and spread through increased contact with East Asia, and various varieties of signs were developed throughout time. A genuinely random opponent is not feasible to acquire an edge. However, it is possible to gain an important advantage by using the psychological flaws of intrinsically nonrandom adversaries. Actually, people tend to be nonrandom players. As a result, competition for algorithms playing rock paper scissors was held.

During competitions, players often plan three gestures before the tournament begins. Some tourney players utilize methods to mislead or fool other players into an illegal move, which leads to a loss. One such approach is to call the name of one move in order to misdirect and mislead the other.

3. Results

3.1. Convex Optimization Equation

The expected payoff per round (EPR) of player and the EPR of player are as follows: where

, , and denote the payoff probabilities of player , and , , and denote the payoff probabilities of player .

Compared with the traditional payoff function, our payoff equation is strictly convex. The convex optimization problem is described by (3) and (4).

Let , , and . With the incidence matrices, we can rewrite problems (3) and (4) as (5) and (6).

Here, player selects a strategy , while player selects a strategy . As (1)–(4) above are convex, all their optima are global optima. Two adversaries randomly throw out motions in the game rock paper scissors, and each wins, loses, or draws with equal probability. It must be a game of sheer luck, not competence—and certainly, if everybody could be perfectly alleged, nobody could take the lead on anybody else.

In this paper, the system we study is a two-person non-zero-sum game, so it can be written as . In order to study it quantitatively, we suppose that player makes his decision first and player acts according to player ’s decision later. In an object larger than another, a paper which covers a rock still makes sense. That is why the paper beats the rock; only because the rock is not harmed does it invisibly render the rock unnecessary to the rest of the world. For rational players and , player wants to minimize , while player wants to maximize . Similarly, player wants to maximize , while player wants to minimize . Game theory just renders it ineffective as an instrument to analyse the occurrences of the real world with the highly problematic assumptions on “rationality,” equilibrium solutions, information, and knowledge.

Player ’s best defense is to use to minimize . while player should choose to maximize one of the payoffs of the system.

We define , , and , where denote the values of the expected payoff (gain), respectively.

When and player wants to minimize , then he should maximize and its coefficient . That is to say, when and , is the minimal payoff. From the inner minimization in (8), we have . The other two cases () proceed similarly. Two-person games are the simplest form of competing situations. These games have only two players; it is dubbed zero-sum games, as one player wins the other player lose.

For , inner optimization can be described as follows: where denotes the vector that is all zeros except for one in the th position, that is, deterministic strategy . These optimization expressions are to be compared with the following standard form of a convex optimization function: the most difficult aspect of making decisions, according to a study presented at the annual conference of the academy of management this month, is not finding the proper answer; it has the fortitude to really act on that information.

In this paper, we introduce a scalar variable representing the value of the inner minimization:

Writing in matrix notation where , , , and

The symbol ≥ denotes being greater or equal to, the symbol ≤ denotes being less or equal to, and the symbol denotes being equivalent to. Thus, we have obtained the standard convex optimization equation of our model.

3.2. The Saddle Point Equation

The primal problem (11) is convex with convex payoff mainly decided by player .

The Lagrangian is

Thus, the dual function is

For each pair with , the Lagrange dual function gives us a lower bound on the optimal value of the optimization problem (15). You are calm; scissors shows. You must be very careful to cut an object or open a box with scissors if you wish to cut it. Scissors shows that you are crafty and are awaiting a chance. You have just made a lovely dinner for scissors if you believed you could suffocate rock by throwing paper. Thus, we have a lower bound that depends on some parameters and . Problem (15) is translated into the optimization problem:

This is called the Lagrange dual problem corresponding to problem (15). The Slater condition [28] says that strong duality between (17) and (18) holds if the quadratic inequality constraints are strictly feasible; i.e., if there exists an with , . Strong duality between (17) and (18) will be proven in the next section.

Now, suppose the order of play is reversed; player chooses first, and then, player chooses . Following a similar argument, if the players follow the optimal strategy, player should choose to minimize , which results in a payoff of which is equivalent to where Equations (13) and (19) are standard convex optimization payoff functions of players and . Note that player ’s problem is dual to player ’s in game theory. Many powerful algorithms have developed as a result of competitions for programming rock paper scissors, the heuristic compilation of techniques by Iocaine Powder, for instance, who was the winner of the First International RoShamBo Programing Competition in 1999. It also contains six metastrategies for each method it deploys, which defeat the opponent in second, third, and second guessing and so on.

The previous hypothesis is based on the sequence of how player and player make decisions. Now, player and player are making decisions at the same time. In comparison with Equations (13) and (19), we accordingly have

We call Equation (20) the saddle point equation. It is a description of the saddle point at which players get the maximal payoff. Equation (13) gives the left part of Equation (20), whereas (19) provides the right part of Equation (20).

Similarly, looking at the payoff of player , we have another saddle point equation

The above equation is also a description of the saddle point equation at which players can achieve the maximal payoff of the system we study. Equations (13) and (20) represent the left part and the right part of Equation (20), respectively. Based on this, we denote the left and right parts of Equation (21) as

We use strong duality theorem to prove Equation (20). The same principle can also be used to prove Equation (21).

To prove it, first, it is noted that

This means that we can write the optimal value of the primal payoff as

We also have by the definition of the dual function. Thus, strong duality can be described as the equality . satisfies the strong max-min property or the saddle point property [29].

3.3. Optimization Algorithm

With linear equality and inequality constraints reduced to a sequence of linear constraint problems, the optimization problem in standard form is as follows:

We have which are called the Karush-Kuhn-Tucker (KKT) conditions [29].

Because problem (15) is convex, the KKT conditions are also sufficient for the points to be primal and dual optimal. Thus, we have

Then, and are primal and dual optimal, with zero duality gap. Interior-point methods solve problem (15) by applying Newton’s method to a sequence of equality constrained problems.

First, we translate problem (15) into an optimization problem, which states that the inequality constraints are implicit in the objective where is the indicator function for the nonpositive reals and is the indicator function of.

Then, we define the function to approximate the indicator function by using the barrier method, where is a parameter that determines the accuracy of the approximation. The function is convex and nondecreasing and takes on the value for . is differentiable and closed and increases to as increases to 0.

Subsequently, we replace with in (30).

The objective function here is convex, since is convex, increasing in , and differentiable [29]. We obtain the function with which is the logarithmic barrier for problem (15). Finally, we obtain the gradient and Hessian of the logarithmic barrier function :

3.4. Verifying Hypothesis

The hypothesis in our study is that the saddle point equation can still work very well even if changes with , but it is never affected by other factors. Assigning a specific value to in simulation software, such as , we can calculate the left part of the saddle point equation easily using the CVX software package. With our specified values, the calculation result of the left part of the saddle point equation is . The calculation result of the right part is . Assigning another specific value to in simulation software, such as , the calculation result of the left part of the saddle point equation is . The calculation result of the right part is . It means that no matter or , the left part of the saddle point equation stays the same with the right part. As indicated in Figure 3, suppose , the optimal value of EPR of player equals to that of player , which increases with and reaching towards 0.95. Thus, we believe numerical results are consistent with our theoretical hypothesis.

(a) The optimal value of EPR of player , when -50

(b) The optimal value of EPR of player , when -50

4. Conclusion

We have demonstrated in this paper how players can obtain the optimal payoff in two-player iterated RPS game with the convex optimization method and saddle point equation and verify the hypothesis of the payoff equation with interior methods by using simulation software. Hence, it can be concluded that convex optimization is a feasible method to maximize the payoff in two-player iterated RPS game regardless of other factors. Furthermore, the research method is operational and the results of our study can be extended to game systems like social cycling, species competition, election, and economical issues and provide insight into further related quantitative research.

Rock paper scissors (RPS) is not only a game popular with children but also a basic and classic model system for studying the mechanism of decision-making in noncooperative strategic interactions in depth. The RPS is a topic of increasing interest and significance for it helps improving our understanding on many complex competition issues (species divergence, price cycling, human decision-making, rationality and cooperation, and so on). It is also a starting point to enter into the interdisciplinary field between statistical physics and game theory [9].

As stated in Bi and Zhou’s paper [8], cooperation in a finite-population RPS game system with more than two players may be much more difficult and complex to achieve than the case of only two players; we only provide the simplest model to probe into the optimal strategy in iterated RPS game in the paper; much more complicated related research needs being carried out in the future.

Data Availability

The figures used to support the findings of this study are included in the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Authors’ Contributions

W.J.L. was responsible for conceptualization. Y.F.L. was responsible for the software. Q.Q. and Y.F.L. were responsible for writing the original draft. Q.Q. was responsible for writing, review, and editing. All authors read and agreed to the published version of the manuscript.

Acknowledgments

The work was supported by the Medical-engineering Interdisciplinary Project funded by University of Shanghai for Science and Technology (No. 1020308412).

References

Z. Wang, B. Xu, and H. J. Zhou, “Social cycling and conditional responses in the rock-paper-scissors game,” Scientific Reports, vol. 4, no. 7, pp. 1–7, 2014.
View at: Google Scholar
J. M. Biernaskie, A. Gardner, and S. A. West, “Multicoloured greenbeards, bacteriocin diversity and the rock-paper-scissors game,” Journal of Evolutionary Biology., vol. 26, no. 10, pp. 2081–2094, 2014.
View at: Google Scholar
R. A. Laird, “Population interaction structure and the coexistence of bacterial strains playing ‘rock–paper–scissors’,” Oikos, vol. 123, no. 4, pp. 472–480, 2014.
View at: Publisher Site | Google Scholar
S. Loertscher, “Rock-scissors-paper and evolutionarily stable strategies,” Economics Letters., vol. 118, no. 3, pp. 473-474, 2013.
View at: Publisher Site | Google Scholar
Q. He, M. Mobilia, and U. C. Täuber, “Spatial rock-paper-scissors models with inhomogeneous reaction rates,” Physical Review E Statistical Nonlinear & Soft Matter Physics., vol. 82, no. 1, pp. 1–11, 2010.
View at: Google Scholar
B. Sinervo, B. Heulin, Y. Surget-Groba et al., “Models of density-dependent genic selection and a new rock-paper-scissors social system,” American Naturalist., vol. 170, no. 5, pp. 663–680, 2007.
View at: Publisher Site | Google Scholar
B. Kerr, M. A. Riley, M. W. Feldman, and B. J. M. Bohannan, “Local dispersal promotes biodiversity in a real-life game of rock-paper- scissors,” Nature, vol. 418, no. 6894, pp. 171–174, 2002.
View at: Publisher Site | Google Scholar
Z. Bi and H. J. Zhou, “Optimal cooperation-trap strategies for the iterated rock-paper-scissors game,” PLoS One, vol. 9, no. 10, pp. 1–6, 2014.
View at: Google Scholar
H. J. Zhou, “The rock-paper-scissors game,” Contemporary Physics., vol. 57, no. 2, pp. 151–163, 2016.
View at: Publisher Site | Google Scholar
E. Bahel, “Rock-paper-scissors and cycle-based games,” Economics Letters., vol. 115, no. 3, pp. 401–403, 2012.
View at: Publisher Site | Google Scholar
A. V. D. Nouweland, “Rock-paper-scissors a new and elegant proof,” Department of Economics-Working Papers Series., vol. 86, no. 43, pp. 178–184, 2007.
View at: Google Scholar
E. Bahel and H. Haller, “Cycles with undistinguished actions and extended rock-paper-scissors games,” Economics Letters., vol. 120, no. 3, pp. 588–591, 2013.
View at: Publisher Site | Google Scholar
P. W. Goldberg, “A survey of PPAD-completeness for computing Nash equilibria,” Surveys in Combinatorics, vol. 3, pp. 1–32, 2011.
View at: Google Scholar
E. Wesson and R. Rand, “Hopf bifurcations in delayed rock-paper-scissors replicator dynamics,” Dynamic Games & Applications., vol. 6, no. 1, pp. 1–18, 2016.
View at: Google Scholar
D. Semmann, H. J. Krambeck, and M. Milinski, “Volunteering leads to rock-paper-scissors dynamics in a public goods game,” Nature, vol. 425, no. 6956, pp. 390–393, 2003.
View at: Publisher Site | Google Scholar
I. Lubashevsky and S. Kanemoto, “Scale-free memory model for multiagent reinforcement learning. Mean field approximation and rock-paper-scissors dynamics,” European Physical Journal B., vol. 76, no. 1, pp. 69–85, 2010.
View at: Publisher Site | Google Scholar
C. M. Postlethwaite and A. M. Rucklidge, “Spirals and heteroclinic cycles in a spatially extended rock-paper-scissors model of cyclic dominance,” Europhysics Letters, vol. 17, no. 9, article 48006, 2016.
View at: Google Scholar
G. Verma, K. Chan, and A. Swami, “Zealotry promotes coexistence in the rock-paper-scissors model of cyclic dominance,” Physics, vol. 92, no. 5, article 052807, 2015.
View at: Publisher Site | Google Scholar
A. Szolnoki and M. Perc, “Zealots tame oscillations in the spatial rock-paper-scissors game,” Physical Review E., vol. 93, no. 6-1, article 062307, 2016.
View at: Google Scholar
D. P. Palomar, J. M. Cioffi, and M. A. Lagunas, “Joint Tx-Rx beamforming design for multicarrier MIMO channels: a unified framework for convex optimization,” IEEE Transactions on Signal Processing., vol. 51, no. 9, pp. 2381–2401, 2003.
View at: Publisher Site | Google Scholar
A. B. Gershman, N. D. Sidiropoulos, S. Shahbazpanahi, M. Bengtsson, and B. Ottersten, “Convex optimization-based beamforming,” Signal Processing Magazine IEEE., vol. 27, no. 3, pp. 62–75, 2010.
View at: Publisher Site | Google Scholar
S. Joshi and S. Boyd, “Sensor selection via convex optimization,” IEEE Transactions on Signal Processing., vol. 57, no. 2, pp. 451–462, 2009.
View at: Publisher Site | Google Scholar
F. Altenbach, S. Corroy, G. Böcherer, and R. Mathar, “Strategies for distributed sensor selection using convex optimization,” in 2012 IEEE Global Communications Conference (GLOBECOM), pp. 2367–2372, Anaheim, CA, USA, December 2012.
View at: Google Scholar
J. Mattingley and S. Boyd, “CVXGEN: a code generator for embedded convex optimization,” Optimization and Engineering, vol. 13, no. 1, pp. 1–27, 2012.
View at: Publisher Site | Google Scholar
J. Mattingley and S. Boyd, “Real-time convex optimization in signal processing,” IEEE Signal Processing Magazine., vol. 27, no. 3, pp. 50–61, 2010.
View at: Publisher Site | Google Scholar
S. P. Boyd, “Real-time embedded convex optimization,” Ifac Proceedings Volumes., vol. 42, no. 11, p. 9, 2009.
View at: Publisher Site | Google Scholar
T. Wang, R. Jobredeaux, M. Pantel, P. L. Garoche, E. Feron, and D. Henrion, “Credible autocoding of convex optimization algorithms,” Optimization and Engineering, vol. 17, no. 4, pp. 781–812, 2016.
View at: Publisher Site | Google Scholar
Y. Ploengpit and T. Phienthrakul, “Rock-paper-scissors with Myo Armband pose detection,” in 2016 International Computer Science and Engineering Conference (ICSEC), pp. 1–5, Chiang Mai, Thailand, December 2016.
View at: Google Scholar
V. Boyd and L. Vandenberghe, “Convex optimization,” IEEE Transactions on Automatic Control, vol. 51, no. 11, pp. 1859–1859, 2006.
View at: Google Scholar

Copyright

Copyright © 2022 Weijie Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

134

Downloads

257

Citations