Volume 2013 (2013), Article ID 754398, 12 pages
A Tree Formulation for Signaling Games
Department of Economics, City University London, Northampton Square, London ECIV OHB, UK
Received 15 February 2013; Accepted 10 May 2013
Academic Editor: Dimitrios P. Tsomocos
Copyright © 2013 Xeni Dassiou and Dionysius Glycopantis. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The paper has as a starting point the work of the philosopher Professor D. Lewis. We provide a detailed presentation and complete analysis of the sender/receiver Lewis signaling game using a game theory extensive form, decision tree formulation. It is shown that there are a number of Bayesian equilibria. We explain which equilibrium is the most likely to prevail. Our explanation provides an essential step for understanding the formation of a language convention. The informational content of signals is discussed and it is shown that a correct action is not always the result of a truthful signal. We allow for this to be reflected in the payoff of the sender. Further, concepts and approaches from neighbouring disciplines, notably economics, suggest themselves immediately for interpreting the results of our analysis (rational expectations, self-fulfilling prophesies).
The philosopher Professor Lewis , writing on the origins and process of formation of language, discusses signaling games between a sender, who sends a signal, and its receiver. In the Lewis formulation the sender is aware of the state of the world, but the receiver is not. There are a number of alternative states and nature chooses one at random, that is, with a certain probability. Once the sender knows the state chosen, there are various signals that he can use. There is a number of alternative actions that a receiver can take in response to the signal received.
Following the specific actions of the sender and the receiver there are payoffs awarded to both of them. These rewards express, for example, the utilities or money, obtained from the combination of their actions. We note in particular the games of common interest in which a resolution leads to optimal payoffs for both actors as noted by Skyrms .
However this type of analysis is not always complete. Notably there is little discussion of the case where the action of the receiver may be appropriate to the state of nature even if the signal sent is not. There is no discussion of what happens to the payoffs of the two agents when this is the case. Lewis makes an attempt to discuss what constitutes “true” and “untrue” signals and responses in a signaling system. We discuss these issues below.
In terms of informal signaling conventions, Lewis offers the simple but illuminating example of a helper, (), referred to also as he, standing behind a truck gesturing to the driver, (), referred to also as she, to help her steer the truck into a narrow parking space. We can assume that Nature consists of the particular position of the truck and of the parking space and that there are two such combined alternatives each with probability 1/2.
The signaling behaviour and the action that follows are what Lewis describes as a “conventional regularity” of unwritten rules of parking gestures based on experience to which all parties conform. In terms of the highest payoffs the common interest objective of and is to get the truck into the space. In this situation, as he puts it:
“The helper gestures as he does because he expects the to respond as he does, and the responds as he does because he expects the helper to gesture as he does.” (page 127)
This means that the expectations of the actors are self-fulfilling and they both receive their optimal payoffs. One can borrow concepts and approaches from neighbouring disciplines, notably game theory and economics. They can provide a formal interpretation of the outcome in terms of existing concepts in those areas (rational expectations, self-fulfilling prophesies) using a fixed-point theorem from mathematics. We return to this point below.
Of course this is not always the case. A deviation from this rule, perhaps based on lack of trust, could lead to one or both of them getting inferior payoffs. The issue is to investigate whether a “correct” interpretation of the signals is possible which would lead to such an equilibrium position being reached. In such a situation no one would wish to independently change his action if all the information was revealed and moreover the payoffs would be optimal.
More formally, in a signaling problem there are the following alternative states of nature . These are observed, let us say, by one sender, who will send a signal concerning the information received, that is, the state chosen by nature. The receiver then has to choose an action without knowing the state of nature. The sender compiles a set of alternative signals using a function . This is an encoding rule according to Blume . In other words, is the function that translates the states of nature into communicated signals. In order to be able to identify eventually the state with a signal we require .
Clearly there will be possible signaling systems, that is, alternative ’s functions. Suppose next that the set of actions available to are . Given a signal we define . This is a decoding rule, again according to Blume. The function is designed to translate actions based on a signal received into a maximum payoff for . The signal received and the action chosen combined with the state of nature implied by should lead to ’s maximum payoff. will be called a signaling convention. It is of course true that differently designed signaling systems combined with an association between actions and states of nature could produce the same result.
Rubinstein, [4, Chapter 2] stresses the difficulty of formulating communication models into game theory models as solutions in the latter are invariant to a change in the names of the actions that lead to these outcomes; that is, an alternative convention has the same outcome. In other words the “sender and the receiver are uncertain about each other’s language use” [3, page 515].
In the existing literature the prevailing assumption is that for each state there is one action that has to be selected. If takes the “correct” action then maximum payoffs will be received by both and irrespective of what the signal sent by was (see, e.g., Lewis , Pawlowitsch , and Huttegger ).
A Nash equilibrium (NE) is a pair of strategies (actions) of the players which are in terms of payoffs best replies to each other’s. Huttegger distinguishes between strict Nash equilibria (i.e., with strictly greater payoffs) and nonstrict Nash equilibria. He argues that it is the former that lead to signaling systems as they ensure a one-to-one correspondence between signals and actions.
If there is more than one strict Nash equilibria, then the appropriate system is selected either through salience Lewis , through “cheap talk” Crawford and Sobel , Farrell , or through evolution van Rooij [9, 10]. The latter approach uses a system of replicator dynamics such as the one used by Taylor and Jonker , in which the frequency of agents with above average payoff increases. However, as Pawlowitsch shows, replicator dynamics will not always converge to an evolutionary stable system.
In our paper we move one step further by distinguishing between a true signal and a correct action. Truthfulness leads to both and receiving their maximum payoff. However a correct action is not always the result of a truthful signal. may still “correctly” guess the state of nature although sends the “wrong” signal. Then in our model will receive the maximum payoff but will not.
Huttegger  hints to this possibility by discussing a case where the sender deliberates about what signal to send, but the receiver is nonetheless fast or lucky to choose the right act. However the payoffs in this model (and typically the payoffs of any Lewis-based signaling game) assume a common interest between senders of information and receivers. Intuitively one would expect that if there is a distinction between correctness and truthfulness, it would be easier to reach a unique conventional signaling system without any need to use salience or some other form of communication to establish the supremacy of a truthful signaling system. We reach the surprising conclusion that this is not the case. We then discuss ways in which a conventional signaling system may be achieved by reverting to notions of saliency, communication, or evolution through Bayesian updating.
The ideas and analysis in this area lend themselves to a game theoretical treatment and this is what we discuss below in the context of a specific game. The philosophical and game theoretical economic approach can profit and enrich each other.
2. A Game Tree Approach
We introduce here the game theoretic analysis by concentrating on an / example which is a detailed elaboration of the discussion by Lewis. The analysis is in terms of an appropriate, extensive-form tree formulation of a noncooperative, signaling game. The setup is one with asymmetric information. The player who is informed, that is, the helper here, moves first to signal, truthfully or not, his information to the receiver, that is, the driver. The relevant concepts and ideas are discussed briefly; for details, see, for example, Binmore , Osborne and Rubinstein .
Figure 1 gives the simple tree formulation of our game. This is a game of imperfect information. In the information set , enclosed by a parallelogram, player does not know whether she is at nodes or . Similarly when enters information set , she does not know exactly where she is. Obviously the actions from the nodes of an information set are identical.
There are two equally probable states of nature; a truck must move Left () or Right (), each with probability 1/2, and then two signals can be made by . These are “left” and “right” and they are indicated by and from point 1 and by and from point 2. Point 1 appears if “Nature” plays and point 2 if it plays .
On the other hand, can choose from two alternative actions having heard the signal but without knowing the state of nature. Therefore she does not know whether she is at node or of , when she hears , neither if she is at node or of , when she hears . In she can choose from actions and and in from actions and . A play of the game starts from the initial node to a terminal point, where the game ends. The payoffs at the terminal nodes refer to the payoffs of , the upper number, and of , the lower number. The expected payoffs for and are denoted throughout the paper by and , respectively.
Pure strategies are rules that tell each agent what action to choose from each information set. They can be played with probabilities as mixed strategies. The pure strategies of are , where, for example, means that he plays , that is, “left” from point 1 and , that is, “right” from point 2.
The pure strategies of are from information sets and , respectively. For example means that plays from information set and from information set .
’s mixed strategy is , where, for example, means that plays from information set and from information set with probability . Of course for and they sum up to 1. The mixed strategies of are analogously defined.
We are looking at the idea of an equilibrium in such a game. The general proof of existence of an equilibrium in the setup of abstract mathematical spaces is based on the concept of the fixed point theorem. Under appropriate technical assumptions, a function from one space to itself has a fixed point, that is, an element of the domain which maps onto itself. There are a number of such theorems, depending on the generality of the mathematical conditions imposed. Of course in particular explicit examples, like the one we are investigating, we do not have to go through a formal proof of existence to confirm an equilibrium.
First we consider the existence of an NE. A number of players with their own individual sets of (mixed) strategies are given and payoffs depend on everybody’s action. A set of (mixed) strategies, one for each player is an NE, if for each player his choice is the best response, in terms of payoff, given the other agent’s action.
The payoffs show that when reveals the “correct” signal and the correct action (the one corresponding to the actual state of the world) by follows, then they both receive a payoff of 1. If communicates an incorrect signal and this leads to an incorrect action by , then both players get 0. If chooses an incorrect signal and reacts counter to the signal indicated, then gets 1 for performing the correct action but ends up with zero.
We set the probability that plays in as , while is the probability that plays in , is the probability that plays in , and finally is the probability that plays in . We note that and , shown in Figure 1, can be also thought of as behavioural strategies describing how chooses between the action from an information set. We shall return to this immediately below. At this stage and give the specific way the probabilities are combined.
We can use the probabilities attached to the choices from and , described in Figure 1, to fold the tree up, as seen in Figure 2, given the choices of the strategies and their probabilities and can calculate the expected payoffs of and from left to right of the new terminal nodes as , , and . We emphasize that these are expected payoffs of the two agents, conditional on the choice made by Nature and . We shall use this information in considering the possibilities for NE.
Next we consider a perfect Bayesian equilibrium (PBE). It consists of a set of players’ optimal behavioural strategies, that is, independent distributions on the nodes of each information set, and consistent with these are a set of beliefs which attract a probability distribution to the nodes of each information set. Consistency requires that the decision from an information set is optimal given the particular player’s beliefs about the nodes of this set and the available information. If the optimal play of the game enters an information set, the updating of beliefs must be Bayesian. Otherwise appropriate beliefs are assigned arbitrarily, within limits, to the nodes of the set. The assignment of beliefs is a characteristic feature of a PBE.
The proof of existence of a PBE is also based on a fixed-point theorem. In our explicit example, we can simply check directly that a set of behavioural strategies satisfy the properties which characterize an equilibrium.
Player uses independent distributions to choose between the action of the information sets. For example, she spins a wheel to decide between and in and a wheel divided differently for choosing when she is in . The same principle applies if information sets are singletons, that is, consist of the single point. This applies to sets 1 and 2 belonging to . In order to avoid introducing too much notation, we use again and , for the behavioural strategies in and , respectively. For ’s behavioural strategies at points 1 and 2 we shall use the independent distributions and for choosing between and and and , respectively.
In calculating PBEs, the behavioural strategies of will be used to fold the tree up, as seen in Figure 2, and the expected payoffs of and from left to right of the new terminal nodes are , , , and . Of course, as mentioned above, these are expected payoffs following the choice of nature and .
Our detailed presentation and analysis of the model show that there exists more than one equilibria. This forces the analysis into a further argument for the choice of the most reasonably expected one.
Finally we note that the structure of our model is such that there is a complete, one-to-one correspondence between NE and PBE. In a number of cases, for example, in Section 2.3, the Nash equilibria are in pure strategies and as such they imply behavioural strategies as well. Also in all other cases here the Nash strategies define beliefs and consistent optimal behavioural strategies.
2.1. Lack of an NE or a PBE
We now show that for , there is no NE and no PBE either. We point out that, as it is shown in Appendix B, any such pair can be realized by a set of feasible s. We consider combinations of strategies.
Proposition 1. For , there is no NE and therefore no PBE either.
Proof. The proof consists of considering the implications of a number of cases in which strategies of and are combined and it is given in Appendix A.1.
Next we consider the possibility of a PBE where the distributions on the choices between the information sets are taken to be independent of each other. We consider the case . Suppose that has chosen some pair of distributions referring, as explained previously, to decisions from points 1 and 2. As argued above, the pairs and are not optimal. Hence there is no PBE either. In general if there is no NE there is no PBE either. One of the requirements for obtaining a PBE is that the proposed strategies are optimal responses to each other.
2.2. The Existence of NE and PBE
We now characterize all equilibria.
Proposition 2. There exists an NE, and hence a PBE, with on the boundary sections of the feasible set .
Proof. The proof consists of considering the implications of a number of combinations of strategies of and and it is given in Appendix A.2.
Finally we note that under all possible circumstances and corresponding to is optimal. It gives to both players a maximum possible expected payoff. In particular it is precisely then that obtains this.
We want to consider all possible feasible pairs . It is instructive to look also at the reasons why certain combinations of strategies do not form an equilibrium. In Appendix A.3 we analyse a number of cases in which are on the boundary of the feasible set . Then in Section 2.3 we shall consider the realistic scenario where both and use pure strategies.
2.3. The Calculation of NE and PBE for Pure Strategies of
In the end, we are interested in the players making, from their information sets, decisions with probability one. In particular we want to analyse the case when will instruct , with probability 1, to turn “left” or “right” and will also play a pure strategy. Eventually we want to know which combination of pure strategies is most likely to prevail; that is, whether the signal of will be truthful and if will believe it.
The behavioural assumption is that every agent chooses his best strategy given the strategy of the other. That is, in effect, a reaction function is formed. If each player optimizes believing, prophesying, a particular strategy for the other, and the outcome is that there is no reason for anybody to feel they have predicted wrongly, then we have an equilibrium which has been obtained rationally. The confirmation of the predictions takes places where the reaction functions intersect.
The analysis of the cases in Appendix A.3 implies we are only left to consider the four corner cases , , , and in Figure 3. We shall show that they can all form PBEs. However there is only one which can reasonably be expected to prevail. In order to obtain this we need to invoke an extra argument over and above the conditions for an NE or a PBE.
In analysing Cases 1–4, below, we take each with only one of the possible strategies for with which they form NE. This applies to Case 2 where there exists more than one such optimal strategy. This allows us to cast the analysis in terms of a corresponding figure as well. The figures enable us to explain the formation of the corresponding equilibrium beliefs, which are attached to the nodes of the information sets and which are part of the definition of the PBE. We also explain the arbitrary, within limits, beliefs when the equilibrium path does not enter an information set.
Case 1. . This is shown in Figure 4. These probabilities imply and . Hence and decides to play and the tree folds up into the smaller one in Figure 4. The best response for is to play from and from , hence the strategy . We write these pair of strategies as . To this, the best response by is to play . Hence the pair of pure strategies (), indicated with double lines, form a unique NE. This follows from the fact that can only be combined with , and vice versa, in order to form an NE.
Next we turn our attention to the existence of a PBE. We consider the set of pairs of behaviour strategies of and given by (), where, for example, means that at point 1 plays with probability 1 and at point 2 he chooses with probability 1. It is straightforward to see that these pairs are optimal in the sense of being a best response by an agent to the behavioural strategy of the other.
The calculations, through Bayesian updating, of the conditional probabilities, beliefs, attached to the nodes of and are based on these strategies and it is explained in Appendix A.4.
Finally, the optimality of the strategies given these beliefs can be easily checked. Hence () form a unique PBE. Of course it is connected to the NE because pure strategies imply that the implied behavioural strategies are played with probability 1 from the relevant information set.
The expected payoffs are calculated as follows: tells the truth always and gets . always makes the correct move and also gets .
We can provide some further explanation with respect to the expected payoffs. As explained earlier a folded-up tree can be obtained and now we are using the optimal strategies of given that . In the folded-up tree of Figure 4 it is clear the must use from point and from point 2. The NE and PBE follow.
In Case 2 below we consider the possibility that might decide to send purposely the wrong signal to . Now we want to consider the possibility that receives a noisy signal. Let Nature select a state; the Sender receives this signal with a noise and this consequently will also be transmitted to the Driver. For example, if Nature chooses , the noise consists of receiving with probability 0.95 and with probability 0.05. sends to a “left” or a “right” signal. All left signals end up in the same information set and all the right signals in a different one. , in her turn, makes a left or a right move. If the final choice is correct, then gets a payoff of 1 and gets also 1 if he reported the correct signal and 0 if he reported an incorrect signal. In the latter case receives an incorrect signal which he reports truthfully or he misreports a correct signal. For a small noise the outcome is the same as in Case 1 but for a large noise will be punished for believing the false message.
Case 2. . This is shown in Figure 5. Hence and implying . decides to play and the full tree folds up into the smaller one in Figure 5. A best response, (we could have selected the best response or and Figure 5 would have been adjusted accordingly), by is to play and to this the best response of is . Hence the pair of pure strategies () form an NE.
With respect to the existence of a PBE, we consider the set of pairs’ behaviour strategies of and given by . It is easy to see that these pairs are optimal. Namely, they are the best response by an agent to the behavioural strategy of the other.
The calculations, through Bayesian updating, of the conditional probabilities, beliefs, attached to the nodes and are based on these strategies. The formulae used are analogous to the ones in Case 1 and the values are shown in Figure 5. The optimality of the strategies given these beliefs can be easily checked. Hence () is a PBE.
The expected payoffs are obtained as follows. never tells the truth and ends up . makes the correct move always, by playing the opposite of what indicates and ends with an expected payoff of . In the folded-up tree of Figure 5 it is clear that can use from point and from point 2. The NE and PBE follow.
We can provide some more detailed explanation with respect to the expected payoffs. In the folded tree, given ’s choices, a payoff of is indicated for irrespective of his strategies. The beliefs 0 and 1 in are consistent with playing . This gives expected payoff while playing gives . Hence action is preferable to .
Also, the beliefs 1 and 0 in are consistent with playing . This gives while playing gives . Hence action is preferable to .
Case 3. . This is shown in Figure 6. These probabilities imply and . Hence and decides to play and the full tree folds up into the smaller one in Figure 6. The best response for is to play and to this the best response of is . Hence the pair of pure strategies () form an NE.
With respect to the existence of a PBE, we consider the set of behaviour strategies of and given by (). It is again straightforward to see that these pairs are optimal in the sense of being the best response by an agent to the behavioural strategy of the other.
The calculations, through Bayesian updating, of the conditional probabilities, beliefs, attached to the nodes are based on these strategies. The formulae used are analogous to the ones in Case 1 and the values are shown in Figure 6. We note that the game never enters and, hence, for the indicated optimal strategy, , are arbitrary with . In the Bayesian formula for updating, since , we obtain 0/0.
The optimality of the strategies given the beliefs in can be easily checked. Hence is a PBE.
The expected payoffs are calculated as follows. plays either or . Hence he tells the truth once with probability and gets expected payoff . plays either or . That is, she makes the correct move once and gets = . In the folded-up tree of Figure 6 it is clear must use from point 1 and he can also use from point 2. The NE and PBE follow.
It is important to note that in this equilibrium the informational content of ’s signal is zero and hence the (updated) beliefs of in are identical to the prior of the state of nature. As the signal “left” is used by to communicate two states (both and ), it is a homonymous signal using Pawlowitsch’s terminology. Professor Lewis refers in his book to inadmissible signals. The signal here which conveys no information could be considered as an inadmissible one.
Case 4. . This is shown in Figure 7. These probabilities imply and . Hence and decides to play and the full tree folds up into the smaller one in Figure 7. The best response for is to play and to this the best response of is . Hence the pair of pure strategies () form an NE.
With respect to the existence of a PBE, we consider the set of behaviour strategies of and given by . It is again straightforward to see that these pairs are optimal in the sense of being the best response by an agent to the behavioural strategy of the other.
The calculations, through Bayesian updating, of the conditional probabilities, beliefs, attached to the nodes are based on these strategies. The formulae used are analogous to the ones in Case 1 and the values are shown in Figure 7. The game never enters and, hence, for the indicated optimal strategy, , are arbitrary with . In the Bayesian formula for updating, since , we obtain 0/0.
The optimality of the strategies given the beliefs in can be easily checked. Hence () is a PBE.
The expected payoffs are calculated as follows. plays either or . Hence he tells the truth once with probability 1/2 and gets expected payoff . plays either or that is, she makes the correct move once and gets . In the folded-up tree of Figure 7 it is clear that must use from point 2 and he can also use from point 1. The NE and PBE follow.
Again, the informational content of ’s signal is zero in this Bayesian equilibrium and we have ended up with a homonymous signal as described by Pawlowitsch.
It is important to note that also in this equilibrium the informational content of ’s signal is zero and hence the (updated) beliefs of in are identical to the prior of the state of nature. This completes the analysis of Case 4.
Next, as we mention in Case 5a in Appendix A.3 the pair and is an NE for any such , and hence a PBE can be formed. We now look at corresponding adjustments to Figure 6. In the smaller graph, the first payoff vector will now be and the third one . Correspondingly, in the bigger graph, coming out of information set there will also be double line on . This means that from point 1 will result in payoffs and from point 2 will imply payoffs . This will only reduce the payoff of . The players’ beliefs stay the same. This equilibrium requires that spins a wheel to decide between and from but equally well, for the point of view of her payoff, can play or as explained in Case 5a. So there is no advantage to her, at all, in taking a more complicated decision of spinning a wheel.
Finally, we show in Case 6b in Appendix A.3 that the pair and is an NE for any such and hence a PBE can be formed. The corresponding adjustments to Figure 7 will be as follows. In the smaller graph, the second payoff vector will now be and the fourth one . Correspondingly, in the bigger graph, coming out of information set there will also be double line on . This means that from point 1 will result in payoffs and from point 2 will imply payoffs . This will only reduce the payoff of . The players’ beliefs stay the same. As mentioned before, this equilibrium requires that spins a wheel to decide between and at , but equally well, for the point of view of her payoff, can play or as explained in Case 6b. So there is no advantage to her, at all, in taking a more complicated decision of spinning a wheel.
2.4. Comparing Equilibria
Examining Cases 1–4 established above, we see that we have an information revelation problem. It is only in Case 1, where and , that the signals of reveal to the true state of the nature. In Case 2, where and , player always misreports the state and responds by doing exactly the opposite of what she is told to do. While one should stress that this also is a perfect equilibrium for the game, it does imply zero expected payoff for . He is punished for lying.
In the equilibria of Cases 3 and 4 player ’s signals do not reveal anything about the state of the nature and the driver sets her expectations in accordance with the prior probabilities. The resulting equilibrium expected payoff for is inferior to that when he tells the truth.
Both players know that inconsistent announcements by will lead to wasteful outcomes that will hurt him. Hence a truthful announcement may follow. This can result from cheap talk exchanges between the players, in which it was agreed that the message sent by the sender will be truthful, Binmore [13, 15], Rasmusen , and Barrett . The incentives of and are compatible. Player can reasonably expect, and correctly guess, that has an interest to observe the agreement and tell the truth. She comes to this conclusion on the basis of the structures of the payoffs and thus uses an extra argument for choosing among the four PBEs. This is over and above the arguments which establish equilibrium strategies. Hence we need, as Lewis argued, salient properties to establish the prevalence of a conventional signaling system, as the inferior payoffs to H are not enough to establish such a unique system.
The equilibrium in Case 1 is the most likely to prevail. knows that if he plays only when he sees and only when he sees , then will prefer to play rather than, for example, . This means that it is in the interests of for his signaling actions to be truthfully revealing of the state of nature.
It is important to note that an aligned interest between the Helper (sender) and the driver (receiver), as found in much of the existing literature, would make the former careless in terms of the truth as he would consider his payoff dependent on the final outcome which is determined by the actions of the Driver. This makes the formation of a convention less likely since Cases 1 and 2 would become equally credible. This can be seen by looking at Case 2 (Figure 5) and changing the payoffs in the folded tree for to 1 in and 1 in . In other words the formation of a clearly distinguishing language convention becomes more problematic. Hence an aligned interest (payoff-wise) between and is less adequate for the formation of a conventional signaling system.
Therefore we can define what is meant by “truth” and a true signal. In our example, if the convention is set as the one where corresponds to and corresponds to then any other signaling system used is punishing at least one of the agents with a lower payoff, even if it results in a possible PBE. We return to this point in the following section.
The structure of the game, the payoffs, and the rationality of the players are common knowledge. Using the idea of rational expectations from economic theory one could argue as follows. Both players, and , make rational decisions taking fully into account all the information which is common knowledge. The implications of their various strategies are clearly laid out. The players make rational predictions, prophesies, of each others’ actions and on this basis they act themselves. In the rational expectation equilibrium that results, that is, in the particular PBE, the predictions, that is, the players’ beliefs, are confirmed. The prophesies of the players are self-fulfilling.
Lewis argues that there is no a priori meaning to signals. In contrast Farrell argues that words (or gestures) do have a fixed external meaning. A receiver will believe that the sender will send him true information unless he has good reason to deceive him. In cheap-talk games speech serves the purpose of reinforcing a particular action and provides the beginning of an evolutionary rationale for choosing a particular equilibrium. There are of course other approaches; for example, Parikh  argues that it is the flow of information in a game theoretic context that leads to the determination of meaning of words and gestures. This helps to select an equilibrium.
Of course, in deciding among the four PBEs examined above, must find a way to communicate with that he intends to be truthful. Clearly this is done on the basis of the expected payoff of the sender of the signal. will rightly assume that will want to do the best for herself in terms of payoffs and thus play . Hence there is an alignment of preferences as discussed in Crawford and Sobel.
3. Further Discussions and Conclusions
3.1. Discovering and Updating the Informational Content of Signals
We try to place, briefly, our analysis in a wider context of the literature. In the Lewis signaling game the sender and the receiver learn how to play through experiencing successes and failures in repeated rounds of the game. This leads to the evolution of a signaling game, for example, a “convention.” This signaling system evolves through reinforcement learning. It takes the form of rewards for correct choices by the receiver for both of the agents. The accumulation of these rewards leads to the updating by the receiver of the probabilities of the states of nature. Skyrms  follows Lewis’s work and develops, in effect, the dynamics of repeated games. He explains how individuals can converge to a common convention setting that indicates which signal is to be sent in a particular situation, as well as the receiver’s action for each type of signal communicated by the sender.
In a later article, Skyrms  stresses the importance of the informational content of transmitted signals in updating beliefs. Nature chooses the state and then sends signals to intermediary receivers (senders). They can convey the information received to other agents through actions in the form of communicated signals. The informational content of signals alters the prior probabilities of states, as a result of the receiver updating his/her beliefs. This updating that takes place relative to the initial prior probability of a state of nature determines the informational content carried by a signal. Intuitively, the lower the prior probability, the higher the informational content of an accurate signal, that is, one that is more likely to be representative of the true state of nature.
In Barrett’s article, instead of through rewards, updating takes the form of adding balls to the sender’s urn. These correspond to a signal that leads the receiver to take an action that matches the actual state of the world. Balls are also added to the receiver’s corresponding action. Clearly such a case makes no distinction between true signals and correct actions because, in contrast to Farrell’s approach, words or gestures have no predetermined external meaning. Hence the action alone determines the truthfulness of the signal. Originally the balls in the urns correspond to the prior probabilities of the different states of nature. The adding of balls to the signal and action urns changes the relative proportion of balls in each urn and we have a process of continuous updating. This results in the formation of a matching law (Herrnstein ).
Here we adopt a more formal approach. In Appendix A.4, using the Bayesian formula, we saw the mathematical formulation of updating beliefs held by the receiver in light of the signal received. If one wanted, as an extension of the current analysis, to work within a framework of repeated games with signals received each period, one could use a Bayesian updating reiterating process. This would adjust beliefs regarding the evolving trustworthiness of a repetitive signal, concerning the probability of a specific state of nature, .
Consider, for example, the formula indicates the informational content of the signal for a given state of nature. It has prior , starting time “0,” a running time of periods, a learning speed convergence parameter , and a cumulative experience cardinality , where . When for each period, the signal leads to a correct action and if , then it does not.
This updating formula is in effect an extension of the Bayesian updating used in this paper and could be more appropriate for a mathematical formulation of a repeated game. This dynamic formula in which tends to 1 could be considered as a background to the completed process in which our presentation rests.
The simple example analysed in this paper leads to a number of different PBEs. As shown in our analysis it is sufficient to consider the four equilibria corresponding to pure strategies. The game suggests a “coding” that may imply a “true signal” mechanism (Case 1), or a “correct action” signaling mechanism (Case 2), or a mechanism where the messages do not convey any information. The receiver performs the same action regardless of the signal (Cases 3 and 4). While a true signal is mutually rewarding, a correctly guessed state of nature is not. Hence the helper has an incentive to signal the true state of nature. This is what is referred to as a “truthful mechanism design” Aggarwal et al. .
As Barrett argues, one may think of even more complex games. For example, there may be four states of nature each occurring with an equal prior probability, two senders, one receiver, and two signals, for example, 0 and 1. The senders coordinate their actions and can send four types of binary signals . The receiver is aware of which sender each signal originates from. However, the receiver needs to understand (learn) the correspondence between states and signals.
An accumulation of confirmations of states through rewards and punishments can lead to the formation of a convention that takes the form of a language. This will accelerate and significantly improve the chances of a perfect signaling game. We could view the punishment reference as the equivalent to our sender not getting a payoff because his signal was not truthful.
One can note that there may be cases where there is plentiful information about the states in the signals, but zero information in the act that will be chosen by its receiver. This applies if the receiver always performs the same act irrespective of the signal. This case has been explored in depth in the economics literature of herding behaviour.
As an example of this, in the formation of investment cascades there is no longer reinforcement learning. The actions performed by the receiver of a private signal (for example, a signal whether or not to invest in a particular project which may be either a good (profitable) or bad (loss making) investment) are no longer an indicator of her private information. Instead, a potential investor follows the same act as his/her predecessors irrespective of what his/her private information indicates. (See, e.g., Bannerjee , Bikhchandani et al. , Choi et al. , and Welch .) This is a case of signal jamming.
3.2. Concluding Remarks
It is remarkable how neighbouring disciplines such as philosophy on the one hand and game theory and economics on the other can come close to the understanding and analysis of important issues. The philosophical and mathematical rigours complement each other. This is the motivation of our discussion.
In this paper we analyse in detail, from the point of view of game theory, the signaling game discussed by Professor Lewis. Our approach is different from, but complementary to, his. We place the model in a rigorous game theory, extensive-form decision tree framework and analyse the perfect Bayesian equilibria, as well as the Nash equilibria. We explain and then deploy well-established game theory ideas and concepts. The game tree that we set up appears, in terms of moves, information sets, and payoffs, to be simple but the complete analysis of equilibria is involved.
We provide a discussion of the informational content and significance of the signals and the formation of beliefs in each of the above equilibria. We invoke a further argument to explain that one particular equilibrium, out of the four existing ones, is the most likely to prevail. This is an essential step for understanding the formation of a language convention. Furthermore we distinguish between a true signal and a signal that leads to a correct action in terms of the payoffs received by the receiver and the sender. A true signal will lead to a correct action but a correct action is not necessarily the result of a true signal.
In the introduction we used a quote from the work of Professor Lewis, which sums up the expected reactions of the two players in a situation where signals are used. The interpretation and analysis of this quote lead us naturally to the well-known concepts of rational expectations and self-fulfilling prophesies of economic theory. These ideas refer to a situation in which the rational decisions of agents are locked in at a fixed point. As optimal reactions to each other’s actions they confirm themselves. We employ the specific terminology for equilibria used in game theory. Our detailed calculations of the equilibria bring out the complexities behind the statement by Professor Lewis and confirm it.
A.1. Proof of Proposition 1
We refer to Figure 1 and in particular Figure 2. The Cases 1–4 in the appendix are different from the cases with the same numbers discussed in the text. We shall consider first the combinations of the pure strategies of with the open set of in Figure 3.
Case 1. and .
There is no NE. will change to .
Case 2. and .
There is no NE. will change to .
Case 3. and .
There is no NE. will change to .
Case 4. and .
There is no NE. will change to .
Also it is clear that there is no NE with in which uses a mixed strategy. Any such strategy will be a combination of the above cases and therefore it will not be optimal for .
A.2. Proof of Proposition 2
We shall consider all cases in detail. We can argue considering Figure 2.(i)Strategies and , form an NE. For it is not an NE. can play .(ii)Strategies and , form an NE. There is no other NE which corresponds to .(iii)Strategies and , form an NE. There is no other NE which corresponds to .(iv)Strategies and , form a NE. For it is not an NE; can play .
Now we can consider the possibility of mixed strategies for . We note that cannot be mixed with any other strategy for . On the other hand , and can be mixed and form a NE with . We have for the individual strategies and the expected payoffs, and , of and respectively,(i) and , is an NE with and ,(ii) and , is an NE with and ,(iii) and , is an NE with and .
From (i)–(iv) above, it follows that only pure strategies , , and can be mixed and this is possible only for , . The expected payoffs will be mixed accordingly.
Next we consider the possibility of mixing the strategies of for fixed strategies of .(iv), , and are all NE and in all circumstances .More explicitly, for probabilities we have the overall expectations (v), , and are all NE and in all circumstances .Explicitly, for probabilities we have the overall expectations
It remains to consider the possibility of an NE consisting both of a mixed strategy for and one for . The mixed strategy of will consist of a combination of her strategies, each taken with a positive probability. Every such strategy taken together with the given mixed strategy of will be optimal. But as we saw above mixed strategies of can only be combined with . So either this is the case or uses a pure strategy. Both circumstances have been analysed above.
The characterization of the PBEs follows easily from the NE above. It is possible to find the implied beliefs which will be consistent with the behavioural strategies. For the corner cases of this is done in detail in Section 2.3.
A.3. Lack of Equilibrium in Specific Cases
We want to consider all possible feasible pairs . First we analyse all cases in which are on the boundary of the feasible set .
Case 5a. .
will play either or ; that is, he must play from point 1. We distinguish between two subcases. If plays , then will change to and , and therefore there is no NE.
However the pair and is an NE for any such . Neither nor can change his/her strategy unilaterally to improve his/her payoff. This equilibrium requires that plays from and spins a wheel to decide from information set between and ; equally well, for the point of view of her payoff, can play or . So there is no advantage to her, at all, in taking a more complicated decision of spinning a wheel. We also note that gives a higher payoff to , although this is not the concern of .
Next we consider the possibility of a mixed strategy for . No such strategy could contain either or with a positive probability because can change that part of his mixed strategies and become better off. So the only possibility is for strategy and to be mixed. But then can change to and become better off.
Case 5b. .
There is no NE. will play from point 1. We consider the pair of strategies and . Then will change to . Consider now the pair of strategies and . Then will change to .
Suppose now that forms a mixed strategy consisting, with some positive probabilities, of the pure strategies , , or . Then can change that part of his mixed strategy and become better off. Therefore there is no NE with mixed strategy for either.
Case 6a. .
There is no NE. If the pair of strategies is and , then will change to . If the strategies are and , then will change to . Finally if they are and , then will change to .
Suppose now that forms a mixed strategy consisting, with some positive probabilities, of , , or . But then can change that part of his mixed strategy and become better off. Therefore there is no NE with mixed strategy for either.
Case 6b. .
Now we distinguish among the following subcases.
If the strategies are and , then there is no NE because will change to . If the strategies are and , then will change to . If the strategies are and , then will change to .
However, the pair and is an NE for any such . Inspection of the graphs reveals that given the strategy of the other player, neither nor can change strategy and improve his/her payoff. This equilibrium requires that plays from and spins a wheel to decide, from information set , between and ; equally well, for the point of view of her payoff, can play or . So there is no advantage to her, at all, in taking a more complicated decision of spinning a wheel. We also note that gives a higher payoff to , although this is not the concern of .
We now look at the possibility of a mixed strategy for . No such strategy could contain either or with a positive probability because can change that part of his mixed strategy and become better off. So the only possibility is for strategies and to be mixed. But then can change to and become better off.
A.4. The Bayesian Updating of Conditional Probabilities
The calculations, through Bayesian updating, of the conditional probabilities, beliefs, attached to the nodes of and in Figure 4 are based on these strategies.
Consider information set . The left-hand-side node is denoted by and the right-hand-side one by . We wish to calculate the beliefs attached to these nodes by . Using the Bayesian formula for updating beliefs, (see; e.g., Glycopantis et al. ), we can calculate these conditional probabilities. We know that is entered only if plays or . Hence Similarly we obtain the conditional probability .
On the other hand is entered only if plays either or . The left-hand-side node is denoted by and the right-hand-side one by . Hence
Similarly we obtain the conditional probability .
We consider here the following linear system: where and . The question is whether there is always a nonnegative solution for arbitrary and .
The family of solutions is given by where must be nonnegative and must also satisfy and . There is a range of that satisfies these relations as long as which is always true. It is straightforward to show that all solutions are of this form.
The result here is used in the text with explicit reference.
The authors wish to thank the referees of this journal for their incisive and very helpful comments which led to an improved version of our paper. They are also grateful for other comments received on an earlier draft. In particular they would like to thank Dr. Angelos Dassios for his comments. Of course responsibility for all shortcomings stays with the authors.
- D. Lewis, Convention, Harvard University Press, Cambridge, Mass, USA, 1969.
- B. Skyrms, “The flow of information in signaling games,” Philosophical Studies, vol. 147, no. 1, pp. 155–165, 2010.
- A. Blume, “A class of strategy-correlated equilibria in sender-receiver games,” Games and Economic Behavior, vol. 75, pp. 510–517, 2012.
- A. Rubinstein, Economics and Language, Cambridge University Press, Cambridge, UK, 2000.
- C. Pawlowitsch, “Why evolution does not always lead to an optimal signaling system,” Games and Economic Behavior, vol. 63, no. 1, pp. 203–226, 2008.
- S. M. Huttegger, “Evolution and the explanation of meaning,” Philosophy of Science, vol. 74, no. 1, pp. 1–27, 2007.
- V. Crawford and J. Sobel, “Strategic information transmission,” Econometrica, vol. 50, pp. 1431–1451, 1982.
- J. Farrell, “Meaning and credibility in cheap-talk games,” Games and Economic Behavior, vol. 5, no. 4, pp. 514–531, 1993.
- R. van Rooij, “Conversational implicatures and communication theory,” in Current and New Directions in Discourse and Dialogue, J. van Kuppevelt and R. W. Smith, Eds., pp. 282–303, Kluwer Academic, Dodrecht, The Netherlands, 2004.
- R. van Rooij, “Signaling games select Horn strategies,” Linguistics and Philosophy, vol. 27, pp. 493–527, 2004.
- P. D. Taylor and L. B. Jonker, “Evolutionarily stable strategies and game dynamics,” Mathematical Biosciences, vol. 40, no. 1-2, pp. 145–156, 1978.
- S. M. Huttegger, “Evolutionary explanations of indicatives and imperatives,” Erkenntnis, vol. 66, no. 3, pp. 409–436, 2007.
- K. Binmore, Fun and Games: A Text on Game Theory, D.C. Heath and Company, 1992.
- M. J. Osborne and A. Rubinstein, A Course in Game Theory, The MIT Press, Cambridge, Mass, USA, 1994.
- K. Binmore, Playing for Real: A Text on Game Theory, Oxford University Press, New York, NY, USA, 2007.
- E. Rasmusen, Games and Information, Blackwell Publishing Company, London, UK, 4th edition, 2007.
- J. A. Barrett, “The evolution of coding in signaling games,” Theory and Decision, vol. 67, no. 2, pp. 223–237, 2009.
- P. Parikh, “Radical semantics: a new theory of meaning,” Journal of Philosophical Logic, vol. 35, no. 4, pp. 349–391, 2006.
- B. Skyrms, Evolution of the Social Contract, Cambridge University Press, Cambridge, UK, 1996.
- R. J. Herrnstein, “On the law of effect,” Journal of the Experimental Analysis of Behavior, vol. 13, pp. 243–266, 1970.
- G. Aggarwal, A. Fiat, A. V. Goldberg, J. Hartline, N. Immorlica, and M. Sudan, “Derandomization of auctions,” in Proceedings of the 37th ACM Symposium on Theory of Computing (STOC '05), 2005.
- A. Bannerjee, “A simple model of herd behaviour,” Quarterly Journal of Economics, vol. 107, pp. 797–817, 1992.
- A. Bikhchandani, D. Hirshleifer, and I. Welch, “A theory of fads, fashion, customs and cultural change as informational cascades,” Journal of Political Economy, vol. 100, pp. 992–1026, 1992.
- C. J. Choi, X. Dassiou, and S. Gettings, “Herding behaviour and the size of customer base as a commitment to quality,” Economica, vol. 67, no. 267, pp. 375–398, 2000.
- I. Welch, “Sequential sales, learning, and cascades,” Journal of Finance, vol. 47, pp. 695–732, 1992.
- D. Glycopantis, A. Muir, and N. C. Yannelis, “On extensive form implementation of contracts in differential information economies,” Economic Theory, vol. 21, no. 2-3, pp. 495–526, 2003.