#### Abstract

We study a local interaction model where agents play a finite *n*-person game following a perturbed best-response process with inertia. We consider the concept of minimal *p*-best response set to analyze distributions of actions on the long run. We distinguish between two assumptions made by agents about the matching rule. We show that only actions contained in the minimal *p*-best response set can be selected provided that *p* is sufficiently small. We demonstrate that these predictions are sensitive to the assumptions about the matching rule.

#### 1. Introduction

There is an extensive literature on evolutionary game theory which investigates the long-run outcomes when a population of boundedly rational agents uses simple rules of adaptation to recurrently play a game in normal form. These models allow for sharp-equilibrium selection results in some classes of games with multiple equilibria. Such an equilibrium can be interpreted as a convention, that is, a pattern of behavior that is customary and expected and regulates much of economic and social life (for a survey, see, among others, Young [1]). A common assumption in this literature is that interactions among agents are local and not global. Each individual may be matched according to some matching rule with a subgroup of the overall population to play a game (for a survey, see, e.g., Weidenholzer [2]). In this paper, we assume that agents have no information about this matching rule. As a consequence, we elaborate two alternative scenarios corresponding to different assumptions made by agents about the matching rule. This point can be interpreted as a consequence of bounded rationality: agents only have an imperfect representation of their environment. We use the concept of (minimal) -best response set introduced by Tercieux [3] to make predictions for the long-run behavior of the population of agents (this concept is used by Durieu et al. [4] in a fictitious play model with bounded memory). We establish that such predictions are possible for the whole class of finite -person games. We also study how assumptions about the matching rule affect these predictions.

Precisely, we consider a finite population of agents located at the vertices of a chordal ring, that is, a ring topology in which each vertex has additional links with other vertices. In particular, we focus on chordal rings of degree , constructed by adding at each vertex chords with its closest vertices in a ring network. In order to deal with asymmetric finite -person games, we distinguish between classes of agents. A specific role in the game is assigned to every class. Each agent of each class is located at exactly one vertex and two agents of the same class are located in different vertices. In this way, each vertex contains exactly agents. The game is played recurrently. At each period, one agent of each class is drawn at random to play the game. However, players’ identity is not revealed. At the beginning of every period, each agent of each class observes the actions that the agents of other classes located in vertices linked with its own vertex has chosen in the previous period. He uses this information to estimate the probability distribution on the action profiles played by his potential opponents in the current period. These estimations depend on the assumptions made by agents about the matching rule. Each agent believes that he can be matched with agents located at vertices linked with his own location. Precisely, we consider the following two scenarios. In the first scenario, each agent assumes that his potential opponents are drawn jointly by location. In the second scenario, each agent assumes that his potential opponents are drawn independently. For both scenarios, agents may have the opportunity to choose a best response to their estimation. The concept of -best response set allows us to study the way in which the distribution of actions that people take in the classes evolves over time. We show that, for both assumptions about the matching rule, only actions contained in the minimal -best response set can be selected on the long run provided that is sufficiently small. For each assumption, an explicit bound of is given, and we analyze how this critical value evolves when increases.

Ellison [5, 6] considers a similar local interaction structure: agents are arranged on a ring improved by adding chords between vertices in a regular form. However, since Ellison [5, 6] focuses on symmetric two-player games, each vertex is associated with one agent and each agent has full information about the matching rule. Ellison [5] considers a symmetric coordination game. The model predicts that the risk-dominant equilibrium is selected on the long run. Ellison [6] shows that in a symmetric game, the -dominant equilibrium, if it exists, is selected on the long run. Alós-Ferrer and Weidenholzer [7] consider the weakened solution concept of globally pairwise risk-dominant equilibrium and show that this equilibrium, if it exists, is selected in a symmetric coordination game played by agents arranged on a ring. However, a generalization of this result to a larger class of games is not possible. Indeed, Alós-Ferrer and Weidenholzer [7] present an example of a coordination game in which the globally pairwise risk-dominant equilibrium is not selected on the long run. From this point of view, the concept of minimal -best response set is interesting. Since a minimal -best response exists in every game, it allows us to investigate the long-run outcomes in the whole class of finite -person games. Moreover, since the concept of minimal -best response set generalizes the concept of -dominant equilibrium [8], the result of Ellison [6] is a particular case of our result.

The paper is structured as follows. Section 2 introduces notations. Section 3 introduces the concept of -best response set. Section 4 introduces the learning model. In Section 5 we present the main results. Section 6 concludes.

#### 2. Notations and Definitions

Let denote weak set inclusion and let denote proper set inclusion. We denote by the smallest integer greater than or equal to . For any finite set , denotes the set of all probability distributions on .

Let be a finite -person strategic-form game. Let be the finite set of pure strategies available to player . We write for the set of probability distributions over for each . Let denote the probability mass on strategy . Define the product set . Let be the set of probability distributions on . Let denote the set of all possible combinations of strategies for the players other than , with generic elements . Let be the set of probability distributions on with generic elements . We sometimes identify the element of that assigns probability one to a strategy in with this strategy in .

In this paper, a player’s belief about others’ strategies takes the form of a probability measure on the product of all opponents’ strategy sets. We assume that each player has expected payoffs represented by the function .

For each player and probability distribution , let be the set of pure best responses of against .

Let be a product set where each is a nonempty subset of . Let denote the set of probability distributions with support in . Finally, denotes the set of strategies in that are pure best responses of against some distribution with support in ; that is,

#### 3. -Best Response Sets

We will now introduce the concept of (minimal) -best response set. Let and let be a nonempty subset of . We write for the subset of distributions such that . Let denote the set of strategies in that are pure best responses by to some distribution (regardless of probability assigned to other possible combinations of strategies); that is, Let us recall the definition of a strict -dominant equilibrium first introduced by Morris et al. [8] in two-person games and extended to -person games by Kajii and Morris [9]. A profile is a strict -dominant equilibrium if for each player In the sequel, we focus on the case where for all . The concept of -best response set extends the concept of strict -dominant equilibrium to product sets of strategies. Formally, a (minimal) -best response set is defined as follows.

*Definition 1. *Let . A -best response set is a product set , where for each player
A -best response set is a minimal -best response set if no -best response set is a proper subset of .

Let be the collection of -best response sets for some . The following lemma states some properties of minimal -best response sets.

Lemma 2. *Let be a finite -person game. *(1)* has a minimal -best response set for any .*(2)*Fix . Then two distinct minimal -best response sets of are disjoint.*(3)*For , there exists a unique minimal -best response set in .*(4)*Let . Let and be the minimal -best response set and -best response set, respectively. Then, .*

For a proof of this Lemma, we refer the reader to Durieu et al. [4] and Tercieux [3].

#### 4. Adaptive Processes

Following Samuelson [10], we extend the Darwinian process proposed by Kandori et al. [11] to the multipopulation case. This extension is quite natural to deal with asymmetric -person games. We think the game as having roles. For each role , a nonempty class of individuals is eligible to play that role. We assume that each class is composed of identical agents, where is a finite integer.

We consider the possibility of local interactions. Let be a graph where is the set of vertices and is the set of edges. We assume that vertices are located increasingly in a clockwise direction around a ring. We focus on a chordal ring: vertex is adjacent to the vertices (modulo ) where . If , then each vertex is adjacent to each other vertex. For each vertex , denote by the open neighborhood of , that is, the set of vertices adjacent to . Since for each , is regular. The closed neighborhood of , denoted by , is defined by .

We assume that each class of agents is distributed among the set of vertices . In other words, each agent of each class , , is located at exactly one vertex of and two agents of the same class are located in different vertices of . In this way, each vertex contains exactly agents. From this point of view, each vertex can be interpreted as a location. We denote by the actions profile chosen by agents located in . The collection of action profiles chosen in all locations in is .

Let denote successive time periods. The stage game is played once in each period. In period , one agent is drawn at random from each of the classes and assigned to play the corresponding role. We assume that, at every period, each agent has no information about who is selected to play each role in the game, only that a given action is played by someone. This lack of information gives rise to two kinds of assumptions made by each agent about his potential opponents, that is, about the matching rule. Fix a location . In a first scenario, each agent in assumes that his potential opponents belong to a unique location in . In other words, opponents would be drawn jointly by location in the closed neighborhood of his location. In a second scenario, each agent in assumes that his potential opponents may belong to different locations in . In other words, opponents would be drawn independently in the closed neighborhood of his location.

We give a formal description of both scenarios. Consider a time period . Actions chosen in by the whole population are described by . Fix an agent located in and an action profile . Denote by the number of locations in such that in . According to the first scenario, at the beginning of period , agent believes that the probability to be matched with agents playing is

In order to describe the second scenario, it is convenient for each class to identify with the set , where . For each such that , and each , denote by the number of where is chosen in . Consider . According to the second scenario, at the beginning of period , agent believes that the probability to be matched with agents playing is

In period , every agent in each class chooses an action according to the following learning rule. Every agent might receive the opportunity to revise his choice. For the sake of simplicity, we assume that this adjustment probability does not depend on the agent nor on the actions chosen in the whole population. Whenever an agent does not receive a revision opportunity, he simply repeats the action he has taken in the past. Whenever an agent receives a revision opportunity, he switches to a myopic best response. That is, the agent assumes that the action choices of other agents will remain unchanged in the next period and adopts a pure best response against it. Precisely, agent chooses a pure best response against the probability distribution on computed using formula (6) or (7). In other words, we define two myopic best-response dynamics with inertia.

If probability distributions are computed using (6), the dynamics results in a Markov chain on the state space , denoted by . We refer to it as* best-response process with joint drawing by location*. Similarly, if probability distributions are computed using (7), the dynamics results in a Markov chain on the state space , denoted by . We refer to it as* best-response process with independent drawing*. Let (resp. ) be the collection of recurrent sets of (resp., ). For each and each subset , let be the set of actions in that appears in . The product set of all actions that appears in is . Consider a minimal -best response set . Denote by (resp., ) the collection of recurrent sets of (resp. ) such that (resp., ) if and only if . Observe that and since is a minimal -best response set and thus a -best response set.

Following the literature (see, e.g., [11, 12]), the model is completed by adding the possibility of rare mutations or experiments on the part of agents. With fixed probability , independent across players and across time, each agent chooses an action at random. The processes with mutations are called* perturbed processes* and are denoted by and . With these mutations as part of the processes, each state of is reachable with positive probability from every other state. Hence, the perturbed processes and are irreducible and aperiodic finite state Markov chains on . Consequently, for each , (resp., ) has a unique stationary distribution (resp., ) satisfying (resp., ). The limit stationary distribution (as the rate of mutation tends to zero) (resp., ) exists and is a stationary distribution of the unperturbed process (resp., ). The states in the support of (resp. ) are called stochastically stable states and form a subset of (resp., ). The recurrent sets appearing in the support of (or ) are those which are the easiest to reach from all other recurrent sets, with “easiest" interpreted as requiring the fewest mutations (cf., Theorem 4 in [12]).

We rely on the identification of the set of stochastically stable states developed by Ellison [6]. This identification proceeds as follows. Ellison [6] introduces sufficient conditions to have a collection of recurrent sets that contains all stochastically stable states. The analysis uses three measures: the radius, the coradius, and the modified coradius. To illustrate these concepts, consider a collection of recurrent sets . Let be the basin of attraction of , that is, the set of states from which the unperturbed process converges to with probability one. In other words, is the set of states such that it is possible without mutation to build a path, that is, a sequence of distinct states, from to . The radius of (the basin of attraction of) , denoted by , is the minimum number of mutations necessary to leave from , that is, the minimum number of mutations contained in any path from a state to a state . The coradius of (a basin of attraction of) , denoted by , is the maximum over all states of the minimum number of mutations necessary to reach , that is, the maximum over all states of the minimum number of mutations contained in any path from to a state . To compute the modified coradius of (the basin of attraction of) , consider a state and a path from to a state belonging to . The modified number of mutations of this path is obtained by subtracting from the number of mutations of the path the radius of the intermediate recurrent sets through which the path passes. The modified coradius of (a basin of attraction of) , denoted by , is the maximum over all states of the minimum modified number of mutations necessary to reach , that is, the maximum over all states of the minimum modified number of mutations associated with any path from to a state . Note that for every . Ellison [6] establishes the following result.

Theorem 3 (see [6]). *Consider a perturbed process and fix . If , then the stochastically stable states are contained in .*

Since , an alternative condition to have all stochastically stable states contained in is that .

#### 5. Selection Results

Fix a minimal -best response set . First, consider the perturbed best-response process with joint drawing by location. We give a sufficient condition to have all stochastically stable states associated with .

Theorem 4. *Let be a finite -person game and a chordal ring of degree . Let and let be the minimal -best response set of . If is sufficiently large and is sufficiently small, the perturbed process puts arbitrarily high probability on a subset such that .*

*Proof. *Observe that if , then the result follows. In the sequel, we assume that . We break the proof into three parts.

We give a lower bound on . Fix a state , a location , and two distinct classes and . Consider agent located in and agent located in . Assume that agent mutates: he chooses at random a strategy not contained in . Then, agent believes that the probability to be matched with agents playing a vector of actions not contained in is . Now, assume that agents in located in locations belonging to mutate and choose an action outside . Then, agent believes that the probability to be matched with agents playing a vector of actions not contained in is . Let be a probability such that at least one class of agents has an action as a pure best response to , where . By definition of a minimal -best response set, we have . A transition from to any requires at least mutations (in locations belonging to and outside ), where is such that . Otherwise, agent ’s best-response(s) to belong(s) to since . Hence, is such that .

We give an upper bound on . To do this, it is convenient to distinguish between two situations according to the values of and . Firstly, consider the cases such that . Fix a state and a class . Assume that, in a location , agents mutate and choose actions in . Then, agent located in believes that the probability to be matched with agents playing a vector of actions contained in is . Now, consider consecutive locations in denoted by . Assume that in each of these locations agents mutate and choose actions in . Then, agent located in such that for each believes that the probability to be matched with agents playing a vector of actions contained in is . By definition of a minimal best response set, agent has at least one strategy as a pure best response to after that all agents located in consecutive locations in belonging to mutate and choose actions in if is such that . Set . Since , if in consecutive locations, , every agent mutates and chooses an action in , then every agent located in each of these locations chooses an action in . By inertia, it is then possible to reach a state in which an action profile contained in is played in consecutive locations, . From such a state, it is possible to reach a state in without additional mutation. Indeed, consider agents located in . Since , every agent of each class, , believes that the probability to be matched with agents playing a vector of actions contained in is . Then, it is possible to reach a state in which an action profile in is chosen in consecutive locations. Continuing in this fashion, we can reach a state in without additional mutation. Hence, is such that .

Secondly, consider the cases such that . Observe that since , we necessarily have . Fix a state . By the above point, we know that mutations allow reaching a state, denoted by , where an action profile in is chosen in consecutive locations, . If the number of locations belonging to is at most equal to , then every agent of each class , located in these locations believes that the probability to be matched with an agent playing an action contained in is . A transition from to a state in is possible without additional mutation and is bounded as above. It remains to deal with situations where the number of locations in is strictly greater than . In , each agent , located in these locations believes that the probability to be matched with an agent playing an action contained in is . Thus, it is possible that does not belong to nor to the recurrent set containing . However, an additional mutation is sufficient to reach a state, say , in which an action profile in is chosen in consecutive locations. To see this, consider location . If agent located in mutates and chooses an action in , then agent located in believes that the probability to be matched with an agent playing an action contained in is . Using inertia, it is possible that an action profile in is chosen in location . By a similar argument, it is possible to establish that one additional mutation may allow reaching a state in which an action profile in is chosen in consecutive locations. This argument can be applied until the number of locations in which an action profile not contained in is equal to . From such a state, it is possible to reach a state in without additional mutation. Hence, the path from to a state in passes through a sequence of intermediate recurrent sets whose radius is equal to 1. Thus, the modified coradius of is such that .

Finally, since , it holds in all cases that

In order to apply Theorem 3, it is sufficient to show that

A sufficient condition to obtain inequality (9) is that
Inequality (10) is satisfied provided that is sufficiently large since by hypothesis , and, by definition of a minimal -best response set, we have .

By definition, if a minimal -best response set is a singleton set, then it is a strict -dominant equilibrium. The following result is an immediate application of Theorem 4.

Corollary 5. *Let be a finite -person game and a chordal ring of degree . Let be a strict -dominant equilibrium of , where . If is sufficiently large and is sufficiently small, the perturbed process puts arbitrarily high probability on a subset such that .*

Observe that if , Corollary 5 is similar to Corollary 2 in Ellison [6] except that it also holds for the class of asymmetric two-player games with a -dominant equilibrium.

Second, we consider the perturbed best-response process with independent drawing. We give a sufficient condition to have all stochastically stable states associated with a minimal -best response set .

Theorem 6. *Let be a finite -person game and a chordal ring of degree . Let and let be the minimal -best response set of . If is sufficiently large and is sufficiently small, the perturbed process puts arbitrarily high probability on a subset such that .*

*Proof. *Observe that if , then the result follows. In the sequel, we assume that . As in the proof of Theorem 4, we break the proof into three parts.

We give a lower bound on . By the same reasoning as in the proof of Theorem 4, it follows that is such that .

We give an upper bound on . Observe that if , both processes and coincide. By the same reasoning as in the proof of Theorem 4, it follows that if then . Now, assume that . Fix a state and a class . Assume that, in a location , agents mutate and choose actions in . Then, agent located in believes that the probability to be matched with agents playing a vector of actions contained in is . Now, consider consecutive locations in denoted by . Assume that in each of these locations agents mutate and choose actions in . Then, agent located in such that for each believes that the probability to be matched with agents playing a vector of actions contained in is . By definition of a minimal -best response set, agent has at least one strategy as a pure best response to after that all agents located in consecutive locations in belonging to mutate and choose actions in if is such that . Set . Since , if in consecutive locations every agent mutates and chooses an action in , then every agent located in each of these locations chooses an action in . By inertia, it is then possible to reach a state in which an action profile contained in is played in consecutive locations in . From such a state, it is possible to reach a state in without additional mutation. Indeed, since and , we necessarily have . Hence, is such that .

In order to conclude, it is sufficient to show that

A sufficient condition to obtain inequality (11) is that
Inequality (12) is satisfied provided is sufficiently large since by hypothesis , and, by definition of a minimal -best response set, we have .

The following result is an immediate application of Theorem 6.

Corollary 7. *Let be a finite -person game and a chordal ring of degree . Let be a strict -dominant equilibrium of , where . If is sufficiently large and is sufficiently small, the perturbed process puts arbitrarily high probability on a subset such that .*

#### 6. Conclusion

This paper establishes that the concept of minimal -best response set is useful to study the long-run outcomes of a process when agents are arranged on a chordal ring and follow a myopic best response rule with inertia. In particular, it allows us to obtain results for the whole class of finite -person games. Even if predictions are not necessarily sharp (since a minimal -best response set may become large when decreases), those results make easier the identification of stochastically stable states. The paper also highlights that predictions depend on the assumptions made by agents about the matching rule. From this point of view, it is possible to establish a connection between these results and the results obtained in Durieu et al. [4]. This paper considers a fictitious play model with bounded memory and sample as in Young’s [12]. Two processes are studied. On the one hand, it is assumed that each agent believes that, in every period, his opponents play independently. On the other hand, each agent believes that, in every period, the play of his opponents is correlated. Durieu et al. show that the concept of -best response set allows establishing predictions about the long-run outcomes of both processes. Furthermore, as in the present paper, there exists a similar gap between predictions obtained for each process. This conveys the idea that sampling in memory and believing that opponents correlate their actions (play independently resp.) has the same effect as believing that players are drawn jointly (independently resp.) in neighborhood to play the game.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

The authors would like to thank two anonymous referees. Financial support from the National Agency for Research (ANR)—research programs “Dynamite Matching and Interactions: Theory and Experiments” (DynaMITE) ANR BLANC and “Mathématiques de la decision pour l'ingénierie physique et sociale” (MODMAD)—is gratefully acknowledge.