Interactions among selfish users sharing a common transmission channel can be modeled as a noncooperative game using the game theory framework. When selfish users choose their transmission probabilities independently without any coordination mechanism, Nash equilibria usually result in a network collapse. We propose a methodology that transforms the noncooperative game into a Stackelberg game. Stackelberg equilibria of the Stackelberg game can overcome the deficiency of the Nash equilibria of the original game. A particular type of Stackelberg intervention is constructed to show that any positive payoff profile feasible with independent transmission
probabilities can be achieved as a Stackelberg equilibrium payoff profile. We discuss criteria to select an operating point of the network and informational requirements for the Stackelberg game. We relax the requirements and examine the effects of relaxation on performance.
1. Introduction
In wireless
communication networks, multiple users often share a common channel and contend
for access. To resolve the contention problem, many different medium access
control (MAC) protocols have been devised and used. Recently, the selfish
behavior of users in MAC protocols has been studied using game theory. There
have been attempts to understand the existing MAC protocols as the local
utility maximizing behavior of selfish users by reverse engineering the current
protocols (e.g., [1]).
It has also been investigated whether existing protocols are vulnerable to the
existence of selfish users who pursue their self-interest in a noncooperative manner.
Noncooperative behavior often leads to inefficient outcomes. For example, in
the 802.11 distributed MAC protocol, DCF, and its enhanced version, EDCF,
competition among selfish users can lead to an inefficient use of the shared
channel in Nash equilibria [2]. Similarly, a prisoner's dilemma phenomenon arises in
a noncooperative game for a generalized version of slotted-Aloha protocols
[3].
In general, if a game has Nash equilibria yielding low
payoffs for the players, it will be desirable for them to transform the game to
extend the set of equilibria to include better outcomes [4]. The same idea can be
applied to the game played by selfish users who compete for access to a common
medium. If competition among selfish users brings about a network collapse,
then it is beneficial for them to design a device which provides incentives to behave cooperatively. Game theory [4] discusses three types of transformation: (1) games
with contracts, (2) games with communication, and (3) repeated games.
A game is said to be with contracts if the players of
the game can communicate and bargain with each other, and enforce the agreement
with a binding contract. The main obstacle to apply this approach to wireless
networking is the distributed nature of wireless networks. To reach an agreement, users should know
the network system and be able to communicate with each other. They should also
be able to enforce the agreed plan.
A game with communication is the one in which players
can communicate with each other through a mediator but they cannot write a
binding contract. In this case, a correlated equilibrium is predicted to be
played. Altman et al. [5] study correlated equilibria using a coordination
mechanism in a slotted Aloha-type scenario. Unlike the first approach, this
does not require that the actions of players be enforceable. However, to apply
this approach to the medium access problem, signals need to be conveyed from a
mediator to all users, and users need to know the correct meanings of the
signals.
A repeated game is a dynamic game in which the same
game is played repeatedly by the same players over finite or infinite periods.
Repeated interactions among the same players enable them to sustain cooperation
by punishing deviations in subsequent periods. A main challenge of applying the
idea of repeated games to wireless networks is that the users should keep track of their past
observations and be able to detect deviations and to coordinate their actions
in order to punish deviating users.
Besides the three approaches above, another approach
widely applied to communication networks is pricing [6]. A central entity charges
prices to users in order to control their utilization of the network. Nash
equilibria with pricing schemes in an Aloha network are analyzed in [7, 8]. Implementing a pricing
scheme requires the central entity to have relevant system information as well
as users' benefits and costs, which are often their private information.
Eliciting private information often results in an efficiency loss in the
presence of the strategic behavior of users as shown in [9]. Even in the case where the entity has all the
relevant information, prices need to be computed and communicated to the users.
In this paper, we propose yet another approach using a
Stackelberg game. We introduce a network manager as an additional user and
make him access the medium according to a certain rule. Unlike the Stackelberg
game of [10] in which
the manager (the leader) chooses a certain strategy before users (followers)
make their decisions, in the proposed Stackelberg game he sets an intervention
rule first and then implements his intervention after users choose their
strategies. Alternatively, the proposed Stackelberg game can be considered as a
generalized Stackelberg game in which there are multiple leaders (users) and a
single follower (the manager) and the leaders know the response of the follower
to their decisions correctly. With appropriate choices of intervention rules,
the manager can shape the incentives of users in such a way that their selfish
behavior results in cooperative outcomes.
In the context of cognitive radio networks, [11] proposes a related
Stackelberg game in which the owner of a licensed frequency band (the leader)
can charge a virtual price for using the frequency band to cognitive radios
(followers). The virtual price signals the extent to which cognitive radios can
exploit the licensed frequency band. However, since prices are virtual, selfish
users may ignore prices when they make decisions if they can gain by doing so.
On the contrary, in the Stackelberg game of this paper, the intervention of the
manager is not virtual but it results in the reduction of throughput, which
selfish users care about for sure. Hence, the intervention method provides
better grounds for the network manager to deal with the selfish behavior of
users.
Chen et al. [12, 13] use game theoretic models to study random access.
Their approach is to capture the information and implementation constraints
using the game theoretic framework and to specify utility functions so that a
desired operating point is achieved at a Nash equilibrium. If conditions under
which a certain type of dynamic adjustment play converges to the Nash
equilibrium are met, such a strategy update mechanism can be used to derive a
distributed algorithm that converges to the desired operating point. However,
this control-theoretic approach to game theory assumes that users are obedient.
In this paper, our main concern is about the selfish behavior of users who have
innate objectives. Because we start from natural utility functions and affect
them by devising an intervention scheme, we are in a better position to deal
with selfish users. Furthermore, the idea of intervention can potentially lead
to a distributed algorithm to achieve a desired operating point.
By formulating the medium access problem as a
noncooperative game, we show the following main results.
(1)Because the
Nash equilibria of the noncooperative game are inefficient and/or unfair, we
transform the original game into a Stackelberg game, in which any feasible
outcome with independent transmission probabilities can be achieved as a
Stackelberg equilibrium.(2)A particular form
of a Stackelberg intervention strategy, called total relative deviation
(TRD)-based intervention, is constructed and used to achieve any feasible
outcome with independent transmission probabilities.(3)The additional
amount of information flows required for the transformation is relatively
moderate, and it can be further reduced without large efficiency losses.
The rest of this paper is organized as follows.
Section 2 introduces the model and formulates it as a noncooperative game
called the contention game. Nash equilibria of the contention game are
characterized, and it is shown that they typically yield suboptimal
performance. In Section 3, we transform the contention game into another
related game called the Stackelberg contention game by introducing an
intervening manager. We show that the manager can implement any transmission
probability profile as a Stackelberg equilibrium using a class of intervention
functions. Section 4 discusses natural candidates for the target transmission
probability profile selected by the manager. In Section 5, we discuss the flows
of information required for our results and examine the implications of some
relaxations of the requirements on performance. Section 6 provides numerical
results, and Section 7 concludes the paper.
2. Contention Game Model
We consider a
simple contention model in which multiple users share a communication channel
as in [14]. A user
represents a transmitter-receiver pair. Time is divided into slots of the same duration.
Every user has a packet to transmit and can send the packet or wait. If there
is only one transmission, the packet is successfully transmitted within the
time slot. If more than one user transmits a packet simultaneously in a slot, a
collision occurs and no packet is transmitted.
We summarize the assumptions of our contention
model.
(1)A fixed set of
users interacts over a given period of time (or a session).(2)Time is divided
into multiple slots, and slots are synchronized.(3)A user always
has a packet to transmit in every slot.(4)The
transmission of a packet is completed within a slot.(5)A user
transmits its packet with the same probability in every slot. There is no
adjustment in the transmission probabilities during the session.
This excludes coordination among users, for example, using time division multiplexing.(6)There is no
cost of transmitting a packet.
We formulate the medium access problem as a
noncooperative game to analyze the behavior of selfish users. We denote the set
of users by . Because we assume that a user uses the same
transmission probability over the entire session, the strategy of a user is its
transmission probability, and we denote the strategy of user by and the
strategy space of user by for all .
Once the users decide their transmission
probabilities, a strategy profile can be constructed. The users transmit their
packets independently according to their transmission probabilities, and thus
the strategy profile determines the probability of a successful transmission by
user in a slot. A
strategy profile can be written as a vector in , the set of strategy profiles. The payoff function of
user , , is defined aswhere measures the
value of transmission of user and is the
probability of successful transmission by user .
We define the contention game by the tuple . If the users choose their transmission probabilities
taking others' transmission probabilities as given, then the resulting outcome
can be described by the solution concept of Nash equilibrium [4]. We first characterize the
Nash equilibria of the contention game.
Proposition 1. A strategy profile
is a Nash
equilibrium of the contention game if and only if
for at least
one .
Proof.
In the contention game, the best response correspondence of user assumes two
sets: if and if . Suppose that user chooses . Then it is playing its best response while other users are also playing their best responses, which establishes the sufficiency part. To prove the
necessity part, suppose that is a Nash
equilibrium and for all . Since , is not a best
response to , which is a contradiction.
If a Nash
equilibrium has only one
user such that , then and for all where can be as large
as . If there are at least two users with the
transmission probability equal to 1, then we have for all . Let . Then, the set of Nash equilibrium payoffs is given
by
Given the game , we can define the set of feasible payoffs byA payoff profile in is Pareto
efficient if there is no other element in such that and for at least
one user . We also call a strategy profile Pareto
efficient if is a Pareto
efficient payoff profile. Let be the set of
Pareto efficient payoffs.
There are points in , namely, such that and for all , for . These are the corner points of in which only
one user receives a positive payoff. Therefore, Nash equilibrium payoff profiles
are either inefficient or unfair. Moreover, since is a weakly
dominant strategy for every user , in a sense that for all , the most likely Nash equilibrium is the one in which for all . At the most likely Nash equilibrium, every user
always transmits its packet, and as a result no packet is successfully transmitted.
Hence, the selfish behavior of the users is likely to lead to a network
collapse, which gives zero payoff to every user, as argued also in [15].
Figure 1 presents the payoff spaces of two homogeneous
users with . If coordination between the two users is possible,
they can achieve any payoff profile in the dark area of Figure 1(a). For
example,
can be achieved
by arranging user 1 to transmit only in odd-numbered slots and user 2 only in
even-numbered slots. This kind of coordination can be supported through
direct communications among the users or mediated communications. However, if
such coordination is not possible and each user has to choose one transmission probability,
Nash equilibria yield the payoff profiles in Figure 2(b). The set of feasible
payoffs of the contention game is shown as the dark area of Figure 1(c). The
set of Pareto-efficient payoff profiles is the frontier of that area. The lack
of coordination makes the set of feasible payoffs smaller reducing the area of
Figure 1(a) to that of Figure 1(c). Because the typical Nash equilibrium payoff
is , the next section develops a transformation of the
contention game, and the set of equilibria of the resulting Stackelberg game is
shown to expand to the entire area of Figure 1(c).
Figure 1: Payoff profiles
with two homogeneous users with . (a) The set of feasible payoffs when coordination between two users is possible. (b) The set of Nash equilibrium payoffs. (c) The
set of feasible payoffs with independent transmission probabilities.
Figure 2: Schematic
illustration of (a) the contention game and (b) the Stackelberg contention
game. (i), (ii), and (iii) represent the order of moves in the Stackelberg
contention game, and the dotted arrows represent the flows of information
required for the Stackelberg contention game.
3. Stackelberg Contention Game
We introduce a
network manager as a special kind of user in the contention game and call him
user 0. As a user, the manager can access the channel with a certain
transmission probability. However, the manager is different from the users in
that he can choose his transmission probability depending on the transmission
probabilities of the users. This ability of the manager enables him to act as
the police. If the users access the channel excessively, the manager can
intervene and punish them by choosing a high transmission probability, thus
reducing the success rates of the users.
Formally, the strategy of the manager is an intervention
function , which gives his transmission probability when the
strategy profile of the users is . can be
interpreted as the level of intervention or punishment by the manager when the
users choose . Note that the level of intervention by the manager
is the same for every user. We assume that the manager has a specific
“target” strategy profile , that his transmission has no value to him (as well
as to others), and that he is benevolent. One representation of his objective
is the payoff function of the following form: This payoff function means that the manager wants the users to operate at the
target strategy profile with the
minimum level of intervention.
We call the transformed game the Stackelberg
contention game because the manager chooses his strategy before the
users make their decisions on the transmission probabilities. In this sense,
the manager can be thought of as a Stackelberg leader and the users as
followers. The specific timing of the Stackelberg contention game can be
outlined as follows.
(1)The network
manager determines his intervention function.(2)Knowing the
intervention function of the manager, the users choose their transmission
probabilities simultaneously.(3)Observing the
strategy profile of the users, the manager determines the level of intervention
using his intervention function.(4)The
transmission probabilities of the manager and the users determine their
payoffs.
Timing 1 happens before the session starts. Timing 2
occurs at the beginning of the session whereas timing 3 occurs when the manager
knows the transmission probabilities of all the users. Therefore, there is a
time lag between the time when the session begins and when the manager starts
to intervene. Payoffs can be calculated as the probability of successful
transmission averaged over the entire session, multiplied by valuation. If the
interval between timing 2 and timing 3 is short relative to the duration of the
session, the payoff of user can be
approximated as the payoff during the intervention using the following payoff
function:
The transformation of the contention game into the
Stackelberg contention game is schematically shown in Figure 2. The figure
shows that the main role of the manager is to set the intervention rule and to
implement it. The users still behave noncooperatively maximizing their payoffs,
and the intervention of the manager affects their selfish behavior even though
the manager does neither directly control their behavior nor continuously
communicate with the users to convey coordination or price signals.
In the Stackelberg routing game of [10], the strategy spaces of the
manager and a user coincide. If that is the case in the Stackelberg contention
game, that is, if the manager chooses a single transmission probability before
the users choose theirs, then this intervention only makes the channel lossy
but it does not provide incentives for users not to choose the maximum possible
transmission probability. Hence, in order to provide an incentive to choose a
smaller transmission probability, the manager needs to vary his transmission
probability depending on the transmission probabilities of the users.
A Stackelberg game is analyzed using a backward
induction argument. The leader predicts the Nash equilibrium behavior of the
followers given his strategy and chooses the best strategy for him. The same
argument can be applied to the Stackelberg contention game. Once the manager
decides his strategy and commits to
implement his transmission probability according to , the rest of the Stackelberg contention game (timing
2–4) can be viewed as a noncooperative game played by the users. Given the
intervention function , the payoff function of user can be written
asIn essence, the role of the
manager is to change the noncooperative game that the users play from the
contention game to a new game , which we call the contention game with intervention . Understanding the noncooperative behavior of the
users given the intervention function , the manager will choose that maximizes
his payoff.
We now define an equilibrium concept for the
Stackelberg contention game.
Definition 1. An intervention function of the manager
and a profile
of the transmission probabilities of the users
constitutes a
Stackelberg equilibrium if (i)
is a Nash
equilibrium of the contention game with intervention and (ii) and .
Combining (i)
and (ii), an equivalent definition is that ( is a Stackelberg equilibrium if is a Nash
equilibrium of and . Condition (i) says that once the manager chooses his
strategy, the users will play a Nash equilibrium strategy profile in the
resulting game, and condition (ii) says that expecting the Nash equilibrium
strategy profile of the users, the manager chooses his strategy that achieves
his objective.
3.1. Stackelberg Equilibrium with TRD-Based Intervention
As we have
mentioned earlier, the manager can choose only one level of intervention that
affects the users equally. A question that arises is which strategy profile the
manager can implement as a Stackelberg equilibrium with one level of
intervention for every user. We answer this question constructively. We propose
a specific form of an intervention function with which the manager can attain
any strategy profile with for all . The basic idea of this result is that because the
strategy of the manager is not a single intervention level but a function whose
value depends on the strategies of the users, he can discriminate the users by
reacting differently to their transmission probabilities in choosing the level
of intervention. Therefore, even though the realized level of intervention is
the same for every user, the manager can induce the users to choose different
transmission probabilities.
To construct such an intervention function, we first
define the TRD of from bySince determines the
transmission probability of the manager, its range should lie in . To satisfy this constraint, we define the TRD-based
intervention function bywhere the operator is used to
obtain the “trimmed” value of TRD between 0 and 1.
The TRD-based intervention can be interpreted in the
following way. The manager sets the target at . As long as the users choose small transmission
probabilities so that the TRD of from does not exceed
zero, the manager does not intervene. If it is larger than zero, the manager will
respond to a one-unit increase in by increasing by units until the
TRD reaches 1. The manager determines the degree of punishment based on the
target transmission probability profile. If he wants a user to transmit with a
low probability, then his punishment against its deviation is strong.
Proposition 2.
constitutes a
Stackelberg equilibrium.
Proof.
We need to check two things. First, is a Nash
Equilibrium of . Second, . It is straightforward to confirm the second. To show
the first, the payoff function of user given others'
strategies is It can be seen from the above expression that is increasing
on , reaches a peak at , is decreasing on , and then stays at 0 on . Therefore, user 's best
response to is for all , and thus constitutes a
Nash equilibrium of the contention game with TRD-based intervention, .
Corollary 1. Any feasible payoff profile
of the contention game with
for all can be achieved
by a Stackelberg equilibrium.
Corollary 1 resembles the Folk theorem of repeated
games [4] in that it
claims that any feasible outcome can be attained as an equilibrium. Incentives
not to deviate from a certain operating point are provided by the manager's
intervention in the Stackelberg contention game, while in a repeated game players
do not deviate since a deviation is followed by punishment from other players.
3.2. Nash Equilibria of the Contention Game with TRD-Based Intervention
In Proposition
2, we have seen that is a Nash
equilibrium of the contention game with TRD-based intervention. However, if
other Nash equilibria exist, the outcome may be different from the one that the
manager intends. In fact, any strategy profile with for at least
one is still a Nash
equilibrium of . The following proposition characterizes the set of
Nash equilibria of that are
different from those of .
Proposition 3.
Consider a strategy profile with for all . is a Nash
equilibrium of the contention game with TRD-based intervention if and only if
either
or
Proof.
See Appendix A.
Transforming to does not
eliminate the Nash equilibria of the contention game. Rather, the set of Nash
equilibria expands to include two classes of new equilibria. The first Nash
equilibrium of Proposition 3 is the one that the manager intends the users to
play. The second class of Nash equilibria are those in which the sum of
relative deviations of other users is already too large that no matter how
small transmission probability user chooses, the
level of intervention stays the same at 1.
Since is chosen to
satisfy for all and satisfies , it follows that for all . (Since we mostly
consider the TRD-based intervention function , we will use instead of when there is
no confusion.) For the second class of Nash
equilibria in Proposition 3, for all because . Therefore, the payoff profile of the second class of Nash
equilibria is Pareto dominated by that of the intended Nash equilibrium
in that the intended Nash equilibrium yields a higher payoff for every user
compared to the second class of Nash equilibria.
The same conclusion holds for Nash equilibria with
more than one user with transmission probability 1 because every user gets zero
payoff. Finally, the remaining Nash equilibria are those with exactly one user
with transmission probability 1. Suppose that . Then the highest payoff for user is achieved
when for all . Denoting this strategy profile by , the payoff profile of is Pareto
dominated by that of if .
3.3. Reaching the Stackelberg Equilibrium
We have seen
that there are multiple Nash equilibria of the contention game with TRD-based
intervention and that the Nash equilibrium in general
yields higher payoffs to the users than other Nash equilibria. If the users are
aware of the welfare properties of different Nash equilibria, they will tend to
select .
Suppose that the users play the second class of Nash
equilibria in Proposition 3 for some reason. If the Stackelberg contention game
is played repeatedly and the users anticipate that the strategy profile of the
other users will be the same as that of the last period, then it can be shown
that under certain conditions there is a sequence of intervention functions
convergent to that the manager
can employ to have the users reach the intended Nash equilibrium , thus approaching the Stackelberg equilibrium.
Proposition 4. Suppose that at
the manager
chooses the intervention function and that the
users play a Nash equilibrium of the second
class.
Without loss of generality, the users are enumerated
so that the following holds:
Suppose further that for each , either or holds.
At ; Define Assume that the manager employs
the intervention function where and that user chooses as a best
response to given .
Then for all and .
Proof.
See Appendix B.
The reason that
no user has an incentive to deviate from the second class of Nash equilibria is
that since others use high transmission probabilities, the TRD is over 1 no
matter what transmission probability a user chooses. Since the punishment level
is always 1, a reduction of the transmission probability by a user is not
rewarded by a decreased level of intervention. If the relative deviations of from are not too
disperse, the manager can successively adjust down the effective range of
punishment so that he can react to the changes in the strategies of the users.
Proposition 4 shows that this procedure succeeds to have the strategy profile
of the users converge to the intended Nash equilibrium.
4. Target Selection Criteria of the Manager
So far we have
assumed that the manager has a target strategy profile and examined
whether he can find an intervention function that implements it as a Stackelberg
equilibrium. This section discusses selection criteria that the manager can use
to choose the target strategy profile. To address this issue, we rely on
cooperative game theory because a reasonable choice of the manager should have
a close relationship to the likely outcome of bargaining among the users if
bargaining were possible for them [4]. The absence of communication opportunities among the
users prevents them from engaging in bargaining or from directly coordinating
with each other.
4.1. Nash Bargaining Solution
The pair is an -person
bargaining problem where is a closed and
convex subset of , representing the set of feasible payoff allocations
and is the
disagreement payoff allocation. Suppose that there exists such that for every .
Definition 2.
is the Nash
bargaining solution for an -person bargaining problem
if it is the
unique Pareto efficient vector that solves
Consider the contention game . can be regarded
as an -person
bargaining problem where is defined in
(3) and is the
disagreement point. The vector is the natural
disagreement point because it is a Nash equilibrium payoff as well as the
minimax value for each user. The only departure from the standard theory is
that the set of feasible payoffs is not convex.
(We do not allow public randomization among users,
which requires coordination among them.) However,
we can carry the definition of the Nash bargaining solution to our setting as
in [15].
Since the manager knows the structure of the
contention game, he can calculate the Nash bargaining solution for and find the
strategy profile that yields . Then the manager can implement by choosing based on . Notice that the presence of the manager does not
decrease the payoffs of the users because = 0. The Nash
bargaining solution for has the
following simple form.
Proposition 5.
is the Nash
bargaining solution for , and it is attained by for all .
Proof.
The maximand in the definition of the Nash bargaining solution can be written
asSince any satisfies , the above problem can be expressed in terms of :The logarithm of the objective
function is strictly concave in , and the first-order optimality condition gives for all .
The Nash
bargaining solution for treats every
user equally in that it specifies the same transmission probability for every
user. Therefore, the manager does not need to know
the vector of the values of transmission to implement
the Nash bargaining solution. The Nash bargaining solution
coincides with the Kalai-Smorodinsky solution [16] because the maximum payoff
for user is and the Nash
bargaining solution is the unique efficient payoff profile in which each user
receives a payoff proportional to its maximum feasible payoff.
If the manager wants to treat the users with discrimination,
he can use the generalized Nash product as the maximand to find a nonsymmetric
Nash bargaining solution, where represents the
weight for user . One example of the weights is the valuation of the
users. (If is private
information, it would be interesting to construct a mechanism that induces users
to reveal their true values .) The nonsymmetric Nash
bargaining solution for can be shown to
be achieved by for all using the
similar method to the proof of Proposition 5.
4.2. Coalition-Proof Strategy Profile
If some of the
users can communicate and collude effectively, the network manager may want to
choose a strategy profile which is self-enforcing even in the existence of
coalitions. Since we define a user as a transmitter-receiver pair, a collusion
may occur when a single transmitter sends packets to several destinations and
controls the transmission probabilities of several users.
Given the set of users , a coalition is any nonempty subset of . Let be the strategy
profile of the users in .
Definition 3.
is
coalition-proof with respect to a coalition
in a
noncooperative game
if there does
not exist such that for all and for at least
one user .
By definition, is
coalition-proof with respect to the grand coalition if and only if is Pareto
efficient. If is a Nash
equilibrium, then it is coalition-proof with respect to any one-person
“coalition.” The noncooperative game of our interest is the contention game
with TRD-based intervention .
Proposition 6.
is
coalition-proof with respect to a two-person coalition
in the
contention game with TRD-based intervention if and only if .
Proof.
See Appendix C.
The proof of
Proposition 6 shows that if then users and can jointly
reduce their transmission probabilities to increase their payoffs at the same
time. For example, suppose that users 1 and 2 are controlled by the same
transmitter and that the manager selects the target with and . Then and . Suppose that the two users jointly deviate to . Then the new payoffs are and , which is strictly better for both users. A decrease
in and at the same
time also increases the payoffs of all the users not belonging to the
coalition, which implies that a target with is not Pareto
efficient. This observation leads to the following corollary.
Corollary 2. If
is Pareto
efficient in the contention game with TRD-based intervention , then it is coalition-proof with respect to any
two-person coalition.
In fact, we can generalize the above corollary and
provide a stronger statement.
Proposition 7.
is Pareto
efficient in the contention game with TRD-based intervention if and only if
it is coalition-proof with respect to any coalition.
Proof.
See Appendix D.
5. Informational Requirement and Its Relaxation
We have
introduced and analyzed the contention game and the Stackelberg contention game
with TRD-based intervention. In this section, we discuss what the players of
each game need to know in order to play the corresponding equilibrium.
5.1. Contention Game and Nash Equilibrium
In a general
noncooperative game, each user needs to know, or predict correctly, the
strategy profile of others in order to find its best response strategy. In the
contention game with the payoff function , it suffices for user to know the
sign of , that is, whether it is positive or zero, to
calculate its best response. On the other hand, is a weakly
dominant strategy for any user , which means setting is weakly
better no matter what strategies other users choose. Hence, the Nash
equilibrium does not
require any knowledge on others' strategies.
5.2. Stackelberg Contention Game with TRD-Based Intervention and Stackelberg Equilibrium
Considering the
timing of the Stackelberg contention game outlined in Section 3, we can list
the following requirements on the manager and the users for the Stackelberg
equilibrium to be played.
Requirement M
Once the users choose the transmission probabilities, the manager
observes the strategy profile of the users.
The manager needs to decide the level of intervention
as a function of the transmission probabilities of the users. If the manager
can distinguish the access of each user and have sufficiently many observations
to determine the transmission probability of each user, then this requirement
will be satisfied. If the manager can observe the channel state (idle, success,
and collision) and identify the users of successfully transmitted packets, he
can estimate the transmission probability of each user in the following way.
First, he can obtain an estimate of by calculating
the frequency of idle slots, called . Second, he can obtain an estimate of by calculating
the frequency of slots in which user succeeds to
transmit its packet, called . Finally, an estimate of can be obtained
by solving for .
Requirement U
User knows (and thus ) and when it chooses
its transmission probability.
Requirement is sufficient for the Nash equilibrium
of the contention game with TRD-based intervention to be played by the users.
User can find its
best response strategy by maximizing given and . In fact, a weaker requirement is compatible with the
Nash equilibrium of the contention game with TRD-based intervention. Suppose
that user knows the form of intervention function and the value of , and observes the intervention level . embedded in the
TRD-based intervention function can be thought
of as a recommended strategy profile by the manager (thus the communication
from the manager to the users occurs indirectly through the function ). Even though
user does not know
the recommended strategies to other users, that is, the values of , , it knows its recommended transmission probability.
From the form of the intervention function, user can derive that
it is of its best interest to follow the recommendation as long as all the
other users follow their recommended strategies. Observing confirms its
belief that other users play the recommended strategies, and it has no reason
to deviate.
The users can acquire knowledge on the intervention
function through one of
three ways: (i) known protocol, (ii) announcement, and (iii) learning. The
first method is effective in the case where a certain network manager operates
in a certain channel (e.g., a frequency band). The community of users will know
the protocol (or intervention function) used by the manager. This method does
not require any information exchange between the manager and the users. Neither
teaching of the manager nor learning of the users is necessary. However, there
is inflexibility in choosing an intervention function, and the manager cannot
change his target strategy profile frequently. Nevertheless, this is the method
most often used in current wireless networks, where users appertain to a predetermined
class of known and homogeneous protocols.
The second method allows the manager to make the users
know directly, which
includes information on the target . The manager will execute his intervention according
to the announced intervention function because the Stackelberg equilibrium achieves his
objective. However, it requires explicit message delivery from the manager to
the users, which is sometimes costly or may even be impossible in practice.
Finally, if the Stackelberg contention game is played
repeatedly with the same intervention function, the users may be able to
recover the form of the intervention function chosen by the manager based on
their observations on , for example, using learning techniques developed in
[17–19]. However, this process may
take long and the users may not be able to collect enough data to find out the
true functional form if there is limited experimentation of the users.
Remark. If
users are obedient, the manager can use centralized control by communicating to user . Additional communication and estimation overhead
required for the Stackelberg equilibrium can be considered as a cost incurred
to deal with the selfish behavior of users, or to provide incentives for users
to follow .
5.3. Limited Observability of the Manager
The
construction of the TRD-based intervention function assumes that the manager
can observe or estimate the transmission probabilities of the users correctly.
In real applications, the manager may not be able to observe the exact choice
made by each user. We consider several scenarios under which the manager has
limited observability and examine how the TRD-based intervention function can
be modified in those scenarios.
5.3.1. Quantized Observation
Let be a set of
intervals which partition . We assume that each interval contains its right end
point. For simplicity, we will consider intervals of the same length. That is, , and we call and for all .
Suppose that the manager only observes which interval
in each belongs to. In
other words, the manager observes instead of such that . In this case, the level of intervention is
calculated based on rather than . It means that given , would be the
same for any if and belong to the
same . Since any is weakly
dominated by , the users will choose their transmission
probabilities at the right end points of the intervals in . This in turn will affect the choice of a target by
the manager. The manager will be restricted to choose such that for all . Then the manager can implement with the
intervention function , where is set equal to . In summary, the quantized observation on restricts the
choice of by the manager
from to .
Figure 3 shows the payoff profiles that can be
achieved by the manager with quantized observation. When the number of
intervals is moderately large, the manager has many options near or on the
Pareto efficiency boundary.
Figure 3: Payoffs that
can be achieved by the manager with quantized observation. (a) . (b) .
5.3.2. Noisy Observation
We modify the
Stackelberg contention game to analyze the case where the manager observes
noisy signals of the transmission probabilities of the users. Let be the strategy
space of user , where is a small
positive number. We assume that the users can observe the strategy profile , but the manager observes a noisy signal of . The manager observes instead of where is uniformly
distributed on , independently over . Suppose that the manager chooses a target such that . The expected payoff of user when the
manager uses an intervention function isHence, the intervention function
is effectively instead of when the
manager observes . If is a Nash
equilibrium of the contention game with intervention when is perfectly
observable to the manager and for all such that , then will
still be a Nash equilibrium of the contention game
with intervention when the
manager observes a noisy signal of the strategy profile of the users.
Consider the TRD-based intervention function . Since for all and with a positive
probability when , whereas . Since is kinked at , the noise in will distort
the incentives of the users to choose .
The manager can implement his target at the expense
of intervention with a positive probability. If the manager adopts the
following intervention function:where , then is a Nash
equilibrium of the contention game with intervention , but the average level of intervention at iswhich can be thought of as the
efficiency loss due to the noise in observations.
Figure 4 illustrates the set of payoff profiles that
can be achieved with the intervention function given by (20). As the size of
the noise gets smaller, the set expands to approach the Pareto efficiency
boundary.
Figure 4: Payoffs that
can be achieved by the manager with noisy observation. (a) . (b) .
5.3.3. Observation on the Aggregate Probability
We consider the
case where the manager can observe only the frequency of the slots that are not
accessed by any user. If the users transmit their packets according to , then manager observes only the
aggregate probability . In this scenario, the intervention function that the
manager chooses has to be a function of , and this implies that the manager cannot
discriminate among the users.
The TRD-based intervention function allows the
manager to use different reactions to each user's deviation. In the effective
region where the TRD is between 0 and 1, one unit increase in results in units increase
in . However, this kind of discrimination through the
structure of the intervention function is impossible when the manager cannot
observe individual transmission probabilities.
This limitation forces the manager to treat the users
equally, and the target has to be chosen such that for all . If the manager uses the following intervention
function:then he can implement with as a
Stackelberg equilibrium. Hence, if the manager only observes the aggregate probability,
this prevents him from setting the target transmission probabilities
differently across users.
Figure 5 shows the payoff profiles achieved with
symmetric strategy profiles, which can be implemented by the manager who
observes the aggregate probability.
Figure 5: Payoffs that
can be achieved by the manager who observes only the aggregate probability. (a)
Homogeneous users with . (b) Heterogeneous users with and .
5.4. Limited Observability of the Users and Conjectural Equilibrium
We now relax
Requirement U and assume that user can observe
only the aggregate probability . Even though the users do not know the exact form of
the intervention function of the manager, they are aware of the dependence of on their
transmission probabilities and try to model this dependence based on their
observations . Specifically, user builds a
conjecture function , which means that user conjectures
that the value of will be if he chooses . The equilibrium concept appropriate in this context
is conjectural equilibrium first introduced by Hahn [20].
Definition 4. A strategy profile
and a profile of conjectures
constitutes a
conjectural equilibrium of the contention game with intervention
if
for all .
The first
condition states that is optimal
given user 's conjecture , and the second condition says that its conjecture is
consistent with its observation. It can be seen from this definition that the
conjectural equilibrium is a generalization of Nash equilibrium in that any
Nash equilibrium is a conjectural equilibrium with every user holding the
correct conjecture given others' strategies. On the other hand, it is quite
general in some cases, and in the game we consider, for any strategy profile , there exists a conjecture profile that
constitutes a conjectural equilibrium. For example, we can set if and 0 otherwise.
Since the TRD-based intervention function is linear in
each , it is natural for the users to adopt a conjecture
function of the linear form. Let us assume that conjecture functions are of the
following trimmed linear form:for some .
We say that a conjecture function is linearly
consistent at if it is
locally correct up to the first derivative at , that is, and . Since the TRD-based intervention function is linear in
each , the conjecture function is linearly
consistent at , and and constitutes a
conjectural equilibrium. Therefore, as long as the users use linearly
consistent conjectures, limited observability of the users does not affect the
final outcome. To build linearly consistent conjectures, however, the users
need to experiment and collect data using local deviations from the equilibrium
point in a repeated play of the Stackelberg contention game. A loss in
performance may result during this learning phase.
6. Illustrative Results
6.1. Homogeneous Users
We assume that
the users are homogeneous with for all . Given a transmission probability profile , the system utilization ratio can be defined as the
probability of successful transmission in a given slotNote that the maximum system
utilization ratio is 1, which occurs when only one user transmits with
probability 1 while others never transmit. Table 1 shows the individual payoffs
and the system utilization ratios for the number of users 3, 10, and 100 when
the manager implements the target at the symmetric efficient strategy profile .
Table 1: Individual payoffs and system
utilization ratios with homogeneous users.
We can see that
packets are transmitted in approximately 37% of the slots with a large number
of users even if there is no explicit coordination among the users. The system
utilization of our model converges to as goes to
infinity, which coincides with the maximal throughput of a slotted Aloha system
with Poisson arrivals and an infinite number of users [21], but in our model users
maintain their selfish behavior, and we do not use any feedback information on
the channel state.
6.2. Heterogeneous Users
We now consider
users with difference valuations. Specifically, we assume that for . We will consider three targets: , , and with which for all . assigns a
higher transmission probability to a user with a higher valuation. treats all the
users equally regardless of their valuations. is egalitarian
in that it yields the same individual payoff to every user, which implies that
a user with a low valuation is assigned a higher transmission probability.
Table 2 shows that a tradeoff between efficiency (measured
by the sum of payoffs) and equity exists when users are heterogeneous. A higher
aggregate payoff is achieved when users with high valuations are given
priority. At the same time, it limits access by users with low valuations,
which increases variations in individual payoffs. Also, the results in Table 2 are consistent with that is a Nash
bargaining solution and that is a
nonsymmetric Nash bargaining solution with weights equal to valuations.
Table 2: Average individual payoffs, aggregate
payoffs, standard deviations of individual payoffs, system utilization ratios,
Nash products, and generalized Nash products with heterogeneous users.
7. Conclusion
We have
analyzed the problem of multiple users who share a common communication
channel. Using the game theory framework, we have shown that selfish behavior
is likely to lead to a network collapse. However, full system utilization
requires coordination among users using explicit message exchanges, which may
be impractical given the distributed nature of wireless networks. To achieve a
better performance without coordination schemes, users need to sustain
cooperation. We provide incentives for selfish users to limit their access to
the channel by introducing an intervention function of the network manager.
With TRD-based intervention functions, the manager can implement any outcome of
the contention game as a Stackelberg equilibrium. We have discussed the amount
of information required for implementation, and how the various kinds of
relaxations of the requirements affect the outcome of the Stackelberg
contention game.
Our approach of using an intervention function to
improve network performance can be applied to other situations in wireless
communications. Potential applications of the idea include sustaining
cooperation in multihop networks and limiting the attack of adversary users. An
intervention function may be designed to serve as a coordination device in
addition to providing selfish users with incentives to cooperate. Finally,
designing a protocol that enables users to play the role of the manager in a
distributed manner will be critical to ensure that our approach can be adopted
in completely decentralized communication scenarios, where no manager is
present.
Appendices
A. Proof of Proposition 3
Recall used to define . We examine whether a strategy profile with for all constitutes a
Nash equilibrium of by considering
four cases on the value of .
Case 1 (). Let . If user changes its
transmission probability from to , then its payoff increases because is still zero.
Hence cannot be a
Nash equilibrium if .
Case 2 (). Consider arbitrary user . If it deviates to , is still zero
and decreases. is differentiable
and strictly concave on . Since , and for all ,There is no gain for user from deviating
to any if and only if , which is equivalent to . For to be a Nash
equilibrium, we need for all . To satisfy , all inequalities should be equalities. Hence, only is a Nash
equilibrium among such that .
Case 3 (). Since , there is no gain for user to deviate to such that . If there is a gain from deviation to such that , then there is another profitable deviation such that by using the
argument of Case 1. Therefore, we can restrict our attention to deviations that lead to . At such a deviation by user , is best
response to if and only if . Using the first derivative given in Case 2, we
obtainFor to be a Nash
equilibrium, the above inequality should be satisfied for every , which in turn impliesand this contradicts to the
initial assumption . Therefore, there is no with that constitutes
a Nash equilibrium.
Case 4 (). Since for every , there is a profitable deviation of user only if there
exists such that