#### Abstract

We find a simple, partially altruistic mechanism that can increase global gain for a community of selfish agents. The mechanism is implied in the phenomena found in the minority game. We apply the mechanism to a two-road traffic system to maximise traffic flow.

One of the central
problems in many social and biological systems is how the global waste of
resources by a community of agents can be reduced to a minimum. Recently, the minority
game [1], inspired by the *El Farol* bar problem [2], was introduced to capture certain essential and general features
of competition between adaptive agents. It has been shown that the global
average gain can be larger than that of a random process only if there is
enough information available to all agents [3]. Nevertheless, in
real systems such as financial markets, it is more often the case that
information is limited in comparison with the number of agents involved [4]. Here
we show that, even
when the available information is limited, a global
average gain larger than that in a random process can be achieved as long as a
small but sufficient portion of agents promises to make a sacrificeโto act without considering its own benefit. The application of this finding to
increase the traffic flow in a two-road system is then discussed.

The
minority game consists of agents
going to either room A or B based on predicting a strategy chosen from their strategies. The room with the smaller
number of agents is the winning room, and every agent in the winning room is
awarded one point. All strategies with the correct prediction (of which room will
win) also score one point. Each agent chooses his or her highest-score strategy
each time. A strategy is actually a list of many entries, each prescribing
which room to go to according to the information gathered from previous
experiences. The standard minority game assumes that the information available to
all agents is the winning history of the previous (*memory*) time steps; thus,
there are entries in each
strategy. The *history* is then updated
according to the current outcome of every time step.

Consider that is large. If no one uses a strategy, that is, everyone randomly chooses a room, then the time sequence of the number of people in room A will form a binomial distribution centred at 0.5โ with the standard deviation . The average of the difference between 0.5โ and the number of people in room A is about (the value can be easily calculated using normal distribution approximation), and the average score by this amount would be less than 0.5โ. Thus, the smaller is, the larger the global average gain is. We calculate the average gain for a random process of the minority game to be . This average gain approaches 0.5 when is large.

It is beyond question that people use strategies in games. Once a person has used a โgoodโ strategy and fared better than the rest, it is certain that others will begin to develop their own โgoodโ strategies. In a situation with limited information, however, it will soon be discovered that the average gain is less than when no one uses a strategy. The situation is much like that in the Prisonerโs Dilemma [5, 6], in which each agent makes the best choice for him or herself but ends up with bad results for both agents. In Figure 1, we plot the time sequence of the number of people in room A for the process in which everyone uses his or her best strategy, which is updated at each time step, to make a decision. Apparently, the small value of the average gain in this case is due to large splits at the beginning and end of every interval of the time steps. We explained recently [7] that this quasiperiodic structure is due to the form of the payoff function and the way in which everyone uses his or her strategy. In the case of a large number of agents, one way to raise the long-term global average gain is to introduce a learning mechanism [1], by which bad performers partially replace their strategies with better ones, or certain genetic adaptation schemes whereby each agent modifies parts of his or her bad strategies based on better ones [8]. Here we show two ways of achieving a high global average gain in the standard minority gameโall agents continue to use their best strategies from among the strategies they selected at the beginning of the game without modifying or substituting them.

First, assume that every agent is
reluctant to change strategy unless there is another at hand that has proved to
be much betterโhaving gained points more than the current
one. In this *conservative* community,
the reluctance to switch to another strategy offers strong stability against
variation. As a result, the number of people in room A gradually grows closer
to /2 so that the global gain
increases. One might think that the reason for this is that a strategy with
predictions distributed more evenly between rooms A and B will score better and
thus be favoured. However, this is not true. We have found that when the number
of agents converges to /2 as time
goes on, the variance of the number of predictions in room A for all of the best
strategies remains more or less constant. What really occurs is that, because
the winning sequence is generally set according to the persistent use of strategies,
there are only a few agents who change strategies at each time step, which
means that the variance of the predictions hardly changes. At the same time,
the act of switching to a new strategy increases by a small amount the number
of agents in the winning room when the same pattern occurs again, thus bringing
the number of agents in that room closer and closer to /2. Consequently, the global average gain increases as time goes
on (see the curve indicated by *conservative* in Figure 2). As we have pointed out in our previous work [9], the crucial
factor in reducing population variance is to decrease the number of agents who
switch strategies at the same time. The simplest way to accomplish this is to
allow only a certain number of agents to switch strategies at a given time [9].
In a real system, such as the stock market, such a limitation measure would be totally
unacceptable to agents. However, we have found that a society of conservative
agents automatically reduces the number of simultaneous switches. The method
works when agents are conservative only at the beginningโbefore their first strategy switch. In a simulation,
we allow one of an agentโs strategies to have points and the others to have 0 points at the beginning [10, 11], and
the game is played as usualโall agents use their highest-score strategy
each time. To understand why this method works, let us consider the simplest
case of = 2. There are at least time steps before the first strategy switch
is made. During this period, because everyoneโs favourite strategy is chosen
randomly at the beginning, the score differences between the two strategies
disperse among agents, and their distribution is given approximately by a
normal distribution centred at . The
probability of the score difference being zero is small because it is at one
tail of a normal distribution. This can be compared with the case of a standard
game ( = 0) in which the probability
of the score difference being zero is at the maximum of a normal distribution. It
is clear that, according to the rule of using the highest-score strategy, a
switch can be made at one particular time step only when the score difference
between the two strategies of a given agent was zero at the previous time step.
Consequently, switching probability is greatly reduced when is large.

There
is a second method of increasing global gain that works without requiring agents
to be conservative. Suppose that the prediction of each
entry in every strategy has a probability of being 0
(room A) and () of being 1 (room
B) [12, 13]. A random prediction means .
Now, choose to be slightly smaller
than .5, for example, .4825, in a case with and .
A typical time sequence for people in room A is shown in Figure 3. We see that after
a short period of time, the variation of this sequence begins to decrease so
that the global average gain increases (see the curve indicated by *partially altruistic* in Figure 2). Now,
let us consider to be the percentage
of 0โs in all entries of the strategies. Suppose that 3.5% of agents always choose room B, and the rest use their best strategies as usual [14].
This is equivalent to setting . In this way, the
mechanism does not require that all agents behave in the same particular way,
for example, by acting conservatively. What we need is for a small portion of
agents to act unselfishly (to forgo their best strategies so that their behaviour
is considered to be *altruistic*); then,
the global gain improves in a self-organised manner. Compared to the first
method, this mechanism is more likely to operate in real situations, either because
of the altruistic behaviour of some agents (which has been observed in many
biological species) or because of the careful arrangements of the game
coordinator (such as government). Note that the role of the agents who always
choose room B is similar to that of *producers* in the grand-canonical minority game [15]. Producers always use the same
strategy so that the information deduced from their fixed patterns can be utilized
by the speculators to exploit. It is possible to maximize global gain by
varying the number of producers in the grand-canonical minority game.

Two stages are required for the second mechanism to produce the result shown in Figure 3. First, the average number of people in room A has to shift upward from to become closer to /2. Second, the time sequence reduces its standard deviation as time goes on. The reason the average value shifts upward is as follows [7]. In the first time steps, room A always wins. Thus, at the end of time steps, the strategies that have the most 0โs score best and will be used in the following time steps. The average number of 0โs for each agentโs best strategies is larger than the average of all strategies, which is . Therefore, the average number of people in room A shifts upward [7]. If this shift is well-adjustedโfor the subsequent time steps, room A has about a 50% chance of winningโthen, because these best strategies have already accumulated more points than the others, the situation at time step is similar to that in the first method in which each agent is initially biased in favour of one strategy. (Setting will give the right shift for , . See [7] for details.) Consequently, the standard deviation of the time sequence begins to decrease.

It is worth mentioning that the mechanisms work even better when is larger. When is larger, the average gain will be smaller if everyone uses his or her best strategy in the standard game. However, the average gain can still be brought very close to 0.5 if a small but sufficient number of people promise to make a sacrifice. It is also important that the mechanism works for the whole range of values of (see Figure 4). (The proper percentage is dependent on the values of and ; see [7].) The effect of the gain-increasing mechanism is most efficient in the small case in which global gain is low in the standard game.

The *partially
altruistic* mechanism described above can be employed in a two-road traffic
system. There are two express highways running from south to north in Taiwan. People who
work in Taipei, the capital of Taiwan, normally drive south to their parentsโ
homes or recreation destinations on the first day of a long holiday and drive back
north on the last day. Traffic jams are a nightmare for everyone, and thus the good
choice of a highway from the two available is crucial to the holiday mood.

Highway traffic flow has been extensively studied [16]. Recently, the cellular automata approach [17] has obtained a number of interesting results. The fundamental diagram in traffic models is the plot of traffic flow as a function of vehicle density. A typical fundamental diagram shows that traffic flow reaches a maximum at a certain vehicle density and that the curve as a function of density is concave-upward. Thus, for a system with two similar roads, as in the case of Taiwan, the total traffic flow would be maximal when vehicles are distributed evenly on both. The problem of maximising traffic flow in a two-road system is therefore similar to maximising the global average gain in the minority game, which aims to distribute agents evenly in two rooms. Assume that every driver has a few strategies for determining which road to take, and that on every occasion he or she chooses the strategy that most often correctly predicted the road with the better flow in the past. (In fact, average personal speed, rather than average traffic flow, should be a more important factor for driver satisfaction and may thus be used when choosing the next strategy. Here we neglect the difference.) According to the second mechanism discussed above, one possible way of obtaining a better traffic flow for a given number of vehicles is to have a few percent of (The proper percentage is dependent on the values of and ; see [7].) drivers choosing the same highway every time, either voluntarily or by design.

Are those who do not adopt a strategy really sacrificing anything? In our simulation, they gain no less in the long run. What they have really sacrificed is the freedom to choose.

#### Acknowledgments

This work was supported by grants from the National Science Council (Grant no. NSC97-2811-M005-018) and the National Center for Theoretical Sciences in Taiwan.