Abstract
The paper is a theoretical investigation into the potential application of game theoretic concepts to neural networks (natural and artificial). The paper relies on basic models but the findings are more general in nature and therefore should apply to more complex environments. A major outcome of the paper is a learning algorithm based on game theory for a paired neuron system.
1. Introduction
Individual neurons are the building blocks for more complex neural circuits. In natural systems these more complex neural circuits interact with other components in a manifold of ways thereby generating the compellingly sensual world of behavior around us. Although tireless and tedious efforts in various disciplines culminated in fundamental insights in the field, there are still many unknowns about individual neurons and the processes in which individual neurons interact and organize themselves in neural circuits (e.g., [1]).
Recently, game theory has obtained some attention in the field of neuroscience. The field of neuroeconomics, for instance, combines the two fields in experiments with human and nonhuman players in order to better understand human decisionmaking (e.g., [2]). This paper has a different motivation and proposes a neural network model under a concept of game theory where individual neurons are assumed to optimally behave with a given payoff matrix. The paper theoretically analyzes a paired neuron system and critically specifies that the value game theory may have as an organizing principle for such a system (in the sense of a guiding principle or mechanism involved in neural communication, organization, and synchronization). The paper also specifies a learning algorithm based on game theory for a paired neuron system, which is a major contribution in this text.
In the remainder of this text, Section 2 summarizes the motivation for this paper and validates an intuitively appealing (though not unproblematic) relationship between game theory and biological/artificial neurons. Sections 3 and 4 investigate this relationship, the theory, and the major concepts and challenges involved in more detail, concentrating, among other things, on static and dynamic games of complete/perfect information. Section 5 applies game theoretic constructs to artificial neural networks and presents a learning algorithm based on game theory for network learning. The discussion in Section 6 revolves around related work and Section 7 ends the paper with a summary.
2. Game Theory, Biological Neurons, and Artificial Neural Networks
Our previous work in various areas (e.g., artificial intelligence, soft computing, reasoning under uncertainty, and neuroscience) identified that many cooperations between two agents (artificial or natural) can be interpreted or bear some of the characteristic features of a game. For example, the main concepts in a game are the players in a game, a set of rules by which the game is played, and an outcome in the form of a reward or a punishment (more generally referred to as a payoff) for the players in the game. In addition, a socalled payoff matrix is a common scheme to represent the dynamic behavior of a game.
Figure 1 applies these key concepts to a coupled neuron system where the neurons are modeled to calculate their strategies according to their individual payoff matrix. (The scopes for game theory and neural networks are extremely wide. The paper therefore uses several abstractions and simplifications (e.g., the neuronal circuit models presented in this text are relatively basic, and in terms of game theory this paper concentrates on static games and dynamic games of complete/perfect information). At large, the paper does not suffer from this reductionism as the findings mentioned in the paper are relevant in a wider sense. London and Häusser [1], for instance, emphasize that the contribution of single neurons to computation in the brain has long been underestimated and that there is a need to investigate novel mechanisms that allow individual neurons to implement elementary computations.) Imagine that the two neurons in Figure 1(a) shall generate the following global behavior: if Neuron1 fires, then Neuron2 shall fire, and if Neuron1 is at rest (not firing), then Neuron2 shall be at rest (it is possible to assume an information exchange, unidirectional or bidirectional, via biochemical substances or electrical signals between Neuron1 and Neuron2). Figure 1(b) presents this behavior in a payoff matrix. The payoff matrix assigns a payoff (illustrated as a reward or a punishment ) to each neuron for each combination of strategies (Fire, Rest). For instance, if Neuron1 fires and Neuron2 also fires, then each neuron obtains a rewarding payoff. (Traditionally, the payoff for Neuron1 would be the left value in a matrix cell, and the payoff for Neuron2 would be the right value in a cell. Note also that the payoffs in a cell need not be identical.) If the two neurons correspond with different strategies (e.g., Neuron1 fires and Neuron2 remains at rest or vice versa), then each neuron receives a punishment payoff . Thus, if the goal for the two neurons in Figure 1(a) is to eventually demonstrate the global behavior Fire/Fire, Rest/Rest, then it is possible to assume the following: (i) if the two neurons demonstrate the desired behavior (Fire/Fire, Rest/Rest), then no action is required, and (ii) in case the two neurons do not demonstrate this desired way of interaction, then some corrective action has to be taken to achieve the desired global behavior. Again, this paper is not interested in the exact description of the biochemical processes (which are not known in their entirety anyway) that may achieve this mode of operation in biological neurons—the motivation here is to describe this global interaction via game theoretic concepts, perhaps involving additional models and abstractions for the two neurons in Figure 1(a). (The following book by Purves et al. [3] provides a comprehensive account of the state of art of neuroscience, and Chapter 1 of this book, which is dedicated to neural signaling, is particularly informative about many of the issues mentioned in this text.) On the other hand, it is crucial to understand that the payoff matrix in Figure 1 is a crude generalization. In reality, it is very difficult to find and specify exactly a payoff function for a game, which is a critical task in game theory (i.e., approximations are the norm rather than the exception).
(a)
(b)
(c)
Laying this issue aside, it is possible to provide a rather straightforward mathematical description for the modeling of the global behavior desired for the two neurons in Figure 1(a). To begin with Figure 1(a), it is necessary to understand that the communication between the two neurons in Figure 1(a) is a relatively simple, onedimensional, linearly separable, and supervised learning classification task. Neuron1 can either fire or be at rest, and Neuron2 has to respond accordingly. It is possible to imagine a function where a value above a certain threshold value represents the firing state for Neuron1 and, a value represents the resting state for this neuron (1) as
Collectively, it is possible to think of Neuron1 and Neuron2 as a simple inputoutput unit that behaves similar to a switch. In terms of its global behavior, a perceptron can be interpreted exactly in the same way. (It is not necessary to elaborate on the perceptron learning algorithm in great detail as this information is widely available in the neural network literature (e.g., in [4, pages 43–54]).) This does not mean, however, that the payoff matrix in Figure 1(b) can be implemented by a traditional perception. Figure 1(c) illustrates a model that is similar to a perceptron but incorporates elements from game theory that may allow this model to demonstrate the behavior illustrated by the payoff matrix in Figure 1(b). It is clear from Figure 1(c) that the decisionmaking process for this model involves some form of an input, an output, a transfer function, and a reward/punishment mechanism, all based on concepts from game theory. The forthcoming Section 5 provides a more detailed description for this model and the relationship illustrated in Figure 1 at large. The current focus is to describe the intuitive relationship between game theory, biological neurons, and artificial neural networks just mentioned in more detail and to elaborate on the various (fundamental) challenges involved in this relationship.
3. Game Theoretic Interpretations
In order to appreciate the forthcoming sections and to avoid unnecessary confusion, it is helpful to understand that game theory distinguishes between different types of games. At large, there are static games or dynamic games with complete information or incomplete information. If the payoffs and strategies available to other players are known and common knowledge to each player, as in Figure 1(b), then a game has complete information; otherwise, the game is classified as a game of incomplete information. Crucially, in a static game, players take their decisions simultaneously (individually and independently), they then move (not necessarily simultaneously but bound to the decisions they took) and then receive their payoffs. That is, the players in a static game are unaware about the strategies the other players in the game may choose but any player may hypothesize on the strategies other players may choose. (Marriage vows couples exchange to each other during a wedding ceremony may be a good example; the decisions are taken independently and the further proceedings of the ceremony unfold upon these decisions.) In a dynamic game, decisions are taken sequentially. In such a game, a player A may choose and act a particular strategy, and another player B who has observed player A may use this information for an appropriate response. (Chess is a typical example for such a game.)
It is tempting now to immediately view and deal with Figure 1 as a dynamic game with complete information where the payoffs in the matrix are common knowledge between the players, and Neuron2 reacts (sequentially) to the signal arriving from Neuron1 (perhaps with other processes going on bidirectionally). There are several reasons, however, to initially treat Figure 1 as a static game with complete information. For one thing, Figure 1 is a rather extreme reduction and it is relatively easy to envisage more complex scenarios. The two neurons in the figure could be exchanged with the brains of two humans or, for that matter, with the complete computer simulation of such two brains, which is the dream of the Blue Brain Project at EPFL (École Polytechnique Fédérale de Lausanne). Another reason involves understanding and learning; it is better to begin with (somewhat simpler) games of complete information and then to move on to more challenging games (in terms of the theory involved). In any case, the forthcoming text benefits from this bottomup approach as it helps to specify, more clearly, some of the subtleties involved in this investigation. In terms of these subtleties, it is important to understand that several of the fundamental assumptions in game theory can be challenged intellectually with relative ease. Some of the reasons for this not only relate to the current example but also reach out deeper into the heart of game theory. These more sensitive (interrelated) concepts include rationality, simultaneity, equilibrium, and mixed strategies.
Rationality
Many of the formalisms in traditional game theory imply a degree of rationality by the players/agents involved in a game. As crucial as the notion of rationality is for the theory, the term rationality is not without problems. For one thing, the term rationality is not universally defined, and for another thing, human agents are often not the hyperrational agents the theory requires them to be. Many applications of game theory therefore involve abstractions and simplifications to various degrees. For instance, this happens when game theory is applied to the modeling of interactions in genes, viruses, or cells, as is the case in evolutionary game theory [5]. (Evolutionary game theory is an extension to classical game theory motivated by some of the more problematic issues discussed in this section. Though very interesting and with some relevance to this work, evolutionary game theory has not been dealt with in this text mainly for the sake of brevity.) Another interesting contribution to this discussion may come from the observation that people usually associate biological brains with higher cognitive functions such as learning or rational decisionmaking. As true as this may be, many people also carry the common misconception that such a task can only be achieved by organisms with highly developed nervous systems, i.e., with brains, which is incorrect. For example, there are instances of predictive behavior within microbial genetic networks where bacteria anticipate changing environments [6]. Bacteria, however, have no brains or nervous systems. Instead, these microbes experience and learn through evolutionary changes in their complex networks of interacting genes and proteins (i.e., the problemsolving potential is encoded, in part, in the architectural configuration of the system) [7]. Although the specific mechanisms for this problemsolving ability are largely unknown today, many would agree that such tasks should involve some form of memory. The recent euphoria devoted to socalled memristors (memory resistors) may shed some light on this topic in the future. In electronics, a memristor is a fundamental basic circuit element [8]. Importantly, through this element, nature seems to provide a form of memory for free. Naturally, the value memristors have for neural networks has been identified in some of the aforementioned and other works already (e.g., [9]).
Simultaneity and Equilibrium
These terms are problematic too and can quickly lead into a deep philosophical discussion. A root problem in Figure 1(a) seems to relate to the larger problem of existence and timing. The typical development process for artificial neural networks relates to this problem quite well too. The learning process for such networks usually starts with a network configuration and a random weight assignment. But how does nature determine the configuration for a network or the degree of connectivity? And how does the network know about the point in time when operation begins? Are these tasks performed by a monitoring supervisory unit or do the neurons involved act with a degree of autonomy (and rationality)? A more distant view magnifies this point even more. An outside observer looking at the complete neural activity of a human being, or a human being in its entirety for that matter, witnesses a multitude of processes running in parallel/simultaneously, and it is not clear at all to this observer how these processes may relate to each other or how they are coordinated in detail. A full discussion of this problem is beyond the scope of this preliminary investigation but it is worthwhile to describe how the concept of equilibrium emerges in this context. It is difficult to imagine an observer that is able to grasp a human being in its entirety. It is possible, however, to imagine an observer witnessing the object under observation in a particular higherlevel, abstract global state. Assume a state of equilibrium (e.g., defined by an energy minimum or some other form of optimization or stabilization). In nature, a system may naturally strive or converge for such an equilibrium. Game theory provides the concept of equilibrium too—the agents in a game acquire this equilibrium through rational thought. Whether such an equilibrium is a law in nature (e.g., similar to the concept of entropy in physics) is only a thought that shall be laid aside here.
Mixed Strategies
Imagine that for some reason Neuron1 and Neuron2 in Figure 1 have cooperated well over time. In this case the likelihood that Neuron2 fires when Neuron1 fires could be rather high. On the other hand, if for some reason their cooperation was relatively poor in the past, then the likelihood of a correct response may be low. It is important to understand that in both cases positive as well as negative responses are still possible (e.g., a relatively good cooperation over time may not entirely prevent undesired responses). Game theory uses mixed strategies for the modeling of such likelihoods, and from a purely theoretical point of view, they are rather important in game theory. For example, in any game where a player has to outguess the behavior (strategy) of any other player involved in the game (e.g., in poker or in the childhood game rockpaperscissors), there is no Nash equilibrium [10, pages 29–33]. In such a game a player may select a strategy according to some likelihood (e.g., motivated by a hint, a tipoff, or some other piece of information that may be difficult to quantify). Game theory expresses a mixed strategy for a player as a probability distribution over some or all strategies available to a player () in a game. It is clear that in many cases probability distributions may not be available and that the exact quantification of likelihoods is a point of weakness in game theory. In such cases the term uncertainty is often more appropriate. This term, however, opens the door for various theories dedicated to the field of management of uncertainty and ultimately adds a touch of vagueness to the rigorous formal underpinnings game theory provides. Hampton et al. [11], for instance, present several update rules for mixed strategies in a neurosciencerelated study with human players and the paper mentions several other sources where this has happened in the past. Anyhow, Figure 2 illustrates a case with mixed strategies for Player1 and for Player2. The hypothetical mixed strategy for Player2 may then be interpreted as Player1's uncertainty that Player2 may play strategy Fire with probability/likelihood 0.8 and strategy Rest with probability/likelihood 0.2. (Note that the terms player and neuron can be used interchangeably in the figure. In addition, the payoff matrix in Figure 2 with its numeric values is less general than that of Figure 1(b). This is for demonstration purposes only and does not impair the general conclusions presented in the forthcoming sections.)
The remaining text in Section 3 analyzes the static game with complete information illustrated in Figure 2 in more detail and starts with Player1's point of view of the game. (Gibbon's [10] book on game theory is a major resource in this work and those readers wishing to get further information about the game theoretic elements mentioned in this text are referred to that text.)
3.1. Player1's (Neuron1's) Point of View
For simplicity, Figure 3 illustrates Player1's (Neuron1's) view only. Player2's payoff is irrelevant in this view; that is why it is omitted in Figure 3.
According to Figure 3, given that Player1 believes that Player2 will play the mixed strategy , then the expected payoff for Player1 for playing the pure strategy Fire is
Similarly, the expected payoff for Player1 for playing the pure strategy Rest is
Figure 4 illustrates (2) and (3) in a single diagram. In order to understand the forthcoming arguments, it is important to always bear in mind that the main goal for each player is to obtain a maximum payoff in a game. Figure 4 illustrates that if , then in which case Player1 should play strategy Fire (see also Figure 3). On the other hand, if , then in which case Player1 should adopt strategy Rest. A special case exists for , which is the point where the two straight lines and intersect. In this case Player1 is indifferent about which strategy to play.
It is also possible to consider mixed strategy responses by Player1. Player1's expected payoff from playing the mixed strategy when Player2 plays the mixed strategy is the weighted sum of the expected payoff for each of the pure strategies (Fire, Rest) where the weights are the probabilities . According to Figure 3 this payoff amounts to
What exactly is at stake here? At stake is the goal to maximize the payoff for Player1 expressed by (4). The mixed strategy is the parameter that provides Player1 with a handle to work towards this maximum. Consider three cases: , , and (i.e., the problem is to determine which values for maximize , , or for Player1). For , (4) gives . In this case maximizes the term . For , (4) gives , in which case provides the maximum. Finally, for , (4) yields . This term is independent of and indicates that any response by Player1 is a best response to Player2's assumed strategy. Figure 5 summarizes all best responses by Player1 if Player2 plays mixed strategy , and mixed strategy is available to Player1.
All in all, Figure 5 indicates that if Player2 plays mixed strategy , then Player1's best response is to play (i) strategy Fire if , (ii) strategy Rest if , and (iii) any strategy if .
3.2. Player2's (Neuron2's) Point of View
This section describes Player2's view from Figure 2. Overall, the steps are similar to those steps performed in the previous section. Given that Player2 believes that Player1 will play the mixed strategy , then the expected payoff for Player2 when playing strategy Fire is
The expected payoff for Player2 for playing the pure strategy Rest is
Figure 6 illustrates (5) and (6) in a single diagram. The interpretation of Figure 6 is similar to that of Figure 4. It is, however, important to carefully look at the labeling on the coordinate system axes. In Figure 6 the two straight lines and intersect at , indicating that for , Player2 is indifferent about which strategy to play. Figure 6 then illustrates that Player2 should play strategy Fire for (because ) and strategy Rest for (because ).
Further, Player2's expected payoff from playing the mixed strategy when Player1 plays the mixed strategy is (see Figure 2)
The interpretation of (7) is similar to that for (4). Here, Player2 has the mixed strategy at his disposal in order to maximize (7). Consider the following three cases: , , and . For , (7) gives , and generates the maximum for this term. Next, gives , and provides the maximum. Finally, establishes . This term is independent of and so any response by Player2 is a best response to Player1's proposal. Figure 7 summarizes all best responses by Player2 if Player1 plays the mixed strategy .
Figure 7 illustrates that if Player1 plays mixed strategy , then Player2's best response is to play (i) strategy Fire if , (ii) strategy Rest if , and (iii) any strategy if .
3.3. Nash Equilibrium for Player1 (Neuron1) and Player2 (Neuron2)
Figures 5 and 7 are quite similar and it is possible to combine both figures in a single diagram. Figure 8 emerges if Figure 7 is put on top of Figure 5 and additionally Figure 7 is flipped and rotated.
The interesting features in Figure 8 include those points where and intersect (i.e., points (0, 0), , and (1, 1)). What makes these three points important is that for each of these three points the strategy chosen by any of the two players involved is a best response to the strategy chosen by the other player, and this is the definition of a Nash equilibrium. Crudely, in a game played by players, the strategies are in a Nash equilibrium if for each player in a game strategy is a best response to the strategies specified for the other players in the game (e.g., see [10, pages 8–12 and 33–48]).
In the communicating neuron context of Figure 1(a), this means that if Neuron1 fires then Neuron2's best response is to fire too. If Neuron1 is at rest, then Neuron2's best response is to be at rest too. An interesting situation exists for point . This situation may be interpreted as if Neuron2 is unaware about the state (strategy) of Neuron1, then Neuron2 may play either strategy, and vice versa (i.e., the situation for each neuron/player is similar to the tossing of a coin).
At this moment, it may be useful to take a step back and to evaluate the results mentioned before a bit more carefully. The results are derived from a purely formal investigation of the (arbitrary) game illustrated in Figure 2. As discussed above, a neural behavior can be modeled under these game theoretic concepts. Whether these concepts can be theoretically generalized to neural systems with other arbitral payoff matrices is a question of debate. For example, the assumption that natural systems organize themselves according to the predictions of game theory (e.g., converge to or exploit Nash equilibriums) rather quickly leads back to the problems mentioned earlier in Section 3 (simultaneity, rationality, etc.). Consider a newly created or evolving biological neural network where new neurons emerge frequently (e.g., thousands of new neurons arise in the adult brain every day [12]). Some of these new neurons may be required to establish a way of communication with other neurons and it is difficult to imagine how this may work if there is no previous history between these neurons. Theoretically, for artificial neural networks, the situation is similar. Imagine a supervised learning scenario and an untrained network just provided with an initial random weight assignment. How does such a network know about a correct/incorrect classification outcome in the first place? The simple answer is that it knows from its supervisor (the network designer, developer, programmer, etc.). But who is the supervisor in nature? In nature, scientists often search for a guiding principle or law. This text does not suggest at all that game theory provides such a guiding principle, but it is necessary to create an awareness of the wider issues this work touches upon. Forthcoming sections relate back to some of the problems mentioned in this section but for the moment this text moves on to dynamic games.
4. Dynamic Games and Neural Circuit Dynamic
This section concentrates on dynamic games with complete and perfect information. Such games have three distinctive features: (i) the moves in the game occur sequentially, (ii) a sort of move history exists (i.e., all previous moves are observed before a next move is chosen), and (iii) the payoffs in the payoff matrix are known to all players in the game. Remember, a game has complete information if the content of the payoff matrix is common knowledge to all players in the game. A game has perfect information if every player has a record of the complete history of the game so far; otherwise, the game has imperfect information. Backwards induction is a general problemsolving strategy for such games and in many situations a game tree is a useful representation for a dynamic game. The game tree in Figure 9 represents an arbitrary dynamic twomove game played by two players (indicated as 1 and 2 in the figure).
The strategies for the players in Figure 9 are Left and Right for player one and Up and Down for player two. The numbers at the leave nodes at the bottom of the tree represent the payoffs for the players after traversing a particular route through the tree. The top number represents the payoff for player one, and the bottom number represents the payoff for player two. The game follows three rules; and taken together, these rules are referred to as the extensiveform representation of the game.
(1)Player one decides on one of the available strategies (here, or ). (2)Player two observes this decision and decides on an appropriate strategy response (here, or ). (3)The players receive their payoffs.Backwards induction works its way up from the bottom of the tree. Assume the position at the bottom of Path1 where player one has decided to play strategy and player two, who has observed this decision, is contemplating a response. The best response for player two is to play strategy in which case player two receives the payoff 2 (instead of payoff 1), and player one recieves the payoff 1 (instead of payoff 2). Per definition, all information in the tree is available to all players (i.e., player one is aware that the response of player two is if player one decides to play strategy ). Now assume the position of player two at the bottom of Path2. In this case the best response for player two is again to play strategy in which case player two receives the payoff 3 and player one the payoff 0. Player one can do some reasoning too. Between the two paths, and expecting best response decisions by player two, player one can expect a payoff of 1 for Path1 and a payoff of 0 for Path2 . Each player aims for a maximum payoff and so player one decides to play strategy . For player two, who is rational and aware of this thinking, the best response for this choice is to play strategy . The pair of best responses for player one and player two is referred to as the backwardsinduction outcome of the game. This text mentioned earlier that there are different types and definitions for Nash equilibrium. In the type of dynamic game that just investigated the backwardsinduction outcome of the game is the Nash equilibrium for the game (note that a game may have more than one Nash equilibrium).
Figure 10 applies these notions to the neuron communication example (see Figure 1(a) and the payoff matrix in Figure 2). The number 1 in the figure represents Neuron1 and the number 2 stands for Neuron2. The strategies for both neurons are Fire and Rest .
For the game tree in Figure 10, backwards induction produces two backwardsinduction outcome pairs, namely, the pair and the pair . Both pairs are a Nash equilibrium for the game. This result is not so surprising and correlates with those results produced in the previous Section 3. If Neuron1 fires, then the best response for Neuron2 is to fire too, and if Neuron1 is at rest, then the best response for Neuron2 is to be at rest too. It is necessary now to mention that game theory provides several possible extensions to the type of games presented in this section. A simple extension is games with longer sequences (perhaps an infinite number) of moves and more than two players. A complete treatment of all these features is well beyond the scope of this paper, and the reference section in this paper may direct the interested reader to further relevant information on these topics. Overall, however, the section provides several important insights. First, the findings in this section associate game theory and neural network dynamic intuitively well, and second, the issue of repetitive, longer sequences involving updates naturally leads to the issue of learning.
5. Game Theory and Neural Network Learning
In order to acquire a capacity for decisionmaking, a network has to evolve from an unorganized state to an organized (synchronized) state with the latter state demonstrating the desired problemsolving potential. The mechanism that drives artificial neural networks from an unorganized state to an organized state is typically realized by a learning algorithm. This section describes a learning algorithm based on game theory for artificial neural networks. The question marks in Figure 11 indicate that game theory provides two possible access points for a learning algorithm: (i) the payoffs in the payoff matrix (i.e., the payoff function), and (ii) the values for the mixed strategies.
5.1. Algorithm
For the algorithm, imagine a onedimensional, linearly separable, and supervised learning classification task. Figure 12 illustrates such a task. The classification scenario in Figure 12 takes place in an arbitrary realvalued , coordinate system. The classification scenario involves objects and together these objects represent the training set for the learning algorithm (e.g., an object may represent a measurement of membrane potential in a neuroscience experiment and indicate whether a neuron is firing or in a resting state). The values measured for these objects have been normalized such that for every object yields . Let the black dots in Figure 12 represent objects of Class 1, and let the lined circles represent objects of Class 2; and let Class 1 indicate the resting state of a neuron and Class 2 indicate the firing state of a neuron. The two points and in the figure are division points. In their current positions, correctly separates all objects into their corresponding classes, whereas incorrectly classifies three Class 2 objects. At the start of a learning scenario, may have been positioned randomly and in successive steps the learning algorithm may have moved this starting point (through various other points) until it finished in location , which is a solution to the problem. Figure 13 projects these ideas into a game theoretic context.
(a)
(b)
(c)
(d)
The figures in Figure 13 are similar to Figure 4 and represent Neuron1's point of view. Remember, the mixed strategy represents Neuron1's uncertainty about Neuron2 and the task for Neuron1 is to establish (in a learning process) a model about the expected behavior (mixed strategy , payoff function) for Neuron2. Further, every figure in Figure 13 includes two lines and either or , which are all payoff functions. (Note that the forthcoming discussion now focuses on Figure 13(a) to 13(c).) Line is fixed and always remains unaltered during the learning process. In addition, represents the payoff function for Class 1 and so, per definition, the resting state for Neuron1. The second line is determined by the angle , where degree. This line represents the payoff function for Class 2 (i.e., the firing state for Neuron1). The angle is derived by the function (e.g., the value corresponds to an angle ). The learning process for Figure 13 is similar to the scenario mentioned for Figure 12. Figure 13(a) represents an initial random assignment for . Point in this figure is at the intersection of and . The learning algorithm will find out in the training phase that this point does not separate the two classes correctly and take appropriate action. In this case, the algorithm will increase the angle , which moves the intersection point further to the left. There may be several such steps until the algorithm arrives at point in Figure 13(b), which is a solution to the problem. (Note that from this position it is possible to determine (i) the mixed strategy and (ii) the payoff functions ( and ) for fire/rest for Neuron1, which is the goal for the learning algorithm.) Anyway, Algorithm 1 illustrates the learning algorithm in pseudocode, where the positive constant in the algorithm represents the wellknown learning rate. (Of course, the problems of seize of learning rate, outliers, stopping criteria, etc. apply to this algorithm too. These problems, however, can be neglected in this text as they are well commented elsewhere and not really relevant to the focus of this work.)

Figure 13(c) captures how the learning algorithm classifies unknown objects (i.e., objects that were not involved in the training phase). Any unknown object to the left of point produces two intersections, one at and one at (white circles in the figure). However, any of these points yields . That is, the payoff for (rest) is always larger than the payoff for (fire). Therefore, Neuron1 chooses to stay at rest for any such value. For similar reasons, for any object to the right of , Neuron1 chooses to fire, because for any such value, the payoff . Equation (8) formalizes this outcome as follows:
where is the coordinate of intersection point and in general the separation point determined by the learning algorithm. For the sake of completeness, Figure 13(d) illustrates a possible scenario from the viewpoint of Neuron2. This scenario is similar to Figure 13(a) but in this figure it is Neuron2 that has just received an initial random assignment for the angle . The task for the learning algorithm now is to establish a model for Neuron2 about the expected behavior (mixed strategy , payoff function) for Neuron1. It is not necessary to provide a detailed description for these processes for Neuron2 because of the general symmetry of the system. (Note that this does not mean necessarily that Neuron1 and Neuron2 learn on the same data. Many of the examples in this text are highlevel abstractions of natural systems where (i) information exchange between two neurons can be unidirectional, bidirectional, inhibitory, excitatory, and effect neuronal differentiation, (ii) unconventional neurotransmitters can provide signaling from postsynaptic cells back to presynaptic cells, or (iii) chemical signaling is not limited to synapses only (e.g., signaling may involve the secretion of chemical signals onto a group of nearby target cells). Thus, a measurement of data (e.g., a particular molecular concentration or a particular biochemical or electrical signal) at Neuron1 related to or may correspond to a related event involving the same or different components at Neuron2. (For more detail, see [3, Unit 1, Neural Signalling]).) It is important, however, to understand what Section 5 achieved. The section formalized a learning algorithm in the game theoretic framework such that a paired neuron system can establish a synchronized way of communication. The learning algorithm determines the payoff functions for the payoff matrix as well as the mixed strategies for the neurons involved. This is an interesting and novel outcome according to our current understanding of the field. (It is clear that the presented algorithm shares many similarities with traditional neural network algorithms (e.g., the perceptron learning algorithm). The presented model, however, goes beyond traditional models where a neuron is modeled as an accumulator of multiple inputs (e.g., such as the McCullochPitts neuron, which has been a basis of neural networks for some time). The paper mentioned already that, in reality, there are still many unknowns about individual neurons communicating with other neurons (e.g., an individual neuron is not just a neuronal membrane; for instance, it includes complex molecular circuits and wellorganized structures, such as dendritic trees [1]). It is necessary, therefore, to develop the theoretical concept of individual neurons beyond the accumulative neuron model (e.g., by proposing a neural network model where individual neurons are assumed to optimally behave according to concepts from game theory). In addition, although the proposed model is for a paired neuron system only, the model has the potential to be expanded for the utilization to more complex networks. For example, the angles , , etc. in Figure 13 lead to trigonometric functions (e.g., the division point in Figure 13(c) can be determined from and ). Learning algorithms for more complex multilayer networks (e.g., the backpropagation algorithm) rely heavily on derivatives (e.g., those of a transfer function). The derivatives for trigonometric functions are easy to obtain and this is certainly beneficial for potential expansions of the proposed approach to more complex network structures. A treatment of such potential expansions, however, is outside the scope of this paper.)
6. Related Work
This section initially repeats an important fact that has been mentioned several times in this text already, namely, that the scopes for neuroscience and game theory are quite rich and rather complex in their own right, and that this paper, consequently, can only present a condensed view of the many challenges involved in the wider context of this investigation. A second important statement in this section is the finding that although there is work combining game theory and neuroscience, according to our understanding, the two fields have not been combined in the way presented in this paper. For example, the relatively young field of neuroeconomics combines the two fields in experiments with human and nonhuman players (e.g., see Sanfey et al. [13] for a somewhat briefer introduction to neuroeconomics or Krüger et al. [2] who reviews this topic quite well). One assumption in the field is that one of the tasks of the human nervous system is to facilitate successful interaction in complex environments and that the process in essence is a decisionmaking process. Körding [14] describes that the value decision theory, which is formally well defined, may have for generating a better understanding of the processes going on in the nervous system during these interactions. The paper introduces the basic concepts of decision theory and emphasizes Bayesian decision theory because this theory, according to Körding, provides a compact and elegant formalism and contains other properties (e.g., its ability to handle uncertainty) that may suit studies in neuroscience well. Works by Sanfey [13, 15] or Hampton et al. [11] indicate other interesting research directions in neuroeconomics. A common feature in these papers is studies in which decisionmaking is based on game theoretic models that are mathematically well understood (e.g., Prisoners' Dilemma, Trust Game, or Ultimatum Game) and where the neural activity of participating players is recorded via established methods (e.g., functional magnetic resonance imaging). A major goal in these studies is to relate brain areas and fundamental brain mechanisms with decisionmaking tasks. Interesting results include those findings where outcomes disagree with theoretical predictions as is the case when emotions such as anger, frustration, or greed, which are generally difficult to quantify and to describe mathematically, come into play because such findings may challenge basic game theoretic assumptions and definitions. For example, one study [16] measuring activation in the anterior insula (a brain region involved in emotional processing) of players participating in the socalled Ultimatum Game contradicts the concept of rationality mentioned in Section 3. The results from this study indicate that players may act irrationally (in a game theoretic sense) if other players act in an antisocial or unacceptable way (e.g., a player may not accept an indecent, unfair, or greedy offer). An outcome may also deviate from a predicted outcome if nonhuman players are involved (e.g., a program running on a desktop PC or a robotlike device), which may be interesting for people working in human computer interaction.
The possible application of game theory to fields such as human computer interaction indicates that game theory has long left its traditional environment–economy and human decisionmaking (the famous mathematician John Forbes Nash was awarded, jointly, the Nobel Prize in Economics in 1994 for his work in game theory). Today, the theory is widely applied in the natural sciences for the modeling of a rich variety of biological games involving agents of various types. Indeed, the principles of the theory are general enough to attract cuttingedge research in artificial intelligence or systems biology in applications where webbased intelligent agents or robots may have to wrestle with complex decisionmaking problems [17] or where evolutionary game theory investigates the interplay between evolutionary dynamics and biological games [5]. For this work it is important to understand that the term rational may not be utilized with ease in these domains and the term uncertainty often softens stricter demands (e.g., those coming from probability theory). Applications in artificial intelligence and evolutionary game theory therefore are permeated by techniques from soft computing (genetic algorithms, fuzzy logic, etc.), which makes it tempting to foresee the inclusion of some of these techniques into the model proposed in this work.
Although it is clear that several other interesting studies could be mentioned here, this review section wants to draw to an end by commenting, briefly, on the timing of games. This paper dealt with static and dynamic games in a separate way and this treatment may have given the impression that a system, over time, always sticks to one type of game, which is questionable. Consider the timing of games in a different context. Take a tournament where the teams and are two teams among several other teams. Imagine not only that team and team meet in the early qualifying stages of the tournament, that team beats team during these qualifying stages, but also that both teams survive qualifying and later meet again in the final, which is won by team (e.g., in the 2008 Olympic Games, this was the case for the women's softball teams of Japan and the US. Team of Japan lost in the early stages against the team from the US but won the gold medal in the final against team of US.) Anyhow, if the team coaches elaborate on their strategies in the qualifying stages, then this analysis may have the form of a static game, whereas in the final, both teams have met before and so the coaches find themselves as game theoretic dynamic game analysts. How does this relate to neural networks? Take the case of an untrained neural network (natural or artificial) again. If the network is untrained (without history), then preliminary assumptions may come from a static game perspective. At a later point in time, some neurons in the network may have cooperated in the past in some way, and for their further interaction, dynamic game concepts may be applicable. A further treatment of this line of thought is beyond the scope of this paper but we feel that the accumulated information in this review section at large provides several pointers for further research.
7. Summary
The paper presented a novel concept for describing individual neurons under the game theoretic framework. The paper created a firm understanding about some of the fundamental problems in game theory and emphasized that these problems are not unique to the domain of neural systems, but that these problems reach out more deeply into game theory, science, and the world around us. The paper demonstrates that various strategic game theoretic concepts and calculations seem to be naturally suitable for the modeling of the behavior of a paired neuron system (and possibly for more complex networks). This finding was further solidified through the specification of a novel learning algorithm based on game theory for the purpose of neural learning.
Acknowledgment
The first author gratefully acknowledges the support of Japan Society for the Promotion of Science (JSPS Fellowship no. S09168).