About this Journal Submit a Manuscript Table of Contents
Neural Plasticity
Volume 2011 (2011), Article ID 579840, 21 pages
http://dx.doi.org/10.1155/2011/579840
Review Article

A Neural Correlate of Predicted and Actual Reward-Value Information in Monkey Pedunculopontine Tegmental and Dorsal Raphe Nucleus during Saccade Tasks

1Graduate School of Frontier Biosciences, Osaka University, 1-3 Machikaneyama, Toyonaka 560-8531, Japan
2Department of Physiology, Kansai Medical University, 10-15 Fumizono-cho, Moriguchi City, Osaka 570-8506, Japan
3ATR Computational Neuroscience Laboratories, 2-2-2 Hikaridai, Seika-cho, Kyoto 619-0288, Japan
4PRESTO, Japan Science and Technology Agency (JST), 4-1-8 Honcho Kawaguchi, Saitama 332-0012, Japan

Received 13 March 2011; Revised 13 July 2011; Accepted 4 August 2011

Academic Editor: Johannes J. Letzkus

Copyright © 2011 Ken-ichi Okada et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

Dopamine, acetylcholine, and serotonin, the main modulators of the central nervous system, have been proposed to play important roles in the execution of movement, control of several forms of attentional behavior, and reinforcement learning. While the response pattern of midbrain dopaminergic neurons and its specific role in reinforcement learning have been revealed, the role of the other neuromodulators remains rather elusive. Here, we review our recent studies using extracellular recording from neurons in the pedunculopontine tegmental nucleus, where many cholinergic neurons exist, and the dorsal raphe nucleus, where many serotonergic neurons exist, while monkeys performed eye movement tasks to obtain different reward values. The firing patterns of these neurons are often tonic throughout the task period, while dopaminergic neurons exhibited a phasic activity pattern to the task event. The different modulation patterns, together with the activity of dopaminergic neurons, reveal dynamic information processing between these different neuromodulator systems.

1. Introduction

Reinforcement learning algorithms, originally proposed in the machine learning field, successfully explain various types of adaptive behavioral changes, including the simple classical and operant conditioning of animals [16] as well as the complex social and economic behavior of humans [7]. During the reinforcement learning process, subjects choose a behavior that is expected to yield the maximal reward and then revise this prediction on the basis of the reward prediction error, which is the difference between the predicted and actual reward [8]. Numerous neurophysiological studies have shown that midbrain dopaminergic neurons, located in the substantia nigra pars compacta (SNc) and ventral tegmental area (VTA), encode the reward prediction error signal [1, 912]. Dopaminergic neurons exhibit phasic burst firing in response to external stimuli and rewards, and the response magnitude alters throughout the course of learning to match the reward prediction error signal [8]. Furthermore, the firing rate of dopaminergic neurons reflects the predicted reward value, which includes the possible reward magnitude, probability of reward delivery, and time delay for receiving the reward [10, 13, 14]. These dopaminergic neurons project to the striatum and cerebral cortices, and the release of dopamine in the projection sites induces synaptic plasticity that corresponds to the revision of reward prediction [6, 1517] (see Figure 1, red arrows).

579840.fig.001
Figure 1: Simplified cortico-basal ganglia circuitry with dopaminergic, cholinergic, and serotonergic innervation. The main cortico-basal ganglia circuit is highlighted by the dashed rectangle and the gray-shaded boxes. Midbrain dopaminergic neurons receive inhibitory input from basal ganglia nuclei and project to the striatum and cerebral cortices. The PPTN and DRN interact with dopaminergic neurons and basal ganglia nuclei. Here, we consider only the major routes by which the basal ganglia and neuromodulators are interconnected. NAc: nucleus accumbens, SNr: substantia nigra pars reticulate, GP: globus pallidus, VP: ventral pallidum, SNc: substantia nigra pars compacta, VTA: ventral tegmental area, DRN: dorsal raphe nucleus, PPTN: pedunculopontine tegmental nucleus.

Although a large body of experimental evidence has revealed the firing pattern of midbrain dopaminergic neurons and its specific role in reinforcement learning, there is considerable debate about the signal properties of these neurons. First, it was suggested that dopaminergic neurons transmit different types of signals that are related to salient or aversive events [1822]. Second, in addition to phasic burst firing, a tonic firing pattern has also been observed in dopaminergic neurons [23, 24]. It was suggested that, in the tonic firing mode, dopaminergic neurons maintain a baseline concentration level of dopamine that is vital for motivational behavioral control and to enable the normal functions of the neural circuits. One key issue that remains unclear is the property of the input signal to the dopaminergic neurons. Therefore, several essential elements of reinforcement learning are unsolved, that is, the mechanism for the computation of the reward prediction error and the mechanism for value formation from the interaction of different kinds of information such as the quantity, certainty, and timing of the reward.

Recent pathophysiological and pharmacological studies have suggested that there are mutual interactions between dopamine and other neuromodulators, including acetylcholine, serotonin, and noradrenaline [18, 2529]. Together with the dopaminergic system, these neuromodulators are proposed to play an important role in gating movement, controlling several forms of attentional behavior [30], and the reinforcement process [28, 31]. The cholinergic pedunculopontine tegmental nucleus (PPTN) and laterodorsal tegmental nucleus (LDT) feed strong excitatory input to midbrain dopaminergic neurons and are reciprocally connected with various basal ganglia nuclei [32] (see Figure 1, green arrows). Additionally, the dorsal raphe nucleus (DRN) is the principal source of serotonergic innervation to the basal ganglia and dopaminergic neurons of rodents [3337] and primates [38, 39] (see Figure 1, blue arrows). The noradrenergic locus coeruleus (LC) has widely distributed ascending projections to the neocortex [40]. The neurons for these different neuromodulators are plausible candidates as the source of input to dopaminergic neurons and also play an important role in the reinforcement process in parallel with dopaminergic neurons; however, their activity during motivated behavioral tasks remains rather elusive. Thus, in order to understand the network mechanisms underlying reinforcement learning and motivational behavioral control, it is important to elucidate the nature of the signals relayed from the neurons in these principal nuclei of neuromodulators.

We recently recorded the extracellular spike activity of PPTN and DRN neurons in behaving monkeys [4145]. In this paper, we will compare the activity of neurons in the PPTN/DRN while monkeys performed eye movement tasks to obtain different reward values. We first summarize the growing literature on the PPTN/DRN in relation to the dopaminergic system (Section 2), we then discuss our recent single-unit recording studies from the PPTN/DRN in behaving monkeys (Section 3), and then finally assess the possible mechanisms for reward prediction error computation and its interaction with the motivational signal (Section 4). In short, PPTN and DRN neurons encode the reward prediction and actual reward signals, while dopaminergic neurons encode the reward prediction error signal. The firing patterns of PPTN/DRN neurons are often tonic and sustained throughout the task period, they start shortly after the presentation of the fixation target and are sustained throughout the waiting period and saccade phase until reward delivery, while dopaminergic neurons exhibit a phasic burst to the task event. The reward prediction signals of PPTN/DRN neurons are intermingled with the signals for task motivation.

2. Interactions between PPTN, DRN, and Dopaminergic Neurons

2.1. Anatomy: Reciprocal Interactions

The PPTN and DRN are heterogeneous nuclei in terms of their neurotransmitters. While the PPTN is the major source of cholinergic projections in the brainstem [46]; it also contains glutamatergic and GABAergic [4752] as well as dopaminergic [53] and noradrenergic [54] neurons. The DRN is the major source of serotonin in the brain [55], but it also contain neurons with GABA, dopamine, noradrenaline, substance P, and acetylcholine [56].

There are reciprocal anatomical connections between the PPTN, DRN, and dopaminergic systems (Figure 1). Neurons of the PPTN abundantly project to midbrain dopaminergic neurons in the SNc and VTA [5760]. In rodents, the rostral PPTN projects to the SNc, while the caudal PPTN projects to the VTA [25, 61]. Dopaminergic neurons in the SNc project back to PPTN neurons and excite or inhibit them [6264], even though the dopaminergic input to PPTN neurons is low compared with the massive cholinergic innervation of dopaminergic neurons. The PPTN also has reciprocal connections with the serotonergic DRN [6567] and noradrenergic LC [30] monoamine systems. DRN neurons also project to midbrain dopaminergic neurons in the SNc and VTA [33, 36, 68], while dopaminergic neurons also project back to the DRN [6972].

The PPTN and DRN also have reciprocal interactions with basal ganglia nuclei. The PPTN has massive reciprocal connections with the subthalamic nucleus, globus pallidus, and substantia nigra [73, 74]; thus, it was recently proposed to form a part of the basal ganglia [32]. The DRN projects to the basal ganglia, that is, the striatum, globus pallidus, and substantia nigra [34, 35], as well as to the cerebral cortex and limbic structures [56].

2.2. Possible Role of the PPTN/DRN in Controlling the Activity of Dopaminergic Neurons

The PPTN is one of the strongest sources of excitatory input for dopaminergic neurons [75]. PPTN neurons make glutamatergic and cholinergic synaptic connections with dopaminergic neurons [51, 76, 77]. The main effect of acetylcholine on the activity of dopaminergic neurons seems to be excitatory. In rats, electrical stimulation of the PPTN induces a time-locked burst of dopaminergic neurons [24, 78], while chemical or electrical stimulation of the PPTN increases the release of dopamine in the striatum [7981]. Furthermore, dopaminergic neurons are dysfunctional following excitotoxic lesioning of the PPTN [82]. Other experiments have revealed the receptor level mechanisms underlying the burst firing of dopaminergic neurons induced by acetylcholine from the PPTN and LDT [25, 83, 84]. The burst firing of dopaminergic neurons depends on glutamatergic and cholinergic input [25, 85, 86]. Acetylcholine acts through nicotinic and muscarinic receptors to depolarize dopaminergic neurons and alter their firing pattern [8790]. Thus, PPTN neuronal activity and acetylcholine provided by PPTN neurons can facilitate the burst firing of dopaminergic neurons [25] and appears to do so via muscarinic [91, 92] and nicotinic [90, 9395] acetylcholine receptor activation.

Conversely, serotonin can exert either excitatory or inhibitory effects on the activity of midbrain dopaminergic neurons, depending on the subtypes of serotonergic receptors present and the location of the dopaminergic neurons [96]. The main mechanism controlling its action seems to be inhibition by serotonergic 2C/2B receptors [97100]; however, several serotonergic receptor subtypes facilitate dopamine release [101]. In addition to the direct effect of serotonin via its receptors on dopaminergic neurons, it can also modulate their activity indirectly by modifying GABAergic and glutamatergic input to the VTA and SNc [102, 103].

2.3. Possible Role of the PPTN/DRN in Reinforcement Learning

The interactions between the neuromodulator systems are classically associated with wakefulness/sleep control, postural control, and several neuropsychiatric disorders [27, 66, 104, 105].

In addition to these numerous functional roles, recent studies have suggested that the PPTN is critically involved in various reinforcement processes [106110]. Lesioning of the PPTN before operant training disrupted the acquisition of the self-administration response, while lesioning after training did not [111, 112]. Lesioning, stimulation, and reversible inactivation of the PPTN impaired the performance in several conditioned task behaviors, but they did not change simple behavior, including locomotion, feeding, and lever pressing [113115].

Similarly, several lines of evidence suggest that the entire raphe or serotonin regulates motivated behavior [28, 31, 116123]. The depletion of serotonin induces impulsive behavior, which might reflect a deficit of the valuation system. The systemic or local depletion of serotonin renders an animal likely to choose a small but immediate reward rather than a large but delayed reward [124131]. The human DRN was activated when subjects learned to obtain large future rewards [119]. Long-lasting DRN activity may also have other functions because impulsivity has been associated with other serotonin-related behavioral tendencies such as aggression [132, 133] and obsession [134].

3. Responses of PPTN/DRN Neurons in Two-Valued Reward Saccade Tasks

Thus, abundant anatomical, electrophysiological, and pharmacological studies of slice and whole animal preparations indicate that PPTN/DRN neurons provide mutual inputs to dopaminergic neurons and basal ganglia nuclei and play an important role in reinforcement learning. However, the precise mechanism by which PPTN/DRN neurons cause these effects is unknown, partly because only a few studies have examined the activity of PPTN/DRN neurons during motivated behavioral tasks.

Classically, electrophysiological studies of PPTN neurons have shown their relationship with the sleep-wake cycle and locomotion [30]. Further, in a pioneering study of operant conditioned cats, PPTN neurons relayed either a reward or salient event signal by phasic firing [135]. A recent study in rats showed that the reward-related activity of PPTN neurons was affected by changes in the reward context [136]. Other studies have reported that PPTN neurons encoded the sensory or motor rather than reward information of task events in rats [137] and monkeys [138].

For DRN neurons, electrophysiological studies have mainly focused on the sleep-wake cycle and motor behavior [139], and recent studies in rats reported that DRN neurons showed transient changes in activity to sensorimotor events, including reward [140] and aversive foot shocks [141]. Recent studies in rats also reported that the efflux of serotonin was enhanced [142], and the tonic firing of DRN neurons was increased [143] while rats waited for a reward, which was related to their waiting behavior.

To examine the role of the PPTN/DRN in reward prediction error computation and adaptive behavioural control, we recorded the extracellular spike activity of PPTN, DRN, and putative dopaminergic neurons in monkeys performing saccade tasks to obtain a juice reward [4145]. We used two-valued reward saccade tasks, that is, visually-guided and memory-guided saccade tasks, which are comparable to those used for electrophysiological recordings from basal ganglia nuclei and dopaminergic neurons. In the visually-guided saccade task, the animal maintained fixation on a central fixation target, and, immediately after the peripheral target appeared, it made a horizontal saccade. In the memory-guided saccade task, the animal made a saccade to a flashed target location after some delay.

To examine (1) the effect of the predicted reward value and (2) the effect of error in reward prediction on neuronal activity, we made two modifications to the tasks. First, in order to examine the effect of reward prediction, we made these saccade tasks two valued so that the reward magnitude (large or small) was cued by the property of the visual target (shape or location) in each trial. For recordings from PPTN neurons [42], the reward magnitude was cued by the shape of the initial central fixation target (Figure 2(a), square or circle). For recordings from DRN neurons and putative dopaminergic neurons in the SNc [44, 45], the location of the saccade target (left or right) was associated with large or small rewards, respectively (Figure 2(b)). In these conditions, the monkeys learned the relationship between the property of the cue and the reward magnitude, and the behavior of the monkeys was influenced by their expectation of the reward value.

fig2
Figure 2: Schematic diagrams for the two-valued reward saccade tasks. (a) The reward magnitude was cued by the shape of the initial central fixation target (square or circle) for recordings from PPTN neurons. (b) The location of the saccade target (left or right) was associated with large or small rewards, respectively, in recordings from DRN neurons. FT: fixation target, ST: saccade target.

Second, in order to examine the effect of the reward prediction error which was the difference between the actual given reward and the predicted reward, we changed the contingency between the cue property and the reward value. Specifically, the cue property (either fixation target shape or saccade target location) and the reward value contingency was constant for more than 20 consecutive trials, called a block. Because of the block design, once a block was started, the animal knew which cue property generated the largest reward, even before cue presentation. Then the contingency between the cue property and the reward value was switched without any additional cue; therefore, the animal only received an unexpected reward magnitude on the very first trial after contingency reversal.

For extracellular recording, the locations of the PPTN and DRN were estimated using magnetic resonance imaging and later verified histologically. Details of recording sites of the PPTN and DRN are shown in Figure  1 of Okada et al. [42] and Figure  1 of Nakamura et al. [44], respectively. Correct placement of the recording electrode was also confirmed by monitoring the neuronal activity in the surrounding structures, including the superior and inferior colliculi. For recordings from PPTN neurons, high-frequency tonic fiber activity in the cerebellar peduncle, close to the PPTN, was used as a landmark. For recordings from the DRN, which has a more medial location than the PPTN, the trochlear nucleus is the most prominent landmark in monkeys [144].

To record from putative dopaminergic neurons, we searched in and around the SNc. Dopaminergic neurons were identified by their irregular and tonic firing at ~5 spikes/s with broad spike potentials. The recording sites were estimated using magnetic resonance imaging and later verified histologically. In this experiment, we focused on those dopaminergic neurons that responded to reward-predicting stimuli with phasic excitation.

As noted above, although the PPTN and the DRN are centers of cholinergic and serotonergic neurons; respectively, they also contain neurons with other neurotransmitters. This heterogeneity poses a challenge to relate electrophysiological studies of PPTN/DRN neurons to their neurochemical identity. It was suggested that there are 2 types of neurons in slice preparations of the rat PPTN that generated broad and brief action potentials [145]. Recent extracellular recording studies also reported neurons that generated broad and brief action potentials; however, they exhibited a unimodal distribution and could not be classified into groups [41, 138]. For the DRN, previous studies estimated that a substantial proportion of DRN neurons are serotonergic: ~30% in rats [146], 70% of medium-sized DRN neurons in cats [55, 147], and 70% in humans [148]. Note that, in addition to serotonin, the DRN includes neurons with many kinds of neurotransmitters such as GABA, glutamate, and dopamine [56]. However, there are no reliable electrophysiological criteria (such as the baseline firing rate, spike shape, and spiking regularity) to identify the neurotransmitter of the recorded neuron. Therefore, we studied all well-isolated neurons in the PPTN/DRN whose activity changed during saccade tasks, rather than choosing neurons with specific electrophysiological properties.

3.1. Neuronal Activity of the PPTN

We recorded the extracellular spike activity of PPTN neurons during the performance of the two-valued saccade tasks in monkeys [42]. These tasks were comparable to those used in recordings from basal ganglia nuclei and dopaminergic neurons in which the shape of the fixation target (square or circle) indicated the reward magnitude (large or small, Figure 2(a)). We recorded a population of PPTN neurons that exhibited significant responses to one or more task events, including reward delivery, visual stimulus presentation, and saccade execution (153/185, 83%). The responses showed a rich variety of patterns: some exhibited a phasic response to task events, others exhibited tonic changes in activity throughout the trial, and we also observed a combination of these phasic and tonic responses.

In this section, we will describe the activity modulation of PPTN neurons for (1) the prediction of reward magnitude, (2) motivation to perform the task, and (3) actual reward magnitude. In short, two groups of PPTN neurons showed reward magnitude-dependent response modulation. A subset of neurons exhibited increased activity around the time of the onset of the fixation target that was sustained until the end of the trial, with a significant dependency on the magnitude of the predicted reward (fixation target neurons, Section 3.1.1), while the other neurons exhibited a phasic increase in activity only around the time of reward delivery, with a significant dependency on the reward magnitude of the current reward (reward delivery neurons, Section 3.1.3). All of these observed features of PPTN neuronal activity are suitable for its possible role in reward prediction error computation and appropriate action selection in a given situation.

3.1.1. Effect of the Predicted Reward Value on the Activity of PPTN Neurons

A subset of PPTN neurons exhibited increased activity around the time of the onset of the fixation target that was sustained until the end of the trial, with a significant dependency on the magnitude of the predicted reward (fixation target neurons, , Figure 3). Figures 3(a) and 3(b) show raster displays and spike density functions for a representative fixation target neuron. This neuron showed elevated firing throughout the trial that was greater when the cued reward was large; compare the red raster lines and traces (large reward) with the blue ones (small rewards). Differences in the responses to the large and small reward cues generally began to emerge at ~100 ms after the cue was presented. These differential responses extended throughout the working memory period following the offset of the fixation target/cue and lasted until, and even after, reward delivery (green bars), and they were almost unaffected by other task events, such as the onset of the peripheral saccade target (black bars) and the saccade to the saccade target (black triangles). Note that there were nondifferential responses before the onset of the fixation target, presumably in anticipation of its appearance. In the next section, we will discuss the relationship between these nondifferential responses and the monkeys’ motivation to perform the task. We used multiple analytical approaches, including receiver operating characteristic (ROC) analysis, mutual information, and correlation analyses, and all analyses consistently proved the dependency of the neuronal activity on the magnitude of the predicted reward [42]. Because some fixation target neurons maintained these differences in response even after reward delivery, we also tested their response to free-reward delivery, in which the large reward was given unexpectedly during the intertrial intervals. All of the tested fixation target neurons were totally unresponsive to free-reward delivery, consistent with the view that these neurons encode the predicted reward value instead of the actual reward or reward prediction error signals.

fig3
Figure 3: Activity of fixation target neurons of the PPTN for the saccade task. (a, b) A rastergram and peritask event spike density function for the activity of a representative fixation target neuron over 10 successive trials, aligned to the onset of the fixation target. The red and blue rasters (a) and traces (b) indicate large and small reward trials, respectively. In (a), the green squares and circles indicate fixation target onset, the black bars indicate the onset of the saccade target, the black triangles indicate saccade onset, and the green lines indicate the times at which the large (3 bars) and small (1 bar) rewards were delivered. (c) Responses of fixation target neurons to fixation target (squares and circles) presentation (mean response of 200–600 ms after fixation target onset, fixation target/cue period) after reversal of cue-reward contingency. The left panel shows the large-to-small reward reversal, and the right panel shows the small-to-large reward reversal. Large-reward trials are indicated by the dark gray bars, while small-reward trials are indicated by the clear areas. Shown are the mean and standard error of the mean (SEM) of the normalized neuronal activity for the th trial after contingency reversal. The asterisks (*) indicate the activity that was significantly different from the activity during the last 5 trials of the block with the reversed contingency ( , Mann-Whitney test). (d) Similar to (c) but for the responses after fixation target offset (working memory period, 200–600 ms after fixation target offset). (e–g) The activity of each fixation responsive neuron is presented as a row of pixels ( ). (e, f) Changes in the neuronal firing rate from baseline are compared in the large- (e) and small- (f) reward trials. The color of each pixel indicates the ROC value based on the comparison of the firing rate between a control period just before fixation onset (400-ms duration) and a test window centered on the pixel (100-ms duration). Warm colors (ROC > 0.5) indicate increases in the firing rate relative to the control period, whereas cool colors (ROC < 0.5) indicate decreases in the firing rate. (g) Changes in reward-dependent modulation. The ROC value of each pixel was based on the comparison of the firing rate between the large- and small-reward trials. Warm colors (ROC > 0.5) indicate higher firing rates in the large-reward trials than in the small ones. In these 3 panels (e–g), the neurons have been sorted in order of their ROC values for the reward prediction effect during the task period. FTon: fixation target onset; STon: saccade target onset; RWon: reward onset. (Modified from [42].)

The tonic modulations in activity during the task period, as shown in the example neuron in Figures 3(a) and 3(b), were commonly observed in the PPTN neurons ( , Figures 3(e)3(g)). After fixation target onset, but before reward delivery, approximately one-third of fixation target-responsive PPTN neurons showed significant reward-dependent modulation, with most of the neurons firing more strongly for large- than small-reward trials ( , Figure 3(g)). There was a small population of neurons that showed a weak negative reward magnitude dependency in which the response was smaller during the large-reward trials ( ). For each neuron, the changes in activity during the task period tended to increase and be sustained during large- and small-reward trials but was greater during large-reward trials, thus leading to the differences in activity between the two reward conditions (Figures 3(e) and 3(f)).

Further insights were obtained by recording the activity in a contingency reversal paradigm, in which the meaning of the fixation target/cue was suddenly reversed during neuronal recording (Figures 3(c) and 3(d)). As a result of contingency reversal, there was a discrepancy between the predicted and actual reward, at least during the first trial, and we examined the trial-by-trial responses of the fixation target neurons around the contingency reversal period. The responses of the fixation target neurons during the fixation target period and the subsequent working memory period clearly reflected the contingency reversal with a delay of one trial. In the first reversed contingency trial, the animals could not predict the correct reward magnitude because they were unaware of the contingency reversal, and the target/cue and working memory period responses did not immediately follow the contingency reversal. The net result was that, by the second trial after contingency reversal, the cue predicting the larger reward was again associated with the higher discharge rate (i.e., one-trial learning).

3.1.2. Correlation of Fixation Target Response with Behavioral Performance

As shown above, a population of PPTN neurons showed tonic activity changes throughout the task period, and a subset showed reward value-dependent activity modulation. We then examined the relationship between the task- and reward-related modulations.

The population-averaged normalized activity of PPTN neurons is shown in Figure 4, separately from the reward-related modulation patterns. As shown by the normalized activity modulation of each neuron in Figure 3, reward value-dependent and -independent neurons showed elevated activity during the task period. The correlation between the neuronal activity and reward value was significant for reward value-dependent neurons, peaked after the presentation of the fixation target and was sustained during the task period (Figure 4(c), black trace). Conversely, there was almost no correlation for reward value-independent neurons (Figure 4(d), black trace).

579840.fig.004
Figure 4: Correlations between PPTN neuronal responses with reward value and task performance. (a, b) Population spike density function of reward magnitude-dependent (a) and -independent (b) fixation target-responsive neurons averaged for large- (red) and small- (blue) reward trials, aligned to fixation target onset, saccade target onset, and reward delivery. The spike density is the population average normalized for the peaks of the individual neurons. The thick lines indicate the mean normalized activity, and the light-shaded areas are ± 1 SEM. (c, d) Correlation coefficient (absolute value) plots of the neuronal responses shown in (a) and (b) with the reaction time to fixate upon the fixation target (purple) and the reward magnitude (black). The horizontal dotted red line indicates the significance level ( ) of the correlations. FT: fixation target, ST: saccade target, RD: reward delivery. (Modified from [42].)

The increase in activity started even before the onset of the fixation target, presumably in anticipation of its appearance. Interestingly, the responses of the reward magnitude-independent neurons during the precue period were identical to those of the reward magnitude-dependent fixation target neurons (Figures 4(a) and 4(b)). To test whether the PPTN neurons encoded the motivation to fixate on the target, we analyzed the relationship between the activity during the precue period and the reaction time to fixate upon the initial fixation target (RTft).

Now, if the neurons encoded motivation in an integrated manner, then the neurons that showed reward value-dependent modulation should also show behavioral performance dependency, whereas neurons that showed no reward value dependency should also show no behavioral performance dependency. Conversely, if the neurons encoded the motivation to fixate on the target and the motivation to get the reward in an independent manner, then there should be no systematic relationship between behavioral performance dependency and reward value dependency.

The neuronal activity was correlated with RTft in a time-dependent manner in the reward magnitude-dependent and -independent neuronal groups. This correlation became significant during the precue period, peaked shortly after the presentation of the fixation target, and declined back to baseline during the cue period (Figures 4(c) and 4(d), purple trace). Altogether, the reward magnitude-independent neurons shared the component for the response correlation related to the anticipation of cue onset with the reward magnitude-dependent neurons. This finding indicates that the reward magnitude-independent neurons signal the early component of the motivational drive to fixate on the fixation target in an almost equal manner to that of the reward magnitude-dependent fixation neurons.

3.1.3. Effect of the Received Reward Value on the Activity of PPTN Neurons

Another group of PPTN neurons exhibited a phasic response to reward delivery, with a significant dependency on the magnitude of the delivered reward (reward delivery neurons, ). In contrast to the tonic activity of the fixation target neurons, the reward delivery neurons exhibited a transient response, reaching a peak discharge rate shortly after reward delivery and then rapidly declining back to baseline (Figures 5(a) and 5(b)); these were almost unresponsive during the target/cue and working memory periods. In the trial with a larger reward, the discharge rate of the transient response reached a higher peak at a slightly later time and took a few hundred milliseconds longer to decay back to baseline than during the small-reward trials. Similar to the fixation target neurons, approximately half of the reward delivery neurons showed small nondifferential responses, even before reward delivery, presumably in anticipation of the timing of the reward.

fig5
Figure 5: Activity of reward delivery neurons of the PPTN for the saccade task. (a, b) A rastergram and peritask event spike density function for the activity of a representative reward delivery neuron over 10 successive trials, aligned to reward delivery. (c) Responses of the reward delivery neurons to reward delivery of large and small rewards after the reversal of cue-reward contingency. (d) Population response of reward delivery neurons to free (black) and large (red) rewards. The responses represent the average firing rate normalized for the peak responses of the individual neurons ( ). The thick lines indicate the mean normalized activity, and the light-shaded areas are ± 1 SEM. (e–g) The activity of each reward-responsive neuron is presented as a row of pixels ( ). (e, f) Changes in the neuronal firing rate from baseline are compared in the large- (e) and small- (f) reward trials. (g) Changes in reward-dependent modulation. In these 3 panels (e–g), the neurons have been sorted in order of their ROC values for the reward effect during the postreward delivery period. FTon: fixation target onset; STon: saccade target onset; RWon: reward onset. (Modified from [42].)

After actual reward delivery, approximately half of the reward-responsive PPTN neurons showed significant positive-reward-dependent modulation and fired more strongly during large- than small-reward trials (15/35, Figures 5(e)5(g)). There was a small population of neurons that showed a weak negative-reward-magnitude dependency ( ). For each neuron, the changes in activity after reward delivery tended to increase during the large- and small-reward trials.

During the contingency reversal paradigm, there was a discrepancy between the predicted and actual reward. The responses of the reward delivery neurons changed immediately after the contingency reversal, so that larger rewards were still associated with larger neuronal responses, even on the first trial in which the monkeys predicted the small rewards (Figure 5(c)). Therefore, the reward delivery neurons convey information about the magnitude of the actual given reward, regardless of the monkeys’ prediction. We also tested the responses to free-reward delivery, and all of the tested reward delivery neurons responded briskly to the task- and free-reward delivery. The fact that the reward delivery neurons responded to the task and free rewards, given in either an expected or unexpected manner, suggests that reward delivery neurons encode the actual reward magnitude. This is fundamentally different from the reward response of dopaminergic neurons that exhibited burst firing only to an unexpectedly given reward and showed no response to the fully predicted reward (reward prediction error, see also Figure 8) [9, 149].

Overall, two different groups of PPTN neurons encode the reward prediction and actual reward signals, both of which are necessary for the computation of the reward prediction error signal in dopaminergic neurons. The reward prediction signal is encoded by the sustained tonic firing of one group of PPTN neurons (Figure 3) and is sometimes intermingled with the task motivation signal (Figure 4). The actual reward signal is encoded by the phasic response of the other group of PPTN neurons (Figure 5).

3.2. Neuronal Activity of the DRN

We also recorded extracellular spike activity from the neurons in the monkey DRN during the two-valued saccade tasks [44, 45]. The tasks were comparable to those used for the PPTN recordings, except that the location of the saccade target (left or right) indicated the reward magnitude (large or small, Figure 2(b)). We observed that, like PPTN neurons, DRN neurons also exhibited tonic changes in activity that would be ideal to encode sustained aspects of motivated behavior such as the predictive state of the upcoming reward. Detailed analyses indicated that a group of DRN neurons did indeed keep track of the predicted and/or given reward value.

3.2.1. Effect of the Predicted and Received Reward Value on the Activity of DRN Neurons

DRN neurons exhibited task-related activity that was modulated by the reward value. Figure 6(a) shows a representative example. The neuron exhibited an increase in activity after the onset of the fixation point (FPon) followed by regular and tonic firing until reward onset. The activity further increased after the onset of a large reward but ceased after the onset of a small reward and lasted for more than 800 ms after reward onset. A subset of neurons, an example of which is shown in Figure 6(b), exhibited the opposite pattern; that is, the neuron showed small reward-dominant post-reward activity that lasted until the start of the next trial. In some neurons, reward value-dependent modulation was also observed during the delay period, before reward onset, presumably reflecting the monkeys’ prediction of the reward. The neuron in Figure 6(b) exhibited stronger delay activity during small-reward trials than during large-reward trials, but only when leftward saccades were required. However, note that such directional selectivity was relatively rare among DRN neurons, and many neurons showed reward value-dependent modulation regardless of the direction of the saccade.

fig6
Figure 6: Activity of two example DRN neurons for the saccade task. For each neuron, (a) and (b), the rasters and histograms for the leftward and rightward saccades are shown separately. The changes in their firing rates are shown by the peritask event spike density function at the top. The activity in the large- and small-reward trials is shown in red and blue, respectively. The data are shown in 3 sections: the left section is aligned to the time of fixation point onset (FPon), the middle section is aligned to target onset (TGon) and fixation point offset (FPoff), and the right section is aligned to reward onset (RWon). Note that the reward offset (RWoff) applies only to the large-reward trials. The black dots indicate saccade onset (SACon), and the light blue dots indicate reward onset and offset. (Modified from [44].)

The reward-dependent modulations in activity before and after reward delivery, as shown in the example neurons in Figure 6, were commonly observed in DRN neurons (Figure 7). After target onset, but before reward delivery, approximately one-quarter of all analyzed DRN neurons showed significant reward-dependent modulation, with most of the neurons firing more strongly for large than small reward trials (Figure 7(c)). After reward delivery, more than 40% of neurons exhibited reward-dependent modulation, with half of them preferring large rewards and the other half preferring small rewards.

fig7
Figure 7: Population activity of DRN neurons. The activity of each neuron is presented as a row of pixels ( ). (a, b) Changes in the neuronal firing rate from baseline are compared in the large- (a) and small- (b) reward trials. The color of each pixel indicates the ROC value based on the comparison of the firing rate between a control period just before fixation onset (400-ms duration) and a test window centered on the pixel (100-ms duration). Warm colors (ROC > 0.5) indicate increases in the firing rate relative to the control period, whereas cool colors (ROC < 0.5) indicate decreases in the firing rate. (c) Changes in reward-dependent modulation. The ROC value of each pixel was based on the comparison of the firing rate between the large- and small-reward trials. Warm colors (ROC > 0.5) indicate higher firing rates during the large-reward trials than during the small ones. In all panels (a–c), the neurons have been sorted in order of their ROC values for the reward effect during the postreward (400–800 ms) period (c). FPon: fixation point onset, TGon: target onset, FPoff: fixation point offset, RWon and off: reward onset and offset. (Modified from [44].)
fig8
Figure 8: Changes in the reaction times and activity of DRN and putative dopaminergic neurons with reward contingency reversal. The reaction times (a) and normalized neuronal activity during the postreward period of DRN (400–800 ms after reward onset) and putative dopaminergic neurons (0–400 ms after reward onset) are plotted. In (b), the data are shown for DRN neurons with a large-reward preference (left), DRN neurons with a small-reward preference (middle), and putative dopaminergic neurons (right). Error bars, SEM. (Modified from [44].)

Note that there was a notable difference in the reward-dependent modulation between the pre- and postreward periods. For each neuron, the changes in activity during the prereward period, compared with the baseline activity, tended to be in the same direction during large- and small-reward trials but tended to be greater during large-reward trials, thus, leading to differences in the activity between the two reward conditions (Figures 7(a) and 7(b)). On the contrary, the changes in activity during the postreward period, compared with the baseline activity, tended to be in the opposite direction. For example, for the neuron shown in Figure 6(a), the prereward activity increased compared with the baseline during large- and small-reward trials. However, the postreward activity increased during large-reward trials, but it was inhibited during small-reward trials. Such a distinct effect on modulation indicates a different source for the modulation of DRN neuronal activity before and after reward delivery.

While recording from DRN neurons, the contingency between the target position and reward value was fixed during one block of trials but was then reversed with no external cue. This allowed us to examine how the monkeys’ performance and neuronal activity changed to the new position-reward contingency. The saccadic reaction times changed quickly after the reversal of the position-reward contingency (Figure 8(a)). We, therefore, examined the time course of the changes in the mean normalized firing rates for DRN neurons (400–800 ms after reward onset) and for the putative dopaminergic neurons (0–400 ms after reward onset) as a function of the trial number after reversal.

There was a striking difference between the DRN neurons and dopaminergic neurons in their postreward activity. The activity of DRN neurons faithfully followed the size of the reward (Figure 8(b), left and middle). In other words, DRN neurons reliably coded the value of the received reward whether or not it was expected. In contrast, the activity of the dopaminergic neurons only changed transiently during the first trial and, thereafter, returned to a level close to baseline activity (Figure 8(b), right). Specifically, dopaminergic neurons decreased their postreward activity for large-to-small reward reversals and increased their activity for small-to-large reversals. These transient changes in postreward activity represent the “reward prediction error,” which is the difference between the value of the predicted (e.g., small reward) and the actual rewards (e.g., large reward). This progression in the postreward activity of dopaminergic neurons is consistent with the findings of other studies [9, 149]. Thus, the results indicate that DRN neurons encode the actual reward value and not the reward prediction error.

3.2.2. Coding of the Task Reward Value in the DRN

As shown in Figure 6, the response of the DRN neurons often took the form of tonic activity changes throughout multiple task phases. Such type of activity would be ideal to encode sustained aspects of motivated behavior such as the state of expectation for the upcoming reward.

To test this hypothesis, we analyzed the relationship between the tonic activity during the fixation period and the differential responses to reward cues and actual rewards. Note that during the fixation period (before target onset), the exact reward value the animal would receive for that trial was as yet unknown (Figure 2(b)). However, the overall value of the behavioral task would be between the large- and small-reward value, which may be expressed by the neuronal firing rate during the fixation period. Now, if the neurons encoded behavioral tasks primarily in terms of their reward value throughout a trial, then the neurons that were excited during the fixation period should preferentially be excited by the reward cues and the actual reward, whereas the neurons that were inhibited during the fixation period should be preferentially inhibited by the reward cues and the actual reward. On the contrary, if the neurons encoded the information (including the reward value) during the fixation period and after the reward cue and reward delivery in an independent manner, then there should be no systematic relationship between the fixation and reward-related activity.

The population-averaged normalized activity of DRN neurons is shown in Figure 9 and separately for neurons with positive (Figure 9(a)), negative (Figure 9(b)), or no significant reward signals (Figure 9(c)) in response to reward delivery. Neurons with positive-reward signals for reward delivery (stronger activity for a large reward than for a small reward) had elevated activity during the fixation period (Figure 9(a)). If the large-reward target appeared, their activity was elevated further, whereas if the small-reward target appeared, they returned to near the baseline. Neurons with negative-reward signals (stronger activity for a small reward than for a large reward) had suppressed activity during the fixation period (Figure 9(b)). If the large-reward target appeared, their activity was further suppressed, whereas, if the small-reward target appeared, they returned to near the baseline. Neurons with no significant reward signals had a tendency for phasic responses to the fixation and saccade targets and slightly elevated activity during the fixation period (Figure 9(c)). Further analyses revealed that neurons with stronger task coding, that is, changes in their fixation period activity, also had stronger reward coding, that is, different activity between the large- and small-reward trials. Collectively, such equivalent changes in activity between the fixation and postreward periods suggest that the level of DRN activity continually tracks the predicted value.

fig9
Figure 9: Population-averaged activity of DRN neurons separated by their reward signals in response to the outcome. (a–c) Normalized activity is shown for the memory-guided saccade task (MGS, left) and visually-guided saccade task (VGS, right), shown separately for positive-reward neurons (a, top), negative-reward neurons (b, middle), and no-outcome response neurons (c, bottom). The colors indicate the average of all trials (black), large-reward trials (red), and small-reward trials (blue). The neurons were sorted into these categories on the basis of significant reward discrimination after outcome onset (gray bar on the -axis; , Wilcoxon rank-sum test). The histograms below (c) show the reward discrimination for each neuron, with the colors indicating positive-reward neurons (red) and negative-reward neurons (blue). For the plots of normalized activity, the activity of each neuron was normalized by computing its ROC area versus baseline activity during the intertrial interval. The thick lines indicate the mean normalized activity, and the light shaded areas are ±1 SEM. (Modified from [45].)

4. Circuit Mechanisms for the Computation of the Reward Prediction Error Signal

4.1. Summary of the Response Patterns of PPTN/DRN Neurons

Here we summarize and compare the temporal activity patterns of the dopaminergic, PPTN, and DRN neurons to the presentation of the reward-predicting cue and reward delivery in the two-valued reward task (Figure 10).

579840.fig.0010
Figure 10: Schematic drawing of the activity changes of dopaminergic, PPTN, and DRN neurons for the two-valued saccade task. Cue and reward indicate the timing of reward-cue presentation (either fixation target shape or saccade target location) and large- and small-reward delivery, respectively. The colors indicate the responses in the large-reward trials (red) and small-reward trials (blue) and the responses of neurons with no significant reward modulation (black). (A) Dopaminergic neurons exhibited phasic burst firing to a reward-predictive cue and an unexpected reward (dashed lines). (B, D) Two different groups of PPTN neurons exhibited a tonic reward prediction response (B) and a phasic actual reward response (D). (C) PPTN neurons with no significant reward modulation often exhibited tonic activity during the task period. (E, G) DRN neurons exhibited correlated central fixation and reward modulation, preferring either larger (E) or smaller rewards (G). (F) DRN neurons with no significant reward modulation often exhibited a phasic response to target presentation.

In the earlier phases of the trial, the reward-predicting cue was presented. The dopaminergic neurons then exhibited a phasic burst of activity. The magnitude of their response was correlated with the predicted reward value, such that greater firing occurred in response to more valuable cues (Figure 10(A)) [150]. In contrast, a group of PPTN neurons exhibited an increase in activity to reward cue presentation, and this activity was sustained throughout the task period. Some neurons showed stronger activity when the predicted reward was larger (Figure 10(B)), while others did not show any reward magnitude-dependent modulation (Figure 10(C)). Both types of neurons showed behavioral performance-related modulation, even before cue onset. Similar to the PPTN, a group of DRN neurons also showed stronger activity for larger-reward-predicting cue (Figure 10(E)). In addition, another group of DRN neurons exhibited the opposite firing pattern, that is, decreased activity for cue predicting a larger reward (Figure 10(G)). Unlike the PPTN, the DRN neurons with no significant reward modulation showed phasic responses to target presentation and slightly elevated activity during the fixation period (Figure 10(F)).

In the later phases of the trial, the monkeys received a juice reward. The dopaminergic neurons now exhibited a phasic burst or pause in activity immediately after cue-reward contingency reversal, in which the reward value was larger or smaller than expected, respectively, (Figure 10(A), dashed line). The PPTN neurons that showed tonic firing to the cue ceased firing around the time of reward delivery (Figures 10(B) and 10(C)) and were totally unresponsive to an unpredictably given reward. A different group of PPTN neurons, which did not modulate their activity in response to the cue, now exhibited a phasic burst to reward delivery (Figure 10(D)), and the response magnitude was correlated with the given reward value. Tonic-firing DRN neurons also showed a prolonged modulation of activity after reward delivery (Figures 10(E) and 10(G)). The reward-related modulation tended to be correlated with the modulation in activity during the fixation period. Notably, the changes in activity for large and small rewards tended to be in the opposite direction; for example, the postreward activity increased during large-reward trials, but it was inhibited during small-reward trials or vice versa. When there was a reward prediction error, just after cue-reward contingency reversal, the response of the reward delivery neurons of the PPTN (Figure 10(D)) and DRN (Figures 10(E) and 10(G)) faithfully followed the actual magnitude of the reward.

Some limitations of these extracellular recording studies in monkeys have to be considered. First, the PPTN and DRN are heterogeneous nuclei and contain various kinds of neurons. In our current experiments, however, the neurochemical identity of the recorded neurons was hard to determine. To date, we have not found a significant relationship between the firing pattern of the neurons and their neurophysiological characteristics, such as spike width, firing regularity, and recording site. Second, the PPTN/DRNs have massive reciprocal interconnections, not only with dopaminergic neurons but also with other brain areas; thus, the firing patterns of the neurons could be either input or output signals. While we found several types of representation, that is, tonic fixation and phasic reward modulation of PPTN neurons and positive and negative reward modulation of DRN neurons, the organization of these circuits and their interactions are hard to understand. With due consideration given to these methodological limitations, we believe that the present study contributes to our understanding of the role of neuromodulator systems in reinforcement learning and motivational behavioural control.

4.2. PPTN/DRN Neurons Relay the Tonic Reward Prediction Signal

A prominent feature of PPTN/DRN neuronal activity is its tonic modulation pattern, and these tonic firing patterns during the task period resemble the short-term memory of the reward prediction for the current trial. Computational models [151155] of dopaminergic neuronal firing have noted similarities between the response patterns of dopaminergic neurons and the well-known learning algorithms, especially temporal difference reinforcement learning algorithms. However, there has been considerable debate regarding the circuit mechanisms underlying reward prediction error computation [154].

The temporal difference model uses fast-sustained excitatory reward prediction and delayed slow-sustained inhibitory signals in dopaminergic neurons to produce an onset burst to the cue followed by offset suppression to the reward. Previous studies have suggested that there are several structures that might send the tonic inhibitory reward prediction signals to dopaminergic neurons, such as the striosome [154, 155] and ventral pallidum [156]. However, the crucial missing link between the learning algorithm and the reported neuronal activity is the excitatory tonic input to dopaminergic neurons, which resembles the memory of the predicted reward value maintained until the actual reward delivery. The classical model supposed that the neurons in the striatum (the striosome) might provide both signals via direct and double-inhibition mechanisms to dopaminergic neurons. Our present findings suggest that a group of PPTN/DRN neurons could send a direct tonic excitatory component to dopaminergic neurons. How are these tonic signals from PPTN/DRN neurons converted to the phasic signals observed in the dopaminergic neurons? The simple and algorithm-matched model is the summation of the excitatory and inhibitory tonic signals, as follows. When the reward cue is presented, dopaminergic neurons receive a fast-sustained excitatory reward prediction signal, which we proposed, and a delayed slow-sustained inhibitory signal from the basal ganglia. DRN neurons can play either an excitatory or inhibitory role because the excitatory and inhibitory types of neurons are present, and serotonin exerts excitatory and inhibitory effects via several subtypes of serotonergic receptors [96]. As a result of summation, dopaminergic neurons exhibited transient excitatory and inhibitory signals timed at reward cue presentation and reward delivery, respectively. An alternative model for the computation suggests that the temporal differentiation of the tonic reward prediction signal, which increases at reward cue presentation and falls around the time of reward delivery, may produce the phasic signals of dopaminergic neurons. During the reward delivery phase, the inhibitory transients are summed with the excitatory actual reward signals by the other group of PPTN neurons, which we proposed, for the computation of the reward prediction error; thus, dopaminergic neurons produce no response when the reward prediction matches the actual one [14, 157].

Recent studies have emphasized the potential importance of the lateral habenula and rostromedial tegmental nucleus for the inhibition of dopaminergic neurons [158, 159]. Neurons in the lateral habenula are inhibited by a reward-predicting stimulus, but fire following a nonreward signal [160]. These structures are other possible candidates for the computation of the reward prediction error and are also interconnected with the PPTN and the DRN [65].

4.3. PPTN/DRN Neurons Relay the Task Motivation Signal

In addition to the reward prediction signal, an overlapping group of PPTN/DRN neurons showed task motivation-related activity modulation. The majority of PPTN neurons exhibited a tonic increase in activity regardless of its reward-related modulation. This tonic increase in activity occurred even before reward cue presentation, and part of these responses showed a significant dependency on the monkeys’ performance of the task, such that stronger activity is observed during a good-performance epoch than during a poor-performance epoch. The recruitment of the PPTN in motivational control concurred with previous studies [30, 114, 161]. Conversely, task-related changes in DRN neurons included excitation and inhibition of activity. Furthermore, the reward-related modulation tended to be correlated with the initial task-related modulation, such that neurons with elevated activity exhibited stronger activity for a large reward than for a small reward. This observation suggests that DRN neurons encode correlated task and reward information, while PPTN neurons encode these signals independently.

4.4. PPTN/DRN Neurons Relay the Actual Reward Signal

In the reward delivery phase, PPTN and DRN neurons encode the “actual reward signal,” while dopaminergic neurons encode the “reward prediction error signal.” The actual reward signal is necessary information to compute the error between the predicted and actual reward; however, there are several differences between the actual reward signals of PPTN and DRN neurons. First, in the PPTN, two different groups of neurons encode the reward prediction and actual reward signals, while an overlapping group of DRN neurons encode both signals. Thus, PPTN neurons exhibited phasic burst firing only to reward delivery and were almost silent during the task period, while DRN neurons exhibited tonic firing both before and after reward delivery. Second, the actual reward responses of PPTN neurons were phasic, while DRN neurons exhibited a tonic modulation pattern that was sometimes sustained until just before the next trial. Third, PPTN neurons exhibited an increase in firing to large- and small-reward delivery, while DRN neurons exhibited an opposite response to these rewards.

These observations suggest that PPTN neurons encode a simple reward value, while DRN neurons encode rather more complex information. The correlated coding of task and reward signals by DRN neurons might be matched with the reported relationship of serotonin to impulsive behavior. One hypothesis is that DRN neurons integrate task-related reward prediction signals and actual received reward signals and have a role in time discounting for future rewards. A recent study in rats also reported that DRN neurons increased tonic firing while the rats waited for a reward, and this was related to the rats’ waiting behavior [143]. Another hypothesis is that the actual reward signal of DRN neurons might be biased by the possible reward value for a rather long time scale (across blocks of trials). As shown above, even when the delivered magnitude of the reward was as predicted, some DRN neurons showed a decrease in firing to small-reward delivery; thus, DRN neurons might encode the error between the actual reward and the average of all possible options for rather a long time scale. Such patterns of relative reward value coding would be useful in comparing and selecting reward options, including reward value and time delay for receiving a reward.

Overall, the activity patterns of PPTN and DRN neurons were different from those of dopaminergic neurons, which are well known as the reward prediction error signal. Furthermore, the reward prediction and actual reward signals of PPTN/DRN neurons, which we proposed, are necessary signals for the computation of the reward prediction error and the appropriate action selection in a given situation. The different modulation patterns of the PPTN and DRN, together with the activity of dopaminergic neurons, reveal dynamic information processing between these different neuromodulator systems.

Acknowledgments

This study was supported by grants from the Ministry of Education, Culture, Sport, Science and Technology (854029, 17022027, 18020019, 20033013, 20300139) and was supported by the Japan Science and Technology Agency PRESTO Program.

References

  1. P. Waelti, A. Dickinson, and W. Schultz, “Dopamine responses comply with basic assumptions of formal learning theory,” Nature, vol. 412, no. 6842, pp. 43–48, 2001. View at Publisher · View at Google Scholar · View at Scopus
  2. B. J. Richmond, Z. Liu, and M. Shidara, “Neuroscience. Predicting future rewards,” Science, vol. 301, no. 5630, pp. 179–180, 2003. View at Publisher · View at Google Scholar · View at Scopus
  3. T. Satoh, S. Nakai, T. Sato, and M. Kimura, “Correlated coding of motivation and outcome of decision by dopamine neurons,” Journal of Neuroscience, vol. 23, no. 30, pp. 9913–9923, 2003. View at Scopus
  4. K. Samejima, Y. Ueda, K. Doya, and M. Kimura, “Representation of action-specific reward values in the striatum,” Science, vol. 310, no. 5752, pp. 1337–1340, 2005. View at Publisher · View at Google Scholar · View at Scopus
  5. A. M. Graybiel, “The basal ganglia: learning new tricks and loving it,” Current Opinion in Neurobiology, vol. 15, no. 6, pp. 638–644, 2005. View at Publisher · View at Google Scholar · View at Scopus
  6. O. Hikosaka, K. Nakamura, and H. Nakahara, “Basal ganglia orient eyes to reward,” Journal of Neurophysiology, vol. 95, no. 2, pp. 567–584, 2006. View at Publisher · View at Google Scholar · View at Scopus
  7. P. R. Montague and G. S. Berns, “Neural economics and the biological substrates of valuation,” Neuron, vol. 36, no. 2, pp. 265–284, 2002. View at Publisher · View at Google Scholar · View at Scopus
  8. W. Schultz, “Predictive reward signal of dopamine neurons,” Journal of Neurophysiology, vol. 80, no. 1, pp. 1–27, 1998. View at Scopus
  9. J. R. Hollerman and W. Schultz, “Dopamine neurons report an error in the temporal prediction of reward during learning,” Nature Neuroscience, vol. 1, no. 4, pp. 304–309, 1998. View at Scopus
  10. C. D. Fiorillo, P. N. Tobler, and W. Schultz, “Discrete coding of reward probability and uncertainty by dopamine neurons,” Science, vol. 299, no. 5614, pp. 1898–1902, 2003. View at Publisher · View at Google Scholar · View at Scopus
  11. H. Nakahara, H. Itoh, R. Kawagoe, Y. Takikawa, and O. Hikosaka, “Dopamine neurons can represent context-dependent prediction error,” Neuron, vol. 41, no. 2, pp. 269–280, 2004. View at Publisher · View at Google Scholar · View at Scopus
  12. R. Kawagoe, Y. Takikawa, and O. Hikosaka, “Reward-predicting activity of dopamine and caudate neurons—a possible mechanism of motivational control of saccadic eye movement,” Journal of Neurophysiology, vol. 91, no. 2, pp. 1013–1024, 2004. View at Publisher · View at Google Scholar · View at Scopus
  13. S. Kobayashi and W. Schultz, “Influence of reward delays on responses of dopamine neurons,” Journal of Neuroscience, vol. 28, no. 31, pp. 7837–7846, 2008. View at Publisher · View at Google Scholar · View at Scopus
  14. C. D. Fiorillo, W. T. Newsome, and W. Schultz, “The temporal precision of reward prediction in dopamine neurons,” Nature Neuroscience, vol. 11, no. 8, pp. 966–973, 2008. View at Publisher · View at Google Scholar · View at Scopus
  15. M. Watanabe, “Reward expectancy in primate prefrontal neurons,” Nature, vol. 382, no. 6592, pp. 629–632, 1996. View at Publisher · View at Google Scholar · View at Scopus
  16. D. Lee and H. Seo, “Mechanisms of reinforcement learning and decision making in the primate dorsolateral prefrontal cortex,” Annals of the New York Academy of Sciences, vol. 1104, pp. 108–122, 2007. View at Publisher · View at Google Scholar · View at Scopus
  17. J. N. Reynolds, B. I. Hyland, and J. R. Wickens, “A cellular mechanism of reward-related learning,” Nature, vol. 413, no. 6851, pp. 67–70, 2001. View at Publisher · View at Google Scholar · View at Scopus
  18. E. S. Bromberg-Martin, M. Matsumoto, and O. Hikosaka, “Dopamine in motivational control: rewarding, aversive, and alerting,” Neuron, vol. 68, no. 5, pp. 815–834, 2010. View at Publisher · View at Google Scholar · View at Scopus
  19. K. C. Berridge, “‘Liking’ and “wanting” food rewards: brain substrates and roles in eating disorders,” Physiology and Behavior, vol. 97, no. 5, pp. 537–550, 2009. View at Publisher · View at Google Scholar · View at Scopus
  20. M. Joshua, A. Adler, R. Mitelman, E. Vaadia, and H. Bergman, “Midbrain dopaminergic neurons and striatal cholinergic interneurons encode the difference between reward and aversive events at different epochs of probabilistic classical conditioning trials,” Journal of Neuroscience, vol. 28, no. 45, pp. 11673–11684, 2008. View at Publisher · View at Google Scholar · View at Scopus
  21. P. Redgrave, K. Gurney, and J. Reynolds, “What is reinforced by phasic dopamine signals?” Brain Research Reviews, vol. 58, no. 2, pp. 322–339, 2008. View at Publisher · View at Google Scholar · View at Scopus
  22. M. Matsumoto and O. Hikosaka, “Two types of dopamine neuron distinctly convey positive and negative motivational signals,” Nature, vol. 459, no. 7248, pp. 837–841, 2009. View at Publisher · View at Google Scholar · View at Scopus
  23. A. A. Grace, S. B. Floresco, Y. Goto, and D. J. Lodge, “Regulation of firing of dopaminergic neurons and control of goal-directed behaviors,” Trends in Neurosciences, vol. 30, no. 5, pp. 220–227, 2007. View at Publisher · View at Google Scholar · View at Scopus
  24. S. B. Floresco, A. R. West, B. Ash, H. Moorel, and A. A. Grace, “Afferent modulation of dopamine neuron firing differentially regulates tonic and phasic dopamine transmission,” Nature Neuroscience, vol. 6, no. 9, pp. 968–973, 2003. View at Publisher · View at Google Scholar · View at Scopus
  25. J. Mena-Segovia, P. Winn, and J. P. Bolam, “Cholinergic modulation of midbrain dopaminergic systems,” Brain Research Reviews, vol. 58, no. 2, pp. 265–271, 2008. View at Publisher · View at Google Scholar · View at Scopus
  26. S. Kapur and G. Remington, “Serotonin-dopamine interaction and its relevance to schizophrenia,” American Journal of Psychiatry, vol. 153, no. 4, pp. 466–476, 1996. View at Scopus
  27. D. B. Lester, T. D. Rogers, and C. D. Blaha, “Acetylcholine-dopamine interactions in the pathophysiology and treatment of CNS disorders,” CNS Neuroscience & Therapeutics, vol. 16, no. 3, pp. 137–162, 2010. View at Publisher · View at Google Scholar · View at Scopus
  28. K. Doya, “Modulators of decision making,” Nature Neuroscience, vol. 11, no. 4, pp. 410–416, 2008. View at Publisher · View at Google Scholar · View at Scopus
  29. Y. Koyama and Y. Kayama, “Mutual interactions among cholinergic, noradrenergic and serotonergic neurons studied by ionophoresis of these transmitters in rat brainstem nuclei,” Neuroscience, vol. 55, no. 4, pp. 1117–1126, 1993. View at Publisher · View at Google Scholar · View at Scopus
  30. E. Garcia-Rill, “The pedunculopontine nucleus,” Progress in Neurobiology, vol. 36, no. 5, pp. 363–389, 1991. View at Publisher · View at Google Scholar · View at Scopus
  31. K. Doya, “Metalearning and neuromodulation,” Neural Networks, vol. 15, no. 4–6, pp. 495–506, 2002. View at Publisher · View at Google Scholar · View at Scopus
  32. J. Mena-Segovia, J. P. Bolam, and P. J. Magill, “Pedunculopontine nucleus and basal ganglia: distant relatives or part of the same family?” Trends in Neurosciences, vol. 27, no. 10, pp. 585–588, 2004. View at Publisher · View at Google Scholar · View at Scopus
  33. H. Moukhles, O. Bosler, J. P. Bolam et al., “Quantitative and morphometric data indicate precise cellular interactions between serotonin terminals and postsynaptic targets in rat substantia nigra,” Neuroscience, vol. 76, no. 4, pp. 1159–1171, 1997. View at Publisher · View at Google Scholar · View at Scopus
  34. H. Imai, D. A. Steindler, and S. T. Kitai, “The organization of divergent axonal projections from the midbrain raphe nuclei in the rat,” Journal of Comparative Neurology, vol. 243, no. 3, pp. 363–380, 1986. View at Scopus
  35. D. Van der Kooy and T. Hattori, “Dorsal raphe cells with collateral projections to the caudate-putamen and substantia nigra: a fluorescent retrograde double labeling study in the rat,” Brain Research, vol. 186, no. 1, pp. 1–7, 1980. View at Publisher · View at Google Scholar · View at Scopus
  36. E. J. Van Bockstaele, D. M. Cestari, and V. M. Pickel, “Synaptic structure and connectivity of serotonin terminals in the ventral tegmental area: potential sites for modulation of mesolimbic dopamine neurons,” Brain Research, vol. 647, no. 2, pp. 307–322, 1994. View at Publisher · View at Google Scholar · View at Scopus
  37. M. Waselus, J. P. Galvez, R. J. Valentino, and E. J. Van Bockstaele, “Differential projections of dorsal raphe nucleus neurons to the lateral septum and striatum,” Journal of Chemical Neuroanatomy, vol. 31, no. 4, pp. 233–242, 2006. View at Publisher · View at Google Scholar · View at Scopus
  38. B. Lavoie and A. Parent, “Immunohistochemical study of the serotoninergic innervation of the basal ganglia in the squirrel monkey,” Journal of Comparative Neurology, vol. 299, no. 1, pp. 1–16, 1990. View at Scopus
  39. S. N. Haber, “The primate basal ganglia: parallel and integrative networks,” Journal of Chemical Neuroanatomy, vol. 26, no. 4, pp. 317–330, 2003. View at Publisher · View at Google Scholar · View at Scopus
  40. C. W. Berridge and B. D. Waterhouse, “The locus coeruleus-noradrenergic system: modulation of behavioral state and state-dependent cognitive processes,” Brain Research Reviews, vol. 42, no. 1, pp. 33–84, 2003. View at Publisher · View at Google Scholar · View at Scopus
  41. Y. Kobayashi, Y. Inoue, M. Yamamoto, T. Isa, and H. Aizawa, “Contribution of pedunculopontine tegmental nucleus neurons to performance of visually guided saccade tasks in monkeys,” Journal of Neurophysiology, vol. 88, no. 2, pp. 715–731, 2002. View at Scopus
  42. K. I. Okada, K. Toyama, Y. Inoue, T. Isa, and Y. Kobayashi, “Different pedunculopontine tegmental neurons signal predicted and actual task rewards,” Journal of Neuroscience, vol. 29, no. 15, pp. 4858–4870, 2009. View at Publisher · View at Google Scholar · View at Scopus
  43. K. I. Okada and Y. Kobayashi, “Characterization of oculomotor and visual activities in the primate pedunculopontine tegmental nucleus during visually guided saccade tasks,” European Journal of Neuroscience, vol. 30, no. 11, pp. 2211–2223, 2009. View at Publisher · View at Google Scholar · View at Scopus
  44. K. Nakamura, M. Matsumoto, and O. Hikosaka, “Reward-dependent modulation of neuronal activity in the primate dorsal raphe nucleus,” Journal of Neuroscience, vol. 28, no. 20, pp. 5331–5343, 2008. View at Publisher · View at Google Scholar · View at Scopus
  45. E. S. Bromberg-Martin, O. Hikosaka, and K. Nakamura, “Coding of task reward value in the dorsal raphe nucleus,” Journal of Neuroscience, vol. 30, no. 18, pp. 6262–6272, 2010. View at Publisher · View at Google Scholar · View at Scopus
  46. M. M. Mesulam, E. J. Mufson, B. H. Wainer, and A. I. Levey, “Central cholinergic pathways in the rat: an overview based on an alternative nomenclature (Ch1-Ch6),” Neuroscience, vol. 10, no. 4, pp. 1185–1201, 1983. View at Publisher · View at Google Scholar · View at Scopus
  47. B. E. Jones and A. Beaudet, “Distribution of acetylcholine and catecholamine neurons in the cat brainstem: a choline acetyltransferase and tyrosine hydroxylase immunohistochemical study,” Journal of Comparative Neurology, vol. 261, no. 1, pp. 15–32, 1987. View at Scopus
  48. J. R. Clements and S. Grant, “Glutamate-like immunoreactivity in neurons of the laterodorsal tegmental and pedunculopontine nuclei in the rat,” Neuroscience Letters, vol. 120, no. 1, pp. 70–73, 1990. View at Publisher · View at Google Scholar · View at Scopus
  49. B. M. Spann and I. Grofova, “Cholinergic and non-cholinergic neurons in the rat pedunculopontine tegmental nucleus,” Anatomy and Embryology, vol. 186, no. 3, pp. 215–227, 1992. View at Scopus
  50. B. Ford, C. J. Holmes, L. Mainville, and B. E. Jones, “GABAergic neurons in the rat pontomesencephalic tegmentum: codistribution with cholinergic and other tegmental neurons projecting to the posterior lateral hypothalamus,” Journal of Comparative Neurology, vol. 363, no. 2, pp. 177–196, 1995. View at Publisher · View at Google Scholar · View at Scopus
  51. K. Takakusaki, T. Shiroyama, T. Yamamoto, and S. T. Kitai, “Cholinergic and noncholinergic tegmental pedunculopontine projection neurons in rats revealed by intracellular labeling,” Journal of Comparative Neurology, vol. 371, no. 3, pp. 345–361, 1996. View at Scopus
  52. H. L. Wang and M. Morales, “Pedunculopontine and laterodorsal tegmental nuclei contain distinct populations of cholinergic, glutamatergic and GABAergic neurons in the rat,” European Journal of Neuroscience, vol. 29, no. 2, pp. 340–358, 2009. View at Publisher · View at Google Scholar · View at Scopus
  53. D. B. Rye, C. B. Saper, H. J. Lee, and B. H. Wainer, “Pedunculopontine tegmental nucleus of the rat: cytoarchitecture, cytochemistry, and some extrapyramidal connections of the mesopontine tegmentum,” Journal of Comparative Neurology, vol. 259, no. 4, pp. 483–528, 1987. View at Scopus
  54. B. E. Jones, “Paradoxical sleep and its chemical/structural substrates in the brain,” Neuroscience, vol. 40, no. 3, pp. 637–656, 1991. View at Publisher · View at Google Scholar · View at Scopus
  55. L. Wiklund, L. Leger, and M. Persson, “Monoamine cell distribution in the cat brain stem. A fluorescence histochemical study with quantification of indolaminergic and locus coeruleus cell groups,” Journal of Comparative Neurology, vol. 203, no. 4, pp. 613–647, 1981. View at Scopus
  56. K. A. Michelsen, C. Schmitz, and H. W. M. Steinbusch, “The dorsal raphe nucleus—from silver stainings to a role in depression,” Brain Research Reviews, vol. 55, no. 2, pp. 329–342, 2007. View at Publisher · View at Google Scholar · View at Scopus
  57. R. M. Beckstead, V. B. Domesick, and W. J. H. Nauta, “Efferent connections of the substantia nigra and ventral tegmental area in the rat,” Brain Research, vol. 175, no. 2, pp. 191–217, 1979. View at Publisher · View at Google Scholar · View at Scopus
  58. A. Jackson and A. R. Crossman, “Nucleus tegmenti pedunculopontinus: efferent connections with special reference to the basal ganglia, studied in the rat by anterograde and retrograde transport of horseradish peroxidase,” Neuroscience, vol. 10, no. 3, pp. 725–765, 1983. View at Publisher · View at Google Scholar · View at Scopus
  59. M. Beninato and R. F. Spencer, “A cholinergic projection to the rat substantia nigra from the pedunculopontine tegmental nucleus,” Brain Research, vol. 412, no. 1, pp. 169–174, 1987. View at Scopus
  60. A. Charara, Y. Smith, and A. Parent, “Glutamatergic inputs from the pedunculopontine nucleus to midbrain dopaminergic neurons in primates: phaseolus vulgaris-leucoagglutinin anterograde labeling combined with postembedding glutamate and GABA immunohistochemistry,” Journal of Comparative Neurology, vol. 364, no. 2, pp. 254–266, 1996. View at Scopus
  61. S. A. Oakman, P. L. Faris, P. E. Kerr, C. Cozzari, and B. K. Hartman, “Distribution of pontomesencephalic cholinergic neurons projecting to substantia nigra differs significantly from those projecting to ventral tegmental area,” Journal of Neuroscience, vol. 15, no. 9, pp. 5859–5869, 1995. View at Scopus
  62. I. Grofova and M. Zhou, “Nigral innervation of cholinergic and glutamatergic cells in the rat mesopontine tegmentum: light and electron microscopic anterograde tracing and immunohistochemical studies,” Journal of Comparative Neurology, vol. 395, no. 3, pp. 359–379, 1998. View at Scopus
  63. N. Ichinohe, B. Teng, and S. T. Kitai, “Morphological study of the tegmental pedunculopontine nucleus, substantia nigra and subthalamic nucleus, and their interconnections in rat organotypic culture,” Anatomy and Embryology, vol. 201, no. 6, pp. 435–453, 2000. View at Publisher · View at Google Scholar · View at Scopus
  64. E. Scarnati, A. Proia, S. Di Loreto, and C. Pacitti, “The reciprocal electrophysiological influence between the nucleus tegmenti pedunculopontinus and the substantia nigra in normal and decorticated rats,” Brain Research, vol. 423, no. 1-2, pp. 116–124, 1987. View at Scopus
  65. T. L. Steininger, D. B. Rye, and B. H. Wainer, “Afferent projections to the cholinergic pedunculopontine tegmental nucleus and adjacent midbrain extrapyramidal area in the albino rat. I. Retrograde tracing studies,” Journal of Comparative Neurology, vol. 321, no. 4, pp. 515–543, 1992. View at Publisher · View at Google Scholar · View at Scopus
  66. Y. Kayama and Y. Koyama, “Control of sleep and wakefulness by brainstem monoaminergic and cholinergic neurons,” Acta Neurochirurgica Supplementum, vol. 87, pp. 3–6, 2003.
  67. T. Honda and K. Semba, “Serotonergic synaptic input to cholinergic neurons in the rat mesopontine tegmentum,” Brain Research, vol. 647, no. 2, pp. 299–306, 1994. View at Publisher · View at Google Scholar · View at Scopus
  68. F. Trent and J. M. Tepper, “Dorsal raphe stimulation modifies striatal-evoked antidromic invasion of nigral dopaminergic neurons in vivo,” Experimental Brain Research, vol. 84, no. 3, pp. 620–630, 1991. View at Scopus
  69. K. Kitahama, I. Nagatsu, M. Geffard, and T. Maeda, “Distribution of dopamine-immunoreactive fibers in the rat brainstem,” Journal of Chemical Neuroanatomy, vol. 18, no. 1-2, pp. 1–9, 2000. View at Publisher · View at Google Scholar · View at Scopus
  70. C. Peyron, P. H. Luppi, K. Kitahama, P. Fort, D. M. Hermann, and M. Jouvet, “Origin of the dopaminergic innervation of the rat dorsal raphe nucleus,” NeuroReport, vol. 6, no. 18, pp. 2527–2531, 1995. View at Scopus
  71. P. Kalen, R. E. Strecker, E. Rosengren, and A. Bjorklund, “Endogenous release of neuronal serotonin and 5-hydroxyindoleacetic acid in the caudate-putamen of the rat as revealed by intracerebral dialysis coupled to high-performance liquid chromatography with fluorimetric detection,” Journal of Neurochemistry, vol. 51, no. 5, pp. 1422–1435, 1988. View at Scopus
  72. A. Mansour, J. H. Meador-Woodruff, J. R. Bunzow, O. Civelli, H. Akil, and S. J. Watson, “Localization of dopamine D2 receptor mRNA and D1 and D2 receptor binding in the rat brain and pituitary: an in situ hybridization-receptor autoradiographic analysis,” Journal of Neuroscience, vol. 10, no. 8, pp. 2587–2600, 1990. View at Scopus
  73. B. Lavoie and A. Parent, “Pedunculopontine nucleus in the squirrel monkey: distribution of cholinergic and monoaminergic neurons in the mesopontine tegmentum with evidence for the presence of glutamate in cholinergic neurons,” Journal of Comparative Neurology, vol. 344, no. 2, pp. 190–209, 1994. View at Publisher · View at Google Scholar · View at Scopus
  74. S. M. Edley and A. M. Graybiel, “The afferent and efferent connections of the feline nucleus tegmenti pedunculopontinus, pars compacta,” Journal of Comparative Neurology, vol. 217, no. 2, pp. 187–215, 1983. View at Scopus
  75. M. Matsumura, “The pedunculopontine tegmental nucleus and experimental parkinsonism: a review,” Journal of Neurology, vol. 252, no. 4, pp. iv5–iv12, 2005. View at Publisher · View at Google Scholar · View at Scopus
  76. E. Scarnati, A. Proia, E. Campana, and C. Pacitti, “A microiontophoretic study on the nature of the putative synaptic neurotransmitter involved in the pedunculopontine-substantia nigra pars compacta excitatory pathway of the rat,” Experimental Brain Research, vol. 62, no. 3, pp. 470–478, 1986. View at Scopus
  77. T. Futami, K. Takakusaki, and S. T. Kitai, “Glutamatergic and cholinergic inputs from the pedunculopontine tegmental nucleus to dopamine neurons in the substantia nigra pars compacta,” Neuroscience Research, vol. 21, no. 4, pp. 331–342, 1995. View at Publisher · View at Google Scholar · View at Scopus
  78. S. J. Lokwan, P. G. Overton, M. S. Berry, and D. Clark, “Stimulation of the pedunculopontine tegmental nucleus in the rat produces burst firing in A9 dopaminergic neurons,” Neuroscience, vol. 92, no. 1, pp. 245–254, 1999. View at Publisher · View at Google Scholar · View at Scopus
  79. C. A. Chapman, J. S. Yeomans, C. D. Blaha, and J. R. Blackburn, “Increased striatal dopamine efflux follows scopolamine administered systemically or to the tegmental pedunculopontine nucleus,” Neuroscience, vol. 76, no. 1, pp. 177–186, 1997. View at Publisher · View at Google Scholar · View at Scopus
  80. G. L. Forster and C. D. Blaha, “Pedunculopontine tegmental stimulation evokes striatal dopamine efflux by activation of acetylcholine and glutamate receptors in the midbrain and pons of the rat,” European Journal of Neuroscience, vol. 17, no. 4, pp. 751–762, 2003. View at Publisher · View at Google Scholar · View at Scopus
  81. A. D. Miller and C. D. Blaha, “Nigrostriatal dopamine release modulated by mesopontine muscarinic receptors,” NeuroReport, vol. 15, no. 11, pp. 1805–1808, 2004. View at Publisher · View at Google Scholar · View at Scopus
  82. C. D. Blaha and P. Winn, “Modulation of dopamine efflux in the striatum following cholinergic stimulation of the substantia nigra in intact and pedunculopontine tegmental nucleus-lesioned rats,” Journal of Neuroscience, vol. 13, no. 3, pp. 1035–1044, 1993. View at Scopus
  83. U. Maskos, “Role of endogenous acetylcholine in the control of the dopaminergic system via nicotinic receptors,” Journal of Neurochemistry, vol. 114, no. 3, pp. 641–646, 2010. View at Publisher · View at Google Scholar · View at Scopus
  84. M. Mameli-Engvall, A. Evrard, S. Pons et al., “Hierarchical control of dopamine neuron-firing patterns by nicotinic receptors,” Neuron, vol. 50, no. 6, pp. 911–921, 2006. View at Publisher · View at Google Scholar · View at Scopus
  85. K. Chergui, P. J. Charlety, H. Akaoka et al., “Tonic activation of NMDA receptors causes spontaneous burst discharge of rat midbrain dopamine neurons in vivo,” European Journal of Neuroscience, vol. 5, no. 2, pp. 137–144, 1993. View at Scopus
  86. A. A. Grace and B. S. Bunney, “The control of firing pattern in nigral dopamine neurons: burst firing,” Journal of Neuroscience, vol. 4, no. 11, pp. 2877–2890, 1984. View at Scopus
  87. P. Calabresi, M. G. Lacey, and R. A. North, “Nicotinic excitation of rat ventral tegmental neurones in vitro studied by intracellular recording,” British Journal of Pharmacology, vol. 98, no. 1, pp. 135–140, 1989. View at Scopus
  88. B. Gronier and K. Rasmussen, “Activation of midbrain presumed dopaminergic neurones by muscarinic cholinergic receptors: an in vivo electrophysiological study in the rat,” British Journal of Pharmacology, vol. 124, no. 3, pp. 455–464, 1998. View at Publisher · View at Google Scholar · View at Scopus
  89. M. G. Lacey, P. Calabresi, and R. A. North, “Muscarine depolarizes rat substantia nigra zona compacta and ventral tegmental neurons in vitro through M1-like receptors,” Journal of Pharmacology and Experimental Therapeutics, vol. 253, no. 1, pp. 395–400, 1990. View at Scopus
  90. E. M. Sorenson, T. Shiroyama, and S. T. Kitai, “Postsynaptic nicotinic receptors on dopaminergic neurons in the substantia nigra pars compacta of the rat,” Neuroscience, vol. 87, no. 3, pp. 659–673, 1998. View at Publisher · View at Google Scholar · View at Scopus
  91. S. T. Kitai, P. D. Shepard, J. C. Callaway, and R. Scroggs, “Afferent modulation of dopamine neuron firing patterns,” Current Opinion in Neurobiology, vol. 9, no. 6, pp. 690–697, 1999. View at Publisher · View at Google Scholar · View at Scopus
  92. R. S. Scroggs, C. G. Cardenas, J. A. Whittaker, and S. T. Kitai, “Muscarine reduces calcium-dependent electrical activity in substantia nigra dopaminergic neurons,” Journal of Neurophysiology, vol. 86, no. 6, pp. 2966–2972, 2001. View at Scopus
  93. J. Grenhoff, G. Aston-Jones, and T. H. Svensson, “Nicotinic effects on the firing pattern of midbrain dopamine neurons,” Acta Physiologica Scandinavica, vol. 128, no. 3, pp. 351–358, 1986. View at Scopus
  94. V. I. Pidoplichko, M. DeBiasi, J. T. Wiliams, and J. A. Dani, “Nicotine activates and desensitizes midbrain dopamine neurons,” Nature, vol. 390, no. 6658, pp. 401–404, 1997. View at Publisher · View at Google Scholar · View at Scopus
  95. T. Yamashita and T. Isa, “Fulfenamic acid sensitive, Ca(2+)-dependent inward current induced by nicotinic acetylcholine receptors in dopamine neurons,” Neuroscience Research, vol. 46, no. 4, pp. 463–473, 2003. View at Publisher · View at Google Scholar · View at Scopus
  96. K. D. Alex and E. A. Pehek, “Pharmacologic mechanisms of serotonergic regulation of dopamine neurotransmission,” Pharmacology and Therapeutics, vol. 113, no. 2, pp. 296–320, 2007. View at Publisher · View at Google Scholar · View at Scopus
  97. R. Samanin and S. Garattini, “The serotonergic system in the brain and its possible functional connections with other aminergic systems,” Life Sciences, vol. 17, no. 8, pp. 1201–1209, 1975. View at Scopus
  98. G. Di Giovanni, P. De Deurwaerdére, M. Di Mascio, V. Di Matteo, E. Esposito, and U. Spampinato, “Selective blockade of serotonin-2C/2B receptors enhances mesolimbic and mesostriatal dopaminergic function: a combined in vivo electrophysiological and microdialysis study,” Neuroscience, vol. 91, no. 2, pp. 587–597, 1999. View at Publisher · View at Google Scholar · View at Scopus
  99. G. Di Giovanni, V. Di Matteo, M. Di Mascio, and E. Esposito, “Preferential modulation of mesolimbic vs. nigrostriatal dopaminergic function by serotonin(2C/2B) receptor agonists: a combined in vivo electrophysiological and microdialysis study,” Synapse, vol. 35, no. 1, pp. 53–61, 2000. View at Scopus
  100. L. Ugedo, J. Grenhoff, and T. H. Svensson, “Ritanserin, a 5-HT2 receptor antagonist, activates midbrain dopamine neurons by blocking serotonergic inhibition,” Psychopharmacology, vol. 98, no. 1, pp. 45–50, 1989. View at Scopus
  101. G. Di Giovanni, E. Esposito, and V. Di Matteo, “Role of serotonin in central dopamine dysfunction,” CNS Neuroscience & Therapeutics, vol. 16, no. 3, pp. 179–194, 2010. View at Publisher · View at Google Scholar · View at Scopus
  102. Q. S. Yan, S. Z. Zheng, and S. E. Yan, “Involvement of 5-HT1B receptors within the ventral tegmental area in regulation of mesolimbic dopaminergic neuronal activity via GABA mechanisms: a study with dual-probe microdialysis,” Brain Research, vol. 1021, no. 1, pp. 82–91, 2004. View at Publisher · View at Google Scholar · View at Scopus
  103. Q. S. Yan and S. E. Yan, “Activation of 5-HT(1B/1D) receptors in the mesolimbic dopamine system increases dopamine release from the nucleus accumbens: a microdialysis study,” European Journal of Pharmacology, vol. 418, no. 1-2, pp. 55–64, 2001. View at Publisher · View at Google Scholar · View at Scopus
  104. K. Takakusaki, J. Oohinata-Sugimoto, K. Saitoh, and T. Habaguchi, “Role of basal ganglia-brainstem systems in the control of postural muscle tone and locomotion,” Progress in Brain Research, vol. 143, pp. 231–237, 2004. View at Publisher · View at Google Scholar · View at Scopus
  105. P. A. Pahapill and A. M. Lozano, “The pedunculopontine nucleus and Parkinson's disease,” Brain, vol. 123, no. 9, pp. 1767–1783, 2000. View at Scopus
  106. A. Bechara and D. Van Der Kooy, “The tegmental pedunculopontine nucleus: a brain-stem output of the limbic system critical for the conditioned place preferences produced by morphine and amphetamine,” Journal of Neuroscience, vol. 9, no. 10, pp. 3400–3409, 1989. View at Scopus
  107. T. E. Kippin and D. Van Der Kooy, “Excitotoxic lesions of the tegmental pedunculopontine nucleus impair copulation in naive male rats and block the rewarding effects of copulation in experienced male rats,” European Journal of Neuroscience, vol. 18, no. 9, pp. 2581–2591, 2003. View at Publisher · View at Google Scholar · View at Scopus
  108. H. L. Alderson, M. P. Latimer, and P. Winn, “Intravenous self-administration of nicotine is altered by lesions of the posterior, but not anterior, pedunculopontine tegmental nucleus,” European Journal of Neuroscience, vol. 23, no. 8, pp. 2169–2175, 2006. View at Publisher · View at Google Scholar · View at Scopus
  109. P. Winn, “How best to consider the structure and function of the pedunculopontine tegmental nucleus: evidence from animal studies,” Journal of the Neurological Sciences, vol. 248, no. 1-2, pp. 234–250, 2006. View at Publisher · View at Google Scholar · View at Scopus
  110. D. I. G. Wilson, D. A. A. MacLaren, and P. Winn, “Bar pressing for food: differential consequences of lesions to the anterior versus posterior pedunculopontine,” European Journal of Neuroscience, vol. 30, no. 3, pp. 504–513, 2009. View at Publisher · View at Google Scholar · View at Scopus
  111. M. C. Olmstead, E. M. Munn, K. B. J. Franklin, and R. A. Wise, “Effects of pedunculopontine tegmental nucleus lesions on responding for intravenous heroin under different schedules of reinforcement,” Journal of Neuroscience, vol. 18, no. 13, pp. 5035–5044, 1998. View at Scopus
  112. H. L. Alderson, M. P. Latimer, C. D. Blaha, A. G. Phillips, and P. Winn, “An examination of d-amphetamine self-administration in pedunculopontine tegmental nucleus-lesioned rats,” Neuroscience, vol. 125, no. 2, pp. 349–358, 2004. View at Publisher · View at Google Scholar · View at Scopus
  113. W. L. Inglis, J. S. Dunbar, and P. Winn, “Outflow from the nucleus accumbens to the pedunculopontine tegmental nucleus: a dissociation between locomotor activity and the acquisition of responding for conditioned reinforcement stimulated by d-amphetamine,” Neuroscience, vol. 62, no. 1, pp. 51–64, 1994. View at Publisher · View at Google Scholar · View at Scopus
  114. H. Condé, J. F. Dormont, and D. Farin, “The role of the pedunculopontine tegmental nucleus in relation to conditioned motor performance in the cat. II. Effects of reversible inactivation by intracerebral microinjections,” Experimental Brain Research, vol. 121, no. 4, pp. 411–418, 1998. View at Publisher · View at Google Scholar · View at Scopus
  115. J. A. Ainge, T. A. Jenkins, and P. Winn, “Induction of c-fos in specific thalamic nuclei following stimulation of the pedunculopontine tegmental nucleus,” European Journal of Neuroscience, vol. 20, no. 7, pp. 1827–1837, 2004. View at Publisher · View at Google Scholar · View at Scopus
  116. P. P. Rompre and E. Miliaressis, “Pontine and mesencephalic substrates of self-stimulation,” Brain Research, vol. 359, no. 1-2, pp. 246–259, 1985. View at Scopus
  117. M. Diotte, C. Bielajew, M. Miguelez, and E. Miliaressis, “Factors that influence the persistence of stimulation-induced aversion,” Physiology and Behavior, vol. 72, no. 5, pp. 661–667, 2001. View at Publisher · View at Google Scholar · View at Scopus
  118. Z. H. Liu and S. Ikemoto, “The midbrain raphe nuclei mediate primary reinforcement via GABA(A) receptors,” European Journal of Neuroscience, vol. 25, no. 3, pp. 735–743, 2007. View at Publisher · View at Google Scholar · View at Scopus
  119. C. S. Tanaka, K. Doya, G. Okada, K. Ueda, Y. Okamoto, and S. Yamawaki, “Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops,” Nature Neuroscience, vol. 7, no. 8, pp. 887–893, 2004. View at Publisher · View at Google Scholar · View at Scopus
  120. P. Dayan and Q. J. M. Huys, “Serotonin in affective control,” Annual Review of Neuroscience, vol. 32, pp. 95–126, 2009. View at Publisher · View at Google Scholar · View at Scopus
  121. R. D. Rogers, A. J. Blackshaw, H. C. Middleton et al., “Tryptophan depletion impairs stimulus-reward learning while methylphenidate disrupts attentional control in healthy young adults: implications for the monoaminergic basis of impulsive behaviour,” Psychopharmacology, vol. 146, no. 4, pp. 482–491, 1999. View at Publisher · View at Google Scholar · View at Scopus
  122. N. D. Daw, S. Kakade, and P. Dayan, “Opponent interactions between serotonin and dopamine,” Neural Networks, vol. 15, no. 4–6, pp. 603–616, 2002. View at Publisher · View at Google Scholar · View at Scopus
  123. S. R. Chamberlain, U. Müller, A. D. Blackwell, L. Clark, T. W. Robbins, and B. J. Sahakian, “Neurochemical modulation of response inhibition and probabilistic learning in humans,” Science, vol. 311, no. 5762, pp. 861–863, 2006. View at Publisher · View at Google Scholar · View at Scopus
  124. M. A. Wogar, C. M. Bradshaw, and E. Szabadi, “Effect of lesions of the ascending 5-hydroxytryptaminergic pathways on choice between delayed reinforcers,” Psychopharmacology, vol. 111, no. 2, pp. 239–243, 1993. View at Scopus
  125. D. Brunner and R. Hen, “Insights into the neurobiology of impulsive behavior from serotonin receptor knockout mice,” Annals of the New York Academy of Sciences, vol. 836, pp. 81–105, 1997. View at Publisher · View at Google Scholar · View at Scopus
  126. A. A. Harrison, B. J. Everitt, and T. W. Robbins, “Central 5-HT depletion enhances impulsive responding without affecting the accuracy of attentional performance: interactions with dopaminergic mechanisms,” Psychopharmacology, vol. 133, no. 4, pp. 329–342, 1997. View at Publisher · View at Google Scholar · View at Scopus
  127. S. Mobini, T. J. Chiang, A. S. A. Al-Ruwaitea, M. Y. Ho, C. M. Bradshaw, and E. Szabadi, “Effect of central 5-hydroxytryptamine depletion on inter-temporal choice: a quantitative analysis,” Psychopharmacology, vol. 149, no. 3, pp. 313–318, 2000. View at Publisher · View at Google Scholar · View at Scopus
  128. S. Mobini, T. J. Chiang, M. Y. Ho, C. M. Bradshaw, and E. Szabadi, “Effects of central 5-hydroxytryptamine depletion on sensitivity to delayed and probabilistic reinforcement,” Psychopharmacology, vol. 152, no. 4, pp. 390–397, 2000. View at Publisher · View at Google Scholar · View at Scopus
  129. C. A. Winstanley, J. W. Dalley, D. E. H. Theobald, and T. W. Robbins, “Fractionating impulsivity: contrasting effects of central 5-HT depletion on different measures of impulsive behavior,” Neuropsychopharmacology, vol. 29, no. 7, pp. 1331–1343, 2004. View at Publisher · View at Google Scholar · View at Scopus
  130. C. A. Winstanley, D. E. H. Theobald, J. W. Dalley, R. N. Cardinal, and T. W. Robbins, “Double dissociation between serotonergic and dopaminergic modulation of medial prefrontal and orbitofrontal cortex during a test of impulsive choice,” Cerebral Cortex, vol. 16, no. 1, pp. 106–114, 2006. View at Publisher · View at Google Scholar · View at Scopus
  131. F. Denk, M. E. Walton, K. A. Jennings, T. Sharp, M. F. S. Rushworth, and D. M. Bannerman, “Differential involvement of serotonin and dopamine systems in cost-benefit decisions about delay or effort,” Psychopharmacology, vol. 179, no. 3, pp. 587–596, 2005. View at Publisher · View at Google Scholar · View at Scopus
  132. P. T. Mehlman, J. D. Higley, I. Faucher et al., “Low CSF 5-HIAA concentrations and severe aggression and impaired impulse control in nonhuman primates,” American Journal of Psychiatry, vol. 151, no. 10, pp. 1485–1491, 1994. View at Scopus
  133. A. M. Van Erp and K. A. Miczek, “Aggressive behavior, increased accumbal dopamine, and decreased cortical serotonin in rats,” Journal of Neuroscience, vol. 20, no. 24, pp. 9320–9325, 2000. View at Scopus
  134. T. R. Insel, J. Zohar, C. Benkelfat, and D. L. Murphy, “Serotonin in obsessions, compulsions, and the control of aggressive impulses,” Annals of the New York Academy of Sciences, vol. 600, pp. 574–586, 1990. View at Scopus
  135. J. F. Dormont, H. Condé, and D. Farin, “The role of the pedunculopontine tegmental nucleus in relation to conditioned motor performance in the cat. I. Context-dependent and reinforcement-related single unit activity,” Experimental Brain Research, vol. 121, no. 4, pp. 401–410, 1998. View at Publisher · View at Google Scholar · View at Scopus
  136. A. B. Norton, Y. S. Jo, E. W. Clark, C. A. Taylor, and S. J. Mizumori, “Independent neural coding of reward and movement by pedunculopontine tegmental nucleus neurons in freely navigating rats,” European Journal of Neuroscience, vol. 33, no. 10, pp. 1885–1896, 2011.
  137. W. X. Pan and B. I. Hyland, “Pedunculopontine tegmental nucleus controls conditioned responses of midbrain dopamine neurons in behaving rats,” Journal of Neuroscience, vol. 25, no. 19, pp. 4725–4732, 2005. View at Publisher · View at Google Scholar · View at Scopus
  138. M. Matsumura, K. Watanabe, and C. Ohye, “Single-unit activity in the primate nucleus tegmenti pedunculopontinus related to voluntary arm movement,” Neuroscience Research, vol. 28, no. 2, pp. 155–165, 1997. View at Publisher · View at Google Scholar · View at Scopus
  139. B. L. Jacobs and C. A. Fornal, “Activity of brain serotonergic neurons in the behaving animal,” Pharmacological Reviews, vol. 43, no. 4, pp. 563–578, 1991. View at Scopus
  140. S. P. Ranade and Z. F. Mainen, “Transient firing of dorsal raphe neurons encodes diverse and specific sensory, motor, and reward events,” Journal of Neurophysiology, vol. 102, no. 5, pp. 3026–3037, 2009. View at Publisher · View at Google Scholar · View at Scopus
  141. J. V. Schweimer and M. A. Ungless, “Phasic responses in dorsal raphe serotonin neurons to noxious stimuli,” Neuroscience, vol. 171, no. 4, pp. 1209–1215, 2010. View at Publisher · View at Google Scholar · View at Scopus
  142. K. W. Miyazaki, K. Miyazaki, and K. Doya, “Activation of the central serotonergic system in response to delayed but not omitted rewards,” European Journal of Neuroscience, vol. 33, no. 1, pp. 153–160, 2011. View at Publisher · View at Google Scholar · View at Scopus
  143. K. Miyazaki, K. W. Miyazaki, and K. Doya, “Activation of dorsal raphe serotonin neurons underlies waiting for delayed rewards,” Journal of Neuroscience, vol. 31, no. 2, pp. 469–479, 2011.
  144. B. L. Jacobs and E. C. Azmitia, “Structure and function of the brain serotonin system,” Physiological Reviews, vol. 72, no. 1, pp. 165–230, 1992. View at Scopus
  145. K. Takakusaki, T. Shiroyama, and S. T. Kitai, “Two types of cholinergic neurons in the rat tegmental pedunculopontine nucleus: electrophysiological and morphological characterization,” Neuroscience, vol. 79, no. 4, pp. 1089–1109, 1997. View at Publisher · View at Google Scholar · View at Scopus
  146. L. Descarries, K. Watkins, S. Garcia, and A. Beaudet, “The serotonin neurons in nucleus raphe dorsalis of adult rat: a light and electron microscope radioautographic study,” Journal of Comparative Neurology, vol. 207, no. 3, pp. 239–254, 1982. View at Scopus
  147. L. Leger and L. Wiklund, “Distribution and numbers of indoleamine cell bodies in the cat brainstem determined with Falck-Hillarp fluorescence histochemistry,” Brain Research Bulletin, vol. 9, no. 1–6, pp. 245–251, 1982. View at Scopus
  148. K. G. Baker, G. M. Halliday, J. P. Hornung, L. B. Geffen, R. G. H. Cotton, and I. Tork, “Distribution, morphology and number of monoamine-synthesizing and substance P-containing neurons in the human dorsal raphe nucleus,” Neuroscience, vol. 42, no. 3, pp. 757–775, 1991. View at Publisher · View at Google Scholar · View at Scopus
  149. Y. Takikawa, R. Kawagoe, and O. Hikosaka, “A possible role of midbrain dopamine neurons in short- and long-term adaptation of saccades to position-reward mapping,” Journal of Neurophysiology, vol. 92, no. 4, pp. 2520–2529, 2004. View at Publisher · View at Google Scholar · View at Scopus
  150. P. N. Tobler, C. D. Fiorillo, and W. Schultz, “Adaptive coding of reward value by dopamine neurons,” Science, vol. 307, no. 5715, pp. 1642–1645, 2005. View at Publisher · View at Google Scholar · View at Scopus
  151. P. R. Montague, P. Dayan, and T. J. Sejnowski, “A framework for mesencephalic dopamine systems based on predictive Hebbian learning,” Journal of Neuroscience, vol. 16, no. 5, pp. 1936–1947, 1996. View at Scopus
  152. G. S. Berns and T. J. Sejnowski, “A computational model of how the basal ganglia produce sequences,” Journal of Cognitive Neuroscience, vol. 10, no. 1, pp. 108–121, 1998. View at Publisher · View at Google Scholar · View at Scopus
  153. “A model of how the basal ganglia generate and use neural signals that predict reinforcement,” in Models of Information Processing in the Basal Ganglia, J. C. Houk, J. L. Adams, and A. G. Barto, Eds., pp. 249–270, MIT Press, New York, NY, USA, 1995.
  154. J. L. Contreras-Vidal and W. Schultz, “A predictive reinforcement model of dopamine neurons for learning approach behavior,” Journal of Computational Neuroscience, vol. 6, no. 3, pp. 191–214, 1999. View at Publisher · View at Google Scholar · View at Scopus
  155. J. Brown, D. Bullock, and S. Grossberg, “How the basal ganglia use parallel excitatory and inhibitory learning pathways to selectively respond to unexpected rewarding cues,” Journal of Neuroscience, vol. 19, no. 23, pp. 10502–10511, 1999. View at Scopus
  156. M. Wu, A. W. Hrycyshyn, and S. M. Brudzynski, “Subpallidal outputs to the nucleus accumbens and the ventral tegmental area: anatomical and electrophysiological studies,” Brain Research, vol. 740, no. 1-2, pp. 151–161, 1996. View at Publisher · View at Google Scholar · View at Scopus
  157. P. N. Tobler, A. Dickinson, and W. Schultz, “Coding of predicted reward omission by dopamine neurons in a conditioned inhibition paradigm,” Journal of Neuroscience, vol. 23, no. 32, pp. 10402–10410, 2003. View at Scopus
  158. T. C. Jhou, H. L. Fields, M. G. Baxter, C. B. Saper, and P. C. Holland, “The rostromedial tegmental nucleus (RMTg), a GABAergic afferent to midbrain dopamine neurons, encodes aversive stimuli and inhibits motor responses,” Neuron, vol. 61, no. 5, pp. 786–800, 2009. View at Publisher · View at Google Scholar · View at Scopus
  159. T. C. Jhou, S. Geisler, M. Marinelli, B. A. Degarmo, and D. S. Zahm, “The mesopontine rostromedial tegmental nucleus: a structure targeted by the lateral habenula that projects to the ventral tegmental area of Tsai and substantia nigra compacta,” Journal of Comparative Neurology, vol. 513, no. 6, pp. 566–596, 2009. View at Publisher · View at Google Scholar · View at Scopus
  160. M. Matsumoto and O. Hikosaka, “Lateral habenula as a source of negative reward signals in dopamine neurons,” Nature, vol. 447, no. 7148, pp. 1111–1115, 2007. View at Publisher · View at Google Scholar · View at Scopus
  161. T. Steckler, W. Inglis, P. Minn, and A. Sahgal, “The pedunculopontine tegmental nucleus: a role in cognitive processes?” Brain Research Reviews, vol. 19, no. 3, pp. 298–318, 1994. View at Publisher · View at Google Scholar · View at Scopus