Abstract

The basal ganglia (BG) are a collection of subcortical nuclei critical for voluntary behavior. According to the standard model, the output projections from the BG tonically inhibit downstream motor centers and prevent behavior. A pause in the BG output opens the gate for behavior, allowing the initiation of actions. Hypokinetic neurological symptoms, such as inability to initiate actions in Parkinson’s disease, are explained by excessively high firing rates of the BG output neurons. This model, widely taught in textbooks, is contradicted by recent electrophysiological results, which are reviewed here. In addition, I also introduce a new model, based on the insight that behavior is a product of closed loop negative feedback control using internal reference signals rather than sensorimotor transformations. The nervous system is shown to be a functional hierarchy comprising independent controllers occupying different levels, each level controlling specific variables derived from its perceptual inputs. The BG represent the level of transition control in this hierarchy, sending reference signals specifying the succession of body orientations and configurations. This new model not only explains the major symptoms in movement disorders but also generates a number of testable predictions.

1. Introduction

The basal ganglia (BG) have been implicated in functions as diverse as movement, learning, and motivation [15]. Damage to these nuclei impair or even abolish voluntary behavior. But after decades of research it remains unclear how the BG generate behavior.

I shall argue that the BG occupy a specific level in a functional hierarchy. Unlike traditional models, which are based on the linear causation paradigm [6], the proposed hierarchy is based on the principle of cascade control [7]. Unfortunately control theory is currently misunderstood in neuroscience, mainly due to conceptual confusions introduced by cybernetics and engineering control theory. To understand the role of the BG in behavior, it would be necessary to explain the principles of control and the organization of the functional hierarchy.

I shall first discuss current models of the BG and recent results that begin to challenge these models. I shall then explain how control theory, correctly applied, can help us understand behavior, and how different control systems can be arranged in a hierarchy using the principle of cascade control. Finally, I shall discuss the neural implementation of cascade control and the distinct contributions of the BG in this functional hierarchy.

2. What Are the Basal Ganglia?

The rapid accumulation of facts on the BG has added new pieces of the puzzle without revealing how the pieces are to fit together. The facts are often isolated and, in the absence of a coherent theory, incomprehensible. Rather than giving a detailed review of the physiology and anatomy, I shall only outline the most salient features that are relevant to our understanding of behavior.

First, the terminology is confusing and daunting to beginning students. Conventional names for parts of the BG are usually Latin descriptions of their visual appearance, independent of the functional significance of the signals being transmitted by these parts. From a functional perspective, it would be more useful to classify brain regions on the basis of the neurotransmitter released by the projection neurons [8]. In any brain region there is typically one type of projection neuron with axons leaving the structure of origin and targeting other brain regions (and multiple types of interneurons whose axons stay within the region). For example, cortical pyramidal neurons, which release glutamate as a neurotransmitter, excite target structures. In contrast, the projection neurons in the striatum, the input nucleus of the BG, are the medium spiny projection neurons, which release GABA and inhibit their targets. The projection neurons in the pallidum, the output nucleus of the BG, are also GABAergic. The BG appear to be one of the few areas in the nervous system with inhibitory projection neurons (Figure 1). Although recent work has shown some exceptions [9], this rule appears to apply in most cases. It should also be noted that an area like the subthalamic nucleus, although traditionally considered to belong to the BG, is not classified as such here, because it contains glutamatergic projection neurons like the cerebral cortex.

Here the term BG refers to three general classes of nuclei: input, intrinsic, and output.(1)The input nucleus is the striatum. The striatum receives projections from the entire cortical mantle and from multiple diencephalic regions, especially the intralaminar thalamus. Various terms, such as caudate, putamen, and nucleus accumbens core, have been used to describe various regions of the input nucleus. Most of these can be grouped on the basis of the region of origin of the massive corticostriatal projections. Just as diverse cortical regions are characterized by pyramidal projection neurons, which are glutamatergic and excitatory, the striatal regions share the spiny projection neurons, which are GABAergic and inhibitory. They possess large dendritic arbors with thousands of spines, which are the sites of glutamatergic synapses made by the cortical and thalamic inputs [10].(2)The intrinsic nucleus is the globus pallidus, often called the globus pallidus external segment (GPe), to be distinguished from the internal segment (GPi), which is found in primates. The entopeduncular nucleus is often considered the rodent version of GPi [11]. It is an output nucleus along with substantia nigra pars reticulata (SNr). Only the GPe is considered the “intrinsic nucleus” here, as its inputs and outputs are largely restricted to other BG nuclei [12]. It is the major target of the striatopallidal projections, from striatal neurons that coexpress D2 class dopamine receptors and A2A adenosine receptors. Traditionally called the “indirect pathway,” this projection is the subject of a vast literature [13]. The output of the GPe can inhibit the SNr neurons [14]. The GPe also sends projections back to the striatum, but the functional significance of these pallidostratal projections remains unknown [15].(3)The output nuclei include entopeduncular nucleus (rodents), GPi, and SNr. These nuclei generally inhibit downstream targets. The entopeduncular nucleus and GPi are both believed to be critical for limb movements, whereas the SNr may be more critical for movements of the head and trunk [16]. Throughout this review, the focus will be on the SNr, which project to the tectum, thalamus, and brainstem.

The excitatory inputs to the BG reach the striatum from the cerebral cortex (corticostriatal projections) and the intralaminar thalamus (thalamostriatal projections). Ascending projections from the midbrain and brainstem also reach the striatum, releasing neuromodulators such as dopamine [17]. Other inputs can also reach the striatum indirectly, via the thalamus, for example, projections from the dentate nucleus in the cerebellum [18] and from the vestibular nucleus [19].

Most striatal outputs target the output nuclei of the BG, which are inhibitory and often fire at high rates (e.g., 20–80 Hz in SNr neurons). Thus the input nucleus (striatum) and the output nucleus (e.g., SNr) have distinct sets of connections with other brain regions, but the main connection between them is the GABAergic projection from the input nucleus to output nucleus [20]. When the striatum is activated, the output of the medium spiny projection neurons inhibits the SNr neurons. Consequently, the downstream targets of nigral outputs, for example, superior colliculus, can be disinhibited [14, 21, 22].

Interestingly, such a circuit organization is similar to the model proposed by Von Holst and Lorenz long ago, purely based on behavioral observations: “the basic central nervous organisation consists of a cell permanently producing endogenous stimulation, but prevented from activating its effector by another cell which, also producing endogenous stimulation, exerts an inhibiting effect. It is this inhibiting cell which is influenced by the receptor and ceases its inhibitory activity at the biologically “right” moment [23].” The inhibiting cell, in this case, is the nigral projection neuron that ceases at the right moment, allowing the target structures of the nigral output to be disinhibited [24, 25].

This model of disinhibition, widely taught in introductory courses, is the foundation of current models of BG function [26, 27]. For example, Hikosaka wrote: “GABAergic output acts as a gate for motor signals such that there should be no motor output as long as the gate is closed. For this gating function to work properly, the level of the GABAergic output must, by default, be maintained at a steady level” [16]. Normally the gate is locked, and the pause in nigral activity unlocks the behavior. The activation of the striatum, however, can inhibit the BG output neurons, thus disinhibiting behavior. As we shall see below, this model of the BG is inadequate for several reasons.

Much work has also been devoted to the understanding of the functional roles of the so-called direct and indirect pathways in the BG. The medium spiny projection neurons comprise over 90% of the neurons in the striatum. Two major populations have been identified based on their anatomical targets and on differential expression of various receptors [10, 13]. Dopamine can have different effects on striatonigral and striatopallidal neurons, depending on the class of dopamine receptors expressed. The striatonigral pathway originates from spiny neurons expressing D1-like receptors, whereas the striatopallidal pathway originates from those expressing D2-like receptors. The modulation is not only restricted to the glutamatergic transmission but also critical for the GABAergic lateral inhibition by the axon collaterals of striatal projection neurons [17, 28]. Such differences between the direct and indirect pathways are the focus of extensive research [17, 2933]. Although much is now known regarding the properties of these pathways, so far their functional role remains controversial. There are a number of speculations on the functional distinction between the direct and indirect pathways [34, 35], but most of these are too qualitative and vaguely formulated to qualify as genuine models.

3. Opponent Output from the BG

We recently recorded from the SNr output neurons in mice performing an operant task [36]. We trained mice to hold down a lever for a minimum duration in order to receive a food reward. Once the lever is released, a food pellet is delivered if the press duration exceeds a minimum criterion determined by the experimenter (Figure 2). In this “temporal differentiation” procedure, antecedent stimuli (e.g., discriminative stimuli) are not manipulated. The mice must learn to generate behaviors that satisfy some arbitrary criterion in order to receive the reward [37, 38]. The action duration is used to tag neural activity related to the holding.

We found nigral neurons that increased their firing rate during the holding period. This increase lasted as long as the press duration and immediately returned to prepress baseline levels following the release of the lever (Figure 2). Such a sustained increase in firing rate appears to support the idea that an increase in the inhibitory output from the SNr should prevent movement. But, at the same time, other neurons exhibited the opposite pattern—pausing during the holding period. This result is more surprising. According to the gate model, a pause in nigral output disinhibits target structures in the tectum and thalamus and permits the initiation of movement. Yet when the mouse is holding down the lever, in the absence of any overt “movement,” there is a clear pause in nigral activity.

It could be argued that our results agree with the “focused selection” theory [3], according to which the BG output selects a motor program while inhibiting competing programs. The competing actions are presumably inhibited by an increase in nigral output, whereas the selected action of lever pressing is enabled by a decrease in nigral output. According to this model, an action is a change in position (e.g., reaching with one’s arm), which is achieved by inhibiting postural control that allows the arm to stay still. But, in our task, movement is only observed before the holding period, as the mouse presses down the lever or afterwards as the mouse releases it. During the holding period itself, there is no overt movement. Moreover, the increase and decrease in firing rate during the holding period appear to be similar in magnitude but opposite in polarity.

Why should there be such opponent activity in the BG output neurons? One interpretation is that these represent bidirectional signals from a tonic baseline. In electronics this is common in amplifier design. In neurobiology, one example of bidirectional outputs is found at the lowest level of the motor system, in the reciprocal inhibition circuit [39]. The primary or Ia afferents that excite the alpha motor neurons also project to inhibitory interneurons that inhibit alpha motor neurons innervating the antagonist muscle. A pair of push-pull signals is generated and sent to the final common path. As one muscle tightens, the opposing one relaxes. Of course, there is no pushing, only a reduction in pulling, because muscles can only pull by contracting. Different muscles pull the joint in different directions, so push-pull can be achieved with muscles that are in some antagonistic relationship, for example, biceps and triceps. The torque generated depends on the difference between the pair of signals. When the two signals are equal, there is no movement. The joint angle stops changing, but the opposing muscles still have “tone,” both being activated without producing any net torque. The balance between the antagonistic muscles, analogous to the “common mode” signal in electronics, can be reached at different values. With movement, the output signal turns either positive or negative relative to the common mode value.

To send a pair of push-pull signals, one cannot use a single neural signal varying from negative to positive, because spike rates can never be negative. Subtraction is possible in neural signaling mainly through inhibition—the reduction of a positive signal (spike rate).

The use of a common mode signal, akin to muscle tone, allows the increase and decrease from the baseline to represent a pair of push-pull signals. Just as the muscle tone allows a smooth transition from a force in one direction to a force in the opposite direction, the common mode signal from the BG outputs will permit bidirectional control of their target systems. Increases and decreases from the average rate of firing will provide a pair of opposite signals to activate antagonistic pairs of control systems.

With reciprocal inhibition, bidirectional signals to the alpha motor neurons clearly alter the contraction of the relevant muscles, but what can the BG outputs do? Since the BG do not innervate motor neurons directly, signals from the SNr cannot command individual muscles. What do the target systems represent, if not antagonistic muscles pulling a joint in different directions? That is the key question.

One clue is suggested by our results from the temporal differentiation task. That both “increasing” and “decreasing” neurons did not change their rate of firing during the lever press suggests that their outputs, in fact, are correlated with the fixed position of the animal [40]. From the initiation of the press, the mouse reaches a new posture or body configuration and is required to hold it for brief time period. Because the animal is not moving during this period, the position is largely fixed. The BG output is also fixed at the same time.

But, to test this idea, it would be necessary to introduce disturbances to the posture in opposite directions. If indeed the opponent outputs represent signals sent to bidirectional control systems, they should reverse when the direction of disturbance is also reversed. Indeed, this is what we observed in a different set of studies. To test the hypothesis that BG output sends antiphasic signals for antagonistic lower systems, we recorded from SNr during continuous and cyclical postural disturbances (Figure 3). The mouse stood on an elevated and covered platform and experienced tilting disturbance in the roll plane (7 degrees of tilt to either side of the animal). To control posture, the mouse simply has to remain standing. In order to resist tilting to one side of the body, the mouse must produce the appropriate outputs. To avoid cables, which can introduce unexpected torques to the animal, we used wireless multielectrode recording to record single unit activity from many neurons simultaneously [41].

The output of most SNr neurons is quantitatively related to the tilt disturbance. Again we observed opponent outputs from the putative GABAergic projection neurons [42]. Some neurons were inhibited with tilt to the left and excited with tilt to the right, yet other neurons exhibited the opposite pattern (Figure 4). Neurons that reduced firing to tilt in one direction always increased firing to tilt in the opposite direction. These two groups of neurons appear to be roughly 180 degrees out of phase.

Moreover, the relationship between neural activity and postural disturbance is highly linear, at least for the range of disturbances used in our study. In the absence of the tilt disturbance, the signals from these two populations of neurons are balanced. This reflects the common mode signal. With -score normalization, the mean firing rate is zero—the effective zero signal for the BG output. From this baseline an increase in one output signal is paired with a corresponding decrease in another signal. These two signals are presumably sent to antagonistic downstream systems.

It is possible that these opponent signals are related to Newton’s third law of motion. Since for every action there is an equal reaction, the generation of force in any direction will also produce a disturbance to the body in the opposite direction, requiring posture control. When pushing on a wall, for example, one is also experiencing the force from the wall in the opposite direction. Without resisting the reactive force, the posture will either collapse or, like astronauts in space, the body will be pushed away.

Our demonstration of the continuous relationship between neural activity and postural disturbances questions the assumptions of the “focused selection” model. BG output is not used to generate movements while inhibiting postural control. Opponent signals are needed for any movement or posture. Movement and posture are not antagonistic but share the same mechanisms. Rather the antagonistic relationship exists between downstream systems that act in different directions.

But it could be argued that our results support the earlier idea that opponent BG pathways can scale the intended movements in a “push-pull” manner by grading the movement parameters such as speed and amplitude [43, 44]. Increased BG output results in hypokinesia (e.g., Parkinson’s disease). Reduced BG output, by contrast, results in hyperkinesia (e.g., hemiballismus). In Parkinson’s patients, the abnormally low velocity and amplitude of movements is thought to be a result of excessive BG output, which inhibits thalamocortical activation. This model actually assumes that there is a monolithic BG output and that the magnitude of this output is modulated by the direct (striatonigral) and indirect (striatopallidal) pathways, as water temperature (a single magnitude) is adjusted by the cold and hot water handles for a faucet. High BG output results in small and slow movements, whereas low BG output results in fast and large movements. But this model does not predict the pattern of bidirectional outputs we observed, and it neglects the role of the common mode signal or tonic activity in SNr neurons, which according to the present account represents the neutral position or body configuration. Push-pull signals are not used to adjust the amplitude and speed of movement but to command antagonistic downstream systems. Greater BG output does not result in larger or faster movements.

4. Behavior and Feedback

In both the temporal differentiation task and the posture control task, the critical data come from a period when there is no apparent movement of the animal. Yet lack of overt movement does not indicate a lack of neural activity. Whether holding the lever, fixating on a target, or standing, the nervous system must produce outputs to counter continuous disturbances to achieve position control. In studies that examined neural activity during rest, when the subject is not performing any task, it is common to call the neural activity the “default mode” to mask ignorance of the underlying processes [45].

Although it is easy for the naïve observer to ignore the continuous neural output in the absence of any overt movement, the role of posture control becomes abundantly clear when it fails. For example, as one dozes off, the head drops as the neck muscles become incapable of maintaining the upright posture. In neurological disorders simply maintaining a posture such as standing or keeping one’s arm raised seems an impossible task.

Anyone attempting to build a system that can maintain a standing posture in a skeleton on a tilting platform will appreciate the tremendous computational challenges in posture control. Unlike a tank or robots with a stable base designed to obviate the computational challenges of postural control, there is no inherent postural stability in the skeleton, which is balancing on ball-and-socket joints. No engineer has succeeded in building anything that can balance a skeleton in an environment with unpredictable disturbances. Yet any deviation from the vertical, in a living man, is corrected exactly by the pattern of muscle contractions needed to restore balance, with these corrections happening so quickly that they are almost imperceptible to the casual observer.

Based on his studies of patients with BG damage, Martin argued that the postural deficits found in BG-related disorders are also responsible for the movement deficits [46]. To understand BG function, it would be critical to understand, at a computational level, exactly what posture control entails and how it can be related to movement.

4.1. Negative Feedback as the Solution to the Calculation Problem

When someone is standing, to the casual observer there appears to be no behavior. But this appearance is misleading. Any perturbation, such as a push, is met with resistance from the organism. Not only a push, but invisible and unpredictable disturbances everywhere—gravity, wind, changes in effector properties such as the spring properties of the muscles. These disturbances must be overcome, by varying output. This is an example of position control.

The term “control” means that posture stays the same, despite environmental disturbances. The naïve assumption that whatever neural signals are sent to our muscles determine the effects we exert on the environment, that is, observable behavior, was demolished by Bernstein nearly a century ago [47]. Bernstein wrote: “There are no situations in which muscle shortening is the cause of a movement” [48]. The actual effect of the muscular contraction is not the product of our neural output. Behavior can never be equated with the output of the nervous system, because it is the joint product of unknown environmental influences and neural signals. To the motor neurons producing muscle contraction, even fatigue or slight changes in the properties of the muscles can become a major source of disturbance. Consequently, a measure of muscle contraction (e.g., electromyography) can never define the actual behavior or the posture. That the output does not equal behavior raises the question of how the neural output can be adjusted as unknown and unpredictable disturbances vary. This is the “calculation problem,” the key problem that the nervous system must solve [49].

It is often believed that the calculation problem can be solved by computing inverse kinematics and dynamics or by feedforward computation to predict the future effects of actions using sophisticated mathematics. If only we can calculate the needed force output, it would be possible to produce movements [50, 51]. This feedforward approach requires enormous computational power and completely accurate knowledge of the physical interactions in the environment, if not omniscience. This is never found in any biological organism. Yet the calculation problem, after all, is solved by virtually all organisms. The solution is closed loop negative feedback, the only known organization to reduce error between the desired and the actual. Unfortunately feedback is widely misunderstood, even though the term is used frequently. Due to such misunderstanding, it is often considered a crude mechanism that has been replaced by modern developments. Because feedback is often incorrectly applied to the analysis of biological systems [5254], it is useful to correct some common misconceptions at the outset.

4.2. Control of Input

A control system always controls its input, not output [7]. Only perceivable consequences of behavior can be controlled. The control system contains internal reference signals, which indicate the desired state of some input variable. It varies its outputs until the consequence matches the reference signal. The output is proportional to the difference between input and reference. It is not determined by either perceptual inputs or reference signal, but by both simultaneously.

According to mainstream engineering control theory, a control system controls its outputs, not its input. This is perhaps the most common fallacy today, both in engineering and in the life sciences [49, 55, 56]. This fallacy, an unfortunate legacy of cybernetics, is the result of imposing the perspective of the observer rather than using the perspective of the organism or controller. The mistake is to assume that what the engineer perceives and records, the “objective” effect of the system, is the output of the system.

The goal of the engineer, when designing a controller, is to compute the output required—the “control signal” sent to the “motor plant” to move it in a certain way. For example, to move something to a preselected position, the engineer can compute the outputs that must be generated in order to produce the change in position, including inverse kinematics and dynamics, and send the signals to the transducers. To the organism, however, feedback through sensory channels is the only way it can know about the consequences of its behavior. There is no alternative way to discover the “objective” effects.

The common assumption that output is controlled ignores the perspective of the organism that is doing the controlling. By imposing his own desire and perspective, the engineer ignores the autonomy of the negative feedback controller, for he is always trying to make the machine do what he wants. He can only accomplish this by adjusting the reference signal, as the user operates a thermostat by adjusting the temperature setting. Since this is the signal generated by the user, it is usually labeled as the input to the system. In a biological organism, however, the reference signal is always internal to the organism.

The real input is the perceptual variable that can be affected by feedback [49]. In a temperature controller using negative feedback, the perceptual input is from the temperature sensors. Of course, in a man-made thermostat, the user can adjust the “set point,” but that is a unique feature of these systems, because that is the only way to use a negative feedback control system. The man-made controller, at least so far, is not designed to adjust its own references. Rather it is designed as a “servo,” to serve the needs of the user. A biological organism, in contrast, has reference signals of its own, not accessible to any user. It is autonomous, because it does not serve the needs of another, but those of itself only and always.

From the perspective of the engineer, negative feedback control is about injecting an error signal to get the desired output. In traditional cybernetic applications of control theory to the study of behavior, the comparison between error and reference is placed outside of the organism, where the engineer designing the system also performs the comparison function. Thus for decades such control systems have been treated as stimulus-response or input-output devices: error in, behavior out. The tendency to resort to linear causation is so strong that even closed loop controllers have been treated as devices that receive error signals and generate behaviors. This is only true of a component of the loop, namely, the output function when it is isolated [49]. It could never be true of the closed loop negative feedback controller.

In the end, the appropriate output signals must be computed somehow. The question is how. The negative feedback organization simply eliminates the effects of disturbance by subtracting them from the internal reference. The effect of its own output is monitored with its own sensors and actively controlled. This elegant solution to the calculation problem avoids calculations on the disturbances in advance. Whatever their effects, they are simply rejected by the negative feedback. The inverse kinematics and dynamics are realized by the physical interaction between organism and environment, in the forward equations describing how muscle contractions interact with the external environment. None of these calculations are performed inside the nervous system.

The misidentification of the inputs and outputs of a control system resulted in persistent mistakes in the application of control theory even when the correct mathematical equations were used. What is worse is that it has made it impossible to perform the appropriate experiments to measure the actual properties of the living control systems. Consequently, many myths have been propagated, for example, the idea that negative feedback controllers are slow [59], when speed is a chief advantage of such a system. Given a small error, the high loop gain can produce a rapid response, instantly removing any small deviation from the desired reference condition. High gain does not mean that a large response will be generated, as the response is always determined by the magnitude of the error. As error is self-reducing in a closed loop, the negative feedback necessarily limits the response, preventing it from becoming too large. But the time it takes for the output to reduce the error is greatly reduced by the high gain. When the input and output of the output function are incorrectly identified, as in all traditional diagrams illustrating the control loop, loop gain cannot be measured accurately [6063].

As a result of these conceptual confusions, in traditional models negative feedback is always misunderstood. Placing the comparator outside the organism has the unintended effect of inverting the inside and outside of the system (Figure 5). What should be part of the organism is considered to be a part of the environment, and what should be part of the environment, namely, the feedback function, is considered a part of the organism. Consequently, the equations that describe how forces act on loads and accelerations and decelerations of the loads are assumed to be computed by the nervous system [50]. These conceptual confusions have largely prevented any progress in the study of behavior for many decades.

5. Posture/Movement Problem and Cascade Control

Posture control is an example of negative feedback control. The controlled variable is the perception of the current body configuration. The relevant perceptual signals are a set of perceptual signals sensed by the organism, using sensors distributed all over the body.

A body configuration may be defined as a collection of joint angles, but joint angles alone are not always sufficient to define a posture. The body configuration may be similar whether standing erect or lying supine, but the relation to environmental disturbances such as gravity is quite different. Perceptions like the sense of effort, related to proprioceptive perception of muscle tension, may also be involved.

The same effectors, the same final common path from motor neuron to muscle, must be used to defend a given posture and to change that posture. That postural control is a prerequisite for normal movements is commonly acknowledged [64, 65]. A fundamental question, first raised by Von Holst and Mittelstaedt, is how movement is possible when posture is in fact defended against environmental disturbances [66]. Clearly animals can maintain a particular posture. But movements require a change in posture. With a self-initiated movement, why is not the current posture also defended? Why are self-initiated movements not treated as disturbances to the controller?

Posture control is traditionally viewed as a result of “postural reflexes,” fast adjustments in muscle output in response to any disturbance. With voluntary movements such lower postural reflexes are assumed to be inhibited [66]. In a postural reflex, the output is highly correlated with the input. The high correlation between stimulus and response gave rise to the concept of the reflex. Yet students of behavior have often noticed the variability in such reflexes; the same stimulus sometimes produces one output and sometimes another, and sometimes opposite outputs can be produced (reflex reversal). Baffled by such variability, some attempted to eliminate it using techniques like decerebration, by removing the descending influence of the brain. Using decerebration, Sherrington was unwittingly forcing control systems to behave like input/output devices, though necessarily in vain [39, 67].

In an input/output device, the output is a function of input. If the output varies given the same input or if different inputs can produce the same output, the standard explanation is that the function relating sensory information to behavior is not fixed but somehow modifiable depending on “contexts” that higher levels can turn off reflexes that get in the way of behavior, or that the processing of perceptual inputs can be “noisy.” The currently popular focused selection model of BG function, for example, assumes that BG output is needed to turn off the postural reflexes while selecting some action [3, 20].

Control theory, however, offers a very different explanation of the lack of correspondence between inputs and outputs. It shows that the output cannot possibly be a function of the input, even when it appears to be correlated, as in Sherrington’s decerebrate dogs [39]. Because the output is generated from the error signal, which is the difference between reference and input, there is simply no function relating the input to the output in this system. Anyone trying to find the function relating the temperature sensor reading to the output of the thermostat is simply wasting his time, because there is none [49]. To find the output, it is necessary to know both the input and the reference.

Consequently, there is no such thing as sensorimotor transformation. Concepts like sensory or neural noise are also irrelevant [51], since the circuit only processes signals and does not make any distinction between noise and nonnoise. In neuroscience, noise is just a term usually used to describe signals that the observer does not understand or want. This confusion arises largely because neither behavior nor neural signal makes sense when viewed from the perspective of input/output systems. They almost always appear to be more variable than what is acceptable to the experimenter. This is mainly because a critical determinant of behavior, namely the internal reference, is left out of the traditional paradigm. Behavior and neural activity vary, but this variation is not due to noise or inconsistencies in the sensorimotor transformations. It varies because of the attempt to control desired perceptual variables by canceling the effects of environmental disturbances.

5.1. Behavioral Illusion and the Myth of the Reflex

Gain in the output function in a negative feedback system is not used to convert input into output. The typical mistake is to measure gain by calculating the ratio between input and output, but input in this case is incorrectly identified: it is actually defined from the observer’s perspective, which is usually a disturbance to a controlled variable [61].

When the input/output correlation appears to be high, it produces a powerful illusion: the illusion that what is observed is the behavior of an input/output device, in which output is generated by the input [68]. In a control system, the effects of disturbance are rejected with output through a feedback function. But what counts as disturbance depends on the reference signal. As soon as the reference is altered, the definition of disturbance also changes, and the input that used to generate output suddenly ceases to do so.

Decerebration alters descending reference signals permanently, so that these signals can no longer influence lower systems. Even then, the lower systems can still have reference signals, whether by default or from some other still intact sources, and error signals can still be generated. With the feedback path still intact, the output can still alter the perceptual signal. The correlation between input and output may seem to be high, but one can easily change the output by altering the feedback path in the environment.

Although reflexes can create the illusion of high input/output correlation, of antecedent automatically “causing” responses from the organism, a closer examination shows this is an illusion [68]. A change in the environment can produce what appears to be a change in the sensorimotor transformation inside the organism. This is because the disturbance is reflected in the error (and output), which reduces the effect of the disturbance, but any manipulation of the feedback function will necessarily change how effective the output will be in rejecting the effects of disturbance on the controlled variable. Systematic manipulations of the feedback function will change how the system “responds” to the input, even when neither the stimulus nor the organism has changed. This behavioral illusion is the first trap that students of behavior must understand and avoid, though unfortunately so far it has victimized even the best investigators.

5.2. The Control Hierarchy

If postural mechanisms are not turned off during voluntary movements, then how can the brain generate movement? In the present model, a change in body position is produced by changing the reference signal of the position controller. Instead of a user injecting this reference signal, as in adjusting the temperature setting of a thermostat, it must come from within the organism.

Where then does the reference signal come from? The answer is suggested by cascade control or hierarchical perceptual control [58], in which the reference signal comes from the output of another controller. Thus there is a hierarchical relationship between the higher controller that sends the reference and the lower controller that receives it, much as an order is given in a chain of command.

At every level of the hierarchy, only inputs can be controlled. When the output of a control system serves as the reference signal of another control system, it does not specify the output of the lower system, but its input. Altering the output directly without altering the reference would affect the controlled variable via the feedback path, creating error that would cancel the effect of the output. Outputs from higher levels determine the type of perceptions the lower levels should achieve [58]. The lower controller will vary its output to produce the input determined by the descending reference signal, serving as an extension of the output function of higher levels (Figure 5).

If the reference signal of the posture control system is altered, the current posture will not be defended. Rather the system will defend the new value of the reference signal at any moment. There will then be a transition from the old posture to new posture, a movement.

The nervous system comprises a hierarchy of negative feedback control systems, each controlling its own perceptual input [58]. The higher systems do not have direct access to the actual actions or most of the perceptual inputs and error signals from lower levels. It only senses the variable to be controlled and generates error signals which become the reference signals for lower levels. To see where the BG fit on this control hierarchy, we must first outline, if only briefly, the functions of the lower levels. This may appear to be a circuitous route to understanding, but as we shall see a major problem with existing theories of BG function is their false assumptions about what behavior is and about the functions of the hierarchically lower systems.

6. Control of Muscle Tension and Length

The lowest level of the neural hierarchy controls muscle tension. The output function of this controller is the muscle. Projections from alpha motor neuron to muscle fibers send error signals in the tension controller [7]. The alpha motor neuron, as a comparator, receives signals from multiple sources. The major source of negative feedback is the Golgi tendon organ, which detects muscle tension produced by contraction of extrafusal fibers. The tension signal is fed back to the alpha motor neuron through the inhibitory Ib interneuron that inverts the sign of the signal, so that it is the opposite of the excitatory Ia afferent to the alpha motor neurons. This inversion creates negative feedback, as the inhibitory effect is subtracted from the excitatory effect. When the muscle contracts, the negative feedback keeps the tension in check. This is traditionally called an inverse myotatic reflex or the Golgi tendon reflex. The contraction creates the feedback, which restricts the contraction. Additional rate feedback can come from Renshaw cells, inhibitory interneurons that are excited by the alpha motor neurons, but in turn inhibit alpha motor neurons [69, 70].

On the other hand, muscle length itself can be controlled independently while tension varies. The relationship between length and tension is hierarchical. The higher length level specifies the tension to be reached. Tension can be varied to maintain a desired length. The difference between desired length and actual length, the error in length control, is turned into a reference signal to the tension controller.

The so-called myotatic or kneejerk reflex is a type of stretch reflex, in which the lengthening of the muscle is resisted by muscle contraction and shortening. This phenomenon reflects the action of a muscle length controller. A major signal driving the alpha motor neuron (and hence contraction of extrafusal muscle fibers) comes from the Ia afferent. This signal is often interpreted as representing muscle length. But the Ia afferent signal can be independent of muscle length. When the extrafusal muscle fibers are stretched, the parallel muscle spindle, a stretch sensor, is also stretched and activates the alpha motor neuron (i.e., stretch reflex). But the Ia afferent can also generate a signal as a result of gamma motor neuron output, which activates the contractile part of the spindle, thus “simulating” a stretch. To the alpha motor neuron, it does not matter how the Ia afferent signal is produced, by actual stretch or by gamma activation. The function of the gamma mechanism is not to keep the spindle taut and maintain sensitivity to changes in muscle length, as described in textbooks [26]. Rather the arrangement produces a comparison between current muscle length (via Ia and II fibers) and the length “demanded” by the reference signals from the gamma motor neuron. The muscle spindle does not directly contribute to the generation of muscle tension but functions as a mechanical comparator of desired and actual muscle length signals. The Ia afferent thus carries an error signal for the length controller, which in turn activates the alpha motor neurons and generates shortening of the extrafusal muscle fibers and muscle tension.

This arrangement is traditionally called a “follow-up servo” model, first proposed by Merton [71]. Yet, although Merton correctly identified the comparator, he failed to take into account the hierarchical relationship between length control and tension control, the key feature also neglected by subsequent models [53, 62, 71, 72]. This failure led to subsequent rejection of servo models of the motor system. Instead, it is common to claim, incorrectly, that the gamma motor neuron output functions simply to keep the muscle spindle sensitive to stretch [26].

According to the model presented here, the length controller achieves control of desired length specified by the gamma motor neurons by varying the reference signal to the tension controller, which varies muscle tension as needed. Tension control at the lowest level is always used for posture control and all other behaviors, but tension is not the controlled variable of the higher levels, which achieve their respective purposes by varying reference signals for tension. The higher levels all adjust muscle tension ultimately but not directly. Directly they all attempt to control their own respective perceptual variables, whether muscle length or joint angle. One possible exception is the direct projection to alpha motor neurons from the motor cortex or more commonly projections to the spinal interneurons. The importance of corticospinal (pyramidal tract) projections, especially for movements of the digits, cannot be denied. These descending projections can directly affect tension or force control, but their functions are poorly understood. In the present review, the focus is on movements of the whole body, rather than distal joints like digits.

6.1. Joint Angle and Body Configuration

In any movement, the length of the relevant muscles must be changed. This changes the angular position of the segments at the joint at which all the forces are balanced. The segments accelerate toward the new position with various damping factors, such as viscosity within muscles as well as negative rate feedback from proprioceptors to prevent overshoot. The new position changes during the movement, and the sensed joint angle smoothly approaches the angle set by the descending reference signal; the segments automatically decelerate as the desired position is approached. This behavior is the necessary result of how the hierarchy is organized. During movement, all the required variations in neural signals to the muscles are created by continuous feedback at various levels, not by forward planning or computation of inverse dynamics and kinematics.

Actions of the length controller (“stretch reflexes”) can facilitate posture control by bracing the knees, keeping the hip joints extended and the trunk upright, to minimize forces required for balance. But muscle length control is not sufficient for posture control. Patients with BG pathology often show intact stretch reflexes, yet they are still impaired in response to tilt [46].

Muscle length control in any controller is not sufficient to define posture. Multiple length controllers are needed just to define a joint angle, for example, biceps and triceps at the elbow joint. Moreover, any signal representing muscle length does not correspond exactly to the angle at the joint spanned by the muscle. The mechanical advantage changes with the angle, and loads can make the actual angle deviate from the angle represented by muscle length or tension. Without sensing the joint angle directly, controlling joint angle by relying only on muscle length would not be very effective. In addition to information from muscle length and tension sensors, joint angles can be perceived with specialized sensors located in the joints and stretch receptors that can detect rate of change.

It is also not sufficient to control posture simply by cocontraction of muscles, that is, “stiffness” or “impedance” control [73]. In muscles with spring constants that are an exponential function of tension, output gain in position control depends on the resting tension in opposing muscles. The common mode signal to both agonist and antagonist muscles can increase spring constants, thereby increasing the force applied to the tendons by a given amount of muscle shortening without generating net force. With continuous disturbances, the muscles across a joint can indeed appear to be stiff, but this apparent stiffness is mostly a result of rate feedback. It varies according to the disturbance applied to the sensed position. Without negative feedback, adjustment of stiffness alone is a poor method for achieving position control.

6.2. From Posture to Movement

At levels above joint angle control, the controlled variable is not the length of a single muscle or a single joint angle but a collection of joint angles coupled with the effort required to resist disturbances. With a complex body geometry, posture control requires a higher level that simultaneously adjusts joint angle in several joints at once, that is, body configuration. Movement can simply be defined as a change in body configuration, produced by a change in the reference signal to the comparator in the configuration controller. However, higher levels can produce movements by sending reference signals to any of the lower level controllers, for tension, length, or joint angle. The purpose of each movement will differ depending on which level is initiating the change in reference signal. For example, one can either activate the tension controller directly, via direct projections to the alpha motor neurons, or by activating the gamma motor neurons specifying muscle length, so that the error signal from the length controller activates the alpha motor neurons. Often both alpha and gamma motor neurons can be activated simultaneously [74].

The above description of the lower levels of the neural hierarchy shows three important features not found in any other model of the nervous system. Familiarity with these properties is necessary for understanding the contributions of the BG.(1)Control of perceptual inputs is the key principle in the hierarchical organization. Typically the higher levels receive higher order transformations of perceptual inputs compared to the lower levels and control these more abstract and global variables by varying reference signals sent to lower levels.(2)The higher levels can use the lower levels without turning them off. For example, in length control, tension is still controlled, except that the tension is specified by the reference signal sent from the length controller. The same is true of higher levels that control other variables.(3)Hierarchical organization allows one control system to command another. But it does so not by adjusting the output of the lower control system directly, but by sending a reference signal. The effect of this descending reference signal is to tell the lower system to achieve a particular level of perception. The actual output generated by the lower level will vary according to the comparison between the new reference signal and lower level perceptions. Thus the command signal in a control hierarchy never contains information about the actual outputs to be generated.

7. Reticulospinal System and Posture Control

In disorders implicating the BG, abnormal postures are common (e.g., somersault postures, bending of the spine). These could simply reflect abnormal reference signals to body configuration systems [46]. Position control is intact, but the reference signals for positions have extreme and fixed values. A fixed reference signal to body configuration would produce a fixed body configuration. In the most extreme condition, it would produce complete freezing of the body [75]. Consequently, the patient is continuously controlling a fixed position, still varying neural outputs in downstream controllers appropriately, until the effectors are exhausted.

Patients with BG pathology are also impaired in response to tilting of the body [46]. In particular, Martin found that, during tilt disturbances, the reaction of the trunk was much reduced. This observation suggests the involvement of the reticulospinal pathway, the most primitive motor system in vertebrates, and a major pathway influenced by the BG outputs [76, 77]. Although the BG output to the thalamus also eventually activates cortical regions giving rise to the corticospinal pathway [43], the latter pathway in most organisms is not critical for posture control.

As shown by lesion studies, the reticulospinal pathway is especially important for axial movements, rather than hand and finger movements that require the corticospinal pathway [46, 78]. The reticulospinal pathway is a major source of descending reference signals for joint angles and simple body configurations. It is therefore critical to consider the functional organization of this pathway before discussing the contributions of the higher levels.

The reticulospinal pathway has been extensively studied in lampreys [7983]. In the lamprey, movement of the axial musculature involves alternation of muscles on two sides of the body, for example, left-right alternation or dorsal-ventral alternation. Muscles on one side lengthen while muscles on the opposite side are shortened. Activation on one side will produce contraction and bending of the muscles and inhibition of the contralateral circuit, that is, relaxation or lengthening of the contralateral segment. The reticulospinal neurons innervating one side can be excited by sensory inputs, but the reticulospinal neurons on the other side are inhibited by the same input. The disturbances sensed by the comparator produce error signals in body configuration, which become reference signals for joint angle control.

The reticulospinal neurons can receive inputs from the vestibular and proprioceptive sensors, which report the current values of the relevant sensory variables, and send projections to motor neurons, which in turn produce movements that resist the effect of disturbances to postural reference signals. They are activated by disturbances in pitch, roll, and yaw planes [84, 85]. If a reticulospinal neuron was activated by a turn in a given plane, it is also involved in generating a torque opposing the turn. For example, the neuron excited by the nose-up pitch tilts activates both left and right ventral muscles, which produce the nose-down body bending, the output that compensates for the initial postural disturbance of nose-up tilt [84, 86, 87]. The reticulospinal pathway can generate movements that counteract the effects of postural disturbances in any direction [85]. Two complimentary types of reticulospinal neurons were found to control posture in a particular axis of rotation: they were activated by rotation in opposite directions and produced movements generating torques counteracting the postural disturbances.

The reticulospinal system can control body orientation, in relation to gravity, by sending reference signals to joint angle controllers. There are obvious parallels between the reticulospinal activity and the nigral activity during postural disturbances [42]. As described below, the BG represent a higher level of the hierarchy.

It is sometimes claimed that the reticulospinal projections have general excitatory or inhibitory effects [88, 89], yet the evidence suggests otherwise. For example, with reticulospinal stimulation, the effects on posture show simultaneous action on pairs of muscles, for example, leg flexed or extended with reciprocal inhibition of the antagonists [90]. Ipsilateral flexion and contralateral extension could be produced with medial reticular stimulation, whereas the opposite pattern of ipsilateral extension and contralateral flexion could be produced by more lateral stimulation. It appears that multiple reciprocal inhibition circuits can be engaged, probably by the activation of spinal interneurons. These behavioral observations are in accordance with the known anatomy. Reticulospinal neurons send branching projections to multiple regions in the spinal cord; for example, axons traveling to the cervical enlargement also project to lumbar levels. A single reticulospinal axon can project to several different spinal levels corresponding to different body parts and to neurons on both sides of the spinal cord [91, 92]. These projections are capable of producing coordinated contraction or relaxation of muscles in several body parts [90, 93, 94].

The reticulospinal pathway, then, implements the body configuration control systems that can adjust references to multiple joint angle controllers. Inputs to this level come from multiple joint angle sensors and muscle length sensors; they are compared with references for body configuration, and error signals are in turn sent to joint angle controllers in different body parts. If the reference signals are fixed, then a stable posture or body configuration will be assumed. By changing reference signals to this level, movements can be created.

8. Orientation Control in the Midbrain

Given the role of the reticulospinal pathway in posture control, the obvious question is how can higher order systems vary descending reference signals in order to generate movements. BG outputs are certainly in a position to do so via direct projections, yet much of the BG output does not reach the reticulospinal pathway directly. Instead there are extensive projections to the midbrain and parts of the diencephalon, which in turn projects to the reticulospinal pathway. Some of these areas, such as the tectum and pedunculopontine/mesencephalic locomotor region, project to reticulospinal neurons. I will focus on the tectum because more is known about its organization [9597]. The nigrothalamic projections are not discussed, because the functions of the thalamocortical system remain obscure.

8.1. Organization of the Tectum

The tectum (superior and inferior colliculi) is chiefly concerned with orientation of the head and body and thereby with steering during locomotion. It is most commonly associated with the orienting reaction, in which any salient stimulus can result in orienting towards that stimulus. This reaction allows one to detect changes in the environment, in preparation for possible behavioral engagement, whether to approach or to avoid [98]. The superficial tectal layers receive perceptual inputs from multiple sensory modalities [95, 99], and stimulation of the tectum can produce a variety of movements (of eyes, ears, head, and muscles) [100104].

Although only a subset of the output neurons from the intermediate and deep layers are related to eye movements [105], these neurons have been studied extensively. Studies of the monkey superior colliculus have shown a retinotopically organized map that receives inputs from the superficial layers above. Some deep layer collicular neurons fire just before the onset of a saccade that would bring the image to the center of the fovea [106].

It is believed that the tectum contains a map of angular deviations [107109].The tectal output somehow allows the organism to orient towards the distal target that is the source of the stimulus. For oculomotor behavior, the question is how to produce a sudden gaze shift to a target off center. The target is selected by moving attention away from the foveated part of the visual field.

The remarkable accuracy in final position of eye movements, despite variability in actual movements, is exactly what we would expect in a position control system. Yet, after decades of study, eye movements are still described in terms of sensorimotor transformation. The fact that negative feedback control and error terms are often mentioned in the oculomotor literature, as mechanisms for sensorimotor transformations, only betrays ignorance of how control works, since control and sensorimotor transformation are mutually exclusive.

Traditional analysis of eye movements has been misguided by the fallacy, discussed above, of misassigning components of a negative feedback system to the organism-environment interaction while missing the one critical ingredient, namely, internal reference signals. Robinson, a pioneer in the study of eye movements, was perhaps chiefly responsible for propagating this fallacy. As he wrote: “The retina senses the error between the eye (fovea) and the target, and the system turns the eye until the error is zero—a simple negative feedback scheme” [110]. The mistake here is to place the comparator function outside the organism. The actual movement is compared with disturbance, and the difference is considered to be the error that is fed into the controller. The controller then becomes an input-output device that transforms the retinal error into neural output. Unfortunately this mistake has dominated oculomotor research [57].

Although the controller reduces the error between fovea and target, it is important to determine where this error signal is generated, by the nervous system or elsewhere. The only relevant visual input is detected by the retina. That self-motion reduces the actual “slip” on the retina simply describes the feedback function, the effect of the behavioral output on the perceived variable. This is the negative feedback. But there is no such thing as “retinal error.” The retina cannot report the error between eye and target. That error is generated inside the brain, using internal comparison functions. Motion is only a disturbance to a control system with a reference signal representing zero motion or simply a particular position. Consequently, any detected motion can generate an error signal that results in movement of the eyes and body.

What is overlooked in the traditional analysis, then, is the internal reference that specifies how much perceived deviation of the target is tolerated. Object motion constitutes a disturbance to the organism precisely because it forces some perceptual inputs to deviate from the values specified by this internal reference signal. The key comparison is done inside the system, not at the retina. This behavioral illusion resulted in a complete reversal of the inside and outside of the control system, forcing theorists to use equations that describe physical processes in the environment to describe the computational processes inside the brain [57]. Robinson, for example, thought it was necessary to use internal computations to generate a signal that represents the target motion, that is, the disturbance [63]. But one of the chief features of the negative feedback controller is that the actual perceptual variable is protected from the effects of the disturbance by producing the appropriate behavior. In other words, its function is to reject the effects of disturbance, in order not to sense it directly.

8.2. The Tectal Orientation Controller

The optic tectum is critical for maintaining foveation. The angular deviation is the difference between the current eye position and the eye position needed to foveate on the visual target. The controlled variable is roughly the distance between target and fovea. The intermediate and deep tectal layers contain neurons that serve as comparators and send error signals in position control. The units at a particular location on the tectal map can activate the appropriate downstream controllers to reduce the position error. The reference level for this variable is close to zero. Any deviation is promptly corrected. This can be called orientation control. There are a few differences from posture control.(1)As control systems are characterized by the input variables they control directly, orientation control involves different types of sensory inputs. Tectal controllers rely on perceptual inputs unavailable to the lower levels such as the reticulospinal pathway. These are primarily the sensory modalities (e.g., vision and audition) for the detection of distal stimuli away from the organism, whereas posture control relies more strongly on proximal kinesthetic senses. Exteroceptive inputs are therefore needed for orientation control. The sensors involved are usually visual and auditory but also include vibrissae in rodents. The main goal is to produce movements to receive the relevant signals, much like adjusting an antenna to optimize signal reception.(2)The use of the distal senses creates a representation of the external environment and of the relationship between one’s own body and this environment (egocentric reference frame). This allows orientation and steering towards things in the environment at a distance from the organism [103, 106, 109, 111]. This level is where the sense of direction becomes relevant, as one is no longer simply changing the body configuration, but changing it in order to achieve some relationship with some other object locations in space. Without the distal senses, this would be nearly impossible. Imagine the difficulty of orienting or goal-directed behavior in complete darkness and in silence.(3)This level is also where the head becomes extremely important. Because the head contains the distal sensory systems, orienting with the head is the equivalent of sensory target acquisition with the relevant receptors, hence the importance of the tectum in foveation control. But target acquisition is not limited to the visual modality. According to the present model, the tectum is also critical for acquiring targets in other sensory modalities. With the head orientation defined with respect to objects perceived by the distal senses, certain concepts used to describe behavior only become meaningful at this level of the hierarchy: straight ahead, towards, and away from. These descriptions cannot be applied to the lower body configuration controllers in the reticulospinal pathway precisely because perceptual signals representing the distal environment do not reach the lower levels, which only receive information about the body.

The tectum, then, controls the orientation of the head and body in relation to some target in the environment. In the orienting reaction, the target is just any salient stimulus. For such control to be possible, both the current location of the fovea and the location of the visual target are needed. In the retinotopically organized tectal map, the fovea represents the origin, that is, in polar or Cartesian coordinates. It also represents the default reference condition for visual tracking. Directions of movement are determined from this starting point. Without knowing the current position, it would be impossible to move towards any location in space.

In foveation control, the outputs vary to minimize deviation from fovea. By visually acquiring the target, the new target location becomes the origin. From the perspective of control theory, fixation, pursuit, and saccades are different modes of operation of the same controller. Fixation, for example, is pursuit tracking on a stationary target. Most differences can probably be attributed to descending reference signals from higher level controllers that have access to additional perceptual variables, such as representation of motion from cortical regions [112]. In smooth pursuit, when the target moves with a certain velocity, position control requires eye movement at a similar velocity as the rate of change in target position on the retina. The smoothness could be the result of additional velocity control, which requires velocity feedback not readily available at the level of the tectum. But current data cannot dissociate velocity control from position control. Both can produce the type of outputs observed in oculomotor studies.

In the rostral end of the tectal map, corresponding to the foveal representation, there are neurons that fire during fixation. Their activity appears to be proportional to position error, the mismatch between a parafoveal stimulus and the currently foveated location [113]. Krauzlis et al. concluded that the activity of these neurons represents a position error signal rather than a motor command [114]. What they failed to realize is that, in control systems, a position error signal is exactly what is needed to produce a motor output, being transformed into a descending reference signal for a lower level controller. These rostral neurons mediate microsaccades, small eye movements that maintain foveation [115]. In caudal tectal regions, neurons fire before and during saccades—large changes in eye position. The difference between rostral and caudal neurons seems to be one of degree, not of kind [116]. The function is to acquire visual targets by placing the light pattern on the fovea in the center of the visual field.

The deep layers of the superior colliculus contain tectoreticulospinal neurons that project to contralateral brainstem regions that generate eye and head movements [117119]. In cats that are free to move their heads, these projections appear to be critical for gaze shifts, using coordinated movements of the eyes and head. During a gaze shift, there is evidence for a zone of activity moving across the tectal map [120]. The gaze is controlled throughout the trajectory of activity on the motor map. Tectal output seems to reflect instantaneous gaze error. Just before the shift, caudal neurons reflect the initial error. Selection of caudal neurons as the goal target initiates the movement, until the rostral fovea region “captures” the target. The location of the activity at any moment during the gaze shift reflects the remaining error to the target. As the gaze shifts, this zone moves towards the rostral pole. The location of the activity reflects the remaining error to the target. As the gaze shift terminates, the active zone enters the rostral pole. These findings suggest that the activation of any point off center in the deep tectal map produces the output needed for position control. The output reflects position error at any moment; this error signal can be computed by subtracting the current target position from the center of the visual field.

At the level of the tectum, the controlled variable is not body configuration per se, but body configuration in relation to some perceived distal stimulus. In other words, any number of body parts (eyes, neck, trunk, etc.) will vary their position in order to reduce this discrepancy. This amounts to varying multiple joint angles, the lengths of many more muscles, and ultimately the sensed tension of many muscles.

The known anatomy suggests that the superficial layer contains the input function and the deep layers contain the comparator function. The projections from the deep layers to the reticulospinal neurons send an error signal, which is turned into a reference signal for the lower level.

A key question is how the position of the neuron on the tectal map can determine the actual movement vector. In polar coordinates, the position of any point on the map can be defined as , where is the radial deviation and θ is the angle. Experiments using electrical stimulation have shown that, in the brainstem targets of tectal projections, there are independent controllers for horizontal and vertical movements [103, 121]. Movement in any direction can be determined by a combination of outputs from these distinct controllers. Since deep layer tectal neurons at any location can project to both horizontal and vertical movement controllers, the ratio between the synaptic weights of these projections can determine θ [106]. The tectal map, then, reflects a map of varying synaptic weights from the deep layer neurons to the independent controllers below. On the other hand, the degree of activation, that is, pulses injected into the system, can determine the amplitude of the movement. Since the radial deviation from the origin in the map is the position error signal, the number of pulses reflects the magnitude of the error.

Stimulation of the tectum can generate coordinated movements of the eyes, neck, and body [101, 120, 122]. The eyes are in the best position to correct this error, but all the relevant controllers probably generate outputs proportional to the error. Consequently, in unrestrained animals, manipulations of the tectum can create sequential activation of orienting movements in a rostral-to-caudal direction (eyes, neck, and body).

The reference signal in this system represents the goal position. Changing the reference signal for a foveation control system moves the eyes to the new kinesthetically sensed configuration, after which the tracking system is again locked onto the visual field. Injection of the GABA-A receptor agonist muscimol can mimic the effect of a change in reference signal. But as GABA receptors are blocked by muscimol, the change is not transient but sustained. That is, we should expect a long-lasting offset in the reference signal and eye position. When injected into the rostral tectum, muscimol indeed creates offset towards the location of injection [123]: instead of foveating on the target, the eye is locked on a nearby region close to the site of injection. Thus the position reference for the foveation system can be altered by injecting an inhibitory signal into the comparator. The injected reference becomes the new center.

9. Nigral Outputs

If enhancing inhibition of the tectum can artificially create an offset in the fixation position, then normally an inhibitory reference signal may be sent to the tectal comparator. Interestingly, the deep tectum is one of the major targets of the BG outputs [96, 97, 99, 124]. The SNr sends strong GABAergic projections to the deep layers of the tectum. The dorsolateral and ventromedial outflows from the SNr terminate in the rostrolateral and caudomedial intermediate layers of the optic tectum, respectively. Nigrotectal channels map onto tectal map output cells with distinct brainstem projection zones [125].

That the nigrotectal projection sends an inhibitory reference signal to the tectal controller provides a clue about tectal function, for control systems with inhibitory reference signals have unique properties. First, the controller does not produce any output unless the perceptual signal exceeds the reference, creating a threshold-like effect. Unlike input/output devices, however, once the “threshold” is reached, negative feedback is used. An inhibitory reference signal does not allow the input to exceed the reference, so that the output of such a system will act to reduce its input to be less than or equal to its reference. Thirdly, the perceptual input should have a positive sign, because the perceptual signal and the reference signal must be opposite in sign (Figure 6), in order to produce the error signal. In the tectal controller, for example, the perceptual signals coming from the superficial layers to the tectal comparator are excitatory [126], and nigrotectal projections provide the inhibitory reference signals. This arrangement enables the comparison function. The reference signal is then subtracted from the perceptual signal.

Any retinal input can activate the corresponding input function in the superficial tectum, which in turn activates the relevant comparator unit, but the error is determined by a comparison between reference and input. During fixation, perceptual input to the comparator for peripheral units is lower than the inhibitory nigral reference signal, generating little error. A salient stimulus from the periphery can generate a perceptual signal that exceeds the inhibitory reference, generating an error signal that results in orienting towards the new target. Whenever the input exceeds the reference, that input is acquired as the target of foveation.

With a constant reference signal, any salient stimulus off center in the visual field can generate a sufficiently strong input to elicit orienting behavior. This creates the illusion of an input/output device. But when the reference signal is altered, the orienting reaction will also change. Habituation, for example, is common when the salient input is repeatedly presented with no significant consequences. An increase in the inhibitory reference signal can explain habituation.

According to the present model, the “baseline” rate of the nigral output would reflect the position control reference. A constant rate of firing corresponds to a fixed body configuration and orientation. For the oculomotor system, the currently foveated target can be viewed as the origin of the map. A change in the nigral reference signal reestablishes the origin. The BG output is hypothesized to send a reference signal for the desired position in Cartesian coordinates . The reference signal indicates goal location, whereas the center of the tectal map, the current reference point, indicates the current position. More generally, the “baseline” BG output corresponds to the neutral position when the animal is not moving (or simply maintaining foveation). From this position, increase or decrease in firing rate of different types of nigral output neurons can generate movements in different directions. The rate of change in the reference signals determines movement velocity.

If the inhibitory reference signal is set to zero, any perceptual input can generate errors. This would be the case if the SNr is lesioned or inactivated, or when GABAergic transmission in the nigrotectal pathway is blocked completely [127, 128]. A reduction in nigrotectal inhibition increases the error signal from the tectal comparator even if the perceptual input does not change. Of course, to control antagonistic systems, both a decrease and an increase will be necessary, but pharmacological manipulations like muscimol are not specific enough to reveal the function of opponent outputs.

According to the present model, the tectoreticulospinal system makes it possible to define body configuration and posture in relation to distal target in the environment. At a lower level, the reticulospinal pathway is responsible for generating reference signals for multiple joint angles. The requisite changes in joint angles allow position control in three-dimensional space , using independent controllers for yaw, pitch, and roll. The reticulospinal system itself cannot achieve orientation control or steering, because it lacks inputs from the distal senses.

Nigral outputs are not the only projections that reach the tectum. Nor is the tectum the only target of the nigral output. The pedunculopontine/mesencephalic locomotor region and ventral thalamus also receive extensive nigral projections [129]. The lower levels below the BG output rely on perceptual inputs from multiple modalities to orient towards the critical aspect of the environment and to coordinate the movements of the relevant body parts. The ventral thalamus, a major target of the BG output, also appears to contain certain body configuration controllers [7, 130], but its functional role remains poorly understood. The mesencephalic locomotor region and pedunculopontine nucleus are critical for the alternation and modulation of locomotor patterns and relevant posture control [76, 77]. References reaching these regions can modulate pattern generators for locomotion. Thus, when the locomotor circuit is engaged, orientation control serves a steering function, so that locomotion becomes directed at specific targets in the environment.

10. BG Circuits and Transition Control

I hypothesize that the BG implement the level of transitions in the control hierarchy [7]. This possibility was first suggested nearly three decades ago by Cools [122], but it was unknown at the time how outputs from the BG can use lower levels, because the opponent outputs from the BG were not known and the functions of the lower levels were poorly defined [4, 49].

In transition control, the relevant variable is not configuration, but the rate of change in any configuration. An example of a visual configuration is the perception of a photograph or a drawing. An example of a kinesthetic configuration is a posture. Each is a unitary representation at the configuration level. Transition is simply a change in that particular configuration, as animation is a succession of images.

The inputs to the transition level represent changes in perceptual variables from multiple modalities. This is the level at which the perceptual variables are “objects” and “things”; for example, a rose is perceived as a rose no matter what the viewing angle may be. The cerebral cortices, especially the secondary cortices, contain “gnostic units,” invariant representations of lower level inputs, and the requisite higher order reference signals [98, 131]. These are sent to the transition controller in the BG via the corticostriatal projections.

The final outputs from the transition level are reference signals for orientation and body configuration controllers. The rate of change of this output signal, for example, a change in the firing rate of SNr neurons, represents movement velocity [40].

10.1. Distinct BG Networks Classified by Perceptual Input Signals

I hypothesize that the striatum serves as a comparator in the transition controller. Striatal output represents the error signal, while the pallidum (including the SNr) contains the output function of the transition controller.

The major projections to the striatum come from the cerebral cortex and intralaminar thalamus [132135]. The corticostriatal and thalamostriatal projections are roughly organized in a topographical fashion. The sensorimotor cortex projects to the sensorimotor striatum. The associative cortex projects to the associative striatum and the limbic cortex, including basolateral amygdala and hippocampus, to the limbic striatum. This projection pattern is the basis for functional heterogeneity within the striatum [40, 136].

The complexity of the cerebral cortico-BG networks is due to the variety of perceptual variables, constructed from lower-order inputs. These higher order perceptual representations are achieved by the cerebral cortex. The organization of the cortex and the corticostriatal projections allows many perceptual variables to be controlled. At least in principle, any variable that can be perceived can also be controlled.

Different striatal regions are therefore associated with transition control of specific classes of perceptual variables [4]. The three major classes are exteroceptive (associative), interoceptive (limbic), and proprioceptive/somatosensory (sensorimotor). Exteroceptive inputs are primarily concerned with perceptions of objects and space. Interoceptive inputs are concerned with internal bodily sensors, which report the state of essential variables (e.g., thirst and hunger). On the other hand, proprioceptive/somatosensory perceptions come from the muscle and tendon sensors as well as sensors from the body surface.

Cortical areas are highly similar in their basic microcircuitry, with relatively minor variations [137]. Whether some cortical region is classified as visual or auditory, for example, is largely attributed to the ultimate source of its inputs. Striatal and pallidal regions, though often bearing many names, are also similar to their circuit organization [10, 138, 139]. To control transitions, different cortico-BG networks therefore perform similar computations on different types of perceptual variables [4, 40]. The computations performed by neural circuits are mathematical functions often used in analog computing. For example, the function is the same, regardless of the value of . The content of the signal is independent of the computations performed.

10.2. Control of Movement Velocity

The simplest type of transition control is the control of succession of proprioceptive signals or movement velocity control [40]. In velocity control, the controlled variable is the rate of transition in body configurations, whether in locomotion or in postural transitions or orienting movements. The error signal in velocity control changes the reference signal of the body configuration controller. In velocity control, all changes of position, velocity, and acceleration are necessary consequences of how the control hierarchy is organized. The load accelerates toward the final position and then starts decelerating before it gets there, as if it knows it is about to reach the desired position. But it has no such knowledge. Even though nobody is telling the system when to accelerate or decelerate, it does so at the right moments with just the right amounts. This is an important yet surprising property of negative feedback control systems.

By integrating the error signal from the velocity comparator, the descending reference signal for body configuration and orientation can be obtained. According to this model, the magnitude (firing rate) of the velocity error signal is proportional to the rate of change of the BG output from the SNr. A larger signal produces a faster rate of change in the orientation/configuration reference. The neural implementation of the leaky integrator is the projection from the striatum to the BG output nuclei such as the SNr.

Velocity control is hypothesized to be a major function of the sensorimotor cortico-BG network. In neurological disorders implicating the BG, velocity control is often impaired. For example, in bradykinesia, a common symptom after dopamine depletion in Parkinson’s disease, movement is abnormally slow [140, 141], though position control is still effective. This deficit is a result of reduced rate of change in the body configuration reference signal. If the reference signal reaching the comparator is reduced, the movement will eventually correct the position error, but it will be slower [40]. If the reference signal is zero, there is akinesia. The effect is similar to playing a video in slow motion: the frame rate is reduced when the velocity reference signal is too low.

Bradykinesia could be a result of reduced velocity reference signal, though abnormalities in the input function or comparator are also possible [40]. The magnitude of the velocity reference signal could be determined by excitatory inputs to the striatum, from the cortex and perhaps thalamus, and by a modulatory signal from the midbrain dopamine neurons. With dopamine depletion in Parkinson’s disease, the velocity controller is impaired, reducing the peak output of the velocity controller. Consequently, the rate of change in the BG output will be reduced, leading to slower transitions in body configurations.

The effect of dopamine is “modulatory” in the engineering sense (not in the conventional neurophysiological sense, which just means change). Playing the role of “volume control,” dopamine is proposed to have a multiplicative effect on the glutamatergic signal arriving at the spines of the striatal projection neurons. The magnitude of the error signal entering the leaky integrator in the output function depends on both the glutamatergic input and the simultaneous dopamine signal. When dopamine is depleted, the glutamatergic signal has a reduced effect on the output. A reduced signal enters the leaky integrator that transforms the velocity reference signal into a rate of change in position reference. Position reference (from the SNr output), in turn, will change more slowly.

Recent work suggests that the sensorimotor striatum is a key component of the velocity controller. The firing rate of sensorimotor striatal projection neurons is highly correlated with movement velocity, though it is still difficult to ascertain whether the signals they carry reflect velocity reference, input, or error [142].

It remains unclear what the role of the striatonigral (direct) and striatopallidal (indirect) pathways is in the transition control network. It has been argued that the direct pathway serves to select desired actions, while the indirect pathway suppresses competing actions. But this model makes a number of questionable assumptions about behavior, in particular the relationship between posture and movement. It cannot be defended in light of recent data on opponent BG outputs. An important question is how these opponent outputs, which are needed for downstream controllers that move in opposite directions, are generated by the intrinsic circuitry. One obvious possibility is that they are generated by the direct and indirect pathways [36, 42]. There are common inputs to the striatonigral and striatopallidal neurons, for at least a large proportion of corticostriatal projections. This circuit can function as a phase splitter, in which a uniform input signal to the BG (e.g., carried by the corticostriatal projection) is transformed into a pair of output signals, one increasing and the other decreasing at the same time (Figure 2). Accordingly the rate of change in these outputs will correspond to movement velocity in different directions. This possibility remains to be tested.

11. Above Transition Control

The transition level is where voluntary or goal-directed behavior emerges. In traditional terms, this is where the will in the brain is translated into actions. According to the current model, the will can be viewed as a particular type of reference signal entering the comparator function of the transition controller. A common symptom after damage to the BG is abulia or lack of will [143]. This is a consequence of reduced reference signals to the transition control system.

A simple movement such as raising one’s hand can serve multiple purposes: to scratch the neck, to fix the hair, to ask a question, and so forth. The kinematics of the arm movement per se is ambiguous, for it does not tell us which level of the hierarchy is responsible for initiating the action. Nevertheless, despite the fundamental ambiguity in interpreting the purpose of actions, the purpose of any control system can be determined experimentally.

What is needed is an explicit test for the controlled variable in question [144], by manipulating feedback functions and assessing the consequent changes in behavioral output. Control systems share an important property: whenever a variable is controlled, disturbance to this variable will be resisted by its output. Thus the hypothetical controlled variable will change less than one would expect had there been no feedback at all.

The entire motor hierarchy can be viewed as the final common path for actions, just as the neuromuscular junction is the final common path for specific muscle contractions. When we analyze actions, we ignore all the details at lower levels (e.g., joint angle or muscle length control in the spinal cord and brainstem). By analyzing the output of the transition controller we can see how a particular action is performed, but an equally important question is why it is performed. That is a question about the higher levels that can alter the reference signal of the transition controllers. When we ascend the control hierarchy, we ask “why” certain outputs are generated by trying to identify the reference signal, which is proportional to the error signal of a higher level.

The transition level is the highest level of the motor hierarchy, but we can still ask why a particular action is performed. For example, the reference signal for the velocity controller comes from still higher levels. The rate at which the configurations are altered appears to be related to the motivational urgency, that is, magnitude of error at still higher systems that becomes the reference signal for the velocity control system. The presence of reward can significantly increase the velocity reference signal, which reduces the latency and increases the firing rate of striatal neurons [145]. How do goals of actions affect the actions themselves? This is a question to be addressed below.

11.1. Relationship Control

The feedback path between muscles and the loads they accelerate is short and relatively direct. Proprioceptive signals are automatically affected by the effectors. But this is not true of other types of transitions. For example, we perceive a cat running across the visual field. This perception is not automatically controlled by our behavior. The lowest levels of the oculomotor system can exert some effect on the perception, as the eyes track the running cat, but in a second the cat is gone. To have full control of the “cat perception,” some feedback path can be discovered, so that our own movements can alter this perception, for example, by chasing the cat.

“Chasing” is the output of a controller that controls one perceptual variable, namely, transition of a set of proprioceptive configurations. Yet in this case the transition level is in the service of a still higher level, with its own controlled variable, which can be described as closing the distance to the cat. The same behavior can be described in different ways. It can be described as a series of changes in muscle tension, in muscle length, in joint angle, or in posture or as running or chasing. Which of these descriptions is the appropriate one? Strictly speaking, all of them are true, but they describe the actions of the control hierarchy at different levels. The most appropriate one here is chasing, because that describes the appropriate control variable. If one is simply running with no target, then the controlled variable is not “closing the distance between self and target.” These two possibilities can be tested experimentally by manipulating the feedback function or introducing a disturbance to the controlled variable. If chasing is the appropriate description, then stopping the target would also stop the behavior, which would not be the case if running were the appropriate description. The key question is not only which perceptual variable is being controlled (as all levels of the hierarchy are controlling their local perceptual variables), but also which level of the control hierarchy is the “lead” level. This level controls its own perception by commanding the lower levels.

The level just above the transition is the relationship level, where the controlled variable is a relationship between at least two perceptual variables. In most cases, this relationship is between two transitions, that is, two changing configurations. In chasing a cat, the distance between the self and the cat is a relationship. Likewise, in a tracking task, one has to move the mouse cursor to follow a moving target [7], so the distance between the cursor and target is a relationship. In driving, the relationship between the car and various other perceptions, for example, the road, lane markers, or red lights, must also be controlled. Humans can readily choose any arbitrary distance, which means that the reference signal for the relationship control can be set at some arbitrary value. To control this value, the relationship level must have access to both perceptual variables and send some error signal that activates lower level controllers, that is, to initiate the appropriates types of transition control.

11.2. Sequence

Another type of controlled variable may be called “sequence” or “serial order.” An action such as “drinking a glass of water” can be broken down into multiple components: gaze shift, reaching, holding, moving the cup to the mouth, drinking, and so forth. It is necessary for these components to be ordered appropriately for the sequence to be effective. Serial order itself is a controlled variable. The sequence AB is different from BA, even though the same elements are involved.

Sequence, in this sense, is different from stereotyped alternation as in locomotor pattern generation, mediated by brainstem and diencephalic structures below the level of the BG [77]. The latter does not require learning of arbitrary serial order, relying instead on innately organized circuits for stereotyped sequences, for example, flexor extensor alternation. Serial order, the arbitrary ordering of individual action primitives, requires learning. This is evident in the lack of proper serial order in the actions of infants, for whom an action as simple as “drinking a glass of water” can be impossible.

Unsurprisingly, the learning of serial order also depends on the sensorimotor cortico-BG network [136, 146]. Lesions of the sensorimotor striatum or of secondary motor cortical regions that project to this region can impair the learning of serial order. Mice were trained to perform two actions sequentially (e.g., press the left lever first and right lever second) in order to earn some food reward. Lesions of the sensorimotor striatum can impair learning of the serial order without impairing the learning of individual actions. In other earlier studies, it was found that dopamine antagonists can also impair sequence control [147, 148]. Exactly how serial order control is implemented by neural circuits remains unclear.

11.3. Learning and Recruitment

So far we have considered how the proposed hierarchy of neural circuits can implement cascade control. One important question that remains is how these systems can be modified through learning.

At the level of transition control, an important phenomenon is observed, traditionally called reinforcement. As Thorndike first stated in his “law of effect,” if a behavior is followed by a good consequence or effect, it is more likely to be repeated in the future; if it is followed by some bad effect, it is more likely to be eliminated or reduced in frequency [149]. This phenomenon is studied most commonly in the field of operant or instrumental conditioning, in which animals are trained to perform specific actions like pressing a lever in order to obtain food. The critical role of the cortico-BG networks in instrumental learning is supported by many studies [36, 150153].

In relationship control, rate of change in one variable is related to that of another. This is similar to the “related rates problem” in calculus, where it is solved with implicit differentiation using the chain rule. Operant conditioning provides a good example. If one learns to press a lever for food, both the action of lever pressing and the outcome of food delivery are transitions in perceptual variables. There is a feedback function relating the rate of pressing to the rate of reward [154]. The organism can only become aware of this action-outcome contingency at the relationship level, where both perceptual variables (action transition and outcome transition) are available.

The key feature of such relationships or contingencies is that they do not reflect stable physical dependencies, in the same way that, for example, joint angle depends on muscle length. Rather they reflect ever-changing and arbitrary relationships in the environment. Exploiting this type of relationship, the organism can generate output to control one variable in order to control another. Precisely because such relationships are fleeting properties of the environment, learning and experience will be needed to acquire them. Learning to control one transition variable in order to control another is therefore the most important type of learning. There is a hierarchical relationship between these two variables. Only one is directly under the control of the organism before learning, whereas the other is not. Control over the new variable is acquired. Using this indirect method, any variable can be controlled provided that it can be perceived and that a feedback path exists between it and a currently controllable variable.

In the absence of experience, for example, in a newborn infant, many types of control systems are still functional, and a rudimentary control hierarchy is already in place. The essential variables necessary for life are, by definition, already controlled using existing homeostatic controllers in the body and the autonomic nervous system. But the extent of control is limited. For example, despite extremely sophisticated body temperature control in the infant, he is quite unable to perform specific actions to put out a fire. His ability to defend the essential variable against environmental disturbances is limited. To do that learning is required, and such learning is initially driven by the error signals in primary controllers, when essential variables are disturbed.

11.4. Trouble with Reinforcement

In recent years, models of reinforcement learning have had a major impact on neuroscience, especially on researchers studying the function of the BG [35, 155, 156]. It is widely believed that the BG circuits implement specific models of reinforcement learning, which are largely based on Thorndike’s law of effect [157]. Reinforcement is what makes behavior repeat. Food reward, for example, is called a reinforcer when the preceding behavior can be reliably repeated.

What is lacking in reinforcement models is the internal reference [158160]. Consequently, it is impossible to determine when to start or stop any behavior. When will a rat start pressing a lever for food? When will it stop pressing? Why is food reinforcing when the rat is hungry but not when it is sated?

The implicit assumption of the reinforcement model is that the organism maximizes rewards or good effects and that more reinforcement causes more behavior [155]. This belief persists partly because almost all studies in this field use food or water deprivation to generate behavior. The goal of the experimenter is to create conditions when the behavior in question can be observed. From the perspective of control theory, this means that error is high, and the animal strives to reduce error by performing the action. Drastic deprivation guarantees responding and creates the illusion of reward maximization during the period when the error is large. Yet the rate of reward in instrumental conditioning is a controlled variable. Changing the feedback function (i.e., reinforcement schedule) dramatically changes the rate of lever pressing, but in a predictable fashion because the rate of food delivery is relatively constant [49]. More reinforcement does not produce more behavior. In fact, when the schedule is leaner, as has been known for decades, the rate of pressing increases. The fluctuation in behavior may appear to be random, but it is understandable in light of what is happening to the variable being controlled, namely, the rate of reward delivery.

The reinforcement model also confounds learning and performance. The implicit model of the organism is a stimulus-response device. Behavior is a function of what happens to the organism. The only possible change in organization is in the strength of the bond connecting stimuli (or states) with responses, as originally proposed by Thorndike, regardless of how many intervening variables are inserted between these two. But clearly motivational state can also affect performance, as a sated rat stops pressing the lever. Consequently, whether a change in associative strength or motivational drive is responsible for the change in performance is impossible to ascertain. Early investigators like Hull at least attempted to solve this problem, but in recent years it has been ignored entirely [158, 159].

The absurdity of explaining behavior by their antecedent conditions has already been discussed above. Knowing how control systems function, it is impossible to define learning simply as a change in behavior. For an important property of control systems is that they can produce new behaviors without ever changing their parameters.

According to the model proposed here, deprivation creates large error signals in the essential variables [55]. The primary deficit in energy homeostasis is the ultimate source for the error signal that initiates the food seeking behavior. What is traditionally called reinforcement is a reduction in error signals in systems that control the essential variables. Behaviors are repeated because they reduce error signals created by deprivation and other disturbances.

In a rat that has already learned to press a lever for food, the action of lever pressing is the means by which the error is reduced as the rat becomes less hungry. But the question is how did the rat ever learn to press the lever in the first place. Such learning requires a change in the properties of the control systems, such as construction of new references signals or establishing or modifying links between levels in a labile hierarchy [4, 7].

Instrumental learning consists of multiple phases [153]. Initially, as a result of large error signals in controllers for the essential variables, the organism generates random variation in system parameters. This is manifested in behavioral variability, which leads, by chance, to the action that reduces the error [161]. The error reduction is what is traditionally called reinforcement. It reduces the rate of variation, preserving the effective set of parameters in the control system. Next time, when the error signal increases again, the system that has been reorganized to reduce it most quickly will be selected.

This process of reorganization is the opposite of the reinforcement mechanism. The reinforcer does not strengthen some existing connection between sensory input and motor output. Rather the error signal in controllers for essential variables starts an active process of reorganization, during which the system parameters simply vary at a high rate. This process, however, is stopped by the error reduction. That saves the set of system parameters.

Performance always depends on the amount of error present, but learning explains which lower level systems are actually recruited to reduce the error and why. In an operant conditioning experiment, to satisfy its hunger the rat must press the lever in the operant chamber. The error signal from the food pellet controller is used as a reference signal for the action. Thus during reorganization the controller for food recruits the controller for lever pressing. This process of “recruitment” is critical in instrumental learning. It involves establishing or strengthening the connection between two independent controllers, so that one will serve the other. The relationship between them is hierarchical, so that the higher controller can use the lower one by sending a reference signal to the comparator function of the latter. This learning process explains what happens, in traditional terms, when an action is associated with an outcome [162]. The action-outcome link is established so that the error of the outcome control system can reliably set a reference for a lower system that specifies some action to be performed. The higher level, therefore, recruits a lower one to reduce its error.

There are important differences between the type of feedback function in operant conditioning and feedback function between, say, the output of motor neurons and muscle tension. There are no first-order sensors for “reward” or “reward rate” as there are for muscle tension. Rather these are highly abstract variables constructed from multiple perceptual signals from lower levels. By definition, to control a particular relationship it is necessary to perceive it. The detection of the instrumental contingency between action and outcome cannot be achieved by the lower levels below transition control. The lower levels are also incapable of instrumental control. Only at the highest levels can different transitions be related to each other and only there can such feedback functions be learned, so that the appropriate actions can be acquired to reach desired goals. The control of the outcome through instrumental actions, therefore, requires relationship control.

The action of pressing the lever, which is generated by a proprioceptive transition controller, can be used to serve many different purposes. There is no fixed relationship between the action and the variables the organism would like to control. One learns to control in order to control . But could also be a variable that was acquired through experience. Its “value” was established very early on, through experience of error reduction in more primary control systems [163, 164].

The organism must form new goals or reference signals to reduce errors in essential variables corresponding to motivational states like hunger and thirst. It must also acquire specific actions to reach these goals. These secondary reference values explain the traditional notion of secondary reinforcement and signals that predict primary reinforcement also obtain incentive value. Value, in this sense, is an attempt to explain how often a behavior is performed. Thus in the traditional literature, stimuli and actions are often assigned value, which merely attempts to explain performance. If, given a particular stimulus, the rate of some behavior is high, this stimulus is endowed with value. Likewise, if the animal chooses to perform one action rather than another, the preferred action is said to have value [165, 166]. Such values can be understood as acquired reference conditions in a control hierarchy.

11.5. Cortico-BG Networks and the Motivational Hierarchy

Above the transition control level, there is no fixed hierarchy. Rather there is a labile motivational hierarchy, in which the levels are defined by acquired controlled variables and relationships between these variables. I hypothesize that the cortico-BG networks can implement this labile hierarchy.

As discussed earlier, inputs to the BG can be roughly divided into interoceptive, exteroceptive, and proprioceptive. Each class of perceptual signals is carried by cortical and possibly thalamic projections to the striatum. These glutamatergic and excitatory projections send the main feedback signals to the level of transitions. With proprioceptive transitions in the body sensors during movement, both interoceptive transitions and exteroceptive transitions can also change.

Imagine a hungry rat exploring its environment, proprioceptive feedback is sent to different levels of the hierarchy. At the same time, distal senses (e.g., visual and auditory) also detect transitions in space, and the interceptive senses detect transitions in autonomic variables, including those related to hunger. Given its motivational state, there will be large error signals in essential variables such as blood glucose controllers. When the rat learns to perform some action to obtain food, the parameters of the exteroceptive and proprioceptive transition controllers are saved (e.g., where food is found and how it can be obtained).

The relationship between interoceptive and exteroceptive transition controllers can therefore be hierarchical. The detection of distal changes usually occurs before the detection of proximal changes; for example, the sight and smell of food usually precedes its digestion. Likewise, for the animal to exert instrumental control on food, it must first produce movements or proprioceptive transitions. Thus the order of dependency is as follows.(1)Exteroceptive, associative network depends on proprioceptive, sensorimotor network. Distal perceptions of the environment change as one moves.(2)Interoceptive, limbic network depends on exteroceptive network. Internal states can also change as distal perceptions change; for example, food is seen, smelled, heard, and then consumed.(3)Interoceptive network depends on proprioceptive network. The feedback in terms of transitions from proprioceptive transitions is mainly exteroceptive but could also be interoceptive. Normally, however, the dependence is more indirect.

The labile motivational hierarchy allows the proprioceptive transition controller to be in the service of higher levels that control any perceptual variable, provided a feedback function is present. The cortico-BG networks are the neural implementations of this hierarchy, as the anatomical connections allow the limbic and associative networks to affect the sensorimotor network, possibly through the striato-midbrain-striatal loops [167169].

The striatonigral projections, at least for the sensorimotor network, transform proprioceptive transition control error into reference signals for configuration and position control systems. The projections from the substantia nigra back to the striatum are less direct. They are not from the GABAergic output neurons but from the dopaminergic neurons, which receive projections from the output neurons and send projections to striatal comparators in a lower level on the motivational hierarchy [134, 167, 170]. The errors from interoceptive transition control can thus be used to alter the reference signals of the exteroceptive transition controller, which in turn uses the proprioceptive transition controller.

Interoceptive inputs such as taste are mediated by limbic cortico-BG network, which is also important for orofacial movements [171, 172]. The inputs to the limbic striatum (nucleus accumbens and surrounding ventral striatum) come from limbic cortical regions such as medial and orbital frontal cortices and the basolateral amygdala [134, 173]. Its output through the ventral pallidum can affect the autonomic nervous system via the hypothalamus [8, 174, 175]. These connections may be sufficient to generate consummatory behaviors [176, 177]. Yet these outputs are not always sufficient for the control of interceptive inputs; for example, chewing is not sufficient to make food appear. If, however, some arbitrary instrumental action is required to obtain the reward, then the taste control system must recruit the associative network and sensorimotor network to generate the appropriate actions. The limbic circuit by itself cannot acquire instrumental behaviors that lead to specific rewards [178]. But indirect projections to the associative and sensorimotor networks allow serial adaptation to recruit the requisite controllers to perform the task.

12. Summary and Conclusions

To understand the contributions of the BG to behavior, it is above all necessary to understand what behavior is. Here the traditional linear causation paradigm is the greatest obstacle to progress. Whenever behavior is conceived as the output of some input/output system with linear causation, as the result of sensorimotor transformation in multiple steps inside the organism, the attempt to understand its neural substrates is doomed at the outset.

I have argued instead that behavior is the outward manifestation of a more fundamental process of control, generated by a hierarchy of negative feedback control systems, each controlling its own perceptual inputs by varying outputs. It is not the result of sensorimotor transformations but is jointly determined by the perceptual input and the internal reference signal, in a mathematically precise way. Using cascade control, the output of a particular level specifies the input signal to be obtained by level immediately below. The loop is closed in the environment, as the output function of the lowest level in the hierarchy—muscles—acts on the environment to generate behavior. Although the basic unit of neural function—the closed loop negative feedback circuit—is simple, a hierarchy of these systems can generate exceedingly complex behavior. We are only now beginning to understand the properties of the control hierarchy.

The properties of negative feedback control systems are counterintuitive from the perspective of the linear causation paradigm. The striking failure to understand control theory in the life sciences so far only illustrates the fundamental difference between closed loop systems and input/output systems. Regardless of how many intervening variables are inserted between the stimulus and the response, an input/output system always lacks internal references, which are only found in negative feedback control systems. This is the crucial difference. The behavior of control systems is not caused by what happens to them. It can never be a function of inputs received or of internal representations of any kind.

For any control systems to function, reference signals are necessary, and negative feedback makes it possible to obtain inputs matching the reference by reducing the discrepancy between the two. The reference signal is the representation of some unrealized future state, but the system makes it possible for this state to be realized by varying its behavior. In this sense, the reference is simply the purpose of the controller, though purposes and goals in ordinary language usually refer to higher level reference signals at the transition level because few lower reference signals are available to conscious awareness. We are not aware of the reference signal for muscle tension in hundreds of muscles in the body at any moment, though these are the signals that ultimately close the loop by causing muscle contraction to act on the environment. We are usually aware of the higher goals of our actions, the reference signals sent to the transition level, for example, to get a cup of coffee. The higher purpose is achieved by elaborations as one descends the hierarchy; for example, the desire to get coffee affects the reference signal for sequence control of the action, which changes the reference for rate of change in body configurations, which then alters references for joint angles, which then alters references for muscle length, which finally alters references for muscle tension.

I have identified the neural implementations of the basic levels of the hierarchy: muscle tension, muscle length, joint angle, body configuration and orientation, and transition. In the proposed neural hierarchy, the BG occupy the highest level, receiving inputs representing rate of change in different perceptual variables, comparing these signals with reference signals, and generating error signals that alter the reference signals for downstream position controllers. Such a model suggests a new view of the relationship between the inputs and outputs. The BG are neither sensory nor motor. Rather their function is to control certain types of higher order perceptual variables, above all relationship, sequence, and transition.

Because traditional studies in systems and behavioral neuroscience rely on input/output methods to understand behavior, without identifying the controlled variable, their results are of limited utility. Given the lack of useful data, the hierarchical model proposed here is still incomplete. No attempt has been made to elucidate the function of many brain regions, such as the cerebellum and the diencephalon, that work closely with the BG in generating behavior. Although the proposed model is still incomplete, even in its present form it generates a number of testable predictions which can be useful in guiding future experiments:(1)The BG produce signals related to movement kinematics: velocity, acceleration, and position. The striatal output, for example, reflects velocity, whereas the nigral output reflects position. This suggests that operations like addition, subtraction, integration, and differentiation are the primary computations performed in these circuits. We would also expect both reference signals and perceptual signals representing these signals. These will be similar, so long as there is successful control. Perturbation experiments will be needed to distinguish between these signals.(2)From the striatum to the SNr or any other BG output nucleus, the neural circuit performs the equivalent of mathematical integration. In the neural integrator, the rate of change in the output will be proportional to the magnitude of the input. The outputs of the SNr (and GPi/entopeduncular nucleus) will be proportional to the time integral of striatal outputs. The presence of the integrator will also produce a roughly 90-degree phase shift in the BG output signal when compared to the striatal output. Although the existence of neural integrators has been known for a long time, often integration is misleadingly called a mechanism for memory [179, 180]. The crucial function performed by integration in the nervous system is not memory but control, as integrators are often needed in building output functions of negative feedback controllers.(3)Dopamine is a gain signal in the transition control system. It is neither a hedonic reward signal nor a reward prediction error signal [181, 182]. By modulating the glutamate signal, it can determine the velocity reference signal or velocity error. The primary function of dopamine is to alter the gain of different types of perceptual transitions. The sensorimotor striatum, which receives the strongest DA projections from the nigrostriatal pathway, is hypothesized to be critical for velocity control. But DA clearly can also be involved in the control of other types of transitions, transitions of any perceptual configuration.(4)The output of the BG quantitatively determines posture and movement. The rate of firing in the output can determine position at any time. A change in firing rate represents a change in body configuration and orientation, that is, movement. From any stable position, opponent and antiphase signals are generated to create movement.

Conflict of Interests

The author declares that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The author is supported by NIH AA021074. The author would like to thank Joseph Barter, Peter Redgrave, and Mark Rossi for helpful discussions.