Applied Mathematics in Biomedical Sciences and Engineering 2014View this Special Issue
Neuronal Ensemble Decoding Using a Dynamical Maximum Entropy Model
As advances in neurotechnology allow us to access the ensemble activity of multiple neurons simultaneously, many neurophysiologic studies have investigated how to decode neuronal ensemble activity. Neuronal ensemble activity from different brain regions exhibits a variety of characteristics, requiring substantially different decoding approaches. Among various models, a maximum entropy decoder is known to exploit not only individual firing activity but also interactions between neurons, extracting information more accurately for the cases with persistent neuronal activity and/or low-frequency firing activity. However, it does not consider temporal changes in neuronal states and therefore would be susceptible to poor performance for nonstationary neuronal information processing. To address this issue, we develop a novel decoder that extends a maximum entropy decoder to take time-varying neural information into account. This decoder blends a dynamical system model of neural networks into the maximum entropy model to better suit for nonstationary circumstances. From two simulation studies, we demonstrate that the proposed dynamic maximum entropy decoder could cope well with time-varying information, which the conventional maximum entropy decoder could not achieve. The results suggest that the proposed decoder may be able to infer neural information more effectively as it exploits dynamical properties of underlying neural networks.
Ensemble data derived from neuronal population activities have been subject to numerous decoding attempts . Traditional methods mainly focused on neural information of a single neuron averaged over multiple trials, consequently suffering from intertrial variations and neglecting ensemble information stemming from interactions between multiple neurons. Thus, decoding methods that directly tackle neuronal ensemble activity as a whole are more desirable in those terms .
To this end, an ensemble decoding method called the population vector (PV) model was proposed where the ensemble state was represented as a weighted sum of the preferred directions of individual neurons’ firing rates . Originally aimed for the analysis of primary motor cortical activity, the PV model was also utilized for the analysis of data from various regions such as primary visual cortex [1, 4]. The idea of the vector representation of neuronal ensemble activity was further extended in a form of the optimal linear estimator (OLE) .
Both PV and OLE, categorized as structural analysis methods for ensemble data, can be applied when covariate events lie in a single space such as the direction of movement. But in many real cases, where such a condition is not met, a Bayesian decoding method may provide a better decoding solution. Rather than simply merging the ensemble data structurally, a Bayesian method chooses the event with the maximum a posteriori probability. This recursive Bayesian decoding method has been proposed for neuronal ensemble decoding in various forms such as the point process , the Kalman filter, or the Particle filter [7, 8]. By considering the fact that neuronal ensemble data contain vital information regarding correlations between multiple neurons , different Bayesian decoding approaches utilize such correlations directly for probability distribution estimation  via the statistical dynamics analysis [11–13], based on the maximum entropy principle. Being capable of extracting information from the data with a temporal resolution of millisecond , the maximum entropy decoding approach is known to be robust even to the neural data with a low firing rate [11, 14].
The maximum entropy decoding method directly draws upon the stability of neuronal states under certain condition and is well known for its simplicity and consistent performance. It is especially known to well represent collective behavior of neural networks. However, it does not take temporal patterns of neuronal ensemble activity into consideration. Therefore, we propose a novel maximum entropy decoder that can embrace information regarding temporal dynamics of neuronal ensemble activity and thus enhance decoding performance. Especially, our new model is expected to play a vital role in the decoding of neuronal ensemble signals in prefrontal regions of the brain where persistence firing activity with a low firing rate is often observed.
The paper is organized as follows. First we illustrate the basic concept of maximum entropy decoding for neuronal ensemble data and the computational models to implement this concept. Then, we describe an extended Ising decoder that has been recently proposed as an effective maximum entropy decoding method for various neuronal representations. Next, we propose a new decoder that incorporates temporal dynamics of neuronal ensemble into the maximum entropy decoding model. Simulation studies illustrating advantages of using the proposed model follow. Finally, we discuss advantages and limitations of the proposed model and possible future research directions.
2.1. Maximum Entropy Decoding
Let multivariate neuronal ensemble data be observed in response to stimuli . Our objective is to determine which stimulus was applied by decoding a given neuronal data . One possible method would be to find that maximizes the conditional probability of given :
Due to the difficulty of completely describing such a conditional probability, the Bayes Rule is applied to represent the conditional probability in terms of likelihood and prior,
In most cases, the likelihood of describing a generative relationship between a specific stimulus and its consequential response can be determined. Also, since is irrelevant to the inference of , the equation above can be reduced as follows: When estimating , not only individual neuronal activity in response to a specific stimulus but also correlations among neurons should carry important information. By utilizing a measure that maximizes entropy, which can be applied with the information regarding neuronal correlations, we obtain a Bayesian model called the Ising model  and a method of decoding neuronal ensemble activity based on this model is called the Ising decoder . The Ising decoder can be briefly described as follows.
Let us assume that the firing rate of the th neuron at a given time instant is represented in a binary form: when fired and when silent. With a correlation between a specific stimulus and the firing rate (Hz) of the th neuron, denoted as , together with as the Lagrange multiplier and as the constraint, the probability that yields the maximum entropy is given by [12, 14] Here Hamiltonian is defined by The denominator in (4) will be henceforth referred to as a partition function denoted as .
In order to solve (4) with the Ising model, the partition function regarding for given must be calculated in advance. A direct approach, however, would require calculations for neurons, rendering it intractable with a large number of neurons. Instead, one can use an approximation solution via the Markov chain Monte Carlo (MCMC) method [15, 16] or mean field approximation [17, 18]. In particular, mean field approximation of for given can be expressed as  Here, indicates the mean firing rate of the th neuron and .
The next problem would be to find that minimizes the Hamiltonian in order to obtain a maximum entropy distribution. The Thouless-Anderson-Palmer method can be utilized to approximate computation [11, 12, 18, 19]. Using this method we can find that satisfies where . Then, can be computed by
2.2. Extended Ising Decoder
One of the biggest advantages of using the Ising decoder is its temporal resolution of one-thousandth of a second . But there may be occasions where a bigger bin size than 1 ms is required. In such cases, there would be multiple firings of a neuron within a single bin and thus resulting in information loss in the simple binary representation of firing activity (i.e., ). Hence, we should consider a system where a state of a neuron is represented in more diverse classes. Since the Ising model only takes binary states into consideration, a model with an extended flexibility is required; one of such models can be found in the Potts model.
The Potts model is a generalized version of the Ising model that embraces a state space of , where an integer . Hamiltonian for an arbitrary is defined as Here, represents the kronecker delta function and is the firing state of a neuron in the neuronal ensemble . The subset of , denoted as , represents a set of neurons that are correlated with others. The Ising model is a special case of the Potts model with , , , and . We can generalize the equation even further to reflect differing levels of interaction depending on the class of state such as where .
Representation of a neuronal state as a variable for the Potts model can be realized in multiple approaches: counting the total number of firings within a bin or measuring the ratio of a current firing rate with respect to a maximum firing rate for each neuron. The state space size would be equal to the maximum number of firings within a bin for the former, whereas the latter would yield an arbitrary size depending on a way of quantizing the firing rates.
In the same line with the Ising model, the Potts model requires approximation of the partition function. Unfortunately, it is not easy to provide a closed-form solution for the partition function in the Potts model. Instead, we utilize the MCMC method as follows [15, 16, 22].
Let represent a Hamiltonian parameter pair and let be conditioned on ; we would like to approximate via the MCMC method. Suppose that we know the information of for and we want to estimate from it. If we define , , then Taking a closer look inside the integral term: We can apply the mean of Markov chain generated by a Gibbs sampler to obtain the expected value in (12). For a sufficiently large , the integral is approximated by the trapezoidal rule as Therefore,
The extended Ising decoder based on the Potts model will be henceforth referred to as the conventional maximum entropy decoder.
2.3. Dynamical Maximum Entropy Decoder
One of the neural network models exploiting statistical dynamics is the Hopfield network [23, 24]. It shows a distinct difference when compared to Perceptron [25, 26] models. Perceptron is aimed at physical systems and does not consider abstract population characteristics where the Hopfield model does. Moreover, while Perceptron is focused on synchronized systems, the Hopfield network can also be applied to more general dynamical systems.
In the Hopfield network, a state of each neuron is represented as , determined with a threshold and a synaptic connection strength : where and are index neurons and represents an external input. The synaptic strength is updated in a specific time range as follows: In case of , .
The most important feature is that if we assume symmetry such that , the Lyapunov function exists for this neural network. The Lyapunov function, also referred to as an energy function, is given by
The existence of the Lyapunov function is important because it analytically presents how the state of a system is likely to change. In such cases, the Lyapunov function is identical to Hamiltonian of the Ising model.
The Hopfield network has been generalized based on the Potts model . In this case, along with various degrees of interactions depending on diverse neural states, the Lyapunov function becomes where . Note that Hamiltonian in the Potts model can be obtained from this Lyapunov function.
Now, we describe how to build a dynamical maximum entropy decoder. First, let us consider some basic aspects of a dynamical system. Let a dynamical system receiving an external input be described by with a set of all possible neuronal states as . We then make two basic assumptions on this dynamical system as follows.
Assumption 1. The dynamical system is assumed to approach a stable state over time with no input. Then, for every , , and time ,
Assumption 2. Let be a pseudo-periodic set in response to an external input . Then, we assume that Here is an integer and can be sufficiently small. We also assume that an external stimulus is static during its stimulation period.
Next, we consider two cases when there is sufficient time for the system to be stabilized (Case 1) and when there is not (Case 2). For both cases, we need to estimate with the assumptions above, which allow us to conjecture that any and are close to each other if for . Here represents a pseudo-periodic discretized set belonging to .
2.3.1. Case 1
In this case, we first define a probability that a state belongs to certain : Using Hamiltonian based on the Lyapunov function, we can compute a probability that belongs to as
The next step is to define a statistical state transition function. We represent every possible pair of states given a change of input from to as ; that is, . Then, the transition function is defined as Decoding an input with the knowledge of from neuronal ensemble data is stated as the following problem: This problem can be solved using the transition function and probabilities described above:
Note that we can also use a simplified version of this formula given by
2.3.2. Case 2
Let be a set of all possible neuronal ensemble states. Suppose there is a function on such as can be seen as the restrictive function of a dynamical system and also as a reconstruction of with respect to . What this entails is the fact that can be found if is given for all . Since solving the problem with every possible and is difficult, we use a technique that limits the length of the sequence to .
Rather than estimating as defined, we estimate the neural trajectory of a specific . For a small enough , we substitute for , therefore acquiring the probability from (23). Now let , a temporary set that retains its elements but not further down the line. With this set, we redefine the function as and also in terms of as such that the condition of is satisfied. Hence, now our goal is to find a dynamical function over time. The initial value of can be assumed to follow the uniform distribution.
As time elapses, and turns into and . Then, we update with the result of (26) as follows: The “val” in above equation can be regarded as either 1 or 0. Once is properly normalized, we accept and normalize the transformation with a probability if the updated enhances the performance of (26). and introduced here are time-varying and converge to 0 when .
This method is partially in line with the simulated annealing technique. Hence, it is expected to approach to a right solution but, on the other hand, can consume longer computational time. The main characteristic of this method is that is not limited only to stable states. If not enough time is given to reach a stable state after each external input, it might be better to consider the temporary states generated during the process. The proposed method realizes such temporary states. Even the trajectory of state transition as a result of change in external input can be represented in . Although each does not necessarily represent the stable state of a neural network in such cases, it can be advantageous for the decoding system that involves temporal dynamics since it can represent the most likely state in response to the change.
2.4. Simulation Procedure
A synthetic neuronal ensemble data set was generated in order to test the new decoding method for both cases, “Case 1” and “Case 2,” discussed above.
2.4.1. Case 1
A model capable of representing persistence activity of neurons is generally complex ; thus, for the sake of the simplicity of experiment, we generated neuronal ensemble data with fixed firing rates of sixteen neurons for each external stimulus, based on Poisson process. Stimuli were limited to two classes, and , where two stable states existed for each. In short, for each stimulus, , stable states and exist (see Figure 2(a)).
While four out of sixteen neurons represented the actual state, the rest were set to have random mean firing rates between 0 Hz to 10 Hz to represent irrelevant activity in the ensemble data. Those four state-representing neurons were divided into two pairs with neurons that were assigned the same mean firing rate. Thus, each state could be expressed as the following pairs: Hz, Hz, Hz, and Hz.
2.4.2. Case 2
In order to generate neuronal ensemble data for Case 2, we utilized the orientation tuning model  that represents a biological neural network model for directional coding [29, 30]. Similar to an excitatory-inhibitory neural network model, the model contains an ensemble of neurons that respond to the angular direction and delivers excitatory or inhibitory signals among neurons depending on the directional similarity. In our simulation, we defined a signal passed on from one population of direction to another population of as Here, and as a balancing parameter were set as and for our simulation. The actual input that the neurons tuned to received from an external input with an angle was given by where and were parameters representing the degree of input focus and overall intensity of the input, respectively. The range of was given as ; indicated that every neuron received uniform input regardless of their inherent direction, and indicated that less input was given if was more distant from . Hence, a condition left neurons of unaffected. For this simulation, and were used (see Figure 1).
(a) State changes resulting from external stimuli
|(b) An example of estimated|
A temporal change of a neural network was expressed as a firing rate based on the neural network model. The firing rate of the th neuron was Here, was a time constant representing a synaptic delay between neurons, was the synaptic strength between the th and th neurons, and was a threshold function . This part of the simulation utilized a total of neurons as well as a synaptic delay constant, ms. A threshold function for the firing rate was defined with the scale of Hz: The number of neurons for each of eight directions was equal and the synaptic strength and the external input were identical to the setting of orientation tuning model of Case 1. The probability that neuronal connections between neurons exist was set below 0.5.
3. Results and Discussion
Performance of the decoders was assessed by means of “decoding error rate” and “decoding uncertainty” for the given dataset. Error rate calculated as a ratio of incorrectly classified samples to the total number of samples tested served as a direct measure of decoder performance. Decoding uncertainty was represented with the entropy of the normalized distribution of Bayesian probabilities for each class generated by the decoder. Small amount of entropy indicates less uncertainty of decoding performance, analogous to smaller variance of estimates of a model. Using these performance measures, we compared our proposed decoder (dynamical maximum entropy decoder) with the conventional decoder (the extended Ising decoder).
3.1. Case 1
The dynamical maximum entropy decoder estimated as the first step of decoding. We examin ed whether this transition estimation was correctly represented by the decoder. Figure 2(b) demonstrates that the estimation results well represent state changes for a given external event (see Figure 2(a)). When there were more than two stable states for each event, the proposed method outperformed the conventional maximum entropy decoder in terms of both error rate and uncertainty (see Figure 3). The proposed decoder generated approximately 0.25 error rate and 0.12 bit of entropy which is a significant improvement over 0.45 error rate and 0.78 bit entropy by the conventional decoder. Superior performance of the dynamical maximum entropy decoder was based on its ability to alleviate signals from those neurons which were regarded as irrelevant to the event.
(a) Decoding error
(b) Decoding uncertainty
3.2. Case 2
The second study considered a case where the external stimulus changes before the system converges to a stable state. For each of 8 uniformly separated directions, the stimulus was changed after the first 200 ms and then lasted for another 200 ms period. The succeeding direction could either be different from or same as the former. A small number of neurons were randomly chosen out of 512 total neurons and were discretized within a bin size of 10 ms (see Figure 4).
(a) Sampled simulation data
(b) Discretized simulation data
For these data with temporal changes of stimuli, the proposed decoder was evidently superior compared to the conventional decoder in terms of error rates with 0.07-, 0.07-, 0.01-, and 0.11-point advantage for respective scenarios (see Figure 5(a)). However, decoding uncertainty was also significantly (difference range 0.3~0.8 bits; -test, ) higher in the proposed decoder than the conventional one in all scenarios (Figure 5(b)). This issue will be further discussed below.
(a) Decoding error
(b) Decoding uncertainty
Decoding neuronal ensemble activity recorded from brain signals provides insights on how neural networks process information. Decoding models based on computational models for biological neural networks can provide these insights to deepen our understanding of neural network mechanisms. However, approaches directly using neural network models are not easily realized because it is challenging to know every synaptic strength between neurons and to obtain full information of all the neurons in a network. Therefore, the present study proposes an alternative approach to leveraging dynamical and collective behavior of an ensemble of neurons in response to various stimuli. From simple assumptions on the model, we blend neural dynamics into a decoder with which we are able to inspect the functions and roles of a neural network.
To the assessment of the performance of a decoder, the uncertainty of decoding outcome as well as the error rate is crucial. This uncertainty measures how robust the decision made by a decoder is regarding the information inferred from neuronal activity. Our proposed decoder exhibited a significantly low error rate and small uncertainty in the first case study (see Figure 3). However, it generated significantly larger uncertainty than the conventional maximum entropy decoder in the second case study despite its lower error rate (Figure 5). We speculate that such larger uncertainty may be due to model complexity in our decoder. More complex decoding procedure in our model includes uniformly distributed transition probabilities that may in turn equalize prior probabilities of individual stimuli, thus increasing uncertainty of a decoded stimulus. However, it may not be solely the result of model structure because the proposed decoder could also reduce uncertainty in the first case. Consequently, we suspect that increased uncertainty in the second case may indicate a particular outcome resulting from specific data properties, which need further investigation in the future.
Yet, we also recognize that there is plenty of room to improve in our model. In particular, more rigorous ways of obtaining information about neural dynamics may be necessary. For instance, our method to estimate stabilized states of a neuronal ensemble may be suboptimal to many different real data. Also, we need to apply our decoder to ensemble data from many different brain regions to generalize it further. Finally, continuous efforts to reduce computational loads in the proposed decoder will be required.
A number of methods have been continuously developed to decode neuronal ensemble activity. The design of decoding models draws upon the properties of a target brain region, a recording technique, neural signals, and the objectives of decoding. Among them, the maximum entropy decoding method takes into account correlations between neurons as well as firing rates of individual neurons. In particular, it produces good decoding performance when there exist marginal but clear interactions between neurons. However, the current maximum entropy decoder does not capture time-varying characteristics of neuronal ensemble activity, which often deliver essential information about underlying brain functions. Hence, the present study addresses this issue by developing a novel decoder that incorporates dynamical properties of a neuronal ensemble in the model while maintains the key functions of the maximum entropy decoder. We demonstrate that more information can be successfully decoded using the proposed decoder compared to the conventional maximum entropy decoder.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was partly supported by the Components and Materials Technology Development Program (10043826) funded by the Korean Ministry of Trade, Industry, and Energy (Jinsoo Kim, Sung-Phil Kim) and the Technological Innovation R&D program (S2087577) funded by the Small and Medium Business Administration of Korea (Sung-Phil Kim). Also, Jee Hyun Choi was being supported by the Global Frontier R&D Program funded by the National Research Foundation of Korea Grant (no. 2011-0031525) (Jee Hyun Choi).
L. F. Abbott, “Decoding neuronal firing and modelling neural networks,” Quarterly Reviews of Biophysics, vol. 27, no. 3, pp. 291–331, 1994.View at: Google Scholar
F. Huang and Y. Ogata, “Comparison of two methods for calculating the partition functions of various spatial statistical models,” Australian and New Zealand Journal of Statistics, vol. 43, no. 1, pp. 47–65, 2001.View at: Google Scholar
T. Tanaka, “Mean-field theory of Boltzmann machine learning,” Physical Review E, vol. 58, no. 2, pp. 2302–2310, 1998.View at: Google Scholar
M. L. Minsky and S. A. Papert, Perceptrons, The MIT Press, 1969.