Research Article  Open Access
Prior Knowledge of Target Direction and Intended Movement Selection Improves Indirect Reaching Movement Decoding
Abstract
Objective. Previous studies have demonstrated that target direction information presented by the dorsal premotor cortex (PMd) during movement planning could be incorporated into neural decoder for achieving better decoding performance. It is still unknown whether the neural decoder combined with only target direction could work in more complex tasks where obstacles impeded direct reaching paths. Methods. In this study, spike activities were collected from the PMd of two monkeys when performing a delayed obstacleavoidance task. We examined how target direction and intended movement selection were encoded in neuron population activities of the PMd during movement planning. The decoding performances of movement trajectory were compared for three neural decoders with no prior knowledge, or only target direction, or both target direction and intended movement selection integrated into a mixture of trajectory model (MTM). Results. We found that not only target direction but also intended movement selection was presented in neural activities of the PMd during movement planning. It was further confirmed by quantitative analysis. Combined with prior knowledge, the trajectory decoder achieved the best performance among three decoders. Conclusion. Recruiting prior knowledge about target direction and intended movement selection extracted from the PMd could enhance the decoding performance of hand trajectory in indirect reaching movement.
1. Introduction
Brainmachine interfaces (BMIs) develop a direct pathway between the brain and external devices, which aims to help amputees or paralysis patients regain their movement functions [1, 2]. The decoding method is the essential part of BMIs which maps the neural activities to movement trajectories. Numerous decoding methods have been proposed in recent decades, such as statespace model [3, 4], artificial neural networks [5], and reinforcement learning [6, 7], which have been applied in many BMIs successfully, such as a robot arm [8–10] and computer cursor trajectory estimation in two and three dimensionality [11, 12]. In most studies, the task is pointtopoint targetoriented center out or variant center out [13–18], in which the target direction and initial movement direction are correlated.
However, the environment of daily life is more complex. For example, obstacles between food and human beings would make fetching trajectories curved to avoid running into it. Such cases challenge the performance of decoding methods with the decoupled target direction and initial movement direction. The study of the complex task could push the limits of BMIs and accelerate the clinical translation [19]. Actually, researchers have designed tasks to dissociate the target direction from initial movement direction, such as curved movements [20], environment with specific paths [19], or obstacles [2]. However, most of the decoding methods were applied to the pointtopoint targetoriented tasks. Work needs to be done to extend the proposed methods to more complex tasks, which could extend the performance limits of BMIs.
Multiple cortices are involved corporately and hierarchically to process the complex tasks. The primary motor cortex (M1) plays the role of passing neural impulses down to the spinal cord and controlling the execution of movement. The dorsal premotor cortex (PMd) is responsible for higherlevel movement control, including movement preparation, sensory and spatial guidance of reaching, or some direct control of reaching movement [21–23]. Planning could happen before movement onset, and a delay epoch can contribute to mature performance. In 2012, Pearce and Moran designed an obstacleavoidance task in which the initial movement direction was confined to induce a curved centerout task, and they found that population vectors (PVs) [15] of one monkey point to the target at first and then turn to the movement direction with relevant visual cues showing up during the delay epoch [2]. In fact, target direction and initial movement direction are two key instructions to finish the indirect reaching. In 2007, Yu et al. extracted target direction during planning as a prior information for the following trajectory estimation [16]. In 2013, Shanechi et al. estimated the target information from the PMd and SMA before movement initiation to improve the trajectory decoding in a centerout task [18]. Similarly, the movement direction could be estimated and integrated to improve the trajectory decoding based on the finds of Pearce and Moran [2].
Several methods have been proposed to decode the indirect reaching movement task [3, 24, 25]. In 2012, Gilja et al. proposed the recalibrated feedback intentiontrained Kalman filter (ReFITKF) to improve the online decoding performance of targetoriented reaching movement task [3]. Researchers also applied ReFITKF on the obstacleavoidance task with promising performance. However, the design of ReFITKF did not consider the properties of indirect reaching movement and the obstacleavoidance performance benefitted from the visual feedback and modulation of neural activities. So the algorithm would not work well in an offline case. Similarly in 2017, Shanechi et al. enhanced the online reaching movement control by rapid control and feedback rates [24]. They applied this method to the obstacleavoidance task. However, the same issue exists as ReFITKF. In our previous studies, the correntropybased attentiongated reinforcement learning (CAGREL) was proposed to decode the obstacleavoidance task by setting a secondary target to avoid the obstacle manually [25]. For obstacleavoidance task, more kinematics parameters are involved, so an algorithm framework that integrates different information is needed [26]. In 2007, Yu et al. built the mixture of trajectory models (MTM) based on recursive Bayesian estimation (RBE) [18, 27–32] for neural decoding of goaldirected movements [16]. They combined several trajectory models, each of which was more accurate within the limited regime of movement (trajectory to one specific target), with probabilistic weights predicted by planning activities. The probability of target direction was estimated from the PMd during the planning period. However, for a more complex task such as obstacleavoidance task, it is still unknown whether the neural decoder combined with only target direction information could work.
In this study, we examined how target direction and intended movement selection were encoded in neuron population activities and tried to improve the indirect decoding by integrating more prior knowledge. Two monkeys were trained to perform delayed obstacleavoidance task. One Utah array was implanted in the PMd area for each monkey. Population vector (PV) and principal component analysis (PCA) were utilized to analyze neuron encoding properties during planning epoch. For comparison of movement trajectory estimations among decoders with no prior knowledge, only target direction and both target direction and intended movement selection were carried out.
2. Experiments and Methods
2.1. System Setup and Training Tasks
In this study, two male rhesus monkeys (Macaca mulatta, labeled as monkeys B and C) were trained to perform a delayed obstacleavoidance task using their upper limbs (right upper limb for monkey B and left upper limb for monkey C). In the task, monkeys were seated in a primate chair and one monitor was placed 50 cm away in front vertically. As shown in Figure 1(a), the monkey was trained to manage a 2D manipulator (20 × 20 cm range) to move a computer cursor (small blue ball) from the start position to the target (big yellow ball) without touching the obstacle (green bar) to get a water reward.
(a)
(b)
(c)
The target position could appear on the left, top, and right with the start position on the bottom, as shown in Figure 1(b). The average trajectories across 20 trials were shown. And trajectories to the same target were labeled by the same color. The bold cyan trajectory was the case shown in Figure 1(a). There were six trajectories with a fixed start position. Additionally, the cases that the start position was located at the left, top, and right were also collected. Generally, the target position in the current trial was the start position for the next. Sometimes, monkeys moved the cursor away from the start positions during rest, and those cases were discarded in our study. In total, there would be 24 (6 trajectories × 4 start positions) conditions where data were collected. This task partly simulated the complex environment by adding an obstacle between the start position and target position, which challenged the performance of decoding methods.
Specifically, the task started with the appearance of the computer cursor and target, as illustrated in Figure 1(a). The cursor was located at the bottom and surrounded by a red square, indicating that monkeys had to hold at this start position. And the target was located at the top. This epoch would last for 300 ms (delay 1) for monkeys to acquire the information of target direction. Then the obstacle appeared and lasted for another randomized time (uniform distribution from 500 ms to 800 ms, delay 2), which was for monkey planning to avoid the obstacle. The disappearance of the red square signaled the go cue. Monkeys moved the cursor from the bottom to top in a curved trajectory to avoid the obstacle. Monkeys were required to hold at the target position for 500 ms to get a liquid reward. A rest time of 500 ms was set between two trials.
2.2. Surgery Procedures and Data Acquisition
Neural data were collected by a 96channel microelectrode array (arranged in a 10 × 10 matrix, 4.2 × 4.2 mm, Blackrock Microsystems, Salt Lake City, UT, USA) [33] implanted in the contralateral PMd for both monkeys. Additionally, two head posts were fixed on the skull with titanium screws to stabilize head and array pedestal during neural recording [34]. The surgery was performed under general anesthesia induced by ketamine (10 mg/kg) and diazepam (1 mg/kg). A deep anesthesia was induced by endotracheal administration of isoflurane (1%2%) with veterinary anesthesia ventilator (Matrx VME2, Midmark, Orchard Park, NY, USA) during the surgery. The vital signs were monitored by a physiological monitor. Body temperature was maintained by a heating pad (T/PUMP, Gaymar, Orchard Park, NY, USA). Craniotomy was performed over the premotor cortex, and the dura was incised to place the array. The array was quickly inserted into the cortex by a pneumatic insertion device (Micro Implantable Systems, Salt Lake City, UT, USA). The surgical procedure was detailed previously in [5]. After the surgery, the antibiotic therapy lasted for 5 days and monkeys had at least one week to recover. All procedures were approved by the Animal Care Committee at Zhejiang University, strictly complying with the Guide for Care and Use of Laboratory Animals (China Ministry of Health).
Neural activities acquired by the array were transmitted to Cerebus data acquisition system (Blackrock Microsystems, Salt Lake City, UT, USA). Analog waveforms were amplified, bandpass filtered (Butterworth, from 0.3 Hz to 7.5 kHz), digitized (16bit resolution and 30 kHz sampling rate), and high pass filtered (Butterworth, 250 Hz). The spikes were detected using a thresholding method (at a level of −4.5 times root mean square of baseline) and sorted by commercial software (Offline Sorter, Plexon Inc., Dallas, TX, USA). Trajectories of manipulator and epochs of the task were recorded simultaneously with neural signals, as shown in Figure 1(c). Ten data sessions (from ten different days) have been collected (five for monkey B and five for monkey C). The spikes were binned in 100 ms time scale to predict the prior knowledge and the following trajectory.
2.3. Mixture of Trajectory Model
To decode the continuous hand trajectory accurately, we employed a mixture of trajectory models (MTM), which is based on recursive Bayesian methods, developed by Yu et al. [16]. Recursive Bayesian methods need a statistic model of hand trajectories for training, while the MTM probabilistically subdivides the whole trajectories into a limited regime of movement, which could maximally optimize the decoding model in the specific regime. The idea was fitting well with our experiment, in which there were three possible targets for each start point and two possible intended movement selection for each target. This would result in 3 subregimes for targets and 2 subregimes for intended movement selection. According to the MTM framework, the decoding accuracy would be boosted if the information about the regimes were known or partly known. We utilized Bayes’ method to obtain the possible targets and intended movement selection during the planning epoch, which will be introduced in the next section. In this section, we will describe the MTM with recursive Bayesian estimation framework included.
The particular probabilistic model in our study referred to that in Kalman filter [4], where kinematics accord with random walk model and observation model is Gaussian [31], which are formulated as follows: where is the label of movement regimes. Taking target direction and initial movement selection into account, there are six movement regimes. Considering target direction only, the number is three. represents hand position at time step . represents the neural activities at time step , where represents the number of single units. The state transition matrix , observation transition matrix , variance matrix and , and bias and were all fit to training data and remained consistent during the test.
The dependency relationships between movement regimes, kinematics, and neural activities are shown by the graphic model in Figure 2. Both of the kinematics and neural activities depend on the movement regimes. Neural activities depend on the kinematics by the observation model and kinematics conform to the random walk model.
Based on the graphic model and Bayes’ method, the kinematic estimation at time step t is equal to the posterior distribution of hand position conditioned on neural activities from the initial time step to current time step , which is defined as . To expand the posterior term conditioned on movement regime , we get where means the probability of movement regime given the neural activities from the beginning to current time step. Furthermore, Bayes’ rule was utilized on that term to obtain the following equation: where term is the probability of movement regime . If the prior knowledge is not available, a uniform distribution is substituted. In order to calculate the posterior distribution recursively, onestep estimation is calculated as
Then the posterior distribution conditioned on m can be obtained by Bayes’ rule: where the term has been replaced by . Because given the current hand position and movement regime, the current neural activities are independent of neural activities from the beginning to last time step, as illustrated by the graphic model in Figure 2. We can calculate posterior distribution recursively by substituting (4) into (5) and feeding (5) back to (4) in the next time step. In practice, was calculated by Gaussian approximation with parameters matched to the location and curvature of two terms [35]. The expectation and covariance matrix were calculated based on to derive estimation and credible intervals.
2.4. Target Direction and Intended Movement Selection Prediction from Delay Epoch
Neural activities during planning (delay 1 and delay 2) contain key information for the forthcoming movement. In our application, there were two prior information, target direction and intended movement selection, which could be extracted in delay epoch (delay 1 and delay 2, resp.). Let be the estimation of target direction given the neural activity in delay 1, and be the estimation of intended movement selection given the neural activity in delay 2. With independence assumption, the estimation of final movement regimes m given the neural activity in whole delay epoch can be calculated as
The estimation result of is substituted into the MTM as prior knowledge. In our study, two other estimations, and , were also used to represent decoding with no prior and target direction prior only, respectively. The results were compared with substitution, which correspond to decoding with both target direction and intended movement selection prior.
To obtain the probabilities of movement regimes, statistical Bayes’ rule was utilized in our study [16]. Supposing neural activities from all units are independent and the distribution of spikes for each movement regime is Gaussian [28, 36], the distribution of neural activity from all units to each movement regime m can be fitted as follows: where and are the mean and variance for the ith unit and mth movement regime and both of the parameters are obtained during training by maximum likelihood. For the test trials, the probability that movement regime is m conditioned on activity in delay can be calculated by Bayes’ rule illustrated in (4). where is assumed to be uniform according to task settings. Actually, the accuracy of the estimated prior information is correlated with the duration and location of the time window, as well as the spike count model. Optimizing the prior information decoder is beyond the scope of our study. The time windows utilized to decode target direction and intended movement selection are shown in Figure 1(a).
3. Results
In this study, two monkeys (monkeys B and C) were well trained to perform the delayed obstacleavoidance task. Accuracy rates of trials exceeded 95% and 93% for monkeys B and C, respectively. Neural signals in the PMd utilized in this study were recorded from 10 sessions (five for each monkey) distributed in one month, and each data session contained 307 ± 43 and 321 ± 50 trials for monkeys B and C. 45 ± 4.3 and 38 ± 3.8 units were isolated with Offline Sorter for monkeys B and C, respectively. Leaveoneout crossvalidation was utilized in both target direction and movement selection prediction [18], which means one trial was regarded as a test sample and the rest trials were utilized to train the model parameters. Take one of the leaveoneout cases for example, the first to the last trials but one were training samples, and the last trial was the test sample.
3.1. Target Direction Encoding Properties in Delay 1 Epoch
We examined the target direction encoding properties during rest and delay 1 epoch. PV, which is defined as the summation of weighted preferred direction, was carried out to investigate the evolution of unitencoding directions [15]. Velocitybased PV was utilized in our study [2]. Because all of the recorded units were analyzed regardless of their tuning depth, preferred direction was not normalized to unit [2]. So strong tuning had higher weights, and weak tuning had lower weights. Target direction was estimated from 0 to 900 ms (whole rest, delay 1 plus 100 ms in delay 2; considering causal time delay, the first bin in delay 2 was also collected for the prediction of target direction [37]), during which the moving cursor and target ball were shown to the monkeys.
Figure 3 demonstrates the behavior of velocitybased PVs for two starting position conditions for each animal as they changed over the course of rest and delay epochs. Five bins in rest, three bins in delay 1, and one bin in delay 2 were shown. Figure 3(a) represents PV temporal evolution with start position on the right. During rest epoch with nothing on the screen, the direction of PVs remained insignificant. During delay 1 and first bin in delay 2, PVs pointed in the direction of the target with bigger length. Figure 3(b) demonstrate the consistent results as (a) with the starting position on the left. Some PVs had direction preference during the rest epoch. For example, in cases of monkey B in Figure 3(b), PVs tended to point to the right during rest epoch and the angles between PVs and direct right were within 45 degrees. One possible reason is that an overtrained monkey could predict that the target would appear on the right (top right, bottom right, or direct right) during the trials, where the initial position was set on the left [38].
(a)
(b)
Bayes’ rule with Gaussian hypothesis was utilized to estimate the target direction quantitatively. Neural activities labeled by red bars shown in Figure 1(b) were analyzed. Tables 1 and 2 demonstrate the expectation of target direction for monkey B and monkey C, respectively. The leaveoneout technique was utilized here to train Gaussian parameters. The accuracy rate was calculated as the expectation of selecting the right target. Student’s ttest between the expectations and the chance level was performed.


3.2. Intended Movement Selection Encoding Properties in Delay 2 Epoch
Delay 2 epoch began with the appearance of the obstacle. There were two obstacle opening positions for each pair of start point and target. During delay 2 epoch with obstacle appearance, two intended movement selections (clockwise and counterclockwise) exist for monkeys to avoid the obstacle. We are interested in investigating whether there are differences in neural patterns between the two selections. Intended movement selection was estimated from 900 to 1300 ms (100–500 ms of delay 2), after the obstacle showed up. We use PCA to visualize neural patterns during delay 2 by dimensionality reduction, as shown in Figure 4. Each dot (blue circles and red triangles) in Figure 4 represents the neural pattern in 100 ms bin. Figure 4(a) refers to PCA projection results for monkey B with start point at the bottom and target at the top. The opening positions of obstacles could appear on the left or right. The two obstacleavoidance trajectory candidates were represented by dashed curves and labeled by solid circles and triangles. The neural activities projected to the top two PCA components were clustered into two groups with some overlaps, corresponding to trajectory candidates. Although there were some overlaps, the two clusters were distinguishable, which implies that monkeys were involved in the intended movement selection during delay 2 epoch. Figure 4(b) shows the results of monkey C, which was consistent with (a).
(a)
(b)
We further utilized Bayes’ rule with Gaussian hypothesis to estimate the intended movement selection. Neural activities during movement planning labeled by blue bars shown in Figure 1(b) were analyzed. Tables 3 and 4 demonstrate the expectation of intended movement selection for monkey B and monkey C, respectively. The leaveoneout technique was utilized here as well. Student’s ttest between the expectations and the chance level was performed. Expectations of both monkeys were above the chance level significantly.


3.3. Decoding Results with Prior Knowledge
The prediction results of target direction and intended movement selection imply that the neural activities during delay epoch contained information of the task. We tried to integrate the predicted information during delay epoch to MTM framework to improve the trajectory estimation during movement. In this study, to evaluate the effects of prior knowledge on decoding performance in obstacleavoidance task, decoders with three different prior knowledge were compared: (1) no prior knowledge (the prior term in RBE obeys uniform distribution); (2) integrating estimated target direction into RBE; and (3) integrating sequentially estimated target direction and intended movement selection into RBE.
Figure 5(a) shows decoding results in horizontal and vertical positions, respectively, while Figure 5(b) demonstrates the estimated trajectories in twodimensional space. Decoder with no prior knowledge performed worst with the largest 95% credible interval, as illustrated in Figure 5(a). The estimated trajectory followed the real trajectory relatively well for the first half, which may even be steered by the obstacle. However, it lost direction in the second half and failed to reach the target. The estimation with target direction only tended to reach the target in a relatively direct way, which may run into the obstacle. The trajectory estimated with both target direction and intended movement selection curved to steer around the obstacle and reached the target successfully, as shown in Figure 5(b).
(a)
(b)
Pearson’s correlation coefficient (CC) and mean square error (MSE) were utilized to evaluate the performance of trajectory regression. Success rate means the rate of trials that monkeys steered around the obstacles and reached the target successfully, which was utilized to evaluate the decoding performance in view of task completion. Figure 6 shows estimation performance of three decoders with different prior knowledge for both monkeys across ten data sessions. Figure 6(a) shows the mean CC (top), MSE (middle), and success rate (bottom) of each data session labeled by different colors for monkey B. The histograms represent the mean CC, MSE, and success rate of all the data sessions. Histograms show that decoder with no prior information had the smallest CC and biggest MSE, while decoder with both target direction and intended movement selection had the biggest CC and smallest MSE. Actually, the CC of the decoder with both target direction and intended movement selection exceeded that without prior information, with only target direction by 15.9% and 5.3%, respectively (MSE descending rate: 18.8%, 7.9%). Figure 6(b) shows the decoding performance of monkey C (CC ascending rate: 14.4%, 7.7%; MSE descending rate: 16.4%, 6.5%), which was consistent with the results shown in (a). The success rate obtained by both target direction and intended movement selection exceeded that without prior information, with only target direction by 113.1% and 45.7% for monkey B and 93.4% and 59.0% for monkey C, respectively. With more prior knowledge, decoders obtained more instructive information, which improved the trajectory estimation and trial completion performance. More information was needed to estimate the trajectory in a complex task. The results demonstrate that both target direction and intended movement selection were essential in the obstacleavoidance task.
(a)
(b)
4. Discussion and Conclusion
In this study, we predicted the target direction and intended movement selection during delay epoch and integrated the planning information to MTM framework to improve the decoding performance in movement epoch. The results of PV and PCA demonstrated that units tuned to the target direction and initial movement direction during delay 1 and delay 2, respectively. We sequentially integrated this two prior knowledge to MTM. Compared to the decoders with no prior and only estimated target direction, the CC of trajectory estimation was promoted by 15.9% and 5.3%, and 14.4% and 7.7% for monkeys B and C, respectively, while the descending rates of MSE were 18.8% and 7.9%, and 16.4% and 6.5% for monkeys B and C, respectively. The trial success rates were improved significantly with both target direction and intended movement selection for both monkeys. Results imply that integrating target direction and intended movement selection could improve the hand trajectories estimation in an indirect reaching.
The indirect reaching is common in daily life. The environment animals live in is very complex and full of obstacles, which poses difficulties for decoding. The study here proposes an approach to generalize the BMIs from a pointtopoint task to more complex task with planning information integration strategy. The PMd is considered to be related to planning during delay epoch [21–23]. Pearce and Moran have visualized the planning activities of the PMd by PVs [2]. And the evolution of PVs during delay 1 epoch in this study agreed with the above report. We also found that the neural activities during delay 2 tuned to the intended movement selection. The neural activities during rest epoch of some trials were beyond our expectation. We found that both monkeys made some prejudging during the rest epoch based on the task settings. That implies that to some extent, overtrained monkeys had the sense of workspace and made the prejudging based on the hand position [38]. Furthermore, the shape or the place of the obstacle might influence the performance of the trajectory estimation. Although our study mainly focused on incorporating prior information to improve decoding performance, it would be important to further study the influence of obstacles by using indirect reach movements in the following studies.
MTM framework was proposed to improve the trajectory estimation by integrating target information [16]. This framework works well by conforming to the timeline of performing a task: first planning and then moving. For some more complex tasks, neural activities during planning are always corresponding to the key information about the task. So the extracted planning information provide some instructions for the following estimation. In this study, we generalize this framework to a more complex task by integrating one more prior knowledge. It is easy to extend the framework to three or more prior knowledge by our methods. With more prior information included, the trajectory of more complex tasks could be estimated smoothly and accurately. As mentioned in the Introduction, the stateoftheart ReFITKF promotes the online reaching movement estimation performance by retraining the parameters with the intention to target information [3]. The comparison of ReFITKF and methods here would be conducted in online BMIs in further study.
There are some limitations in our study. In practice, the situation is more complex than the task performed in our study. So more complex models [39] should be utilized to extract more valid information. In addition, the time windows to extract the planning information were fixed in this study, where uncorrelated neural activities were involved in. Several methods have been proposed to estimate the state evolution during the task [39–42]. However, detecting the time windows that planning happens is still an open question. Only offline analysis was carried out here. More experiments for online validation [17] needs to be done in the following studies.
Conflicts of Interest
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work is supported by the Natural Science Foundation of China (nos. 31371001, 31627802, and 61473261), International cooperation projects of MOST (no. 2014DFG32580), National Basic Research Program of China (no. 2013CB329506), and the Fundamental Research Funds for the Central Universities. The authors would like to thank Yiyi Yang, Shenglong Xiong, and Yimin Shen for their assistance with the animal experiments.
References
 L. R. Hochberg, D. Bacher, B. Jarosiewicz et al., “Reach and grasp by people with tetraplegia using a neurally controlled robotic arm,” Nature, vol. 485, no. 7398, pp. 372–375, 2012. View at: Publisher Site  Google Scholar
 T. M. Pearce and D. W. Moran, “Strategydependent encoding of planned arm movements in the dorsal premotor cortex,” Science, vol. 337, no. 6097, pp. 984–988, 2012. View at: Publisher Site  Google Scholar
 V. Gilja, P. Nuyujukian, C. A. Chestek et al., “A highperformance neural prosthesis enabled by control algorithm design,” Nature Neuroscience, vol. 15, no. 12, pp. 1752–1757, 2012. View at: Publisher Site  Google Scholar
 G. Welch and G. Bishop, An Introduction to the Kalman Filter, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA, vol. 8, no. 7, pp. 127–132, 2006.
 Q. Zhang, S. Zhang, Y. Hao et al., “Development of an invasive brainmachine interface with a monkey model,” Chinese Science Bulletin, vol. 57, no. 16, pp. 2036–2045, 2012. View at: Publisher Site  Google Scholar
 P. R. Roelfsema and A. V. Ooyen, “Attentiongated reinforcement learning of internal representations for classification,” Neural Computation, vol. 17, no. 10, pp. 2176–2214, 2005. View at: Publisher Site  Google Scholar
 Y. Wang, F. Wang, K. Xu, Q. Zhang, S. Zhang, and X. Zheng, “Neural control of a tracking task via attentiongated reinforcement learning for brainmachine interfaces,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 23, no. 3, pp. 458–467, 2015. View at: Publisher Site  Google Scholar
 J. Wessberg, C. R. Stambaugh, J. D. Kralik, P. D. Beck, and M. Laubach, “Realtime prediction of hand trajectory by ensembles of cortical neurons in primates,” Nature, vol. 408, no. 6810, pp. 361–365, 2000. View at: Publisher Site  Google Scholar
 J. M. Carmena, M. A. Lebedev, R. E. Crist et al., “Learning to control a brainmachine interface for reaching and grasping by primates,” PLoS Biology, vol. 1, no. 2, p. E42, 2003. View at: Publisher Site  Google Scholar
 M. Velliste, S. Perel, M. C. Spalding, A. S. Whitford, and A. B. Schwartz, “Cortical control of a prosthetic arm for selffeeding,” Neurosurgery, vol. 453, no. 63, pp. N8–N9, 2008. View at: Google Scholar
 D. M. Taylor, S. I. Tillery, and A. B. Schwartz, “Direct cortical control of 3D neuroprosthetic devices,” Science, vol. 296, no. 5574, pp. 1829–1832, 2002. View at: Publisher Site  Google Scholar
 M. D. Serruya, N. G. Hatsopoulos, L. Paninski, M. R. Fellows, and J. P. Donoghue, “Brainmachine interface: instant neural control of a movement signal,” Nature, vol. 416, no. 6877, pp. 141–142, 2002. View at: Publisher Site  Google Scholar
 A. P. Georgopoulos, J. F. Kalaska, R. Caminiti, and J. T. Massey, “On the relations between the direction of twodimentional arm movements and cell discharge in primate motor cortex,” The Journal of Neuroscience, vol. 2, no. 11, pp. 1527–1537, 1982. View at: Google Scholar
 M. Weinrich and S. P. Wise, “The premotor cortex of the monkey,” Journal of Neuroscience, vol. 2, no. 9, pp. 1329–1345, 1982. View at: Google Scholar
 A. Georgopoulos, A. Schwartz, and R. Kettner, “Neuronal population coding of movement direction,” Science, vol. 233, no. 4771, pp. 1416–1419, 1986. View at: Publisher Site  Google Scholar
 B. M. Yu, C. Kemere, G. Santhanam et al., “Mixture of trajectory models for neural decoding of goaldirected movements,” Journal of Neurophysiology, vol. 97, no. 5, pp. 3763–3780, 2007. View at: Publisher Site  Google Scholar
 W. Bishop, B. M. Yu, G. Santhanam et al., “The use of a virtual integration environment for the realtime implementation of neural decode algorithms,” Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 2008, pp. 628–633, 2008. View at: Publisher Site  Google Scholar
 M. M. Shanechi, Z. M. Williams, G. W. Wornell, R. C. Hu, M. Powers, and E. N. Brown, “A realtime brainmachine interface combining motor target and trajectory intent using an optimal feedback control design,” PloS One, vol. 8, no. 4, pp. 1–15, 2013. View at: Publisher Site  Google Scholar
 P. T. Sadtler, S. I. Ryu, E. C. Tylerkabara, B. M. Yu, and A. P. Batista, “Braincomputer interface control along instructed paths,” Journal of Neural Engineering, vol. 12, no. 1, p. 016015, 2015. View at: Publisher Site  Google Scholar
 S. Hocherman and S. P. Wise, “Effects of hand movement path on motor cortical activity in awake, behaving rhesus monkeys,” Experimental Brain Research, vol. 83, no. 2, pp. 285–302, 1991. View at: Google Scholar
 S. P. Wise, “The primate premotor cortex: past, present, and preparatory,” Annual Review of Neuroscience (Palo Alto, CA), vol. 8, pp. 1–19, 1985. View at: Publisher Site  Google Scholar
 P. Cisek and J. F. Kalaska, “Neural correlates of reaching decisions in dorsal premotor cortex: specification of multiple direction choices and final selection of action,” Neuron, vol. 45, no. 5, pp. 801–814, 2005. View at: Publisher Site  Google Scholar
 M. M. Churchland, “Neural variability in premotor cortex provides a signature of motor preparation,” Journal of Neuroscience, vol. 26, no. 14, pp. 3697–3712, 2006. View at: Publisher Site  Google Scholar
 M. M. Shanechi, A. L. Orsborn, H. G. Moorman, S. Gowda, S. Dangi, and J. M. Carmena, “Rapid control and feedback rates enhance neuroprosthetic control,” Nature Communications, vol. 8, p. 13825, 2017. View at: Publisher Site  Google Scholar
 H. Li, F. Wang, Q. Zhang, and J. C. Principe, “Maximum correntropy based attentiongated reinforcement learning designed for brain machine interface,” in Engineering in Medicine and Biology Society (EMBC), 2016 IEEE 38th Annual International Conference of the IEEE, pp. 3056–3059, 2016. View at: Google Scholar
 G. Hotson, R. J. Smith, A. G. Rouse, M. H. Schieber, N. V. Thakor, and B. A. Wester, “High precision neural decoding of complex movement trajectories using recursive Bayesian estimation with dynamic movement primitives,” IEEE Robotics and Automation Letters, vol. 1, no. 2, pp. 676–683, 2016. View at: Publisher Site  Google Scholar
 E. N. Brown, L. M. Frank, D. Tang, M. C. Quirk, and M. A. Wilson, “A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells,” The Journal of Neuroscience, vol. 18, no. 18, pp. 7411–7425, 1998. View at: Google Scholar
 B. M. Yu, S. I. Ryu, G. Santhanam, M. M. Churchland, and K. V. Shenoy, “Improving neural prosthetic system performance by combining plan and perimovement activity,” Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 6, pp. 4516–4519, 2004. View at: Publisher Site  Google Scholar
 C. Kemere, G. Santhanam, B. M. Yu, S. Ryu, T. Meng, and K. V. Shenoy, “Modelbased decoding of reaching movements for prosthetic systems,” Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 6, pp. 4524–4528, 2004. View at: Publisher Site  Google Scholar
 A. E. Brockwell, A. L. Rojas, and R. E. Kass, “Recursive Bayesian decoding of motor cortical signals by particle filtering,” Journal of Neurophysiology, vol. 91, no. 4, pp. 1899–1907, 2004. View at: Publisher Site  Google Scholar
 W. Wu, Y. Gao, E. Bienenstock, J. P. Donoghue, and M. J. Black, “Bayesian population decoding of motor cortical activity using a Kalman filter,” Neural Computation, vol. 18, no. 1, pp. 80–118, 2006. View at: Publisher Site  Google Scholar
 M. M. Shanechi, G. W. Wornell, Z. M. Williams, and E. N. Brown, “Feedbackcontrolled parallel point process filter for estimation of goaldirected movements from neural signals,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 21, no. 1, pp. 129–140, 2013. View at: Publisher Site  Google Scholar
 E. M. Maynard, C. T. Nordhausen, and R. A. Normann, “The Utah intracortical electrode array: a recording structure for potential braincomputer interfaces,” Electroencephalography and Clinical Neurophysiology, vol. 102, no. 3, pp. 228–239, 1997. View at: Publisher Site  Google Scholar
 P. Y. Chhatbar, L. M. von Kraus, M. Semework, and J. T. Francis, “A biofriendly and economical technique for chronic implantation of multiple microelectrode arrays,” Journal of Neuroscience Methods, vol. 188, no. 2, pp. 187–194, 2010. View at: Publisher Site  Google Scholar
 D. J. C. MacKay, Information Theory, Inference, and Learning Algorithms, vol. 50, no. 10, p. 640, 2003, Cambridge University Press
 E. M. Maynard, N. G. Hatsopoulos, C. L. Ojakangas et al., “Neuronal interactions improve cortical population coding of movement direction,” Journal of Neuroscience, vol. 19, no. 18, pp. 8083–8093, 1999. View at: Google Scholar
 D. W. Moran and A. B. Schwartz, “Motor cortical representation of speed and direction during reaching,” Journal of Neurophysiology, vol. 82, no. 5, pp. 2676–2692, 1999. View at: Google Scholar
 E. Hoshi and J. Tanji, “Integration of target and bodypart information in the premotor cortex when planning action,” Nature, vol. 408, no. 6811, pp. 466–470, 2000. View at: Publisher Site  Google Scholar
 J. C. Kao, P. Nuyujukian, S. I. Ryu, and K. V. Shenoy, “A highperformance neural prosthesis incorporating discrete state selection with hidden Markov models,” IEEE Transactions on Biomedical Engineering, vol. 62, no. 1, pp. 21–29, 2015. View at: Google Scholar
 V. Aggarwal, M. Mollazadeh, A. G. Davidson, M. H. Schieber, and A. N. V. Thakor, “Statebased decoding of hand and finger kinematics using neuronal ensemble and LFP activity during dexterous reachtograsp movements,” Journal of Neurophysiology, vol. 109, no. 12, pp. 3067–3081, 2013. View at: Publisher Site  Google Scholar
 C. Kemere, G. Santhanam, B. M. Yu et al., “Detecting neuralstate transitions using hidden Markov models for motor cortical prostheses,” Journal of Neurophysiology, vol. 100, no. 4, pp. 441–2452, 2008. View at: Publisher Site  Google Scholar
 N. Achtman, A. Afshar, G. Santhanam, B. M. Yu, S. I. Ryu, and K. V. Shenoy, “Freepaced highperformance braincomputer interfaces,” Journal of Neural Engineering, vol. 4, no. 3, pp. 336–347, 2007. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2017 Hongbao Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.