Computational Intelligence and Neuroscience

Volume 2015 (2015), Article ID 427829, 12 pages

http://dx.doi.org/10.1155/2015/427829

## Test Statistics for the Identification of Assembly Neurons in Parallel Spike Trains

European Centre for Soft Computing, Edificio Científico Tecnológico, Gonzalo Gutiérrez Quirós, s/n, 33600 Mieres, Spain

Received 13 September 2014; Revised 13 February 2015; Accepted 18 February 2015

Academic Editor: Jianwei Shuai

Copyright © 2015 David Picado Muiño and Christian Borgelt. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

In recent years numerous improvements have been made in multiple-electrode recordings (i.e., parallel spike-train recordings) and spike sorting to the extent that nowadays it is possible to monitor the activity of up to hundreds of neurons simultaneously. Due to these improvements it is now potentially possible to identify assembly activity (roughly understood as *significant* synchronous spiking of a group of neurons) from these recordings, which—if it can be demonstrated reliably—would significantly improve our understanding of neural activity and neural coding. However, several methodological problems remain when trying to do so and, among them, a principal one is the combinatorial explosion that one faces when considering all potential neuronal assemblies, since in principle every subset of the recorded neurons constitutes a candidate set for an assembly. We present several statistical tests to identify assembly neurons (i.e., neurons that participate in a neuronal assembly) from parallel spike trains with the aim of reducing the set of neurons to a relevant subset of them and this way ease the task of identifying neuronal assemblies in further analyses. These tests are an improvement of those introduced in the work by Berger et al. (2010) based on additional features like spike weight or pairwise overlap and on alternative ways to identify spike coincidences (e.g., by avoiding time binning, which tends to lose information).

#### 1. Introduction

The principles of neural coding and information processing in biological neural networks are still not well understood and are the topic of ongoing debate. As a model of network processing, neuronal assemblies were proposed in [1], which are intuitively understood as groups of neurons that tend to exhibit synchronous spiking.

In recent years considerable improvements have been made in multiple-electrode recordings and spike sorting (see, e.g., [2, 3]) that allow monitoring the activity of up to hundreds of neurons simultaneously. These improvements open the possibility of identifying neuronal assemblies from multiple-electrode recordings using statistical data analysis techniques. However, several methodological problems remain when trying to do so and, among them, a principal one is the combinatorial explosion that we face when considering all potential neuronal assemblies (since in principle every subset of the recorded neurons constitutes a candidate set for an assembly). For this reason, most studies that deal with temporal spike correlation still resort to analyzing only pairwise interactions (see, e.g., [4–7]), thus considerably reducing the computational complexity of such task. There are approaches in the literature that try to infer higher-order correlation and potential assembly activity by building primarily on these pairwise interactions (see, e.g., [8–11]) but, although they can sometimes provide a hint of higher-order correlation and even closely identify assembly activity (provided it is sufficiently pronounced), higher-order correlations need to be checked directly in order to properly identify neuronal assemblies, mostly for two reasons: first, to make sure that the activity reported is actually that of an assembly and not just of several overlapping pairs and, second, to increase the sensitivity for assembly activity as pairwise tests may not be affected sufficiently by assembly activity (see, e.g., [12, 13]). Some approaches already do so (see, e.g., [14–16]) yet they are all generally limited to a small number of neurons. Others presented in some of our recent companion papers (see, e.g., [17–19]) push this limitation by employing frequent item set mining methodology and algorithms to ease and speed up the search through all the candidate sets for potential assemblies, yet combinatorial explosion remains a fundamental problem (especially since statistical tests aiming at identifying assembly activity often rely on randomization or surrogate data approaches, which drive up the computational complexity even further).

In this paper we present several statistical tests to identify individual assembly neurons (i.e., neurons that are part of an assembly). Our tests extend and considerably improve those presented in [20], which were based on time binning and were mostly intended to identify* exact* (or almost exact) spike synchrony—which is more a theoretical simplification for modelling purposes rather than a realistic assumption. With the new tests introduced in this paper we can do much better: first, we introduce new features into the tests that make them more sensitive (like, e.g., spike weights or pairwise overlap of spikes) and, second, we introduce new ways to identify spike coincidences (i.e., we introduce alternatives to time binning to avoid the loss of detectable synchronous activity). The main motivation of our tests is to reduce the set of neurons only to a relevant subset of them and in this way ease the task of identifying neuronal assemblies in further analyses (i.e., by reducing the total number of neurons to those that tested positive in our approach, the combinatorial explosion can be reduced significantly). The idea of all tests that we present in this paper is fairly simple: we evaluate whether an individual neuron is involved* significantly* more often in some correlated-spiking event (that depends on the particular test) than it would be expected by chance under the assumption of noncorrelation (i.e., independence). In order to assess significance we estimate the distribution of our test statistics by means of randomized trials (i.e., collections of parallel spike trains): modifications of our original data that are intended to keep all its essential features except synchrony for the neuron we are testing.

The paper is structured as follows: in Section 2 we mainly introduce some notation that we will be using throughout the paper and briefly discuss the notion of* spike synchrony*, central to our research. In Section 3 we introduce our test statistics to identify assembly neurons. First, in Section 3.1 we provide four statistical tests that rely on a window-based approach to identify spike coincidences. Technically speaking, different collections of windows provide different ways of counting spike coincidences and thus different tests. We consider in our evaluations two collections of windows: the first one we consider is a partition of the recording time of our spike data into equal intervals (i.e, time bins), on which the bin-based model (the almost exclusively applied model of synchrony in the neurobiology literature) relies in order to identify spike coincidences. The second one we consider, more in keeping with a time-continuous account of spiking activity, is a collection of sliding windows (one for each spike time) able to account for all spike coincidences in our spike trains that fall within the window length and that is consistent with the common, intended characterization of spike synchrony in the field, which regards two or more spikes as synchronous if they lie within a certain distance from each other (to be determined by the modeller). Second, in Section 3.2, we offer a* graded,* continuous alternative to some of the previous tests. In Section 4 we briefly discuss the complexity of computing the test statistics presented in the two previous sections. In Section 5 we evaluate the performance of our new test statistics on artificially generated collections of spike trains based on parameters learned from typical real recordings, compared to the performance of those in [20], and show that the former clearly outperform the latter. Finally, in Section 6 we summarize results.

#### 2. Preliminary Definitions, Remarks, and Notation

Let be our set of items (i.e., in our context, neurons). We will be working with parallel spike trains, one for each neuron in , formalized as spike-time sequences (i.e., point processes) of the form , for and (the recording time), where is the number of times neuron fires in the interval . We denote the set of all these sequences by . Sets of sequences like constitute our raw data.

In order to identify (potential) assembly neurons and, ultimately, neuronal assemblies we need to determine first what constitutes spike synchrony: exact spike coincidences cannot be expected and thus an alternative, nontrivial characterization of synchrony is needed. Generally it is considered that two or more spikes are synchronous (or coincident)—that is, they constitute a synchronous event—if they lie within a certain (user-defined) distance from each other, say . We will assume this notion of spike synchrony throughout.

The bin-based method, the almost exclusively applied method for dealing with synchronous spiking in the neurobiology literature, builds on the notion of synchrony above: the recording time is partitioned into time bins (i.e., windows) of equal length ( above, the time distance within which the modeller intends to define synchrony) and all those spikes that lie in the same time bin are regarded as synchronous. Notice though that the bin-based method can fail to identify some synchronous events: two or more spikes can be separated by a time distance way smaller than and lie in two distinct time bins—what we called in other companion papers the* boundary problem*, which we addressed by means of an alternative method to identify and count spike coincidences which builds on an alternative window set, defined in the next section (that matches the intended characterization of spike synchrony given above), introduced in [17]. In order to illustrate the relevance of the boundary problem and the huge impact that time-bin boundaries have on the identification of synchrony we show, in Figure 1, the probability that spike coincidences of different sizes (with respect to different ratios between the scatter of the spikes—the time span of the spikes in the coincidence—and bin width) are cut by a time-bin boundary.