Abstract

Artifactual sources of resting-state (RS) FMRI can originate from head motion, physiology, and hardware. Of these sources, motion has received considerable attention and was found to induce corrupting effects by differentially biasing correlations between regions depending on their distance. Numerous corrective approaches have relied on the identification and censoring of high-motion time points and the use of the brain-wide average time series as a nuisance regressor to which the data are orthogonalized (Global Signal Regression, GSReg). We replicate the previously reported head-motion bias on correlation coefficients and then show that while motion can be the source of artifact in correlations, the distance-dependent bias is exacerbated by GSReg. Put differently, correlation estimates obtained after GSReg are more susceptible to the presence of motion and by extension to the levels of censoring. More generally, the effect of motion on correlation estimates depends on the preprocessing steps leading to the correlation estimate, with certain approaches performing markedly worse than others. For this purpose, we consider various models for RS FMRI preprocessing and show that the local white matter regressor (WMeLOCAL), a subset of ANATICOR, results in minimal sensitivity to motion and reduces by extension the dependence of correlation results on censoring.

1. Introduction

Resting-State Functional Magnetic Resonance Imaging (RS FMRI) has become a popular methodology for studying brain function with FMRI and holds promise for understanding brain functions without a task or stimulus [1]. A commonly used approach employs the cross correlation between time series to estimate the strength of connection between a pair of voxels or regions of interest after possible artifacts are removed by linear regression (nuisance-removal regression) from the original echo planar imaging (EPI) time series data. Part of the appeal of RS FMRI is the relative ease with which the data can be acquired.

However, drawing valid inferences can be fraught with pitfalls, as illustrated in recent publications that have caused a considerable stir in the functional neuroimaging field. For example, Power et al. [2] showed that head movement differences between subjects might explain perceived differences in the spatial patterns of brain connectivity and suggested that these motion differences differentially bias short-range versus long-range correlations. This inference was reached by considering the change in interregional correlations after high-motion points were eliminated from the estimation of correlation. Removing high-motion samples differentially affected correlations depending on interregional distance, thus implicating motion as the source of this distance-dependent bias. As a result, the authors conclude that censoring is the recommended approach for reducing the effects of motion. While we agree with the notion that data censoring can be important, we find that the reported distance-dependent bias is not primarily induced by motion. It is strongly exacerbated by the inclusion of the global signal averaged over whole brain (GS) and related regressors derived by time series averaging over regions containing signals of interest.

In this work, we replicate the bias reported in [2] using the data the authors have very generously made public; we demonstrate how the exclusion of particular tissue-based regressors reduces the distance-dependent bias effect considerably and how the use of a variant on ANATICOR [3] almost entirely eliminates the effect, establishing that our recommended approach is less sensitive to motion induced artifacts. This result is yet another demonstration of why the GS and comparable time series averaged over large brain areas, a practice still widely used, should not be projected out of the data in RS-FMRI [35]. Finally, we provide an annotated flowchart that presents our recommended data preprocessing pipeline.

2. Materials and Methods

2.1. MRI Data

Image data used in [2] are open to the public at the FCON 1000 project website (http://fcon_1000.projects.nitrc.org/). We used children group data (cohort 1; ) that exhibited larger motion effects than the other two groups (adolescent and adult cohorts in the full data set). The details of the cohort are described in [2] and the website (http://fcon_1000.projects.nitrc.org/indi/retro/Power2012.html).

2.2. Preprocessing Pipeline

Overview of the preprocessing pipeline for this work is described in Figure 1. The recommended preprocessing steps for RS FMRI analysis are described towards the end of this work (see Figure 7). We deviate from our recommended pipeline to accommodate the particulars of the data at hand, as detailed in the text below and in the flowchart in Figure 1.

2.2.1. Segmentation of T1-Weighted Images

T1-weighted images of individual subjects were aligned to the first frame of FMRI echo planar imaging (EPI) data of resting scans and segmented into gray, white, and cerebrospinal fluid tissue classes using AFNI’s “3dSeg” program [6].

2.2.2. Despiking, Slice-Timing, and Head-Motion Correction

Despiking was done with AFNI’s “3dDespike” program for Figures 2 and 4(e), as the first step of the preprocessing pipeline. Each voxel’s time series is fit to a Fourier series of order , defaulting to 1/30 of the number of time points: where is the duration of time series, the parameters , , , , and are chosen to minimize the sum over of ( regression), and is the EPI time series of each voxel. The value of used herein is , where is the number of time points. The median absolute deviation (MAD) of the residuals is used to obtain a standard deviation estimate that is robust to outliers:

For each voxel value, define as follows: and values with greater than the threshold value of for a spike () are replaced with a value that yields a modified : where is the upper range of the allowed deviation from . is mapped to . By default parameters and are set to 2.5 and 4, respectively, although program “3dDespike” allows users to modify them. With the default parameters, despiking consists of transforming spike values from the range of [, ) to [, ). The purpose of this transformation is to make the output data be continuous in the input data: small changes in the input (e.g., a value going from slightly under a threshold to slightly over) will not produce large changes in the despiked output. Slice-timing correction was performed, and motion correction was done by rigid body registration of EPI images to a base image [7]. Alignment of EPI data to the T1 was accomplished via an affine transformation, as was the spatial normalization of the T1 to the MNI avg152 T1 template, in MNI stereotaxic coordinates. All 3 transformations were applied at once to the EPI data to prevent multiple resampling steps.

Despiking was skipped in [2]. In practice, however, we find that despiking appears to improve the results of volume registration over time as illustrated in Figure 2 (also see supplementary video S1 available on line at http://dx.doi.org/10.1155/2013/935154). With despiking, motion parameters are less variable and the alignment quality is superior when visually examined.

2.2.3. Nuisance-Removal Regression

Five types of nuisance regression models were compared in this study. All regressors were extracted from motion-corrected EPI data before spatial smoothing with an isotropic Gaussian smoothing kernel (full-width-at-half-maximum; FWHM = 6 mm). Extraction of tissue-based regressors prior to any spatial smoothing is essential, to avoid mixing data from different tissue types; this point, while obvious, is often not made in Methods sections of papers. Regressors in the first model, GS + MO, included the 6 motion estimates, the tissue-based averages (global signal, GS; white matter signal, WM; large ventricle signal, LV), and the first time difference of each of the aforementioned regressors. In addition, th-order Legendre polynomials were used to model slow baseline fluctuations. is automatically determined by the number of EPI time points in the AFNI program afni_proc.py and was set to 4 for the time series analyzed here (http://afni.nimh.nih.gov/pub/dist/doc/program_help/afni_proc.py.html) where is the EPI time series at a voxel , is a global signal (GS) calculated by averaging the time-series over all brain mask voxels, is the average signal of all white matter voxels, are the averaged time series of lateral ventricles (LV) masks, is the group of six regressors for motion correction parameters (three translation and three rotation), and is the group of detrending polynomials. The residual is the “cleaned” time series after subtracting the best-fit regression model of the nuisance variables from the original voxel time series. The second model, GS, excluded the 6 motion estimate regressors and their first difference terms as follows: The third model, MO, included motion estimates with their first difference terms but omitted any tissue-derived regressors and their first difference terms as follows: The fourth model was based on the model MO but included a localized and eroded WM regressor to form a local estimate of nuisance parameters while avoiding gray matter signals in the regions of interest as follows: where is a regressor of local WM signal for each voxel , which can be calculated by averaging signals in eroded WM with a local sphere mask ( mm) by the AFNI program 3dLocalStat. The fifth model Depike + MO + is based on the model MO +, but EPI data was despiked at the first stage of processing. These final two models are reduced variants of ANATICOR [3], lacking regressors of independently acquired physiological signals (not available in [2]) and the eroded large ventricles (LVe) regressors, as well as the despiking step for the fourth model (MO + ).

2.2.4. Censoring and Bandpass Filtering

We based the criterion for censoring on the Euclidian norm of the first time differences of motion estimates . This criterion has been part of the AFNI processing stream (afni_proc.py) and while not identical to the frame-wise displacement (FD) in [2], it serves the same function of eliminating data at time points when significant rapid motion is detected. At a threshold of 0.25 mm, we censored on average 17.6% of the time series (1.8% and 45.0% at minimum and maximum, resp.).

Though contrary to our recommendation in the Discussion section, we filtered the data with a bandpass filtering kernel ( Hz) after nuisance regression to avoid the degrees-of-freedom (DOFs) limitation for high movers because the EPI data had less samples than regression model parameters. This filtering was done via linear regression of sine/cosine basis functions, to avoid artifacts that would otherwise arise from the censoring process (e.g., assuming constant time steps or including censored data points).

2.2.5. Spatial Smoothing, and Orders of Preprocessing Steps

Figure 3 illustrates the effects of different processing steps on the spectral content of the time series. The preprocessing pipeline used in [2], shown to the left of Figure 3 as pipeline 1, included spatial smoothing (FWHM = 6 mm) and bandpass filtering (0.009–0.08 Hz) before regression. The first row shows the periodogram of slice-timing and head-motion corrected FMRI data, which were used as the common inputs to both pipelines 1 and 2. The other rows of each column are the periodograms of FMRI data as they are sequentially processed by subcomponents of the pipelines from top to bottom. Gray, black, blue, and red lines are spectral densities of GS, gray matter (GM), cerebrospinal fluid (CSF), and WM masks, respectively, which were averaged across the subjects. Not surprisingly, spatial smoothing can be done at any of these stages, as long as the tissue-based regressors are derived before spatial smoothing. The regression of nuisance parameters can also be carried out either before or after the bandpass filtering stage as long as the nuisance regressors are subject to the same bandpass filter. Otherwise, the regression step would reintroduce frequency components outside of the bandpass range as shown in the bottom row of column 1 [8].

2.3. Correlation Analysis for Seed Pairs

For each individual subject, the time series of 264 seed locations in standard brain space (MNI 152) were obtained from censored and uncensored data to produce two sets of 34,716 () Pearson correlation coefficients [2]. The uncensored correlation coefficients were subtracted from the motion-censored correlation coefficients. The correlation differences are plotted as a function of the Euclidean distance between the pairs of seed locations in Figure 4, and the nonlinear dependence on distance is referred to as the distance-dependent correlation bias. Note that for all models, we censored the same fraction of time points. We also examined the benefits of replacing Pearson correlation with Spearman’s rank correlation, which is more robust to the presence of outliers in the time series that may be induced by motion.

2.4. Fits of Nuisance Regressors

We examined the spatial distribution of variance captured by the nuisance regressors [4] and correlations between them. To this end, we computed (i) the marginal explained variance ( value) maps of the global signal (GS) and six head-motion estimates (MO) at each voxel in the brain (see Figure 5), and (ii) the cross-correlation matrix of the regressors to identify shared variance (see Figure 6). For these types of tests, the regressors were obtained from the time series that were despiked, slice-timing corrected, and volume registered.

3. Results

3.1. Distance-Dependent Correlation Bias after Different Preprocessing Steps

The distance-dependent correlation biases present after different preprocessing steps are shown in Figure 4, with results for the more standard Pearson correlation coefficient shown in the upper row. The distance-dependent bias with the GS + MO model (Figure 4(a)) mimic those obtained in [2]. The distance-dependent bias is captured by the curvilinear blue trace showing the average change in correlation after censoring. What this result indicates is that the correlation estimate can change considerably in the presence of motion and in a manner that depends on the interregional distance. In other words, GS + MO is sensitive to motion and by extension the censoring threshold, since eliminating points of high motion change the correlation values considerably. The desired trend for an estimate in these figures would be a flat line, preferably with zero mean and zero variance as a function of distance. With model GS, where motion estimates with their first difference terms were excluded, the bending of the mean curve was more pronounced than in Figure 4(a) (see Figure 4(b)). With model MO, which included motion estimates and their first differences but excluded tissue-based regressors, the bias was negative throughout and was more constant across interregional distances (Figure 4(c)). Figures 4(b) and 4(c) indicate that while the addition of GS makes the correlation estimate more sensitive to motion, the use of MO alone is not enough to yield a robust estimate of correlations. Most notably, however, when WMeLOCAL was added as a nuisance regressor to model MO (Figure 4(d)), the change in correlation became considerably less variable with distance and closer to zero. The addition of despiking further reduced the bias fluctuation as shown in Figure 4(e) where the nonlinear dependence of bias on distance was mostly eliminated; the mean bias was near zero and the variance of correlation change with censoring was the smallest of all five models tested. Thus of all models tested, Despike + MO + WMeLOCAL resulted in the correlation estimates with minimal sensitivity to the presence of motion.

The lower row of Figure 4 shows results when Spearman’s rank was used to compute the correlations. The trends are largely similar to those in Figure 4, with a small reduction in the scatter of correlation change for (a), (b), (c), and (d) panels where no despiking was performed. Not surprisingly, the use of Spearman’s rank had little effect when despiking was included in the processing stages (e).

3.2. GS + MO Regressor Fits
3.2.1. Explained Variances of Nuisance Regressors

The marginal explained variances ( values) of each regressor are presented in Figure 5. For the regression model GS + MO, the regressors (GS and MO) fit most brain regions and locations at the outer edge of the brain with high values () (see the column GS + MO in Figure 5). When we measured values for each regressor, a different pattern in the spatial locations fit by each regressor could be identified: (i) GS tended to fit GM, the sinuses, and mid-sagittal locations (yellow to red color overlays in the column GS in Figure 5; ), and (ii) MO captured variance more uniformly than GS throughout the brain, and the highest values were observed along the boundary between cortex and nonbrain areas (see the column MO in Figure 5). The areas with the high values of GS and MO seldom overlapped each other.

3.2.2. Cross-Correlation Matrix between Regressors

The cross-correlation matrix between regressors is shown in Figure 6. GS and its first-difference term () only have high correlations with one partial component (dP; the displacement along the anterior-to-posterior direction) of MO and its first difference term (), respectively.

4. Discussion

4.1. Correlation Bias Observed by Motion Censoring

It was reported in [2] that the presence of motion introduces a distance-dependent distortion of correlations, whereby correlations between neighboring voxels were biased differently than correlation between voxels that are more distant. The authors also proposed a version (dubbed “scrubbing”) of motion censoring as a method to mitigate the bias of motion on correlation estimates. The evidence that the distance-dependent bias was introduced by subject motion was summarized in graphs that show the change in correlation magnitude between a set of brain location pairs (regions-of-interest; ROIs) as time points affected by excessive motion were excluded from the correlation estimation. The censoring process reduced the distance-dependent bias. While we agree that censoring is a valid approach, we highlight the fact that the distance-dependent bias does not appear to be driven by the mere presence of motion, and that the particular choice of preprocessing stream considerably exacerbates this distance-dependent bias. To illustrate this effect, we began by reproducing the effects of data censoring on short- versus long-distance correlations. For Figure 4(a), preprocessing included regression of head motion parameters, tissue-based time series including the GS, and their first-order time differences. In a reproduction of the results in [2], we found that censoring differentially affects correlations between ROIs that are close together compared to those that are further apart. This bend in the distribution was considered in [2] as evidence that motion was behind this bias, since lessening the effects of motion through censoring in turn differentially affected correlation values between ROIs depending on their distance. However, this is not entirely the case. In Figure 4(b), we recomputed the correlation differences but without including the 6 motion estimates and their first differences, thereby amplifying the effect of censoring on the correlations. The scatter plot of the correlation difference increased in variance but the distance-dependent bias remained. The nonlinear trends in these scatter plots can be considered as a measure of the sensitivity of particular correlation estimates to motion. The ideal trend for a correlation estimate would be a flattened cloud with small constant variance and a constant bias of 0. In Figure 4(c), we brought back the motion estimates with their first differences but omitted any tissue-derived regressors and their first differences, most notable of which is the GS. With this model, the effect of censoring on the correlations became considerably less dependent on the inter-ROI distance. The correlation changes were also more uniformly negative, implying that sharp head motion tends to increase correlations prior to censoring (see also Gotts et al. [9]). Taking together the results of Figures 4(a)4(c), we can conclude that the addition of GS to the model exacerbates the distance-dependence of the correlation estimates on motion, with results that are more dependent on the level of motion censoring. For Figure 4(d), we repeated the analysis in Figure 4(c) with the additional inclusion of a local eroded white matter signal, a regressor that intends to measure local manifestations of artifacts (e.g., hardware artifacts resulting from faulty head coil channels, [3]) while avoiding regions with the (gray matter) signals of interest. Not only was the dependence on the inter-ROI distance much reduced, but also the mean and range of the correlation bias were closer to zero. Addition of a despiking step at the very beginning of the preprocessing pipeline (Figure 4(e); see also Satterthwaite et al. [10]) further improved these trends, resulting in correlation estimates that varied little with the censoring of high-movement data points. The despiking procedure is often used to dampen the effects of extreme signal deviations on motion correction and variance estimates, and it is essentially a mild form of censoring. While it is expected that despiked data will always result in smaller changes in correlation after censoring, the two operations are not interchangeable as despiking is performed independently for each voxel. In other terms, not all reduced spikes get flagged as high-motion points. In conclusion, with a preprocessing model including despiking as an initial processing step and WMeLOCAL, the correlation estimate was least sensitive to motion artifacts and, by extension, to censoring threshold levels. We emphasize that the fraction of time points censored was the same for all the models tested. Even when despiking was adopted in panel (e), we censored the same fraction of time points as in panel (a), (b), and (c). Therefore the fact that censoring had minimal effect on the correlation estimates suggests that the Despike + WMeLOCAL approach is more robust to motion contamination than all the other models and is consequently least sensitive to censoring threshold levels. These results suggest that ANATICOR, the physiological noise augmented form of Despike + WMeLOCAL, is not only useful for reducing local hardware artifacts, but also local manifestation of motion. While the basis of the benefit of WMeLOCAL in reducing the motion bias is not entirely clear, one possibility is that it provides some adaptation to small local changes in the magnetic field resulting from movement, which will affect the EPI time series. Lastly, we found that using Spearman rank instead of Pearson correlation was of little advantage for despiked time series but was of mild advantage for other conditions.

4.2. Suggested Preprocessing Pipeline

Figure 7 shows the pipeline we recommend for RS-FMRI analysis. Despiking FMRI data at the subject level is recommended to reduce the contribution of sudden spike signals to correlation estimates. Anecdotally, we also found it to improve the accuracy at the volume registration step (see the video in the Supplementary Material and Figure 2). Physiological denoising is carried out early in the processing pipeline because RETROICOR [11] nuisance models depend on the acquisition time of each slice relative to the cardiac and respiratory cycles. These nuisance regressors are projected from the time series immediately after the despiking step. Bandpass filtering should be applied to both data and regressors of no interest. Otherwise, frequency components in cut-off bands will be introduced back through the regressors of no interest. It is best to perform censoring, nuisance regression, and bandpass filtering simultaneously in one regression model. By simultaneously doing these three subprocesses in one general linear model, there is no conflict between bandpassing and censoring. Though not carried out in this work for lack of data (physiological measures were not taken in [2]), physiological denoising is highly recommended [11, 12], as physiological noise differences amongst the subjects can certainly lead to false inferences. In our recommended pipeline in Figure 7, we advocate bandpass filtering in the same model for nuisance regression. This manner of censoring can be handled readily, unlike in the pipelines 1 and 2 of Figure 3.

The regression model used here contains 6 motion estimates, their first difference terms, and WMeLOCAL only, since “global” tissue-based regressors (e.g., GS, average gray matter, GM, noneroded LV, and WM) can also cause group differences either by spreading hardware artifacts that are undetectable by visual inspection in FMRI data [3] or by corrupting the correlation matrix, as can be seen when using GSReg [5]. As long as care is taken to prevent the inclusion of gray matter signals of interest, tissue-based regressors such as eroded LV (LVe) and WMeLOCAL can be beneficial at reducing physiological and hardware artifacts and are part of our recommended ANATICOR [3] approach. The results presented here further demonstrate the utility of WMeLOCAL in helping to reduce head motion artifacts. In these data, we were not able to include time series from the LVe mask as a nuisance component because the erosion operation eliminated too many LV voxels in most subjects due to a combination of small brain size and relatively coarse EPI resolution.

4.3. Summary

In this work, we have demonstrated that the distance-dependent bias in correlations between ROIs reported by Power and colleagues [2] is not driven only by motion. It is considerably exacerbated by the regression of nonspecific, tissue-averaged time series such as the GS. Specifically, the use of GS as a nuisance regressor can increase the sensitivity of correlation estimates to motion and motion censoring levels. This constitutes another example of why GS and equivalent regressors should not be projected out of the data in RS-FMRI [5]. We also find that Despike + WMeLOCAL, a reduced version of our denoising approach dubbed ANATICOR [3], resulted in correlation estimates with minimal sensitivity to motion. While many in the field are rightfully concerned about the impact of motion on functional connectivity measures, these concerns can be effectively mitigated by the choice of appropriate preprocessing methods.

Acknowledgments

This study was greatly facilitated by the generous contribution of data by the authors of Power et al. 2012. The authors thank Kelly Barnes for helpful discussions. This research was supported by the NIMH and NINDS Intramural Research Programs of the NIH.

Supplementary Materials

S1. EPI timseries from an individual subject is shown in the upper row. Motion-corrected results without or with despiking are shown in the middle and lower rows, respectively. The movie corresponds to the motion estimates in Figure 2, and presents visible residual motion between the 60th and 65th time frames. This suggests that the more elevated motion estimates obtained without despiking are less accurate than those with despiking. The subject used is sub0015004 who had the largest head movements in the children group.

  1. Supplementary Material