Abstract

Various inverse algorithms have been proposed to estimate brain electrical activities with magnetoencephalography (MEG) and electroencephalography (EEG). To validate and compare the performances of inverse algorithms, many researchers have used artificially constructed EEG and MEG datasets. When the artificial sources are reconstructed on the cortical surface, accuracy of the source estimates has been difficult to evaluate. In this paper, we suggest a new measure to evaluate the reconstructed EEG/MEG cortical sources more accurately. To validate the usefulness of the proposed method, comparison between conventional and proposed evaluation metrics was conducted using artificial cortical sources simulated under different noise conditions. The simulation results demonstrated that only the proposed method could reflect the source space geometry regardless of the number of source peaks.

1. Introduction

Noninvasive measurements of brain electrical activities with electroencephalography (EEG) and magnetoencephalography (MEG) enabled us to estimate the underlying cortical activities, thereby contributing to the rapid development of clinical and cognitive neuroscience. To estimate the cortical electrical activities from EEG and MEG, of which the process is often called EEG/MEG source imaging, highly underdetermined inverse problems have to be solved using linear or nonlinear inverse algorithms since the source estimation from EEG and MEG signals is an ill-posed problem, which generally produces blurry or inaccurately positioned source estimates [1]. Many mathematical approaches and techniques have been proposed to estimate accurate source locations and strengths. Among them, minimum-norm estimate (MNE) has been the most widely studied inverse algorithm as MNE is simple and has linearity [2]. MNE chooses a source distribution where the norm of the current distribution is minimized. On the contrary, minimum current estimate (MCE) selects a source where the norm of the current is minimized [3]. Other than those two representative algorithms, there have been several modifications of norm minimization, for example, low-resolution electrical tomography (LORETA) algorithm [4] and the focal underdetermined system solution (FOCUSS) algorithm [5].

When a new source imaging algorithm is proposed, the performance of the inverse algorithm needs to be verified and compared with those of the existing ones. Since the reconstructed source distributions are hard to be verified using in vivo experiments, many researchers have used artificial EEG/MEG human skull phantoms [6] or realistically simulated EEG/MEG datasets. Since the use of simulated EEG/MEG data allows us to readily adjust and control noise levels and source configurations, that is, the number and size of source patches, most inverse algorithms are generally verified using simulated EEG/MEG data [25]. In recent simulation studies, the source spaces are generally constrained only on the interface between white and gray matter of the cerebral cortex, generally called cortical surface, considering neurophysiology. The orientations of the cortical sources are also assumed to be perpendicular to the cortical surface [7]. In such simulations, both the original source patches and the reconstructed sources are commonly distributed on the cortical surface generally tessellated with surface triangular elements.

For the evaluation of the reconstructed sources, evaluation metrics or error metrics need to be introduced to probe the similarity between the simulated and reconstructed sources. The well-known evaluation metrics are root mean square error (RMSE), shift of the maximum (), shift of the center of mass (), and the correlation coefficient (CC) [6]. Each metric has its own pros and cons. In contrast to the conventional geometric error metrics such as and , RMSE and CC do not reflect the geometry of the cortical surface. However, compared to and , RMSE and CC are reliable specifically when the source distributions are not concentrated to a single peak. For more accurate and robust estimation of the accuracy of reconstructed EEG/MEG sources, we modified CC by giving the geodesic distance weights to the reconstructed sources to reflect the geometric information of cortical surface. To validate the new evaluation metric, named weighted correlation coefficient (WCC), some representative examples were used.

2. Methods

2.1. EEG/MEG Inverse Problem

When a set of possible source locations and sensor positions is given, thanks to the linearity of Maxwell’s equations, an EEG/MEG forward model can be described as , where is an by EEG/MEG lead field matrix, is an by 1 unknown source vector, is an by 1 recorded EEG/MEG data, and s is the additional sensor noises. The inverse problem for estimating from has no unique solution. To estimate the possible solutions, MNE adopts the following minimization problem:

Then, the estimated solution can be written as where is a regularization parameter, which was determined using the generalized cross-validation method [8].

2.2. Conventional Evaluation Metrics

We assume that both the simulated true sources and the estimated sources are distributed on the 3D cortical surface. The dimension of both vectors is by 1, where is the number of nodes on the source space. We firstly summarize four conventional evaluation metrics, having been frequently used for assessing the accuracy of the source estimates.

2.2.1. Root Mean Square Error

The root mean square error (RMSE) is the most well-known and convenient way to measure the error between the actual source and the estimated source. RMSE is formulated as where and are the th elements of andrespectively.

This metric is easy to implement and can be used regardless of the shapes of the source distributions. However, RMSE does not reflect the geometry of the cortical surface since RMSE is computed with just vectored values.

2.2.2. Shift of the Maximum of the Estimate

The shift of the maximum of the estimate () is the simplest measure which reflects the geometry of the source space. indicates the distance between the locations where the maximum intensities of sources are generated. The maximum intensities of the actual and reconstructed source are assumed to be located at and ; respectively, where is the coordinate of th node, then is defined as and ranged from 0 to , the maximum distance within the brain.

This measure is reliable only when the actual source is concentrated around the location of the maximum source intensity because it does not consider the distributions of the cortical sources. When is adopted as a measure, the merit of distributed source modeling disappears. For example, even when the extents of the true source and the reconstructed sources are largely different, identical maximum location makes the value be 0.

2.2.3. Shift of the Center of Mass

The center of mass has been widely used for evaluating various algorithms adopted not only in EEG and MEG but also in other functional brain imaging techniques such as functional magnetic resonance imaging (fMRI) and positron emission tomography (PET). The center of mass of the actual source and the center of mass of the reconstructed source are computed as

As assuming the distributed source to be a dipole source placed on the center of mass of the source, the shift of center of mass () is defined as the distance between and :

is similar to in that the distributed source is considered as a point source placed at a single location. Therefore, is also reliable only when the simulated source is concentrated around . If the distribution of the source has a radial symmetry, becomes equivalent to .

2.2.4. Correlation Coefficient

The correlation coefficient (CC), a concept adopted from statistics, is a measure of linear dependency between two variables, and the value ranges between −1 and 1. It has been widely employed as a standard measure in various fields of engineering and sciences. The conventional CC is defined as the covariance of and divided by the product of their standard deviations: where the covariance is defined as and represents the mean value of the source j:

If the distribution of the reconstructed sources is similar to that of the actual sources, the value of CC is close to 1; if the distribution of the reconstructed sources is different from that of the actual sources, CC is close to −1. CC is reliable even when the source distribution is not concentrated to a single location or when the true source has many distinct peaks. However, similar to RMSE, CC cannot reflect the real geometry of the cortical surface.

2.3. New Algorithm: Weighted Correlation Coefficient

When we categorize four conventional measures mentioned in Section 2.2 in terms of geometric consideration, contrast to and , RMSE and CC do not reflect the geometry of the cortical surface. However, RMSE and CC are reliable compared to and when the source distribution is not concentrated to a single peak. To combine the advantages of both types of conventional measures, we modified CC by giving the source vector a weight reflecting geometrical information of cortical surface. The new evaluation measure, named weighted correlation coefficient (WCC), is defined as and is an by weighting matrix that can be computed as where is an by identity matrix. is an by distance matrix whose element is given as and is the maximum value in . If , the Euclidean distance is employed, and if , then the geodesic distance is employed to obtain the distance matrix. The geodesic distance was computed by solving the Eikonal equation on the tessellated cortical surface [9]. The main diagonal of the weight matrix was filled with 1, and the off-diagonal elements were filled with values between 0 and 1. By multiplying weight matrix to the source vector , the geometric information of cortical surface is considered.

Additionally, Euclidean or geodesic distance can be employed in the definition of the distance matrix . Since the cortical surface of a human brain is folded, the geodesic distance is more suitable to reflect the geometric information of the cortical surface than the Euclidean distance. The Euclidian distance is computed by the Cartesian coordinates regardless of the geometrical feature of the cortical surface. However, as the geodesic distance implies the minimum distance along the surface, the geodesic distance between the two adjacent gyri should be greater than the Euclidian distance. Figure 1 is an example of the Euclidean and geodesic distance between each cortical surface vertex and a reference point located at right dorsolateral prefrontal cortex, corresponding to a column of the distance matrix.

Once the distance matrix is computed, the weighting matrix is determined from (2.13). Figure 2 shows the Euclidean and geodesic weights corresponding to the Euclidean and geodesic distance matrices exampled in Figure 1. In contrast to Figure 1, as getting farther from the reference point, the weight is getting smaller.

The characteristics of the conventional and proposed measures are summarized in Table 1. Low values of RMSE, , or and high values of CC and WCC indicate the accurate reconstruction. Only WCC is applicable to the case of multipeak and can consider the geometry of source space.

3. Results

To compare and verify the conventional and proposed measures, a simple two-dimensional example was simulated as shown in Figure 3. The source space was defined as a two-dimensional rectangle. The actual source distribution is given in Figure 3(a), and five reconstructed sources are given in Figures 3(b)3(f), each of which was denoted as , , , , and . The source current intensities are indicated with different colors. If we evaluate the reconstructed sources based on visual inspection, anyone would agree that is the most accurate reconstruction and is the second best one. seem to be the worst reconstruction as the peak location is farthest from the actual one and no reconstructed source is overlapped with the actual one. and seems to be better matched than , but it is difficult to judge which result is better. The result has no commonly activated region with the actual source, but the distribution is close to the actual source distribution, whereas has slightly overlapped region, but other regions are located far from the actual source location. If we assume visual inspection () as a qualitative measure, the rank of the reconstructed sources can be expressed as .

We then employed the conventional and proposed quantitative measures for the evaluation of the reconstructions depicted in Figure 1 and summarized the result in Table 2. All measures commonly indicated that is the best reconstruction and is the worst reconstruction. However, the different metrics showed different evaluation results for , , and . In the case of RMSE, was evaluated as the worst reconstruction, and and had an identical RMSE value, which was because RMSE was affected by the commonly activated regions regardless of the source geometry. In the case of , which considers only the maximum location of the source, the results of and were equivalent. Similar to RMSE, CC classified as the worst reconstruction and and had an identical CC value. Both and WCC evaluated the reconstruction results identically to the visual inspection results. However, if the actual source has multiple peaks, cannot be accurately evaluated.

We then applied the conventional and proposed evaluation metrics to the evaluation of distributed sources on the cortical surface. The cortical surface was extracted from a structural MRI of a standard brain atlas provided by the Montreal Neurological Institute (MNI). To extract and tessellate the cortical surface, we used CURRY6 for Windows (Compumedics, Inc., El Paso, TX). The actual source defined on the cortical surface is given in Figure 4(a), and the reconstructed sources are given in Figures 4(b)4(f), each of which was denoted as , , , and . If we verify the reconstructed sources by visual inspection, seems to be the most accurate reconstruction and seems to be the second best one. The two distributions and are very different from the actual source distribution. We can roughly estimate the rank of the reconstructed sources as .

The quantitative evaluation results with conventional and proposed metrics are shown in Table 3. In the case of RMSE, the RMSE values corresponding to ~ were increasing, which coincided with the visual inspection results, but the increment was very small compared to the absolute values of RMSE. The results of and showed that is the better reconstruction than . Moreover, since the center of mass of was located near the actual source, and of were less than those of . Subscripted and indicate that the Euclidean and geodesic distance matrices were adopted, respectively. CC also could not distinguish the difference between and . and are the WCC results when the Euclidean and geodesic distance matrices were adopted, respectively, to construct the weighting matrix. Both and were proved to be reliable since both results coincided well with the visual inspection results. However, compared to , could not reflect the large difference between and , which are located even in different hemispheres.

We performed extensive computer simulations to quantitatively compare the performance of the proposed measure with that of conventional measures. 2,000 locations on the cortical surface are selected randomly, and on each location a single constant source patch is generated. Consequently, our computer simulations used 2,000 cortical patches whose averaged radius is 6 mm (±1.2 mm). After solving the MEG forward problem with each cortical patch, different-level white Gaussian noise are added to the simulated MEG signal data. We set the signal-to-noise ratio (SNR) values from −10 dB to 30 dB. The reconstructed source is computed by the minimum norm method with each MEG signal then evaluated with different measures. Table 4 shows the averaged accuracies with conventional and proposed measures with respect to SNR. We expect that results of reconstruction to become more accurate as SNR is getting higher. In the case of RMSE, though the RMSE is increasing as SNR becomes lower, the difference between 10, 20, and 30 dB cases is not clear as much the cases of 0 and −10 dB. In the cases of geodesic measures (, , , ), the relation of accuracy and SNR is not consistent with the expected tendency. Moreover, the accuracy of low SNR is occasionally greater than that of high SNR in geodesic measures. In the case of CC with high SNR (10, 0, and −10 dB), there is only a marginal difference compared to that of low SNR. Only the accuracy measured by WCC is consistently decreasing as the SNR becomes lower.

4. Conclusion

The geometric measures (, ) could reflect the geometry of the source space, while the statistical measures (RMSE, CC) could be applied regardless of the distribution characteristics of the sources. In this paper, a new evaluation metric named WCC was proposed to combine the advantages of both types of evaluation metrics and showed enhanced performances compared to the conventional metrics. From the extensive simulation, we could conclude that the proposed measure is very promising to evaluate accuracy of reconstructed sources or EEG/MEG inverse algorithms.