Abstract
Diffusion MRI (DMRI) plays an essential role in diagnosing brain disorders related to white matter abnormalities. However, it suffers from heavy noise, which restricts its quantitative analysis. The total variance (TV) regularization is an effective noise reduction technique that penalizes noiseinduced variances. However, existing TVbased denoising methods only focus on the spatial domain, overlooking that DMRI data lives in a combined spatioangular domain. It eventually results in an unsatisfactory noise reduction effect. To resolve this issue, we propose to remove the noise in DMRI using graph total variance (GTV) in the spatioangular domain. Expressly, we first represent the DMRI data using a graph, which encodes the geometric information of sampling points in the spatioangular domain. We then perform effective noise reduction using the powerful GTV regularization, which penalizes the noiseinduced variances on the graph. GTV effectively resolves the limitation in existing methods, which only rely on spatial information for removing the noise. Extensive experiments on synthetic and real DMRI data demonstrate that GTV can remove the noise effectively and outperforms stateoftheart methods.
1. Introduction
DMRI is a unique way in the in vivo characterization of anatomical connectivity in the human brain [1]. However, it is frequently impeded by pronounced thermal noise due to its echoplanar acquisition strategies and low signaltonoise ratio (SNR) [2]. One practical way to reduce the noise variance of DMRI data is averaging several acquisitions [3]. Therefore, it requires an extended period that is not suitable for clinical settings. On the other hand, many postprocessing algorithms have been applied in DMRI denoising to improve the data quality. Most of them can be roughly grouped into transformbased and spatial approaches.
Transformbased denoising methods are frequently used in signal processing because it is generally believed that noise and signal are more easily distinguished in the transform domain. Nowak [4] proposed the wavelet domain filter for reducing Rician noise in MRI. This method filters the squared magnitude of MRI in that distribution changes from Rician to noncentral Chi (nc). Wood and Johnson [5] proposed an MRI denoised strategy based on the wavelet packet. This method exceptionally performs well on removing Rician noise with low SNR since wavelet packet provides a more compact signal representation than single wavelet decomposition. Other transformbased approaches, including curvelet transform [6] and blockmatching and 3D (BM3D) [7] denoising schemes, have been proven to tackle Rician noise in MRI data efficiently.
Spatial denoising methods utilize the neighbor information around the pixel to reduce the variance caused by noise. Henkelman [8] presented the pioneering study on estimating Gaussian noise in MR images and smoothened them by convolution filtering, but this method blurs sharp edges that provide crucial clinical information. Gerig et al. [9] applied nonlinear anisotropic diffusion (AD) filters to 2D and 3D MRI denoising. AD filters can remarkably leverage the MRIs’ quality in terms of persevering boundaries. However, AD methods assume that the processed MRI are piecewise constant and the noise is Gaussian distributed with zero means. Both of those mentioned above are not applicable for MRI. Tong et al. [10] proposed an improved AD algorithm that used an adaptive scheme to select parameters to reduce Rician noise in MRI. To avoid tuning numbers of parameters carefully, Awate and Whitaker [11] proposed a nonparametric neighborhood statistics technique for MRI denoising. They inferred the prior from the degraded MRI and the knowledge of the Rician noise model. The uncorrupted MRI statistics are modeled in a nonparametric Markov Random Field (MRF) and estimated by the expectationmaximization algorithm. Experiments show that this method can well preserve important features in brain MRI. Fabio et al. [12] defined a 3D local Gaussian MRF (LGMRF) that allows tuning the filter in an unsupervised way. The LGMRF model can automatically implement the regularization to find the tradeoff between noise reduction and detail preservation.
TVbased regularization [13] is one of the most influential denoising algorithms since it can effectively reduce the variance while preserving shape edges. Martin et al. [14] proposed a TVRician denoising model for MRI data by solving a semiimplicit numerical scheme. Liu et al. [15] extracted the noise map through a twostep waveletdomain estimation method, then denoised the MRI data based on a generalized TV model. Later, the authors extended their work by combining none local mean (NLM) filters [16]. The aforementioned methods mainly process the data in the spatial domain (i.e., space) while ignoring that DMRI data consists of spatial and diffusion wavevector space. These denoising algorithms lead to new smoothing artifacts caused by averaging over differential oriented signals, particularly in the highly curved white matter structures.
Recently, Chen et al. have proposed two novel NLMbased denoise methodologies in joint space. Before NLM denoising, for each point in space, they (1) defined a spherical patch from which they extracted the rotationinvariant features patch matching [17] and (2) performed graph framelet transforms to extract robust rotationinvariant features after encoding the space sampling domain using a graph [18]. Graphs have the ability to well modeling the data with irregular and complex structures [19]. In this paper, we represent the joint space DMRI data with a graph and associate signal coefficients with graph nodes. Then, we consider the denoising DMRI data on the graph as an optimization problem, which seeks to minimize a TV function based on graph edge signals. The main contribution of our GTV denoising framework includes threefold: (i) the proposed method avoids computing rotationinvariant features; (ii) the minimizing variation procedure harnesses not only over spatial space but also angular space, allowing information to be shared across DW images for effective denoising; (iii) the information in angular space represents structures oriented in different directions, so our method is expected to be more effective for denoising without introducing new artifacts.
The remainder of this paper is organized as follows. We describe the detail of the proposed method in Section 2 and evaluate the effectiveness on synthetic data and public real data in Section 3. Section 4 shows our further discussion of this work. Finally, we conclude this work in Section 5.
2. Method
2.1. Graph Representation
Given a graph , where represents the set of nodes and is an affinity matrix, which characterizes the relationships between every pair of nodes on graph. The graph signal is a mapping function that associates signal coefficients to each node of the graph.
We show how to represent DMRI data using a graph. Specifically, given a sampling point in spatioangular domain, which is determined by a spatial location and a gradient direction and diffusion weighting , we consider it as a node of the graph and define the affinity weight with another node as [20] where , , and are three parameters controlling the commitments of spatial, angular, and diffusion weight, respectively [21].
2.2. GTV of DMRI
We define the GTV of DMRI signal with respect to graph as where is a diffusion signal associated with spatial location , gradient direction , and diffusion weighting . is the spatioangular neighborhood of . We set where ; then, (2) becomes the standard norm .
To denoise the DMRI data, we propose where the quadratic fidelity term is to force the underlying clean data to stay close to the acquired noisy DMRI data , and tuning parameter controls the tradeoff between two parts of (3) [22]. Since (1) assumes the nearer nodes have a higher weight, (3) ensures they have lower dissimilarities. It is worth noting that our GTV takes the neighborhood region compared to traditional TV, which only considers the neighbors of a sample along Cartesian axes.
2.3. Optimization
Graph gradient is a linear operator (hereafter denoted as ) which associates the DMRI signal to the corresponding node, and consequently, (3) is usually referred to the primal problem since it can be expressed as the following form: where represents the fidelity term in (3) and it is convex and differential with a Lipschitz continuous gradient . (4) is related to the following dual problem [23]: where is the conjugate of function . It is worth noting that only a small number of neighbors are taken into account in the proposed graph (i.e., the number of edges is much smaller than that of nodes); therefore, (5) is much easier to solve than (4). The second term where is convex and has proximity operator [24].
Note that is the step size in the iterative optimization process and is typically set to the inverse of the Lipschitz constant, and denotes the Hadamard product. With the aid of these tools, we can recover the unique solution of (3) via a forwardbackwardbased primaldual algorithm [23, 24].

3. Experiments
3.1. Datasets
We evaluate the proposed GTV based on a synthetic dataset and a subject from Traveling Human Phantom (THP) that was collected for a multisite neuroimaging reliability study [25]. The research was completed according to the endorsed guidelines.
In order to evaluate the denoising performance of the proposed method for various fiber structures, we use phantoms to produce a synthetic dataset [26]. The fiber geometric configuration is completely identical to that in ISBI 2013 HARDI challenge, and the gradient direction files are consistent with the THP dataset, i.e., using with a total of 30 noncollinear gradient directions. Image dimensions are with resolution.
THP data was acquired using the Siemens 3T TrioTim MR scanner with the following imaging protocol: imaging matrix, resolution, ms, and ms. values and gradient direction are the same as the synthetic data.
To evaluate the performance of our proposed approach, we add Gaussian distributed noise with zero mean and different deviation of and to the synthetic data. Some example images are shown in Figure 1.
(a) Ground truth
(b)
(c)
(d)
3.2. Parameter Settings
We have implemented Algorithm 1 with the help of opensource Python packages including PyGSP [27] and PyUNLocBox [28], which, respectively, facilitate a number of operations on graph and solvers on nondifferentiable convex optimization problems using proximal splitting methods [29]. The relative tolerance stopping criterion is set to , and the maximum number of iterations is set to [24]. Both of them are set automatically by PyUNLocBox. Larger values of tuning parameter lead to smoother solutions of (3) whereas smaller values emphasize better fitting to the required measurements. Hence, we set .
The proposed approach regards each sampling point in the spatioangular domain as a node. Therefore, DMRI data will lead to a large graph, requiring colossal memory, and GTV is difficult to converge. To overcome this weakness, we construct the graph by dividing the DMRI data into overlapping patches in spatial space, then optimize each patch simultaneously. Although a larger patch can capture more context information, we set the size of the patch to and the overlap step to according to our compute unit with 12 GB memory.
As mentioned in (1), construction of the graph involves three parameters: , , and . We normalize each exponent in exponential functions to as it was suggested in [21]. The maximum distance in a patch equals to 64, and therefore, we set . We fix parameters and , then evaluate the influence of parameter by measures of peak signaltonoise ratio (PSNR) and structural similarity index (SSIM) [30]. As indicated in Figure 2, the proposed method will achieve the best performance when .
3.3. Methods for Comparison
We compared GTV with the following stateoftheart methods: (1)Adaptive nonlocal means (ANLM): ANLM [31] is an improved version of the NLM algorithm, which is adaptive soft coefficient matching. Based on [31], we set the patch radius to 2(2)MarcenkoPastur PCA (MPPCA): MPPCA [32] classifies the principal components of the observed DMRI signal based on MarcenkoPastur distribution, then removes them as noise. Both ANLM and MPPCA are implemented by DIPY [33](3)XQ nonlocal means (XQNLM): XQNLM [17] denoises the signal via weighted averaging of selfsimilar information, which is defined in the spatialangular domain. The parameters are set to the default values as suggested in [17]
We transform the Rician signal to its Gaussiandistributed counterpart using Özarslan et al.’s method [34]. For a fair comparison, we estimate the nonstationary noise by MPPCA for ANLM, XQNLM, and GTV. Then, we determine the standard deviation of stationary noise via PIESNO [35].
3.4. Results
We first quantitatively show the denoise performance through measures including SSIM and PSNR in Figure 3 where the axial represents the noise level. Although GTV has achieved slightly higher SSIM than MPPCA and XQNLM, all of them are close to 1, meaning that denoised data structures closely resemble the original synthetic data. Regarding PSNR, GTV and XQNLM have outperformed MPPCA and ANLM, which indicates filtering in spatialangular space can remove more noise.
We randomly select two regions of interest (ROI) marked by red and blue rectangles, respectively. Closeup views of DW images and RMSE maps are presented in Figure 4. The first and third rows of DW images show that MPPCA, XQNLM, and GTV have outperformed ANLM. With the help of RMSE maps in the below rows, we can observe that the reconstruction error of GTV is smaller than MPPCA and XQNLM, which indicates GTV can restore the image more precisely.
One disadvantage of NLMbased methods is that the denoising process may add method noise which usually blurs the outputs. The region marked by the red square in Figure 5 is a part of the boundaries between white matter and the ventricles. Due to the consideration of information in the space, XQNLM and GTV have recovered the edges as MPPCA does. In contrast, ANLM has removed some noise, but the remaining is far from perfect. The region marked by the blue square in Figure 5 includes a contingent of boundaries, and it shows that GTV reconstructs the internal details while its boundary is the clearest.
Residual maps are used to evaluate whether structural information has been removed after denoising. In Figure 6, residual maps of MPPCA, XQNLM, and GTV have less structure information than that of ANLM. This observation provides evidence that ANLM has lower edgepreserving abilities in Figure 5 from another perspective.
We further evaluated the denoising performances through fiber ODFs. THP [25] is singleshell (with only a single image) DMRI data. Response functions for singlefiber white matter (WM), as well as gray matter (GM) and cerebrospinal fluid (CSF), were estimated from the denoised data using an unsupervised method [36]. Then, we performed SingleShell 3Tissue Constrained Spherical Deconvolution (SS3TCSD) [37] to obtain WMlike FODs as well as GMlike and CSFlike compartments in all voxels using MRtrix3Tissue, which is a fork of MRtrix3 [38]. The first row in Figure 7 shows a slice of FODbased directionally encoded color (DEC) maps [39]. The fiber ODFs in the region marked by the red rectangle indicate more coherence after denoising. The region marked by a blue rectangle indicates that the proposed GTV has more effectively reduced spurious fiber peaks that result from noise.
4. Discussion
GTV can remove the noise more effectively, which may be attributed to the following aspects: (1)DMRI uses space information to characterize the direction and scale of the diffusion for water molecules in the tissues. The proposed GTV takes advantage of neighborhood similarity information in the spatioangular domain while constructing a graph(2)Graph representation is a versatile model where nodes are associated with DMRI signal intensity, and edges reflect structural information
Although the proposed algorithm is very effective for denoising, it is not superior to the recent novel DMRI denoising method Patch2Self [40] which denoises DMRI with a selfsupervised deep learning strategy. Furthermore, there may be two limitations in this study. First, taking each sample in spatioangular space as an independent node leads to a big graph, which requires a large amount of memory for storing graph properties, including affinity weighted matrix and Fourier basis. It can be relieved by dividing the data into patches [21]. Second, optimization of total variation takes a relatively long time. We speed up the denoising by using the multicore CPUs to optimize each patch simultaneously. Possible solutions include refactoring the software using C++ and combining a twostep optimization approach [16] developed to solve the resulting convex denoising GTV model.
5. Conclusion
In this study, we formulated denoising processing as an optimization problem, finding the DMRI data with minimal graph total variation. Both spatioangular information of DMRI data were incorporated to construct the graph, which significantly contributed to this paper. The performances of our proposed method were assessed via experiments on synthetic and real data. Numerical results demonstrate that GTV outperforms various current stateoftheart approaches in terms of preserving edges and removing noise. Future works may extend GTV to patch GTV that associates values in a patch with a graph node.
Data Availability
The real data is downloaded from https://openneuro.org/datasets/ds000206/versions/1.0.0/download.
Conflicts of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Acknowledgments
The authors thank Dr. Geng Chen who provided the XQNLM source codes and guidelines for experiments and Dr. Hans J. Johnson who provided the data to OpenfMRI database. This work was supported in part by the National Science Fund of China under Grant No. 61806098.