Computational and Mathematical Methods in Medicine

Volume 2018, Article ID 4254189, 9 pages

https://doi.org/10.1155/2018/4254189

## Kernel Principal Component Analysis of Coil Compression in Parallel Imaging

^{1}Computer Science and Engineering Technology Department, University of Houston-Downtown, Houston, TX 77002, USA^{2}Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China

Correspondence should be addressed to Haifeng Wang; nc.ca.tais@1gnaw.fh

Received 26 November 2017; Accepted 7 March 2018; Published 19 April 2018

Academic Editor: Sheng-Yu Peng

Copyright © 2018 Yuchou Chang and Haifeng Wang. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

A phased array with many coil elements has been widely used in parallel MRI for imaging acceleration. On the other hand, it results in increased memory usage and large computational costs for reconstructing the missing data from such a large number of channels. A number of techniques have been developed to linearly combine physical channels to produce fewer compressed virtual channels for reconstruction. A new channel compression technique via kernel principal component analysis (KPCA) is proposed. The proposed KPCA method uses a nonlinear combination of all physical channels to produce a set of compressed virtual channels. This method not only reduces the computational time but also improves the reconstruction quality of all channels when used. Taking the traditional GRAPPA algorithm as an example, it is shown that the proposed KPCA method can achieve better quality than both PCA and all channels, and at the same time the calculation time is almost the same as the existing PCA method.

#### 1. Introduction

Parallel imaging methods [1, 2] have been widely used to accelerate MRI acquisitions. Due to the increased number of coils in parallel Magnetic Resonance Imaging, the numbers of coils (e.g., 128 channels) have been developed to improve the image quality of reconstruction and the sampling speed of acquisition [3–5]. On the other hand, the calculated cost increases as the number of coils increases, especially for coil-based reconstruction methods such as GRAPPA [2]. A number of coil compression methods have been proposed [6–19] to reduce computational time. They can be divided into two categories, one based on the hardware approach [5] and the other based on the software approach [6–19]. Those software-based coil compression methods provide a more flexible way to reduce computation workload. For example, principal component analysis (PCA) has been applied on compressing large array coils [10, 19]. The coil compression process produces a smaller set of virtual channels that can be represented as a linear combination of physical channels. The method has been successfully applied to most existing reconstruction methods such as SENSE [1], GRAPPA [2], and SPIRiT [20]. All existing coil compression methods have demonstrated that the number of channels can be greatly reduced without significant loss of SNR or image degradation, thereby increasing computational efficiency. In addition to saving computing time, PCA-based channel reduction methods have been shown to have noise reduction effects [10, 19, 21]. However, this denoising effect has been discussed in [21] without significant improvement.

The purposed method is to study the noise reduction capability of software-based coil compression methods while achieving noise suppression and channel reduction simultaneously. And we present a PCA-based approach, which is a nonlinear extension of the conventional PCA method [10, 19]. In contrast to the linear combination used in the conventional PCA, the proposed channel reduction technique nonlinearly combines physical channels to generate a new reduced set of virtual channels. Actually, the conception of nonlinear reconstruction using kernel methods has been studied in nonlinear GRAPPA [22], and the advantages of nonlinear combination over linear techniques have been demonstrated. The proposed kernel PCA (KPCA) method can reduce the usage of nonlinear combination on additional dimensions and more effectively enhance the quality of coil channels. In experiments, we used the GRAPPA method [2] as the reconstruction demos to achieve the final images from the data reduced channels. When generating the same small number of virtual channels, the proposed KPCA can reduce GRAPPA calculation time the same as the previous PCA-GRAPPA reconstruction [10] calculation time; however, the signal-to-noise ratio (SNR) is higher than the conventional GRAPPA [2] and PCA-GRAPPA [10].

#### 2. Background

Generally, the GRAPPA reconstruction [2] can be represented aswhere the unacquired -space signal (the left side of (1)) is calculated by a linear combination of -space signals (the right side of (1)). Here, represents the coefficient set, is the outer reduction factor (ORF), is the target coil, counts all coils, and are calculated by taking neighbored -space data in and directions, respectively, and the variables and represent the coordinates encoded along the frequency and phase, respectively. The GRAPPA formulation can be simplified as a matrix equation:where** D **denotes the matrix consisting of the acquired data,** b** represents the vector of the missing data, and** x** represents the coefficients.

In general, the coefficients are dependent on the coil sensitivity, which are a priori unknown. In GRAPPA, autocalibration data (ACS) are obtained and used as the vector** b** to estimate the coefficient vector** x**. The least-squares method is usually used to calculate the coefficients:When the matrix** D** changes with a higher reduction factors, the noise in the estimation coefficients can be greatly amplified.

As a dimension reduction technique, PCA has been successfully used to reduce the number of effective channels in GRAPPA reconstruction [8, 9]. The PCA finds an orthogonal linear transformation that converts the data to a new coordinate system so that the largest change in any projection of data comes from the first coordinate, the second largest change is on the second coordinate, and so on. When applied to channel reduction, the ACS data is used to obtain the transformation and then applied to all acquired data to obtain a new dataset in the new coordinate system. Mathematically, the linear transformation** W** can be calculated by the eigen-decomposition of the covariance matrix of the ACS data: where consisted of vector generated from the ACS data of the th channel (a total of* L* channels) after removing the average;** W** and are, respectively, eigenvectors and eigenvalues of the matrices. The new coordinates based on eigenvectors are called principal components. Assuming that the direction of largest variance represents interesting information and the direction of the minimum variance indicates noise that is not of interest. For simplicity, only a few first eigenvectors corresponding to the largest eigenvalues are retained to form a linear transformation** T**. The transformation matrix is then applied to the acquired -space data to obtain an orthogonal projection of the eigenvectors, resulting in a new set of reduced virtual channels. Then, the undersampled data is reconstructed in the transform domain by conventional GRAPPA. Note that [9] the number of source channels () and the number of target channels () may differ after PCA reduction. One may be bigger than the other, with the same calculation time to get the best result. The final image is produced by combining the virtual channels with root sum-of-square (SOS). Obviously, the assumptions in the PCA are not necessarily kept because of the possibility of small variance in the direction of interesting signals, in which case the useful information is lost after reductions.

#### 3. Proposed Method

##### 3.1. Kernel PCA

The kernel method [23] is a widely used machine learning method. The main idea of the kernel method is that a set of points which cannot be linearly segmented in a low-dimensional space is likely to become linearly separable when transformed into a set of points in a high-dimensional space. For a given linear algorithm, the data is mapped from the input space to the feature space* H *through a nonlinear mapping Φ(·): , and then the algorithm is applied on the vector representation of the data. When the PCA method is a nonlinear mapping algorithm, the approach becomes a kernel PCA (KPCA) method.

PCA is a process of attribute dependency. The correlation here mainly refers to the linear correlation. So, for nonlinear situation, it involves kernel PCA called KPCA [24]. Intuitively, the kernel PCA is the PCA dimensionality reduction based on the kernel space after the original sample has passed the kernel mapping. KPCA formula derivation and PCA are very similar, but there are two innovations. In order to deal with nonlinear data better, a nonlinear mapping function is introduced to map the data in the original space into a high-dimensional feature space. For any vector in space, even if it is a basis vector, all samples lie in the linear representation. After kernel mapping, we make a linear PCA on the new data in the feature space constructed by the product of vector elements, thus taking into account higher-order statistics. We applied kernel PCA on parallel imaging reconstruction methods such as GRAPPA [2].

##### 3.2. Nonlinear Mapping Function

In order to achieve a smooth relationship, a nonuniform polynomial kernel is selected for Φ mapping. It has the following form:where and are scalars; represents the degree of the polynomial. Due to explicit representation of nonlinear mapping Φ(**A**) of , polynomial kernel is also suitable for mapping MRI data. For instance, Φ(**A**) maps the original* L*-channel data** A** to , when is 2,where are vectors representing different channels; superscript means piecewise square; denotes piecewise multiplication. It can be seen that the vector includes the constant, linear, second-orders in the original data, and Φ(**A**) has terms in total.

In order to avoid overfitting, some second-order terms are removed. In particular, the second-order terms are rearranged in the following order. The square terms are selected within each coil at first, and then the product terms between the nearest neighbors are chosen, and then the next-nearest neighbors are selected in -space and so on. The vector Φ(**a**) is removed by using sorted terms based on the desired dimension of the feature space. If all second-order terms are truncated, the proposed method is the same as linear PCA-based channel compression algorithm.

The target channels are corresponding to data on the left side of (1) and source channels as those for the right side of (1). The original space for the target channel is selected to avoid the complexity of converting the data from the feature space back to the original space. The source channels are used for estimation only, so there is no need to convert it back to the original space. The number of second-order terms to be three times of that of the first-order terms is chosen for building the source channels. Since MRI noise is generated in a very complicated procedure, which can be considered as non-Gaussian distribution [25]. Noise also exists in sensitivities of acquired channel data. Noise and true signal can be considered as error-in-variable model [22]. The traditional linear space is mapped to nonlinear feature space to capture noise characteristics existing in coil sensitivities. Nonlinearity is added to modulate sensitivities in the channel compression procedure. The benefit of the proposed method is the simultaneous channel compression and noise suppression in reconstruction procedure.

To balance linearity and nonlinearity of the new coordinate system, the parameters and are finely tuned. If the nonlinearity dominates the coordinate, the reconstructed image quality is distorted since the original channel information is lost and overridden by nonlinearity information. By contrast, if the nonlinearity is too tiny, reconstruction is almost equivalent to original PCA-based channel reduction method, so that nonlinearity does not have effect on suppressing noise. and adjustable *λ* are set to obtain the better performance. The maximum absolute value of the second-order terms is identified for building the feature space. sets the value within the range within () based on the experience that the reconstruction is insensitive to the values in the above range.

##### 3.3. Proposed Algorithm

The proposed method is presented in the following steps.

*Step 1. *Extracted calibration data is the input data of KPCA for target channels and source channels, respectively. The calibration data in each channel is arranged into a vector; therefore, there are overall vectors corresponding to overall channels of original -space data.

*Step 2. *Nonlinear mapping Φ is applied on random variable** V** here to construct the covariance matrix and of target channels and source channels, respectively. The new vectors** U** are constructed as follows:where denote vectors obtained from original -space ACS data; second-order terms represent the vector from the point-wise multiplication by . For example, . Furthermore, the dimension of** U** is , where is the total number of -space data obtained at the central strip, and* Nyquist* rate (, where is the number of phase-encoding lines fully sampled with Nyquist rate, and is the number of points along the frequency-encoding direction) is fully sampled. If ACS lines are defined, which are fully sampled -space data at the central strip, can be derived. Nonlinearity controlled by the parameter is added into the new coordinates. Since the target channels will be used for final reconstruction which can’t be incorporated large nonlinearity, so both parameters and are tuned for constructing** U**_{t} and** U**_{s}, respectively. Generally, is much smaller than .

*Step 3. *For target and source vectors** U**_{t} and** U**_{s} produced in Step 2, mean of zero is calculated to make sure** U**_{t} and** U**_{s} will be the direction of maximal variance. The mean of zero can be calculated as follows:

*Step 4. *For target channels, covariance matrix is generated as follows: where one component represents covariance between random variables and . The parameter can be set as zero to keep uniform with PCA-based channel reduction.

*Step 5. *Similarly to Step 4, covariance matrix is constructed for source channels. The difference is that the parameter is chosen, which is generally larger than in Step 4.

*Step 6. *Calculate eigenvalues and eigenvectors using singular value decomposition (SVD) on covariance matrix and , respectively. Since nonlinear mapping is directly used here and kernel trick matrix is not needed to be computed here, SVD can be directly used here to calculate eigenvalues and eigenvectors, like conventional linear PCA does [23]. The transformation matrix** T** is composed of eigenvectors of the covariance matrix, which transforms data in** U**_{t} and** U**_{s} into new coordinates.

*Step 7. *Generate the transformed data in the new coordinates for target channels and source channels, respectively. Similarly to [26], and are not needed to be necessarily equal. In the calibration step, in (1) are obtained from target channels and in (1) are obtained from source channels to calculate weights. In the synthesis step, calculated weights and acquired data on source channels are combined to predict missing values on target channels, which are used for final image reconstruction.

#### 4. Results

We validate the proposed algorithm performance by using three MRI datasets. At first, a uniform water phantom was scanned using a gradient echo (GRE) sequence (15.63 kHz bandwidth, FOV = 250 mm^{2}, matrix size = 256 × 256, TE/TR = 10/100 ms, and slice thickness = 3.0 mm). Then, a coronary brain image was acquired by using a 2D spin echo (SE) sequence (slice thickness = 3.0 mm, matrix size = 256 × 256, FOV = 240 mm^{2}, and TE/TR = 2.29/100 ms,). The third set of axial brain data was acquired on a 3T scanner (SIEMENS AG, Erlangen, German) with a 32-channel head coil using a 2D gradient echo sequence (TE/TR = 2.29/100 ms, flip angle = 25, matrix size = 256 × 256, slice thickness = 3 mm, and FOV = 24 cm^{2}). The conventional GRAPPA [2] and PCA-based GRAPPA [10] were implemented for comparing with the proposed method on the Matlab platform (Mathworks, Natick, MA, USA). For reference image, fully sampled data with all channels are reconstructed via root sum of squares (SoS).

To measure signal loss in channel compression, KPCA and PCA channel reduction based reconstructions with fully sampled data are evaluated firstly. Both of KPCA and PCA are applied to reduce the total 32 channels to 10 channels without undersampling -space data. The compressed channels are used to reconstruct the images with inverse Fourier transform, respectively. Both reconstructed images are compared to the reference image with fully sampled data of all 32 channels. As shown in Figure 1, KPCA channel compression based reconstruction can suppress more noise than PCA channel compression based reconstruction in the region of interest (ROI), as demonstrated in the difference images.