Computational and Mathematical Methods in Medicine

Volume 2015 (2015), Article ID 831790, 12 pages

http://dx.doi.org/10.1155/2015/831790

## A Model of Regularization Parameter Determination in Low-Dose X-Ray CT Reconstruction Based on Dictionary Learning

^{1}Medical Imaging Laboratory, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Suzhou 215163, China^{2}Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun 130033, China^{3}University of Chinese Academy of Sciences, Beijing 100049, China^{4}PET Center, Huashan Hospital, Fudan University, Shanghai 200235, China

Received 13 March 2015; Accepted 11 June 2015

Academic Editor: Lin Lu

Copyright © 2015 Cheng Zhang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

In recent years, X-ray computed tomography (CT) is becoming widely used to reveal patient’s anatomical information. However, the side effect of radiation, relating to genetic or cancerous diseases, has caused great public concern. The problem is how to minimize radiation dose significantly while maintaining image quality. As a practical application of compressed sensing theory, one category of methods takes total variation (TV) minimization as the sparse constraint, which makes it possible and effective to get a reconstruction image of high quality in the undersampling situation. On the other hand, a preliminary attempt of low-dose CT reconstruction based on dictionary learning seems to be another effective choice. But some critical parameters, such as the regularization parameter, cannot be determined by detecting datasets. In this paper, we propose a reweighted objective function that contributes to a numerical calculation model of the regularization parameter. A number of experiments demonstrate that this strategy performs well with better reconstruction images and saving of a large amount of time.

#### 1. Introduction

Nowadays, X-ray computed tomography (CT) is still an important part of biomedical imaging technologies for the reason that the reconstructed image is of high spatial resolution and quality. Nevertheless, it confirms that an overdose of radiation possibly increases the risk of genetic or cancerous diseases, making it urgent to develop creative and effective reconstruction techniques to fit low-dose CT scanning protocol. Obviously, the X-ray flux cannot be reduced much since the signal-to-noise ratio (SNR) of measured data declines with the reduction of dose. Another approach is to decrease the number of projection angles, which will lead to incomplete few-view data. In this case, analytic-based algorithms like FDK [1], which are derived from a continuous imaging model and in need of dense sampled projections, are sensitive to insufficient projection data and arrive at a terrible result. However, algebraic algorithms like the simultaneous algebraic reconstruction technique (SART) [2] solved the problem better by transforming it to a series of linear equations.

Recently, Candes et al. [3, 4] have made compressed sensing theory popular in information theory field. This theory indicates that a variety of signals can be represented sparsely in a certain transform domain. Therefore, original signal can be recovered accurately by far fewer samples while there is no need to follow the Shannon/Nyquist sampling theorem. A principle called restricted isometry property (RIP) guarantees the perfect recovery of any sparse signal [5]. This novel theory has been applied to many regions, like information technology [6], signal and image processing [7], inverse filtering [8], and so on. It is said that the data acquisition process with compression is good for enhancing image quality because this method can increase imaging speed and suppress the artifacts caused by patients’ movement [9]. For these benefits, many compressed sensing based algorithms are created to deal with few-view CT reconstruction problem. One major group is based on the total variation (TV), which takes the TV of the image as the sparse constraint. The image is determined by minimizing the TV term with the constraints of the linear projection equations. Sidky and Pan presented an improved TV-based algorithm named adaptive steepest descent projection onto convex sets (ASD-POCS) in circular cone-beam framework [10]. Another similar method called gradient projection Barzilari Borwein (GPBB) has a faster convergence speed [11]. Besides the TV minimization algorithms, dictionary learning is also helpful to sparse representation. During the reconstruction process, the image is divided into many overlapped patches, represented sparsely by overcomplete elements of a particular dictionary. Xu et al. combined statistical iterative reconstruction (SIR) with dictionary learning and got a better reconstruction result than TV-based methods in the low-dose CT condition [12]. According to Xu’s paper, this method is robust to noise and obtains a better reconstructed image with more details than the TV-based methods do. Naturally, there are some parameters relevant to the final result. Some of them, like the sparse level, the scale of the dictionary, and so on, have less change due to different scanning data and then can be empirically selected. However, there is a special parameter changing according to the phantom, the scanning protocol, the noise level, and other factors. This parameter plays an important role in the reconstruction program to balance the data fidelity term and the regularization term while determining its value is time consuming with many attempts. Hence, there is no doubt that providing a model to select a proper value of this parameter according to the scanning data is essential for the algorithm based on dictionary learning, which leads to better result and time saving.

This paper is organized as follows. In Section 2, the problem of low-dose CT reconstruction is stated and the algorithm based on dictionary learning is reviewed. In Section 3, the model of regularization parameter determination is proposed by function fitting method. In Section 4, a series of experiments are performed and corresponding discussions are given. Finally, there is the conclusion at the end of this paper.

#### 2. Notation and Problem Description

##### 2.1. Background and Notation Interpretation

According to previous work by Xu et al. [12], SIR is united with dictionary learning to derive the algorithm. SIR assumes that the measured data can be regarded as the Poisson distributionwhere is the entrance X-ray intensity, is the exit X-ray intensity, is the integral of the linear attenuation coefficient with , is the system matrix, the reconstructed image is a linear attenuation coefficient distribution, which transforms the initial image of pixels to a vector , and represents the read-out noise.

The objective function of SIR is aswhere is the data fidelity term, is the measured data of calculated by , is the statistical weight, and is the regularization term.

The regularization term usually contains prior information of the image, like sparse constraint. When the sparse representation is acquired by dictionary learning theory, we can replace in the objective function. Therefore, the reconstruction problem is equivalent to the following minimization:where is an operator to extract patches with pixels from the image, is the training dictionary whose column is called an atom of the same size of a patch, has few nonzero entries as a sparse representation of patches by the dictionary basis , and the variables and are regularization parameters. In this optimization problem, , , and are all unknown; hence, a practical plan of minimizing the object function is an alternating minimization scheme. The plan divides the primary problem into two recursive steps: update of the dictionary model and update of the image. The final result is acquired by operating the two steps alternately until reaching a stopping criterion.

##### 2.2. Update of the Dictionary Model

During this procedure, the image is supposed to be fixed, meaning that the data fidelity term is a constant. The optimization problem is simplified to the one aswhere is an intermediate image of the last updating step. In the adaptive dictionary based statistical iterative reconstruction (ADSIR), the dictionary is defined dynamically based on the unknown image while the dictionary in the global dictionary based statistical iterative reconstruction (GDSIR) is predefined beforehand [12]. Previous researches have proved that the K-SVD algorithm performs well at training the dictionary [13]. Once the dictionary is determined, the OMP algorithm is used to update the sparse coding [14] with a predetermined sparse level, instead of solving the -norm problem as (4) directly.

##### 2.3. Update of the Image

While updating the image, the dictionary and sparse coding remain invariable. In other words, the problem transforms to the form aswhere is the regularization parameter balancing the data fidelity term and the regularization term . The regularization term is already a separable quadratic function. By replacing the data fidelity term with a separable paraboloid surrogate [15], the optimization can be iteratively solved by

#### 3. Materials and Methods

##### 3.1. Effect of the Regularization Parameter

As mentioned above, the regularization parameter is of great importance during the update of image (5). We consider the optimizing problem of the formIf there is , , , then we can get and by an easy derivation of the unequal relations:It shows that a smaller makes the data fidelity term smaller and the regularization term bigger, which means that the sparse constraint has less effect on the optimizing process and more noise will appear in the final image. On the other hand, a bigger weakens the effect of the data fidelity term, generating a loss of some fine details in the image. For example, should be increased to suppress the noise increment in the projection domain since the data fidelity term is proportional to the noise standard deviation. In order to get an optimal result, previous work selects a great many values of and picks out the best one by comparing the final images. This testing strategy is of great time consuming, making the algorithm based on dictionary learning not friendly to the reconstruction task.

##### 3.2. Morozov’s Principle and the Balancing Principle

The research on the choices of regularization parameters in linear inverse problems appears early in 1998 [16]. The original optimizing function of the inverse problem is likeTake , as an example. If the noise level of is known, one efficient tool for selecting the proper regularization parameter is the well-known Morozov discrepancy principle [16]. To adaptively determine the regularization parameter, a model function is brought in [17]:To find a solution of , (9) is rewritten asWhen , it obviously shows that the solution of the minimization problem is , and then is obtained easily. The other two variables and can be determined by the equations , . To solve the parameter iteratively, the balancing principle [18] is introduced aswhere controls the relative weight of the two terms. Equation (12) can be written aswhich is a fixed point iteration. is calculated by the formula

Although the balancing principle behaves well in the inverse problem model, there is no direct way introducing the method to the dictionary based algorithm. The regularization term has a minimum value greater than zero, which leads to the derivation as follows:

Therefore, the strategy that determining the regularization parameter adaptively accords to the last iterative result is not reasonable. We should look for a selecting strategy which can determine the proper value of the regularization parameter by making an analysis of the projection data.

##### 3.3. Weight Modification of the Objective Function

In order to find out an applicable selecting model of the regularization parameter, we reconsider the minimizing problem (5) and the updating formula (6). According to the former work, the data fidelity term is replaced with a separable surrogate [15]where And one convenient choice isBy making use of the surrogate, (5) becomes a separable formwhere and both can be represented as the sum of a series of quadratic functions, whose variables are . After some variable exchanges, (19) can be rewritten asFrom the above, the image updating formula is just the same as (6); the quadratic term coefficient only depends on the system matrix and the statistical weight . We make a weight modification on the regularization termBy eliminating the constant term, the image reconstruction process is equivalent to solving the following optimization problem:which can be solved iteratively by As shown in (23), determines the relative impact on the updating image by the data fidelity term and the regularization term, respectively.

##### 3.4. Evaluation Model of Regularization Parameter

Before developing the evaluation model, some discussions about the reconstruction result are displayed firstly. Once a value of is selected randomly, by solving (22) iteratively, it infers that the relative error of the data fidelity term is asThe relative error depends on the phantom image, the noise level, the regularization parameter, and so on. When it comes to the situation that the regularization parameter is infinite as , (23) and (24) will be like the following form:It is naturally derived that the relative error increases with the increment of the noise level in projection domain. In addition, it has been mentioned above (in Section 3.1) that should be increased with the noise increment. Therefore, the proper has a monotonous relation with the parameter . Since can be easily determined by operating the reconstruction algorithm based on dictionary learning once with , the proper value of can be calculated if a reasonable function as can be found.

With the help of a series of tests, the relationship between and is fitted by a piecewise quadratic function as follows:

Finally, taking ADSIR as an example, the workflow of the developed algorithm is exhibited in Algorithm 1. In addition, the ordered subsets convex (OSC) algorithm [19] is utilized as an acceleration of the convergence.