Machine Learning and Network Methods for Biology and MedicineView this Special Issue
A Model of Regularization Parameter Determination in Low-Dose X-Ray CT Reconstruction Based on Dictionary Learning
In recent years, X-ray computed tomography (CT) is becoming widely used to reveal patient’s anatomical information. However, the side effect of radiation, relating to genetic or cancerous diseases, has caused great public concern. The problem is how to minimize radiation dose significantly while maintaining image quality. As a practical application of compressed sensing theory, one category of methods takes total variation (TV) minimization as the sparse constraint, which makes it possible and effective to get a reconstruction image of high quality in the undersampling situation. On the other hand, a preliminary attempt of low-dose CT reconstruction based on dictionary learning seems to be another effective choice. But some critical parameters, such as the regularization parameter, cannot be determined by detecting datasets. In this paper, we propose a reweighted objective function that contributes to a numerical calculation model of the regularization parameter. A number of experiments demonstrate that this strategy performs well with better reconstruction images and saving of a large amount of time.
Nowadays, X-ray computed tomography (CT) is still an important part of biomedical imaging technologies for the reason that the reconstructed image is of high spatial resolution and quality. Nevertheless, it confirms that an overdose of radiation possibly increases the risk of genetic or cancerous diseases, making it urgent to develop creative and effective reconstruction techniques to fit low-dose CT scanning protocol. Obviously, the X-ray flux cannot be reduced much since the signal-to-noise ratio (SNR) of measured data declines with the reduction of dose. Another approach is to decrease the number of projection angles, which will lead to incomplete few-view data. In this case, analytic-based algorithms like FDK , which are derived from a continuous imaging model and in need of dense sampled projections, are sensitive to insufficient projection data and arrive at a terrible result. However, algebraic algorithms like the simultaneous algebraic reconstruction technique (SART)  solved the problem better by transforming it to a series of linear equations.
Recently, Candes et al. [3, 4] have made compressed sensing theory popular in information theory field. This theory indicates that a variety of signals can be represented sparsely in a certain transform domain. Therefore, original signal can be recovered accurately by far fewer samples while there is no need to follow the Shannon/Nyquist sampling theorem. A principle called restricted isometry property (RIP) guarantees the perfect recovery of any sparse signal . This novel theory has been applied to many regions, like information technology , signal and image processing , inverse filtering , and so on. It is said that the data acquisition process with compression is good for enhancing image quality because this method can increase imaging speed and suppress the artifacts caused by patients’ movement . For these benefits, many compressed sensing based algorithms are created to deal with few-view CT reconstruction problem. One major group is based on the total variation (TV), which takes the TV of the image as the sparse constraint. The image is determined by minimizing the TV term with the constraints of the linear projection equations. Sidky and Pan presented an improved TV-based algorithm named adaptive steepest descent projection onto convex sets (ASD-POCS) in circular cone-beam framework . Another similar method called gradient projection Barzilari Borwein (GPBB) has a faster convergence speed . Besides the TV minimization algorithms, dictionary learning is also helpful to sparse representation. During the reconstruction process, the image is divided into many overlapped patches, represented sparsely by overcomplete elements of a particular dictionary. Xu et al. combined statistical iterative reconstruction (SIR) with dictionary learning and got a better reconstruction result than TV-based methods in the low-dose CT condition . According to Xu’s paper, this method is robust to noise and obtains a better reconstructed image with more details than the TV-based methods do. Naturally, there are some parameters relevant to the final result. Some of them, like the sparse level, the scale of the dictionary, and so on, have less change due to different scanning data and then can be empirically selected. However, there is a special parameter changing according to the phantom, the scanning protocol, the noise level, and other factors. This parameter plays an important role in the reconstruction program to balance the data fidelity term and the regularization term while determining its value is time consuming with many attempts. Hence, there is no doubt that providing a model to select a proper value of this parameter according to the scanning data is essential for the algorithm based on dictionary learning, which leads to better result and time saving.
This paper is organized as follows. In Section 2, the problem of low-dose CT reconstruction is stated and the algorithm based on dictionary learning is reviewed. In Section 3, the model of regularization parameter determination is proposed by function fitting method. In Section 4, a series of experiments are performed and corresponding discussions are given. Finally, there is the conclusion at the end of this paper.
2. Notation and Problem Description
2.1. Background and Notation Interpretation
According to previous work by Xu et al. , SIR is united with dictionary learning to derive the algorithm. SIR assumes that the measured data can be regarded as the Poisson distributionwhere is the entrance X-ray intensity, is the exit X-ray intensity, is the integral of the linear attenuation coefficient with , is the system matrix, the reconstructed image is a linear attenuation coefficient distribution, which transforms the initial image of pixels to a vector , and represents the read-out noise.
The objective function of SIR is aswhere is the data fidelity term, is the measured data of calculated by , is the statistical weight, and is the regularization term.
The regularization term usually contains prior information of the image, like sparse constraint. When the sparse representation is acquired by dictionary learning theory, we can replace in the objective function. Therefore, the reconstruction problem is equivalent to the following minimization:where is an operator to extract patches with pixels from the image, is the training dictionary whose column is called an atom of the same size of a patch, has few nonzero entries as a sparse representation of patches by the dictionary basis , and the variables and are regularization parameters. In this optimization problem, , , and are all unknown; hence, a practical plan of minimizing the object function is an alternating minimization scheme. The plan divides the primary problem into two recursive steps: update of the dictionary model and update of the image. The final result is acquired by operating the two steps alternately until reaching a stopping criterion.
2.2. Update of the Dictionary Model
During this procedure, the image is supposed to be fixed, meaning that the data fidelity term is a constant. The optimization problem is simplified to the one aswhere is an intermediate image of the last updating step. In the adaptive dictionary based statistical iterative reconstruction (ADSIR), the dictionary is defined dynamically based on the unknown image while the dictionary in the global dictionary based statistical iterative reconstruction (GDSIR) is predefined beforehand . Previous researches have proved that the K-SVD algorithm performs well at training the dictionary . Once the dictionary is determined, the OMP algorithm is used to update the sparse coding  with a predetermined sparse level, instead of solving the -norm problem as (4) directly.
2.3. Update of the Image
While updating the image, the dictionary and sparse coding remain invariable. In other words, the problem transforms to the form aswhere is the regularization parameter balancing the data fidelity term and the regularization term . The regularization term is already a separable quadratic function. By replacing the data fidelity term with a separable paraboloid surrogate , the optimization can be iteratively solved by
3. Materials and Methods
3.1. Effect of the Regularization Parameter
As mentioned above, the regularization parameter is of great importance during the update of image (5). We consider the optimizing problem of the formIf there is , , , then we can get and by an easy derivation of the unequal relations:It shows that a smaller makes the data fidelity term smaller and the regularization term bigger, which means that the sparse constraint has less effect on the optimizing process and more noise will appear in the final image. On the other hand, a bigger weakens the effect of the data fidelity term, generating a loss of some fine details in the image. For example, should be increased to suppress the noise increment in the projection domain since the data fidelity term is proportional to the noise standard deviation. In order to get an optimal result, previous work selects a great many values of and picks out the best one by comparing the final images. This testing strategy is of great time consuming, making the algorithm based on dictionary learning not friendly to the reconstruction task.
3.2. Morozov’s Principle and the Balancing Principle
The research on the choices of regularization parameters in linear inverse problems appears early in 1998 . The original optimizing function of the inverse problem is likeTake , as an example. If the noise level of is known, one efficient tool for selecting the proper regularization parameter is the well-known Morozov discrepancy principle . To adaptively determine the regularization parameter, a model function is brought in :To find a solution of , (9) is rewritten asWhen , it obviously shows that the solution of the minimization problem is , and then is obtained easily. The other two variables and can be determined by the equations , . To solve the parameter iteratively, the balancing principle  is introduced aswhere controls the relative weight of the two terms. Equation (12) can be written aswhich is a fixed point iteration. is calculated by the formula
Although the balancing principle behaves well in the inverse problem model, there is no direct way introducing the method to the dictionary based algorithm. The regularization term has a minimum value greater than zero, which leads to the derivation as follows:
Therefore, the strategy that determining the regularization parameter adaptively accords to the last iterative result is not reasonable. We should look for a selecting strategy which can determine the proper value of the regularization parameter by making an analysis of the projection data.
3.3. Weight Modification of the Objective Function
In order to find out an applicable selecting model of the regularization parameter, we reconsider the minimizing problem (5) and the updating formula (6). According to the former work, the data fidelity term is replaced with a separable surrogate where And one convenient choice isBy making use of the surrogate, (5) becomes a separable formwhere and both can be represented as the sum of a series of quadratic functions, whose variables are . After some variable exchanges, (19) can be rewritten asFrom the above, the image updating formula is just the same as (6); the quadratic term coefficient only depends on the system matrix and the statistical weight . We make a weight modification on the regularization termBy eliminating the constant term, the image reconstruction process is equivalent to solving the following optimization problem:which can be solved iteratively by As shown in (23), determines the relative impact on the updating image by the data fidelity term and the regularization term, respectively.
3.4. Evaluation Model of Regularization Parameter
Before developing the evaluation model, some discussions about the reconstruction result are displayed firstly. Once a value of is selected randomly, by solving (22) iteratively, it infers that the relative error of the data fidelity term is asThe relative error depends on the phantom image, the noise level, the regularization parameter, and so on. When it comes to the situation that the regularization parameter is infinite as , (23) and (24) will be like the following form:It is naturally derived that the relative error increases with the increment of the noise level in projection domain. In addition, it has been mentioned above (in Section 3.1) that should be increased with the noise increment. Therefore, the proper has a monotonous relation with the parameter . Since can be easily determined by operating the reconstruction algorithm based on dictionary learning once with , the proper value of can be calculated if a reasonable function as can be found.
With the help of a series of tests, the relationship between and is fitted by a piecewise quadratic function as follows:
Finally, taking ADSIR as an example, the workflow of the developed algorithm is exhibited in Algorithm 1. In addition, the ordered subsets convex (OSC) algorithm  is utilized as an acceleration of the convergence.
4. Experimental Results and Discussion
To make the evaluation of the regularization parameter possible, the developed algorithm improves ADSIR with a modified weight of the regularization term while the weight is adaptive to the data fidelity term. So the proposed algorithm is named adaptive weight regularized ADSIR (AWR-ADSIR). In this section, a series of reconstruction experiments are exhibited to validate that the regularization selecting principle in AWR-ADSIR is practical. The simulation numerical phantoms are Shepp-Logan phantom, human head slice image, and human abdomen slice image. The Shepp-Logan phantom is a numeric phantom with pixels intensities ranging from 0 to 1. The sample images of human head slice and human abdomen slice are the FBP reconstruction results based on full-sampling scanning data, which are obtained from our collaborator. All of these phantom images are of pixels presented in Figure 1. The scanning data are simulated as an undersampling situation with different noise levels. Firstly, different regularization parameters are selected to demonstrate that the one chosen by the algorithm leads to the best reconstruction result. Secondly, by comparing the quality of images reconstructed by diverse algorithms, which are SART, GPBB, ADSIR, and AWR-ADSIR, it can be confirmed that AWR-ADSIR is of remarkable performance among these algorithms. Finally, the selecting principle is used in the GDSIR model, proving that it also works. All the algorithms above are coded in MATLAB and run on a dual-core PC with 3.10 GHz Intel Core i5-2400 and 4 GB RAM.
4.1. Comparison of Different Regularization Parameters
In the following experiments, all the parameters except the regularization parameter keep invariant for the same phantom with the same projection noise level. Three values of are selected, of which one is calculated by the proposed algorithm, another one is multiplied by 0.1, and the third one is multiplied by 10. The distance from the X-ray source to the center point of the phantom is twice the length of the image edge. The iteration of the algorithm is stopped when the relative error is less than a stopping value ( is calculated by (24)).
To compare the difference between different selections of the regularization parameter, the human abdomen slice image is tested as an example. The projection data are simulated by 180 views of 2° step length over a 360° range, and 512 detector elements are distributed in fan-beam geometry covering the phantom. The noise levels added to the projection data are 0.0% and 0.1% Gaussian random noise. The results are displayed in Figures 2 and 3. For the reason that biomedical images are often observed by a proper window to find more details, the images are displayed with a window HU. The difference between the reconstructed image and the phantom image is displayed by a window HU.
There are two criterions to evaluate the reconstructed image. One is the normalized mean absolute deviation (NMAD), defined as
The other one is the signal-to-noise ratio (SNR), defined as
The values of the two criterions are presented in Table 1. Comparing the results with the same noise level of the situation and the situation, the NMAD of the situation is smaller and the SNR of the situation is larger mostly, which proves that the image reconstructed by choosing the regularization parameter as is more close to the sample image. When it comes to the situation, although the values of the two criterions are a little better, there are some artifacts appearing in the reconstructed image, leading to a decline of the image quality. In the middle column of Figures 2 and 3, some horizontal line artifacts appear in the images (regions D, E, and F in Figure 2 and the ellipse regions in Figure 3). It seems that there are more horizontal artifacts in the 0.0% noise level image. The probable reason is the noise added to the projection data since the noise covers the inconspicuous artifact in the 0.1% noise level image. So what is the reason for the fact that the NMAD and SNR of the situation are better? One reasonable explanation might be the smoothing effect of the dictionary learning algorithm. When the iterative image is updated by (6) or (23), the sparse constraint added by dictionary learning method smoothes not only the noise but also the margin details. This effect becomes more significant when the regularization parameter is becoming larger. Therefore, the difference between the reconstructed image and the sample image in the margin area becomes more obvious, which is displayed in the left and right columns of Figure 3. By discovering this effect making the criterion worse, future work should be devoted to improving the reconstruction algorithm based on dictionary learning in order to smooth the noise and preserve the margin details meanwhile.
4.2. Comparison of Different Reconstruction Algorithms
To present the advantage of AWR-ADSIR, the images reconstructed by AWR-ADSIR and some other algorithms (SART, GPBB, and ADSIR) are compared with the same projection data, initial conditions, and stopping criterions. The testing examples are the Shepp-Logan phantom and the human head slice image. The Shepp-Logan phantom is simulated by 120 views of 3° step length over a 360° range, and 512 detector elements are distributed in fan-beam geometry with three different noise levels while the head slice sample is simulated by 180 views of 2° step length over a 360° range. The reconstructed results are displayed in the four figures (Figure 4 to Figure 7).
With the comparative results calculated by (28) and (29) presented in Tables 2 and 3, the quality of all the reconstructed images is decreased with the noise level increasing. Among these four algorithms, SART generates the worst results. GPBB behaves well when the noise level is very low but when the noise level is beyond 0.1%, the quality of reconstructed image degenerates quickly. The regularization parameter in ADSIR is empirically selected according to . The fact that the image quality and comparing criterions of ADSIR and AWR-ADSIR are mostly the same proves that the parameter selecting model in AWR-ADSIR is practical and efficient. In addition, the marginal details appearing in Figures 5 and 7 of algorithms ADSIR and AWR-ADSIR indicate the smoothing effect on the margins again. Since the high-contrast edge is smoothed by the algorithm, the structural boundaries appear in the difference image in Figure 7. Obviously, when the empirical regularization parameter is unknown, the ADSIR algorithm costs a large amount of time to determine the proper value by repeatedly operating the iterative process (usually more than ten times) while the AWR-ADSIR algorithm only operates the iterative process twice as Algorithm 1 shows. The model indeed reduces the time consumption to determine the value of the regularization parameter.
4.3. Adaptive Weight Regularized GDSIR
The last experiment is applying the parameter selecting strategy into the GDSIR model. The dictionary displayed in Figure 8 is learned from the overlapping patches of the original image in Figure 1. The reconstruction process of AWR-GDSIR is almost the same as AWR-ADSIR except that the dictionary has been constructed in advance and dose not change during the reconstruction process. The result shown in Figure 8 indicates that the proposed model is also suitable for the general dictionary condition.
In most optimization problems, the determination of the regularization parameter is still a problem. In this paper, aiming to determine the regularization parameter of the algorithm based on dictionary learning, one model function, whose independent variable can be calculated by the known projection data, is proposed depending on some modification on the objective function. When compared to some other algorithms, the images reconstructed by AWR-ADSIR and ADSIR are of similar quality, better than the one reconstructed by SART, and the proposed algorithm is much more robust to noise than GPBB. This indicates that the modification of the objective function does not degrade the performance of ADSIR. What is more, the parameter selection model is demonstrated to be rational by the fact that the image quality of the situation is better than the ones of and situations. However, when some other parameters (like the scale of the dictionary, the scale of the patch, and so on) change, the model function might result in some difference. However, it still works to look for the function relationship between these two parameters since the monotonous relation between and remains.
By validating the proposed selecting principle, the smoothing effect on image margins is discovered. Our future work will focus on improving the dictionary learning method, with the expectation to maintain the smoothing effect on noise regions and preserve marginal information better.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work was partially supported by the National Natural Science Foundation of China (NSFC) through Grants no. 61201117 and no. 61301042, the Natural Science Foundation of Jiangsu Province (NSFJ) through Grant no. BK2012189, and Science and Technology Program of Suzhou (no. ZXY2013001). The authors are very grateful for the CT images provided by the PET Center, Huashan Hospital, Fudan University, China.
X. Han, J. Bian, E. L. Ritman, E. Y. Sidky, and X. Pan, “Optimization-based reconstruction of sparse images from few-view projections,” Physics in Medicine and Biology, vol. 57, no. 16, pp. 5245–5273, 2012.View at: Publisher Site | Google Scholar
B. Song, J. Park, and W. Song, “SU-E-J-14: a novel, fast, variable step size gradient method for solving simultaneous algebraic reconstruction technique (SART)-type reconstructions: an example application to CBCT,” Medical Physics, vol. 38, no. 6, article 3444, 2011.View at: Publisher Site | Google Scholar
E. J. Candes, J. Romberg, and T. Tao, “Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information,” IEEE Transactions on Information Theory, vol. 52, no. 2, pp. 489–509, 2006.View at: Publisher Site | Google Scholar | MathSciNet
E. J. Candes and T. Tao, “Near-optimal signal recovery from random projections: universal encoding strategies?” IEEE Transactions on Information Theory, vol. 52, no. 12, pp. 5406–5425, 2006.View at: Publisher Site | Google Scholar | MathSciNet
J. Wang and B. Shim, “On the recovery limit of sparse signals using orthogonal matching pursuit,” IEEE Transactions on Signal Processing, vol. 60, no. 9, pp. 4973–4976, 2012.View at: Publisher Site | Google Scholar | MathSciNet
P. T. Lauzier, J. Tang, and G.-H. Chen, “Prior image constrained compressed sensing: implementation and performance evaluation,” Medical Physics, vol. 39, no. 1, pp. 66–80, 2012.View at: Publisher Site | Google Scholar
W. W. Hager, D. T. Phan, and H. Zhang, “Gradient-based methods for sparse recovery,” SIAM Journal on Imaging Sciences, vol. 4, no. 1, pp. 146–165, 2011.View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
R. Volz and S. Close, “Inverse filtering of radar signals using compressed sensing with application to meteors,” Radio Science, vol. 47, no. 4, Article ID RS0N05, 2012.View at: Publisher Site | Google Scholar
G. Wang, Y. Bresler, and V. Ntziachristos, “Guest editorial compressive sensing for biomedical imaging,” IEEE Transactions on Medical Imaging, vol. 30, no. 5, pp. 1013–1016, 2011.View at: Publisher Site | Google Scholar
E. Y. Sidky and X. C. Pan, “Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization,” Physics in Medicine and Biology, vol. 53, no. 17, pp. 4777–4807, 2008.View at: Publisher Site | Google Scholar
J. C. Park, B. Song, J. S. Kim et al., “Fast compressed sensing-based CBCT reconstruction using Barzilai-Borwein formulation for application to on-line IGRT,” Medical Physics, vol. 39, no. 3, pp. 1207–1217, 2012.View at: Publisher Site | Google Scholar
Q. Xu, H. Y. Yu, X. Q. Mou, L. Zhang, J. Hsieh, and G. Wang, “Low-dose X-ray CT reconstruction via dictionary learning,” IEEE Transactions on Medical Imaging, vol. 31, no. 9, pp. 1682–1697, 2012.View at: Publisher Site | Google Scholar
M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Transactions on Signal Processing, vol. 54, no. 11, pp. 4311–4322, 2006.View at: Publisher Site | Google Scholar
J. A. Tropp and A. C. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Transactions on Information Theory, vol. 53, no. 12, pp. 4655–4666, 2007.View at: Publisher Site | Google Scholar | MathSciNet
I. A. Elbakri and J. A. Fessler, “Statistical image reconstruction for polyenergetic X-ray computed tomography,” IEEE Transactions on Medical Imaging, vol. 21, no. 2, pp. 89–99, 2002.View at: Publisher Site | Google Scholar
K. Karl and Z. Jun, “Iterative choices of regularization parameters in linear inverse problems,” Inverse Problems, vol. 14, no. 5, pp. 1247–1264, 1998.View at: Publisher Site | Google Scholar | MathSciNet
J. Feng, C. Qin, K. Jia et al., “An adaptive regularization parameter choice strategy for multispectral bioluminescence tomography,” Medical Physics, vol. 38, no. 11, pp. 5933–5944, 2011.View at: Publisher Site | Google Scholar
C. Clason, B. T. Jin, and K. Kunisch, “A semismooth Newton method for L1 data fitting with automatic choice of regularization parameters and noise calibration,” SIAM Journal on Imaging Sciences, vol. 3, no. 2, pp. 199–231, 2010.View at: Publisher Site | Google Scholar
C. Kamphuis and F. J. Beekman, “Accelerated iterative transmission CT reconstruction using an ordered subsets convex algorithm,” IEEE Transactions on Medical Imaging, vol. 17, no. 6, pp. 1101–1105, 1998.View at: Publisher Site | Google Scholar