Abstract

Medical image fusion plays an important role in diagnosis and treatment of diseases such as image-guided radiotherapy and surgery. The modified local contrast information is proposed to fuse multimodal medical images. Firstly, the adaptive manifold filter is introduced into filtering source images as the low-frequency part in the modified local contrast. Secondly, the modified spatial frequency of the source images is adopted as the high-frequency part in the modified local contrast. Finally, the pixel with larger modified local contrast is selected into the fused image. The presented scheme outperforms the guided filter method in spatial domain, the dual-tree complex wavelet transform-based method, nonsubsampled contourlet transform-based method, and four classic fusion methods in terms of visual quality. Furthermore, the mutual information values by the presented method are averagely 55%, 41%, and 62% higher than the three methods and those values of edge based similarity measure by the presented method are averagely 13%, 33%, and 14% higher than the three methods for the six pairs of source images.

1. Introduction

With the development of medical technology, computer science, and biomedical engineering technology, the medical image technology can provide the clinical diagnosis with a variety of multimodal medical images such as the computed tomography (CT), the magnetic resonance imaging (MRI), the single photon emission computed tomography (SPECT), the positron emission tomography (PET), and ultrasonic images [1]. Different medical image can display different information of the same viscera in the body. For example, the MRI is good at expressing the soft tissue information compared to the CT. However, the CT image can provide better information of tissue calcification and bone segment than the MRI can. In the clinic application, a single modal medical image often cannot provide doctors with enough information to make the correct diagnosis [2, 3]. It is necessary to combine different modal images into one image with enough information of source images. The fused medical images can contain the vital information from the several modal images to demonstrate the comprehensive information of diseased tissue or organs. At the same time, the redundant information in the source images is abrogated. Hence, the doctor can easily make an accurate diagnosis or determine the accurate therapeutic scheme.

Generally, medical image fusion algorithms are divided into two categories: spatial domain methods and multiscale decomposition domain methods [4]. The spatial domain methods combine pixels or regions from source images into fused images in the spatial domain [5]. The other methods adopt the sparse transforms such as traditional wavelets pyramid, contourlet [6], and nonsubsampled contourlet transform [6]. Compared with the spatial domain methods, multiscale decomposition domain methods are of more time complexity because of their redundancy decomposition, especially for nonsubsampled contourlet transform-based fusion approaches. On the other hand, the spatial domain methods can be introduced into the clinical application and surgery procedure because of the low complexity. Generally speaking, fusion methods based on spatial domain can be performed in real time to provide clinic doctor with real-time diagnosis in the surgery. Therefore, this paper focuses on the multimodal medical image fusion method in the spatial domain.

In the latest years, many edge-preserving are active research topic in image processing such as the bilateral filter, weighted least squares [7], guided filter [8], domain transform filter [9], and cost-volume filter [10]. Due to the fact that edge-preserving filters can avoid ringing artifacts and preserve well the edge structure information, these edge-preserving filters have already been widely used in image matching, image dehazing, image denoising, and image classification [11]. The guided filter assumes that the filtered output is a linear transformation of the guidance image. Owing to guided filter based on a local linear model, Kang [11] introduced firstly the guided filter into image fusion area in spatial domain. The domain transform filter preserves the geodesic distance between points on the curve, adaptively warping the input signal so that 1D edge-preserving filtering can be efficiently performed in linear time. The recursive filter used in the domain transform filter makes itself not effective to deal with the complex edge structure with a large amount of discontinuity area. The cost-volume filter is a discrete optical flow approach which handles both fine (small-scale) motion structure and large displacements. The cost-volume leads to generic and fast framework that is widely applicable to computer vision problems. The adaptive manifold filter [12], which has the advantages of better global diffusion and edge-preserving ability, is a real-time high dimension filter on the basis of iterative filter. Moreover, adaptive manifold filter can produce high-quality results and require less memory. In this paper, the adaptive manifold filter is firstly introduced into the images fusion area, especially the multimodal medical image fusion.

2. Methods

2.1. Adaptive Manifold Filter

The adaptive manifold filter is the first high-dimensional filter for performing high-dimensional filtering of images and videos in real time [13]. The adaptive manifold filter is quite flexible and capable of producing responses that approximate to either standard Gaussian filters or non-local-means filters. The process of the adaptive manifold filter can mainly be divided into three parts: the projection part, the blurring part, and the gathering part.

Let be a signal associating each point from its -dimensional spatial domain to a value in its -dimensional range . With regard to gray image, and are equal to 2 and 1, respectively [14].

Then, the number of manifolds is independent of the filter dimensionality and can be generated by the following function:where is defined as a linear correction calculated from the range standard deviation and defines the height calculated from the spatial standard deviation. Let be the set of samples obtained by sampling using a regular grid. We refer to each as a pixel. th -dimensional adaptive manifold can be described by a graph (), and the manifold value associated with pixel is defined by the evaluation of a function at [15]. When the low-pass filtering is performed over the input signal , the first manifold can be generated:where is convolution operation and is a low-pass filter with covariance matrix . Based on the first manifold , Gaussian distance-weighted projection of the pixel values of the image is performed on the manifold. The projection process can be represented aswhere is diagonal covariance matrix with size of which controls the decay of the Gaussian kernel . Gaussian filtering is performed over each manifold mixing the values from all sampling points . Mathematically, the blurred values can be expressed aswhere and is the Gaussian filtering on -dimensional space. The final filter response for each pixel is generated by interpolating blurred values gathered from all adaptive manifolds:where is the total number of adaptive manifolds that will be used to filter a signal and is the weight corresponding to .

2.2. Modified Local Contrast

The contrast feature of image can evaluate the difference of the intensity value at some pixels around the neighbor pixels. The human visual system is highly sensitive to the intensity contrast rather than the intensity value itself. In general, the same intensity value looks like a different intensity value depending on intensity values of neighboring pixels. According to [16], local luminance contrast can be defined as follows:where is the local brightness of image and is the brightness of the local background. In general, is regarded as local low-frequency information of an image and is treated as local high-frequency information of an image. Hence, a proper way to select high-frequency and low-frequency information is necessary to ensure better information interpretation. The modified spatial frequency (MSF) [17] is calculated according to the row frequency, column frequency, and diagonal frequency of the image. The larger modified spatial frequency leads to the salient features such as edges, lines, and region boundaries. Hence, the modified spatial frequency of an image can be used as the high-frequency information of the image. On the other side, the filtered result of an image by adaptive manifold filter can be used as the low-frequency information of the image. Mathematically, the modified local contrast in spatial domain is given bywhere is the modified spatial frequency of image at row and column. On the other hand, is the filtered result of image by adaptive manifold filter. The modified spatial frequency is capable of capturing the fine details presented in the image because of incorporating the diagonal frequency, the row frequency, and column frequency. The modified spatial frequency can be calculated as where the spatial frequency can be calculated as follows [18, 19]: where and denote the number of row and column of image , respectively. The diagonal frequency can be expressed as

2.3. Summary of Fusion Method

Figure 1 demonstrates the schematic diagram of proposed fusion algorithm. The steps of the proposed fusion approach in this paper can be briefly summarized as the following five steps: (1)The source medical images and are registered, respectively. (2)The source medical images and are filtered by the adaptive manifold filter to obtain and as the low-frequency part of modified local contrast information: (3)The modified spatial frequency of source medical image is adopted as the high-frequency information of modified local contrast information according to (7). The modified local contrast of source images and can, respectively, be defined as and which are expressed aswhere and are equal to and and represent the high-frequency information of modified local contrast information, respectively. (4)The decision map can be expressed as follows to fuse the source multimodal medical images: (5)Finally, the fused medical image can be merged by the decision map:

3. Results

3.1. Experimental Setup

To evaluate the performance of the proposed fusion method, experiments have been performed on six pairs of images shown in Figure 2, respectively. These images are characterized in four different categories: (1) CT and MRI, (2) T1-weighted MRI (T1-MRI) and T2-weighted MRI (T2-MRI), (3) MRI and magnetic resonance angiography (MRA) images, and (4) Gadolinium-Diethylenetriamine Pentaacetic Acid MRI (GD-MRI) and T1-weighted MRI. Groups (a) and (b) in Figure 2 are CT images and MRI whereas groups (e) and (f) in Figure 2 are T1-MRI and T2-MRI, respectively. Group (c) in Figure 2 is the T1-MRI and GD-MRI images, respectively. Group (d) in Figure 2 is the T1-MRI and MRA, respectively. The corresponding pixels of two input images have been perfectly matched. All images have the same size of 256 × 256 pixel, with 256-level gray scale. On the one hand, the proposed method is compared with some classic image fusion methods such as principal components analysis (PCA), Laplacian pyramid, Gradient pyramid, and shift invariant discrete wavelet transform (SIDWT) which are compared in many works [4, 20]. On the other hand, the performance of the proposed method is compared with the modified spatial frequency of NSCT coefficients motivated PCNN method proposed by Sudeb [17] and the dual-tree complex wavelet transform method combined with the nonsubsampled direction filter bank (NSDFB) by Liu [21]. In Sudeb’s scheme based on NSCT, the pyramid filter and the direction filter are set to “pyrexc” and “vk,” respectively. The decomposition levels of NSCT are set to [1, 2, 4] in accord with [17]. The three levels of dual-tree complex wavelet transform are adopted to decompose the NSDFB coefficients in Liu’s method. The direction filter is set to “cd.” Furthermore, the guided filter method in spatial domain proposed by Kang [11] is compared with the proposed method because the proposed fusion method is part of the spatial-based domain fusion method. In Kang’s method, the source images are decomposed into a base layer and a detail layer by average filtering. The guided filtering-based weighted average technique is adopted to make full use of spatial consistency for fusion of the base and detail layers. The parameters used in [11] are directly adopted in this comparison. The filter spatial standard deviation and filter range standard deviation is set to 14 and 0.10 in the adaptive manifold filter, separately.

3.2. Evaluation Metrics
3.2.1. Mutual Information

Mutual information (MI), proposed by Piella [22], can demonstrate how much information the fused image conveys about the reference image. The MI is defined as , where can be calculated bywhere and denote the source image ( or ) and fused image, respectively. is the joint gray level histogram of and , and are the normalized gray level histograms of and , and is the number of bins. Hence, the larger MI value indicates that the fused image acquires more information from image and image .

3.2.2. Edge Based Similarity Measure

The edge based similarity measure [23] gives the similarity between the edges transferred in the fusion process. Mathematically, is defined aswhere and represent the input image, respectively. is the fused images. The definition of and is the same and is given aswhere and are the edge strength and orientation preservation values at location of images, respectively. represents image or image , separately. The dynamic range for is and it should be as close to 1 as possible for better fusion.

3.3. Subjective Evaluation Analysis

To evaluate the performance of the proposed method in multimodal medical images fusion, extensive experiments, shown in Figures 38, are performed on the six groups of images, respectively. The fused images in first row of Figures 38 are fused results with the classic methods including PCA, Laplacian pyramid, Gradient pyramid, and SIDWT. The fused images in second row of Figures 38 are merged with latest three methods and proposed method. It can be clearly seen that the images fused with latest three methods and proposed method reach a higher contrast than the classic methods do in most cases. However, the proposed method is very different with the latest methods. To be specific, it can be seen that the contrast of the images fused by the presented method is higher compared to the other latest three methods by looking carefully at Figures 3(e)3(h) and 7(e)7(h). Figures 4(e)4(h) demonstrate that the other three methods cannot well preserve edge information shown in the blue labeled regions of Figures 4(e)4(g). Figures 5(e)5(h) illustrate that Kang’s method and Sudeb’s method introduced many artifacts into the fused images and Liu’s method lost useful information shown in the blue region in Figure 5(e). The contrast of Figure 6(g) by Kang’s method is lowest among Figures 6(e)6(h). The labeled regions by author in Figure 6(h) are clearer than the corresponding parts in Figures 6(e) and 6(f). From Figures 8(e)8(h), it can be concluded that Liu’s method is not effective to fuse images of group (f) in Figure 2 and the proposed method fuses more information from source images than Kang’s method and Sudeb’s method. In summary, the proposed algorithm can convert the more accurate and necessary information into the fused images than other several methods can. At the same time, less useless image information such as block effect and artifacts is introduced into the fused images by the presented scheme.

3.4. Objective Evaluation Analysis

Apart from the subjective performance evaluation, objective evaluation metrics are necessary to demonstrate the differences among the fused images. Tables 1, 2, and 3 demonstrate the MI and of fused images with different methods. The bold values indicate the best results in Tables 13. The MI value and value of the proposed algorithm are largest in the eight methods except that the values by Kang’s method (Guided filter) are largest in the fusion results of group (b) and group (f). Objective evaluation results mean that the useful information converted into the fused result by the proposed algorithm is maximal among the eight approaches except special case. Moreover, the objective evaluation results of objective evaluation coincide with the visual effect evaluation very well with minor exceptions. For these exceptions, the visual effect of the proposed method is better compared to the methods with better objective evaluation performance. From above subjective performance and objective metrics comparisons, it may be concluded that the proposed algorithm can work better to combine the CT with MRI, combine the MRI with GD-MRI, combine the MRI with MRA, and combine the T1-weighted MRI with T2-weighted MRI. The proposed scheme is more effective than some state-of-the-art works and four classic methods.

4. Conclusion

In order to improve the effect of multimodal medical image fusion method and increase diagnostic accuracy, novel and effective medical image fusion algorithm in spatial domain is presented in this paper. The modified local contrast information is proposed as the decision map to fuse the multimodal medical images. In consideration of better global diffusion and edge-preserving ability of the adaptive manifold filter, the filtered result of source images by the adaptive manifold filter is introduced as the low-frequency part. On the other side, the modified spatial frequency of the source images is adopted as the high-frequency part. The experiment results illustrate clearly that the presented scheme is better than many other fusion methods such as guided filter method in spatial domain, NSCT-based method in transform domain, the dual-tree complex wavelet combined with the NSDFB method, and several classic image fusion methods both in subjective performance and objective evaluation.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

Some of the images adopted in these experiments are downloaded from the website of http://www.med.harvard.edu/AANLIB/home.html. This work was supported in part by National Natural Science Fund under Grants 61572063 and 61401308, Natural Science Fund of Hebei Province under Grants F2013210094, F2013210109, and Z9904427, and Natural Science Foundation of Hebei University under Grant no. 2014-303.