Abstract

A new multifocus image fusion method is proposed. Two image blocks are selected by sliding the window from the two source images at the same position, discrete cosine transform (DCT) is implemented, respectively, on these two blocks, and the alternating component (AC) energy of these blocks is then calculated to decide which is the well-focused one. In addition, block matching is used to determine a group of image blocks that are all similar to the well-focused reference block. Finally, all the blocks are returned to their original positions through weighted average. The weight is decided with the AC energy of the well-focused block. Experimental results demonstrate that, unlike other spatial methods, the proposed method effectively avoids block artifacts. The proposed method also significantly improves the objective evaluation results, which are obtained by some transform domain methods.

1. Introduction

In many cases, defocused parts often exist in the acquired image because interesting sceneries are highlighted or these defocused parts are restricted by environmental factors when the images are acquired. Fusing two or multiple images is an effective method to remove the defocusing phenomenon and to obtain a totally well-focused image. That is, fusing involves using the well-focused parts of the images to compose a new image, in which no defocusing phenomenon exists.

Image fusion has two types: spatial and transform domain. Transform-based image fusion algorithm has received many favorable results, such as pyramid decomposition transform domain fusion [1] and wavelet transform [25]. Wavelet transform is one of the most commonly used methods. However, the conventional wavelet transform can well represent only the point singularity and therefore cannot well represent the linear singularity that largely exists in many images, such as line or curve edges in an image. A series of super-wavelets has been proposed and applied to image fusion techniques, including ridgelet [6], curvelet [7], contourlet [8], bandlet [9], and shearlet [10], to effectively represent linear singularity.

Although the image fusion techniques based on wavelet transform and super-wavelet transforms have obtained significant achievements, all of these transformations are local transformations that inevitably bring artifacts into the resulting images. Wavelet transform often introduces the ringing effects, whereas super-wavelet transform often introduces linear artifacts. A nonlocal method has been initially developed for image denoising [1113] in the recent years. This method has largely relieved the phenomenon of artifacts in the resulting images. As a successful nonlocal method, the block-matching 3D (BM3D) transform in [14, 15] has achieved significant progress in various image processing applications, such as image denoising, image enhancement, and image super-resolution analysis.

An effective multifocus image fusion algorithm is proposed in this paper by utilizing the superiority of the enhanced sparse representation of BM3D. The experimental results show that the proposed algorithm can effectively avoid all kinds of artifacts; that is, the proposed algorithm can obtain excellent subjective visual quality, and we call it BM3D fusion (BM3DF). However, the objective evaluation of the BM3D-based method is not ideal. We remove the 3D transform in BM3D algorithm to improve further the objective evaluation result. However, the block-matching and image aggregating operators are preserved. This simple method is just a spatial one since the transform is removed; however, achieving ideal image fusion results, it is called block-matching spatial fusion (BMSF). The experimental results show that the multifocus image fusion results on both subjective visual quality and objective evaluation of this method are better than those of some existing state-of-the-art image fusion algorithms.

2. Image Fusion Algorithm Based on BM3D

The different purposes between the image fusion and the image denoising are considered when the first stage of BM3D [14] is conducted in this paper; in other words, we need not implement the second stage of BM3D. The image fusion algorithm is as follows.

Step 1. Given two images, A and B, which have been well registered with different focus, the input images A and B are divided into several overlapping blocks, and , respectively, where is the set of the coordinates of each block and and are both called reference blocks. Grouping operations are implemented on blocks and according to the method in [14]; that is, those image blocks similar to and are stacked and two 3D matrices are formed:

Step 2. Two 3D matrices grouped in Formula (1) are implemented with separable 3D transform to obtain their sparse representation:

Step 3. Let and be the low-frequency coefficients of and , respectively. and are the high-frequency coefficients of and , respectively.
The defocused patches do not have enough detailed information, such as texture and contour. After the 3D transform, the bigger magnitude high-frequency coefficients are usually from the well-focused patches. The purpose of image fusion is just to recover these pieces of detailed information. So we select the bigger magnitude coefficients as the fusion result high-frequency coefficients. On the other hand, there are very similar low-frequency transformed coefficients between the well-focused patches and the defocused ones, so we simply average the low-frequency coefficients which are from both the well-focused and defocused patches, respectively, as the fusion result low-frequency coefficients.

Step 4. The following coefficient fusion rules are presented in this algorithm:Note. Only one low-frequency coefficient exists after the 3D transform of each 3D transform, whereas the rest are all high-frequency coefficients. The high-frequency coefficients to be fused are from the same location in each 3D transform.

Step 5. 3D inverse transform is implemented on 3D coefficients matrix that consists of as low-frequency coefficients and as high-frequency coefficients:

Step 6. Fusion image is obtained by aggregating all the image blocks in each group . The particular aggregation formula is as follows:where is the fused image, is the characteristic function, and is a set of coordinates constituted of all blocks. Figure 1 illustrates the algorithm flow chart.
The flowchart presents a kind of direct BM3D-based algorithm for image fusion. In this paper, this algorithm is called BM3DF.
The experimental results show that the fused result of BM3DF is not ideal on objective evaluation, although the subjective visual quality remains better than some existing algorithms. There may be some following reasons. Firstly, the high-frequency coefficients selection strategy of BM3DF is a little rough. Secondly, the image details from the well-focused image patches are always inevitably smoothed in some degree since the inverse 3D transform after the coefficients fusion and the final aggregation procedure. Thus, the fused result is not ideal on objective evaluation. If we use a good strategy to judge which image blocks are from the well-focused parts in advance, we only implement the block matching on those well-focused image blocks, and if we remove the 3D transform, we may obtain better objective evaluation. Therefore, a simpler but more effective image fusion algorithm than BM3DF is proposed, namely, block-matching spatial fusion (BMSF).

3. Block-Matching Spatial Image Fusion

In this algorithm, we remove 3D transform, but the block matching and block aggregation are preserved. Thus, this algorithm can still avoid introducing artifacts and can obtain excellent objective evaluation results. The reason of preserving the block aggregation is to avoid presenting the image edge abrupt transition in the fused image; however, the image details in the well-focused patches are hardly impacted by those defocused ones. Although DCT cannot better represent image textures or contours than DWT or some super wavelets, we can well judge which patches are better focused only by its alternating component (AC) energy. So we use AC energy of DCT to decide which patches are well-focused ones in advance. The BMSF algorithm is as follows.

Step 1. Block matching is conducted, which is the same as Step 1 in Algorithm BM3DF.

Step 2. DCT is implemented only on the two reference blocks. The AC energy of these blocks is calculated as follows:where and are the DCT coefficients of the two reference blocks and and are their direct components.

Step 3. If , then we consider that is from the well-focused image, rather than being from the well-focused image. By using this selection rule, we obtained all the well-focused image block groupings, which are noted as .

Step 4. Fused image is obtained by aggregating all image blocks in each group . The particular aggregation formula is as follows:where is the fused image, is the characteristic function, and is a set of coordinates composed of all blocks. Figure 2 illustrates the algorithm flow chart. We present the algorithm diagram in Figure 3 to show clearly the proposed algorithm.

4. Experimental Results

In this section, many multifocus image fusion experiments are conducted based on the proposed two image fusion algorithms. Image fusion experiments are conducted by using “clock,” “disk,” and “lab.” These three are commonly used images in image fusion research fields. Two fusion criteria, mutual information (MI) [16] and [17], are used to evaluate fusion performance [8, 18, 19]. Mutual information means MI() = 0 if and only if and are independent random variables. In image fusion, MI is used to measure the amount of the same information between the source images and the fused image, and the higher MI value indicates the source images and the fused image sharing the more same information; represents the preserved image details amount in the fused image, and the higher value of means the fused image contains the more image details. Evaluation criteria listed in Table 1 demonstrate that the proposed BMSF algorithm can achieve the most consistent fused images to the source images and preserves the image edges best.

We compare the two proposed algorithms with the classical image fusion algorithm PCNN [8], sharpness statistics (SS) [18], and state-of-the-art guided filtering (GF) [19] to show the advantages of the proposed algorithms in this paper. For the proposed methods, all the parameters of BM3DF are the same as those of BM3D. For BMSF, the size of image block is and all the other parameters are the same as those of BM3D. As shown in Figures 46, PCNN leads to obvious blurred edges. SS generates very sharp edge but introduces artifacts around boundaries. GF produces good results but still encounters blurring around boundaries. The proposed approach achieves minimal artifacts around the object boundaries and produces the sharpest images. However, BM3DF cannot achieve good objective evaluation results.

5. Conclusions

The block-matching and nonlocal means algorithms are initially proposed to improve image denoising performance and have obtained a series of significant achievements. The most significant advantage of nonlocal method is that it can relieve and even avoid the artifacts introduction comparing some traditional transform domain methods. This paper proposed two multifocus image fusion algorithms which are called BM3DF and BMSF, respectively, under the revelation of nonlocal method.

Although the subjective visual quality of the fused image by the BM3DF is better than some existing algorithms, the objective evaluation is not ideal yet. In fact, the image details from the well-focused image patches are always inevitably smoothed in some degree since the inverse 3D transform after the coefficients fusion and the final aggregation procedure. In order to obtain the good visual quality and the ideal objective evaluation simultaneously, we use a good strategy to judge which image blocks are from the well-focused parts in advance; we only implement the block matching on those well-focused image blocks and remove the 3D transform thus obtaining better objective evaluation; that is, a simpler but more effective image fusion algorithm than BM3DF is proposed, and it is called BMSF in this paper.

Some related references even pointed out that BM3D algorithm is too complex, but the running time of the BM3D program is not long at all comparing many existing image denoising algorithms; it usually takes less than 5 seconds when a common grayscale image is processed in a core2 duo 2.66 GHz computer. Due to only using the first stage of BM3D in the BM3DF, the running time of BM3DF is significantly shorten; it takes only about 2 second when we fuse the “clock” images in our experiments. Furthermore, the BMSF removes the 3D transform in BM3D; the running time should be shorter. However, the image block size in BMSF is a little bit bigger than BM3DF, so the running time of the BMSF is nearly the same as the BM3DF.

Experimental results have shown that the improved image fusion results can be achieved, although the proposed algorithm in this paper uses the simple fusion rules.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by the National Science Foundation of China under Grants nos. 61379015 and 61233011, the Natural Science Foundation of Shandong Province under Grant no. ZR2011FM004, and the Science and Technology Development Project of Tai’an City under Grant no. 20113062.