Abstract

Multifocus image fusion is a process that integrates partially focused image sequence into a fused image which is focused everywhere, with multiple methods proposed in the past decades. The Dual Tree Complex Wavelet Transform (DTCWT) is one of the most precise ones eliminating two main defects caused by the Discrete Wavelet Transform (DWT). Q-shift DTCWT was proposed afterwards to simplify the construction of filters in DTCWT, producing better fusion effects. A different image fusion strategy based on Q-shift DTCWT is presented in this work. According to the strategy, firstly, each image is decomposed into low and high frequency coefficients, which are, respectively, fused by using different rules, and then various fusion rules are innovatively combined in Q-shift DTCWT, such as the Neighborhood Variant Maximum Selectivity (NVMS) and the Sum Modified Laplacian (SML). Finally, the fused coefficients could be well extracted from the source images and reconstructed to produce one fully focused image. This strategy is verified visually and quantitatively with several existing fusion methods based on a plenty of experiments and yields good results both on standard images and on microscopic images. Hence, we can draw the conclusion that the rule of NVMS is better than others after Q-shift DTCWT.

1. Introduction

Since the optical lens has limited depth of focus (DOF), it is difficult to obtain an image of an object with every part in the field focused. Due to the insensitivity of the human eyes to unsharpness or blurriness within the DOF of the lens, only the objects within the DOF appear to be in focus while objects out of the DOF are blurred [1]. The solution is to use multifocus image fusion to extract in-focus information from each partially focused image into one fully focused image, which is better for visual perception and further calculation. This technology has been extended to various fields, such as microscopic imaging, visual inspection, 3D shape recovery, and measurement [2]. Thus, it attracted the attention of lots of researchers, who presented a variety of fusion algorithms.

Existing image fusion methods can be categorized into fusion methods on pixel, feature, and decision level [3]. In this paper, only pixel level methods are discussed due to their high accuracy and less information loss, which is better for microscopic imaging. Fusion methods on pixel level are subdivided into spatial domain and transform domain fusion algorithms [2, 4]. Transform domain algorithms, namely, the multiresolution algorithms, are more robust since the human visual system deals with information in a multiresolution way, which is in line with the processing principle of transform domain algorithms [5]. So, multiresolution algorithms can bring about higher precision, while spatial domain methods are commonly used in cases when rapidity is required. Transform domain fusion methods have two major research hotspots: pyramids algorithm [2, 6, 7] and DWT algorithm [2]. Presently, the DWT based fusion algorithm is a research hotspot with better fusion effect [5]. The DTCWT proposed by Selesnick et al. [8] is a relative improvement on DWT, with several important additional properties: approximate shift invariance, good directional selectivity, and limited data redundancy. In this Q-shift solution [8], the construction process of filters is largely simplified. Image fusion with guided filtering was presented by Li et al. [9]. Yin et al. presented a method of image fusion and superresolution using sparse representation [10]. Yang and Li reported an image fusion method on pixel level with simultaneous orthogonal matching pursuit [11].

In this work, we have proposed an effective fusion rule for multifocus image fusion after using Q-shift DTCWT and performed this method over partially focused image sequences blurred by Gaussian operators and microscopic image sequences. The proposed method is compared with DWT and different pyramid algorithms. Some common objective metrics [12, 13] are used to analyze the performance of these algorithms and different fusion rules.

The following renders the framework of this paper. Section 2 deals with the preliminary details of Q-shift DTCWT. Section 3 describes the proposed approach of image fusion. Section 4 is about the fusion rules applied in the proposed approach. In Sections 5 and 6, the experimental results, performance comparisons, and time consumption are given. Finally, conclusions are drawn in Section 7.

Aiming to solve the drawbacks caused by the DWT method, Kingsbury proposed Complex Wavelet Transform (CWT) method when conducting image fusion experiment. In the working process of CWT method, the plural form is needed, and perfect reconstruction filter banks should be built based on the plural form. It is relatively easy to construct such filter banks on the first decomposition layer, but it is much harder on more upper layers. In 1999, Kingsbury proposed a new form of DTCWT, which can both keep the advantages of CWT and achieve the function of perfect reconstruction [8]. DTCWT is a novel wavelet transform method using a pair of binary-structured trees to realize wavelet transforms on real and imaginary components of the signal in parallel. The principle of DTCWT decomposition process on a two-dimensional signal on level is shown in Figure 1, while the reconstruction process is not presented as it is just an inverse transform of the decomposition one.

DTCWT algorithm gains a series of advantages due to its unique structure, which are approximate shift invariance, good directional selectivity, and limited data redundancy. Due to its shift invariance, images fused by DTCWT are smooth and continuous while images fused by DWT contain irregular edges. Another major superiority of DTCWT is good directional selectivity since DTCWT produces six subbands at each scale for both imaginary and real parts in (±15°, ±45°, ±75°), while DWT only presents limited directions in (0°, 45°, 90°), which improves the transform precision and keeps more detailed information. However, in DTCWT, the process of designing filters is a little complicated due to its requirement for satisfying simultaneously the biorthogonal conditions and the phase condition [8]. Q-shift DTCWT is a solution to improve this problem as its quarter shift filters produce complex wavelets that are exactly linear phase, which largely simplifies the filter construction process.

The transform uses two parallel treelike filter series (Tree A and Tree B in Figure 1, generating the real and imaginary parts, resp.) that realize complex wavelet filtering of the input signal. In order to achieve shift invariance, filters are designed to own different delays, ensuring that every filter can sample the values discarded by other filters because of downsampling. In this way, aliasing phenomenon is minimized. The trees are implemented on rows and columns of the signal, respectively, and generate two low frequency coefficients and , containing rough information, and six high frequency coefficients , containing detailed information in different directions. In Figure 1, character “wood” is decomposed into eight subimages; two low frequency coefficients appear to be similar to the original one while the other six are the extracted detailed information in different directions. The two low frequency images are used as the input signal of the next level decomposition.

3. The Proposed Fusion Methods

The general image fusion scheme using DTCWT for both grayscale and color images is shown in Figure 2. Since DTCWT can only be calculated on monochrome images, color images should be decomposed into subimages. As the commonly used RGB color model creates color distortion due to the high correlation among the R, G, and B components, we transform it into YIQ color model. Since Y (luminance) component contains much more information than I and Q (chrominance) components [14], we perform the DTCWT fusion only on Y components, while I and Q components are fused based on a mapping table corresponding to Y fusion. Finally, the images in YIQ color model are transformed in RGB color model for display. In case of dealing with grayscale images, the above steps are omitted and DTCWT fusion is performed directly on the source images.

Detailed steps of DTCWT fusion process are as follows:(1)Perform DTCWT decomposition process on the input images a to k, which are focused in different parts. Each image is decomposed into two low frequency coefficients (La1, La2) to (Lk1, Lk2) and some high frequency coefficients (Ha1, Ha2,…, Ha) to (Hk1, Hk2,…, Hk). The high frequency coefficients represent detailed information of the input image in different directions, and their number depends on the total decomposition level.(2)Fusion rules are performed on low frequency coefficients and high frequency coefficients separately. More details are presented in the next section.(3)The selected high frequency and low frequency coefficients are then adjusted with consistency check. In multifocus images, blurred and sharp areas are generally regionally connected. For example, if the surrounding pixels are in focus, then the single pixel inside must be in focus too. Consistency check [15] is aimed at picking and changing these single pixels to be consistent with surrounding ones.(4)Finally, DTCWT reconstruction process is implemented on the final low frequency coefficients L1 and L2 as well as high frequency coefficients H1 to H to construct the fused image.

4. Fusion Rules

Fusion rule is the most important factor of the proposed fusion methods as it is the core of distinguishing focused region from defocused region. In the proposed method, low frequency coefficients and high frequency coefficients represent rough information and detailed information, respectively, requiring different fusion rules [16]. Several fusion rules are described in this section including pixel based fusion rules such as the Weighted Average (WA) method [17], the Synthesis image Module Value Maximum Selectivity (SMVMS), and regional based fusion rules like the Neighborhood Variant Maximum Selectivity (NVMS) [18], the Neighborhood Gradient Maximum Selectivity (NGMS) [19], and the SML [20]. Details of these fusion rules are discussed below.

4.1. The Weighted Average (WA) Method

WA is normally used in low frequency coefficients as differences in low frequency information are comparatively small. The final coefficient is defined as follows:where to are pixel values of source low frequency coefficients in the location and to are weights for to , generally .

4.2. The Neighborhood Variant Maximum Selectivity (NVMS)

NVMS is defined as follows: calculate the standard variance of the pixel in the th picture in the location: where are the window sizes, generally or . is the average value which is given by

The final coefficient is equal to the th pixel and is defined by picking the biggest from all computed values. Hence,

4.3. The Neighborhood Gradient Maximum Selectivity (NGMS)

NGMS is defined by which is formed as follows:

is equal to the kth pixel and is decided by the biggest . Hence,

4.4. The Sum Modified Laplacian (SML) Method

The Sum Modified Laplacian is given bywhere denotes the step length and is the kth pixel and is defined by the biggest . Hence,

4.5. Synthesis Image Module Value Maximum Selectivity (SMVMS)

SMVMS is used in high frequency coefficients as high frequency information is generally bigger in focused area. In order to make the time consumption short, a synthesis image ( is the decomposition level) is formed by combination of coefficients in six directions, and the fused coefficient is equal to the maximum which is calculated as

5. Experimental Results and Analysis

The proposed algorithm has been operated on Matlab 2014a with an Intel i5 4590 processor with 4 GB RAM. The traditional fusion methods based on DWT [16] and pyramids algorithms such as Gradient Pyramid (GRAD), Morphology Pyramid (MOP), Ratio Pyramid (RAT), and Laplacian Pyramid (LAP) [2, 13] are presented as compared to the proposed one. The DWT method, pyramids methods, and the proposed Q-shift DTCWT methods are all decomposed on 5 levels. In the aspect of fusion rules, traditional methods all take the rule of low frequency WA and high frequency SMVMS. For the proposed one, five different fusion rules described in the last section are used, and different matches are shown in Table 1. DTCWT1 is for comparing the performances among Q-shift DTCWT and the traditional methods. The remaining four are improved methods with better performance. Low frequency information has high correlation with the adjacent ones, so it is natural to reach an inference that more regional based fusion rules could improve its performance.

In DTCWT2 to DTCWT4, three regional based fusion rules are applied on low frequency coefficients to testify this inference and, in DTCWT5, the regional based NVMS is used both on low frequency and on high frequency coefficients for potential improvement.

5.1. Experiments on Standard Images

In the first experiment, the performance of the proposed fusion method is demonstrated by fusing 20 pairs of blurred images which are generated by filtering the source images shown in Figure 3 with Gaussian filter. Due to space reasons, only six pairs are displayed but the others are similar to the presented ones. In each of these pairs, complementary regions of the source images are blurred. The source images are standard grayscale or color images and are taken as the ground truth images, serving as the templates for comparison.

We have found that it is difficult to evaluate the quality of fused images visually, especially when the differences are too small to be observed. Hence, objective methods are used for more scientific and accurate evaluation. The first objective evaluation method is pixel value subtraction. The difference image is obtained from subtracting ground truth image from the fused image. The subtraction results of “Peppers” are shown in Figure 4 consisting of the ground truth image (Figure 4(a)), the partially blurred images (Figure 4(b) blurred in the middle Pepper and Figure 4(c) blurred in surroundings), subtraction results of the blurred images (Figures 4(d) and 4(e)), and subtraction result of Gradient Pyramid (Figure 4(f)), Morphology Pyramid (Figure 4(g)), Ratio Pyramid (Figure 4(h)), Laplacian Pyramid (Figure 4(i)), DWT (Figure 4(j)), and DTCWT1 to DTCWT5 (Figures 4(k)–4(o)).

From analyzing the color of these subtraction images according to the color bar, we can judge the deviation between each fused image and the ground truth image. Obviously, subtraction results of GRAD and MOD bear relatively poor effect and those of RAT and LAP are better but perform extremely poor in edges. DWT performs better than the pyramids methods, and, by comparing DWT and DTCWT1, it can be concluded that DTCWT is better than DWT under the same fusion rules. But, with regional based fusion rules in low frequency coefficients, as in DTCWT2 to DTCWT4, the fused results are much better. Finally, in DTCWT5, where both low frequency and high frequency use regional based NVMS fusion rule, the deviation mainly focuses on the edge of the middle Pepper.

Another objective evaluation method is called the quantitative metrics [21]. Quantitative metrics are the objective evaluation indicators which can overcome the influence of human’s inaccurate vision judgment and make the indicators mathematically evaluate the effectiveness of image fusion. Three quantitative metrics are applied here, including mutual information (MI), peak signal-to-noise ratio (PSNR), and root mean square error (RMSE), which are described as follows.

5.1.1. Mutual Information (MI)

Let , and denote the normalized histogram of the source image , source image , and fused image, respectively. and denote the joint histogram between the fused and source images. and denote the row size and column size of the image. Mutual information between source images , and resulting fused images is as follows:

The total mutual information is defined as

Larger values imply better image quality.

5.1.2. Peak Signal-to-Noise Ratio (PSNR)

where , denote, respectively, the th pixel value of the reference image (ground truth image) and fused image and is the count of gray levels in the image. PSNR value is higher if the ideal and fused images are more approximate. Higher value indicates better fusion.

5.1.3. Root Mean Square Error (RMSE)

RMSE between the reference image and the fused image can be calculated as

RMSE approaches zero when the reference and fused images are very approximate and it will increase when the similarity decreases.

The values of MI, PSNR, and RMSE of the fused images with various fusion methods are shown in Tables 24 and the best values are marked in bold. It can be seen that DTCWT2, DTCWT4, and DTCWT5 show better performance while being analyzed with MI. The PSNR values of LAP, DTCWT1, DTCWT2, and DTCWT3 methods are superior to the others. When calculated with RMSE, the best values belong to RAT, DTCWT3, and DTCWT4. Since DTCWT based fusion methods bear the best evaluation values in most situations, it can be concluded that the proposed fusion methods are superior to the existing pyramids and DWT methods while the fusion rules could be chosen in accordance with specific conditions, which will be studied in the future work.

5.2. Experiments on Microscopic Images

In the second experiment, the fusion process is performed on two groups of images captured by a metallographic microscope. As described in Section 1, only the objects within the DOF of the objective lens of the microscope appear to be in focus. In reality, observed specimens within the field sometimes have rough surfaces which could not be completely captured in the DOF in one shot. A method is proposed to solve the problem in which several pictures partially focused are captured by adjusting the height of the lens and hence different parts are within the DOF. Then, fusion methods presented above are applied on the image sequence to obtain an entirely clear image.

Since metallographic microscopes are widely used in machinery, electronics, chip, chemical, precision instrument, and other industries for observing and analyzing surface quality of opaque matters, two examples, namely, the detection of Printed Circuit Board (PCB) and worn external turning tools, are presented in this paper with a metallographic microscope at 5x magnification and a CCD camera connected to a computer as our experimental devices. Fusion results are analyzed both visually and objectively.

In the first PCB experiment, since the components and their joints are uneven, it is hard to capture a totally clear image in one shot. But, with the methods proposed in this article, completely focused images can be acquired. Three image sequences of PCB are shown in Figures 5, 6, and 7, but, due to space reasons, only sequence 1 is displayed in detail. The PCB we used is shown in Figure 5(a) and the inspected area is marked in red blocks. In Figure 5, we can see three source images focused on board (Figure 5(b)), the solder paste (Figure 5(c)), and the component pin (Figure 5(d)) and ten fused images (Figures 5(e)5(n)) using various fusion methods including GRAD, MOP, RAT, LAP, DWT, and DTCWT1 to DTCWT5. For better observation, a small part as shown in the red spotted line block in Figure 5(b) is magnified into Figures 5(o)5(y), in which Figure 5(o) is extracted from the source image as a ground truth image and Figures 5(p)5(y) are extracted from fused images. It is obvious that fused images with pyramids methods (Figures 5(p)5(s)) show severe distortion. DWT method is better but contains lots of vertical light beams which do not exist in DTCWT method. DTCWT1 to DTCWT4 contain some degree of light dispersion but DTCWT5 shows perfect performance.

In objective evaluation, we simply use PSNR and RMSE as our quantitative metrics since MI could not be used because each experiment has more than two source images. As it is impossible to obtain a completely clear image, we cut clear parts of source images as the ground truth image as shown in the red blocks in Figures 57. The fused images are also cropped at the same position for objective evaluation.

Values of PSNR and RMSE of these areas are shown in Tables 5 and 6 and the best values are marked in bold. It can be seen that DTCWT4 and DTCWT5 own the best performance when evaluated with PSNR while RAT, DTCWT3, DTCWT4, and DTCWT5 are better in RMSE evaluation. Generally, the results appear to be a good match to that in subjective evaluation. We can see that the proposed methods perform better. Fused images of the proposed methods reserve more details from source images and thus are clearer than other methods with less distortion. Using the proposed DTCWT fusion methods, we can obtain totally clear images of PCB for components failure detection, solder paste detection, and other applications.

For the second cutting tool experiment, image sequences for a worn external turning tool are shown in Figures 810 with the turning tool shown in Figure 8(a) and the inspected area marked in red blocks. Due to its worn surface, the turning tool seems to be partially focused and partially defocused in one image. We analyze the fusion results both subjectively and objectively, where fused images are not presented due to space reasons. The clear parts from different source images are cut as the ground truth images, as shown in the red blocks in Figures 810.

PSNR and RMSE values of the fusion results are shown in Tables 7 and 8. We can draw the conclusion that DTCWT based methods also have outstanding performances in fusion of turning tool images.

6. Time Consumption

This part discusses the calculation time of the above experiments. From Tables 9 and 10, it can be concluded that calculating processes using pyramids algorithm cost very little time while the DWT algorithm costs a little more time, and the proposed Q-shift DTCWT methods cost the longest time. It seems that different fusion rules utilized in this paper have little to do with the costing time.

7. Conclusion

This paper provides an effective method for fusing various images by using Q-shift DTCWT. Since DTCWT is approximately shift invariant and has six directions: (±15°, ±45°, ±75°), more than in DWT, it could preserve more detail information and edge information of source images, and the Q-shift resolution simplifies its filter construction process. Since fusion rule is another significant factor in image fusion, five different fusion rules are presented in this article and evaluated both visually and objectively. From the plenty of experimental data, we can draw the conclusion that Q-shift DTCWT methods are a little better than some other multiresolution fusion methods (using the same fusion rules), but, with low frequency regional based fusion rules or both low frequency and high frequency regional based fusion rules, the DTCWT methods have outstanding performances. The proposed methods are used on microscopic image fusion including PCB and worn external turning tool, and the results turn out to be consistent with the ideal experiments. The proposed methods were proven to cost more time than pyramids algorithm and the DWT algorithm, and it seems that using different fusion rules will not add the costing time. Future work will be done for founding better fusion rules and discussing whether different fusion rules are appropriate for different situations.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

This research work was supported by the National Key Technology Support Program of China (Grant no. 2015BAF10B01) and Science and Technology Commission of Shanghai Municipality (Grants nos. 15111104002 and 15111106302).