Abstract

In this study, we introduced a preprocessing novel transformation approach for multifocus image fusion. In the multifocus image, fusion has generated a high informative image by merging two source images with different areas or objects in focus. Acutely the preprocessing means sharpening performed on the images before applying fusion techniques. In this paper, along with the novel concept, a new sharpening technique, Laplacian filter + discrete Fourier transform (LF + DFT), is also proposed. The LF is used to recognize the meaningful discontinuities in an image. DFT recognizes that the rapid change in the image is like sudden changes in the frequencies, low-frequency to high-frequency in the images. The aim of image sharpening is to highlight the key features, identifying the minor details, and sharpen the edges while the previous methods are not so effective. To validate the effectiveness the proposed method, the fusion is performed by a couple of advanced techniques such as stationary wavelet transform (SWT) and discrete wavelet transform (DWT) with both types of images like grayscale and color image. The experiments are performed on nonmedical and medical (breast medical CT and MRI images) datasets. The experimental results demonstrate that the proposed method outperforms all evaluated qualitative and quantitative metrics. Quantitative assessment is performed by eight well-known metrics, and every metric described its own feature by which it is easily assumed that the proposed method is superior. The experimental results of the proposed technique SWT (LF + DFT) are summarized for evaluation matrices such as RMSE (5.6761), PFE (3.4378), MAE (0.4010), entropy (9.0121), SNR (26.8609), PSNR (40.1349), CC (0.9978), and ERGAS (2.2589) using clock dataset.

1. Introduction

In the field of image fusion, the subfield multifocus image fusion is one of the most significant and valuable approaches to handle the problem of defocusing that some parts of the image are not in focus and blurred due to the limited depth of focus in the optical lens of traditional cameras or large aperture and microscopes cameras. In multifocus image fusion, various images of a similar scene but with different focus settings can be merged into a signal image (one image) with more information, where all the parts of the image are entirely focused. The practical technique of multifocus image fusion should need to accomplish the requirements that all the information of the focused regions in the source images is preserved in the resultant image [1]. Due to this, the resulting image is well-informative and complete. Multifocus image fusion is applicable in a wide range of applications such as environmental monitoring, image analysis [2], military technology, medical imaging [3], remote sensing, hyperspectral image analysis [4], computer vision, object recognition [5], and image deblurring [6].

In the multifocus image, fusion has been introduced as a large number of techniques over the past couple of decades; some of them are very popular methods and achieve high accuracy, such as stationary wavelet transform (SWT) [7], discrete wavelet transform (DWT), dual-tree complex wavelet transform (DT-CWT), and discrete cosine transform (DCT) [2]. Most multifocus image fusion techniques are divided into four major classes [1, 8]. The first category is multiscale decomposition or frequency domain techniques such as wavelet transformation [8, 9], complex wavelet transformation [1, 10], nonsubsampled contourlet transform [11], DWT [2], and SWT [12]. The second category is sparse representation techniques like an adaptive SR model proposed in [13] for simultaneous image fusion and denoising and multitask sparse representation technique [14]. The third category of techniques is based on computational photography, such as light-field rendering [15]. This kind of technique finds more of the physical formation of multifocus images and reconstructs the all-in-focus images. The last category of techniques performed in the spatial domain, which can make full use of the spatial context and provide spatial consistency or spatial domain, includes averaging [2, 16], minimum [2,17], intensity hue saturation (IHS) [2,18], principal component analysis (PCA) [2,19], and Gram–Schmidt [20] techniques.

In this paper, a new concept has been proposed in the image fusion environment for multifocus image fusion. The key contributions of this work are summarized as follows:(i)The new concept is that image enhancement or image sharpening techniques are used before image fusion; in other words, the preprocessed step is performed before applying image fusion techniques.(ii)The preprocessed step is beneficial before the image fusion because the sharpening methods are helpful for recognizing the meaningful discontinuities in an image, i.e., edges information or edges detection.(iii)All the standard techniques of image fusion have directly fused the images and generated the resultant image. In this work, first, the source images were enhanced, using the proposed hybrid enhancement method such as LF + DFT (Laplacian filter + discrete Fourier transform) and other popular enhancement methods (Laplacian filter (LF) and unsharp masking (UM)).(iv)Second, the enhanced images were fused by popular fusion methods such as DWT, SWT, and generated more informative and meaningful resultant images as demonstrated in Figure 1. The performance of the novel proposed method is outperformed as compared with the state-of-art methods.

The rest of the paper is organized as follows. Section 2 briefly describes the related work of multifocus image fusion. Section 3 describes the proposed methodology, such as Laplacian filter + discrete Fourier transform with DWT and SWT. Section 4 shortly describes the performance measures. Section 5 gives the experimental results and discussion, and the paper is concluded in Section 6.

2. Literature Study

Multifocus image fusion is one of the most significant areas of image processing, and a lot of advanced techniques have been proposed in a couple of decades. Several works have been carried out in the spatial domain. Principal component analysis (PCA) is the most frequently used method and is specially designed to generate visible results regarded as sharp edges and highly preserved spatial characteristics [21]. The intensity hue saturation (IHS) technique effectively transforms the image from red, green, and blue (RGB) domain into spatial (I) and spectral (H, S) information [22]. The PCA and IHS have one significant advantage: both can use an arbitrary number of channels [23]. Brovey technique is mathematical formulas of the Brovey transform (BT), introduced by American scientist Bob Brovey. BT is different sources that capture a simple technique for merging the information. Brovey is also called the color normalization transform (CNT) because it involves a red, green, blue (RGB) color transform approach [24]. Average and maximum/minimum selection is also spatial-domain method [25]. Many spatial-domain methods are complicated and time-consuming, and these techniques produce poor results because they usually produce spectral distortions in the fused images, and the produced image is of low contrast, which contains less information comparatively.

Image fusion is also based on frequency domain techniques such as discrete cosine transform (DCT), the frequency information (pixels) is very effective in obtaining the details and outlines of an image, and DCT is the proper working mechanism with frequencies. It provides a fast and noncomplex solution because it uses only cosine components for the transformation. The IDCT reconstructs the original pixel values from the frequencies acquired from DCT [26]. The discrete cosine harmonic wavelet transform (DC-HWT) is the advanced version of DCT. In DC-HWT, the signal is decomposed by grouping the DCT coefficients similarly to DFT coefficients except for the conjugate operations in laying the coefficients symmetrical (accurate as DCT).

Further, symmetric placement is also not significant due to the definition of DCT [27]. These groups’ inverse DCT (IDCT) results in discrete cosine harmonic wavelet coefficients (DC-HWCs). The DCT of these processed sub-bands (DC-HWCs) results in sub-band DCT coefficients, which are repositioned in their corresponding positions to retrieve the overall DCT spectrum at the original sampling rate. Details of DC-HWT are provided in reference [28]. The dual-tree complex wavelet transform (DT-CWT) is based on a couple of parallel trees, the first one represents the odd samples, and the second one represents the actual samples generated at the first level. The parallel trees render the signal delays necessary for each level and, therefore, eradicate aliasing effects and attain shift-invariance [29]. Discrete wavelet transform (DWT) is the mathematical tool introduced in the 1980s, and it is an instrumental technique for image fusion in the wavelet transformation process [1] but with the following drawbacks: it retains the vertical and horizontal features only, it is lack of shifting invariance, it suffers through ringing artifacts and reduces the quality of the resultant fused image, it is lack of shifting dimensionality, and it is not suitable for edge places due to missing edges during the process. The technique DWT is not a time-invariant transformation technique, which means that “with periodic signal extension, the DWT of a translated version of a signal X is not, in general, the translated version of the DWT of X.”

The stationary wavelet transform (SWT) is a wavelet transform developed to overcome the deficiency of translation invariance of the DWT. The SWT is an entire shift-invariant transform, which up-samples the filters by putting zeros among the filter coefficients to overcome the down-sampling step of the decimated approach [2]. They provide improved time-frequency localization, and the design is simple. Appropriate high-pass and low-pass filters have used the data at each level, producing two sequences at the next level. In the decimated approach, the filters are applied for the rows at first and then for the columns [7, 30]. The SWT filter bank structure is given in Figure 2.

The images are broken down into horizontal and vertical approximations by employing column-wise and row-wise low-pass and high-pass filters [31]. The same filtration decomposes elements row-wise and column-wise to acquire vertical, horizontal, and diagonal approximation. The low-pass and high-pass filters preserve the low and high frequencies and provide detailed information at respective frequencies.

3. Proposed Methodology

In this article, the novel idea is proposed, which is the first time involved in multifocus image fusion to increase the accuracy (the visibility of objects). The novel concept is preprocessed evaluation of images before fusion. The fusion is performed by the two standard methods such as DWT and SWT to validate the proposed techniques. The complete process is demonstrated in Figure 3, and the proposed techniques are elaborated as follows.

3.1. Laplacian Filter (LF)

The Laplacian filter of an image highlights an area of rapid intensity change. Hence, the LF is used for the edge-sharpening [27, 30, 32]. This operator is exceptionally well at identifying the critical information in an image. Any feature with sharp discontinuity will be sharpening by an LF. Laplacian operator is also known as a derivative operator, used to identify an image’s key features. The critical difference between the Laplacian filter and other filters such as Prewitt, Roberts, Kirsch, Robinson, and Sobel [27, 33] is that all these filters use first-order derivative masks, but LF is a second-order derivative mask. LF sharpens the “Knee MRI medical image,” which demonstrates the difference between source and LF sharpen images. The Laplacian equation is as follows:

3.2. Unsharp Mask (UM)

An “unsharp mask” is a simple sharpen image operator, contrary to what its name might lead you to believe. However, actually, the name is derived from the fact that it sharpens edges through a process that deducts an unsharp version of a picture from the reference picture and detects the presence of edges, making the unsharp mask (effective a high-pass filter) [19]. Sharpening can demonstrate the texture and detail of the image. This is probably the common type of sharpening and can be executed with nearly any image. The unsharp mask cannot add artifacts or additional detail in the image, but it can highly enhance the appearance by increasing small-scale acutance [33, 34] and making important details easier to identify. The unsharp mask method is usually used in the photographic and printing industry applications for crispening edges. In sharpening images, the image size does not change, and it remains similar, but an unsharp mask improves the sharpness of an image by increasing the acutance only. In the unsharp masking technique, the sharper image a(x, y) will be produced from the input image b(x, y) aswhere c (x, y) is the correction signal calculated as the output of a high-pass filter and is a positive scaling factor that controls the level of contrast sweetening achieved at the output [32,35]. Unsharp masking sharpens the “Knee MRI medical image,” demonstrating the difference between source, LF, and unsharp masking sharpen images.

3.3. LF + DFT Method

The hybrid sharpening technique (LF + DFT) is proposed in this study for multifocus image fusion. The hybrid approach is the merger of the advantages of LF and DFT methods. The LF is used to recognize the meaningful discontinuities in an image, i.e., edges information or edges detection. In other words, LF is a derivative operator used to find the region of rapid change in the picture. The rapid change in the image is like sudden changes in the frequencies, low-frequency to high-frequency [36]. The DFT is a common approach used to compute the frequency information as discrete. The frequency information is considered an important way in the picture enhancement [33, 37]. Therefore, to make a beneficial way of sharpening, the frequency information of Fourier transform is combined with the second derivative masking of Laplacian filter in the novel technique. Here is the involvement of spatial conversion to the frequency and inverse (see equations (4) and (5)). So, this is the reason for calling that the cross-domain method.

For a two-dimensional square image with N × N, the DFT equation is given as follows:where f(m, n) is the spatial-domain image, and the exponential term is the basis operation representing every point F(x, y) in the Fourier space. The formulation can be construed as follows: the value of every point F(x, y) is acquired by multiplying the spatial image with the representing base operation and summing the results.

The primary operations are cosine and sine waves with growing frequencies, i.e., F(0, 0) presents the DC components of the image which corresponds to the average brightness and F(N − 1, N − 1) presents the highest frequency.

Similarly, the frequency domain image can be retranslated (inverse transform) to the spatial domain, shown in Figure 4. The inverse frequency transform is as follows:

In the proposed technique, for a two-dimensional square image with N × N resolution, the Laplacian equation (2) and Fourier equation (4) are given:

The apparent sharpness of an image is increased, which is the combination of two factors, i.e., resolution and acutance. Resolution is straightforward and not subjective, which means the size of the image file in terms of the number of pixels. With all other factors remaining equal, the higher the resolution of the image is—the more pixels it has—the sharper it can be. Acutance, a measure of the contrast at an edge, is subjective and a little complicated comparatively. There is no unit for acutance—you either think an edge has contrast or think it does not. Edges that have more contrast appear to have a more defined edge to the human visual system. LF + DFT sharpens the “Knee MRI medical image,” which demonstrates the difference between source, LF, unsharp masking, and sharpen images in Figure 5.

4. Performance Metrics

The quantitative evaluation aims to identify the performance of the proposed methods and existing methods on various measures, and every measure has its properties. Table 1 briefly describes the well-known statistical metrics.

5. Experimentation

5.1. Datasets

In this letter, the experimentations are performed on four image sets; two are grayscale image sets including “Clocks” and “Books,” and the other two are color image sets such as “Toys” and “Building and card.” The grayscale image sets are provided by authors, and the color image sets are acquired from “Lytro multifocus datasets” [43]. These image sets are used for testing multifocus images for the experimental evaluation of novel techniques. The size of the grayscale image sets (test images) is 512 × 512, and the size of the color image sets is 520 × 520 pixels.

5.2. Experimental Results and Discussion

In this section, the experimentation is conducted on different multifocus image sets for the proposed hybrid methods. The proposed hybrid methods like DWT + LF, DWT + unsharp masking, DWT + (LF + DFT), SWT + LF, SWT + unsharp masking, and SWT + (LF + DFT) are compared with the traditional methods such as average method (spatial-domain methods), minimum method, DWT (frequency domain method), and SWT methods. The algorithms are implemented, and the simulations are performed using the MATLAB 2016b application software tool. The resultant images are evaluated in two ways, i.e., quantitatively and qualitatively. For quantitative evaluation, eight well-known performance matrices, i.e., percentage fit error (PFE), entropy (E), correlation coefficient (CORR), peak signal to noise ratio (PSNR), relative dimensionless global error (ERGAS), mean absolute error (MAE), signal to noise ratio (SNR), and root mean square error (RMSE) are used to measure the performance of resultant images of old and new methods. The quantitative results of the new approaches are improved for the “Clocks,” “Books,” “Toys,” “Building and card,” and “Breast Medical (CT and MRI images)” image sets, as shown in Tables 26. All the performance metrics show better results for the proposed approaches on all image sets, which show the capability of the new approaches in fusion environment.

RMSE indicates the difference between the true image and the resultant image. The smallest values show excellent results. PFE is computing the norm of the difference among the corresponding pixels of the true and resultant image to the norm of the true image. The low values indicate superior results. MAE is the absolute error to calculate and validate the difference between resultant and reference images. Here, MAE values are small for the proposed methods on both image sets, promising results. The large value of entropy expresses the good results; hence, for the “Books” image set, the DWT technique has a large value, while the “Clock” image sets the proposed methods to demonstrate the impressive results. The CORR is a quantitative measure that demonstrates the correlation between the true image and the resultant image. When the true and resultant images are similar, the value will be near to one. PSNR is specifically used for the measurement of spatial quality in the image. SNR is the performance measure used to find the ratio among information and noise of the resultant image. ERGAS is used to calculate the quality of the resultant image in terms of normalization average error of each channel of the processed image. The quantitative results of the proposed methods are well performed as compared with traditional methods. According to the results shown in Figures 610, the SWT + (LF + DFT) method is superior among all proposed methods.

The qualitative analysis is a significant evaluation metric in multifocus image fusion. The scientists performed fusion on simple multifocus images. All the fusion methods are directly employed to the multifocus images and improved the results. However, in this article, the new concept is introduced as a preprocessing step before fusion. This concept is firstly proposed in fusion environment. The preprocessed step is involved in sharpening the images. Three image sharpening techniques are used as a preprocessed step like Laplacian filter, unsharp masking, and LF + DFT. From Figures 1115, (a) and (b) both are source images, while (c) and (d) are sharpen images by Laplacian filter, (e) and (f) are sharpen images by unsharp masking, and (g) and (h) are sharpen images by LF + DFT for “Clocks,” “Books,” “Toys,” and “Building and Cards” image sets, respectively.

6. Conclusions

In this paper, we are mainly trying to solve the problem of the out-of-focus blur part of an image. To achieve this goal, we introduced a new concept of sharpening the edges or enhancing the image before fusing the multifocus source images. Laplacian filter does the preprocessing step (sharpen the edges), unsharp masking, and newly proposed Laplacian filter + discrete Fourier transform (LF + DFT) sharpen method. The sharpening concept is firstly proposed in a fusion environment, and the experimental results demonstrate the superiority of the new concept. After sharpening the images, fusion is performed by stationary wavelet transform (SWT) and discrete wavelet transform (DWT) techniques. The experiments are conducted on color and grayscale datasets to validate the effectualness of the proposed technique. Four datasets “Clock,” “Book,” “Toy,” “Building and Card,” and “Breast Medical CT and MRI images” are used for experimentation The proposed technique is evaluated visually and statistically, and for statistical assessment, we used eight well-known metrics such as percentage fit error, entropy, correlation coefficient, peak signal to noise ratio, relative dimensionless global error, mean absolute error, signal to noise ratio, and root mean square error which indicates that the new method outperformed among all state-of-the-art methods. In this work, one major future challenge is that the proposed scheme is not time efficient because of the preprocessed step before image fusion compared with simple fusion methods.

Data Availability

The datasets used in this research are taken from UCI ML Learning Repository available at https://archive.ics.uci.edu/.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.