Abstract

This paper proposed an effective multifocus color image fusion algorithm based on nonsubsampled shearlet transform (NSST) and pulse coupled neural networks (PCNN); the algorithm can be used in different color spaces. In this paper, we take HSV color space as an example, H component is clustered by adaptive simplified PCNN (S-PCNN), and then the H component is fused according to oscillation frequency graph (OFG) of S-PCNN; at the same time, S and V components are decomposed by NSST, and different fusion rules are utilized to fuse the obtained results. Finally, inverse HSV transform is performed to get the RGB color image. The experimental results indicate that the proposed color image fusion algorithm is more efficient than other common color image fusion algorithms.

1. Introduction

Video technology is one of the important technologies for coastal monitoring, and image fusion is the basis of video technology. Color images contain color information and brightness information, so the color images are more suitable for coastal monitoring than gray images [1]. Besides, the identifiable degree of human vision to color information is higher than the gray image [2]. The whole procedure of the image fusion is to extract the significant and representative information from the source images of the same scene, which may come from different types of image sensors or the same one acting in different modes, and then attempt to fuse it into the final composite image with a better description of the scene than any of the individual source images. Thus, the study of suitable fusion technology in multisensor image is necessary and valuable [3].

Color image is the combination of different brightness and colors. Because color image is comprised of several components and the fusion image is the fusion of each color space component, there are some common algorithms, such as average, Intensity, Hue and Saturation (HIS), and principal component analysis (PCA) [4, 5], which are easy to implement but the performances are not good. Recently, image fusion methods based on multiresolution analysis have been widely studied; the first step is image transform and then recombining the coefficients of the transformed image; at last the fused image can be obtained by inverse transform. According to the different ways of decomposition, these algorithms can be divided into pyramid transform, wavelet transform [6], curvelet [7], and contourlet [8]. In 2005, Labate et al. proposed a new multidimensional representation algorithm, which is called shearlet [9]. One advantage of this algorithm is that it can be constructed using generalized multiresolution analysis and efficiently implemented using a classical cascade algorithm. So shearlet has good performance in both time domain and frequency domain [10]. In order to combine the superiorities and overcome the defects of nonsubsampled contourlet transform (NSCT) and shearlet transform (ST), [11] proposed the theory of nonsubsampled shearlet transform (NSST) combining the nonsubsampled Laplacian pyramid transform with several different shearing filters. In comparison with current multiresolution geometric analysis (MGA) tools, NSST absorbs some recent developments in the MGA field and shows satisfactory fusion performance such as the better sparse representation ability and much lower computational costs. Besides, NSST also has the requirement of the shift-invariance property ST lacks. Therefore, it is hoped that further research on the area of image fusion based on NSST domain is promising and competitive [12]. In recent years, the image fusion method based on PCNN is getting more and more attention by many experts and scholars with PCNN’s characters in biological background. Compared with other artificial neural networks, PCNN has an incomparable advantage over other traditional artificial neural networks. So PCNN has been widely used in image processing fields and shows extremely superior performances [1215].

In this paper, a new multifocus color image fusion algorithm is proposed based on NSST and PCNN. The paper absorbs some advantages of NSST and PCNN; it firstly converts RGB color image to HSV color image, and then H component is input into adaptive simplified PCNN (S-PCNN) model to get oscillation frequency graph (OFG) of S-PCNN; a new fused H component is obtained by comparing the OFG; S and V components are decomposed into low frequency subband and high frequency subband by NSST, and these subbands are fused by different methods to get new fused S and V components. At last, inverse HSV transform is performed to obtain a new fused RGB color image. The experimental results indicate that the proposed algorithm is more effective to save the color information of the source color images than other common algorithms; and the fused image contains more edges, texture, and detail.

This paper is arranged as follows. Section 2 introduces related theories of NSST and PCNN model. Section 3 explains the proposed algorithm, including framework and workflow. Section 4 presents the experimental results and analysis. Section 5 concludes this paper.

2.1. PCNN

PCNN model has three fundamental parts: the receptive field, the modulation field, and the pulse generator [13, 14]. In the receptive field, which consists of and channels and is described by (1), the neuron receives neighboring neurons’ coupling input and external stimulus input . In and channels of the neuron, the neuron links with its neighborhood neurons via the synaptic linking weights and , respectively; the two channels accumulate input and exponential decay changes at the same time; the decay exponentials are and , respectively, while the channel amplitudes are and , respectively:

In the modulation field, the linking input made by adding a bias to the linking; then, it is multiplied by the feeding input; the bias is unitary, is the linking strength, and the total internal activity is the result of modulation, which is described by

Pulse generator consists of a threshold adjuster, a comparison organ, and a pulse generator, which is described by (3). Its function is to generate the pulse output , and is adjustment threshold; is threshold coefficient. When the internal state is larger than the threshold , that is, the neuron satisfies the condition , a pulse would be produced by the neuron; we call an ignition, which is described by (4):where the subscripts and represent the neuron location in PCNN and denotes the current iteration (discrete time step), where varies from 1 to ( is the total number of iterations). In particular, “a neuron ignition” means a PCNN’s neuron generates a pulse. The total times of ignitions represent image information of the corresponding code sequences after iterations.

When PCNN is used for image processing, a pixel is connected to unique neuron. The number of neurons in the network is equal to the pixel number of the input image; namely, there exists one-to-one correspondence between the image and neurons network, and the pixel value is taken as the external input stimulus of the neuron in channel. A neuron outputs results in two states, namely, pulse (status 1) and nonpulse (status 0), so the output status of neurons composes a binary image. More information about PCNN will be found in [1215].

2.2. S-PCNN

Simplified PCNN (S-PCNN) model [15] is composed the same as the original PCNN model, but the input of channel is only related to image gray value and has no relationship with external coupling and exponential decay characteristics, and its parameters are less than original PCNN model, and the input channel of the receptive field is simple and effective. In S-PCNN model, the variables of a neuron satisfy the following:

2.3. The OFG of PCNN

Capture characters of PCNN neurons will cause a similar brightness to the surrounding neurons to capture the ignition.. The capture characters can be automatically coupled to transmit information. In this paper, we use PCNN to extract image features; PCNN also can extract the information of the image’s texture, edge, and regional distribution and has a good effect on image processing. In an iteration of PCNN, a binary image will be obtained by recording the neuron fires or not. The binary images effectively express the features of the image such as texture, edge, and regional distribution; the binary map and OFG are shown in Figures 1(b) and 1(c). After the global statistics of the binary image of the neurons, we get an oscillation frequency graph (OFG), which is shown in (6) and Figures 1(d) and 1(e):where denotes the iteration times, denotes the pulse output of the neuron , and is the current iteration.

2.4. Nonsubsampled Shearlet Transform

Shearlet transform was proposed based on wavelet by Labate et al. [9, 11]. In dimension , affine systemwhere is a collection of basis functions and satisfies ; represents anisotropy matrix for multiscale partitions; is a shear matrix for directional analysis. , , and are scale, direction, and shift parameter, respectively. and are both invertible matrices and . For and , the matrices of and are given byLet and , from (8), be further modified:For , , the mathematical expression of basic function for shearlet transform can be given according to [11]where is the Fourier transform of , and are both wavelets, and supplement , . It implies that and compactly supported with . In addition, we assume thatand for each , satisfies From the conditions on the support of , and , one can obtain that function has the frequency support listed in

That is, each element is supported on a pair of trapeziform zones, whose sizes all approximate to . The tiling of the frequency by shearlet and the size of the frequency support .

In NSST algorithm, in order to remove the influence of upsampling and subsampling, nonsubsampled Laplacian pyramid filters are used as a substitute in the shearlet transform, so it has excellent performance in terms of shift-invariance, multiscale, and multidirectional properties. The discretization process of NSST has two phases: multi-scale factorization and multi-directional factorization. Nonsubsampled Laplacian pyramid filters complete multiscale factorization. The first phase uses classes two-channel nonsubsampled filter to get one low frequency image and high frequency images. The multidirectional factorization in NSST is realized via improved shearlet transform. These filters are formed by avoiding the subsampling to satisfy the property of shift-invariance. Shearlet transform allows the direction decomposition with stages in high frequency images from nonsubsampled Laplacian pyramid at each level and produces directional subimages with the same size as the source image [16, 17]. Figure 2 shows the two-level NSST decomposition of an image.

3. The Proposed Algorithm

In this section, the proposed multifocus color image fusion algorithm is presented in detail. The framework of the proposed algorithm is shown in Figure 3. In this algorithm, the RGB color images are transformed into HSV color space [18], NSST is used to decompose the image and PCNN is used to extract the features and fuse these features using different rules. Besides, it is important to note that two kinds of PCNN model are used in the algorithm; adaptive S-PCNN is used to fuse H component and original PCNN is used to fuse high frequency coefficients of NSST.

3.1. RGB Color Image Transform to HSV

RGB color image contains almost all basic colors that can be perceived by human vision; however, the correlation among the components is very strong; it is shown in Figures 4(b), 4(c), and 4(d); this makes it difficult for RGB color image to deal with the fact that the color of the image will be changed if a component changes. HSV image can be obtained by the RGB transform. The values of R, G, and B correspond to unique H, S, and V values, as the values of H, S, and V components depend on the values of R, G, and B in RGB color space. This color system is closer than the RGB color system to human experience and perception of the color; this is shown in Figures 4(e), 4(f), and 4(g).

The distribution of H component is concentrated, and its pixel value generally is small; see Figure 4(h). So its edge is obvious, and S-PCNN is sensitive to edge and regional distribution of the image. In this algorithm, H component is partitioned into blocks and input into adaptive S-PCNN to get OFG of the H component; we fuse the H component according to the OFG. The distribution of S and V components is dispersive; see Figures 4(i) and 4(j), which contain lots of details of the image in different grayscale, so S and V components are decomposed into multiscale and multidirectional subband by NSST to get different detailed information of the images, and then according to the characters of new S and V components, new S and V components are fused using different rules.

3.2. H Component Fusion Using Adaptive S-PCNN

is the linking strength of S-PCNN, and it is a key determinant to the ignition behavior of S-PCNN. Spatial frequency (SF) and average gradient (AG) are very important indicators of the image definition, which represent the quality of the image. So S-PCNN should be adaptively adjusted by SF and AG to make it work well. In S-PCNN model, H component is divided into several blocks, and then SF and AG of the blocks are calculated as linking strength of S-PCNN, respectively, so this can adaptively adjust of S-PCNN; see (14) and (15). The block images are input into S-PCNN to get two kinds of OFG according to different , which can effectively express the quality of the block images. The two types of indexes will change with the content of the image and produce a certain range of changes. The index itself and its change range are suitable for adjusting the parameter of S-PCNN, which can affect the OFG of S-PCNN, so that the two types of OFG can reflect different details of the image. If one type of OFG is only used, as the size of OFG for different quality of pixel is the same, this cannot select appropriate pixels. However, the combination of two types OFG can reflect the image information from different angles; thus, this reduces the possibility of different pixels with the same OFG. It can make the algorithm more effective.

SF and AG of the image are described in (16) to (19):where is adjusted factor. Consider

SF is composed of row frequency (RF) and column frequency (CF), where is the row of the image, is the column of the image, and is grey level of the image at pixel :

3.3. S and V Components Fusion Using NSST and PCNN

S and V components contain lots of details of the image, and the gray values of S and V components are dispersive, respectively. S and V components are decomposed by NSST to make it easy to extract, and then we will get one low frequency subband and several high frequency subbands in multiscale and multidirection. The higher values of the low frequency coefficients are used into the new fused low frequency subband. The high frequency subbands contain abundant detailed information of the S and V components; these components are input into original PCNN model to get OFG, which shows the statistics of the ignition times of the pixels. Compared with the OFG of the high frequency subbands from different S and V components, we will get the new fused high frequency subbands. At last, inverse NSST is performed to get the fused S and V components according to the new fused low frequency subbands and high frequency subbands.

3.4. Algorithm Steps

The proposed fusion algorithm processes are shown in Figure 3, and the steps are as follows.

Step 0. Given source images and .

Step 1. The color image in RGB color space is converted to HSV color space to get three components H, S, and V.

Step 2. PCNN is utilized to deal with H component:(a)The H components of images and are divided into subblocks, and then the SF and AG of the subblocks are calculated using (16) to (19); the linking strengths and are gotten using (14) and (15); and acted as value of S-PCNN, respectively.(b)The subblocks are input into S-PCNN model twice with different to get two OFGs of the H components.(c)Get the average value of two OFGs using (20), and then of the fused H component can be decided by (21):where and are the average OFG of images and .

Step 3. Perform decomposition of source S and V components of image and using NSST to obtain the low-pass subband coefficients and the high-pass directional subband coefficients; different rules are utilized to deal with the S and V components.(a)The fused low-pass subband coefficients can be decided by where and is the low-pass subband coefficients of images and .(b)The high-pass directional subband coefficients are input into PCNN model to get corresponding OFGs (, ) according to the statistics of the ignition times of the pixels; the fusion rule of S and H components can be decided by

Step 4. Reconstruct the fused HSV image by an inverse NSST.

Step 5. Obtain the fused RGB color image by inverse HSV.

4. Experimental Results and Analysis

To verify the validity of the algorithm presented in this paper, we take several groups of experimental color images with different focus position test. The first group of color image is cups, which is shown in Figure 5. Image focuses on the left and image focuses on the right, and there are many words as details.

4.1. Evaluation Index System

In order to verify the effectiveness of this method, we consider the quantitative assessment of the fused images. For evaluation of the proposed fusion method, we have considered several common fusion performance metrics defined in this section. The final fused color images are composed of three components such as R, G, and B. Each component can be regarded as a grayscale image, and the quality of the fused color image strongly depends on the grayscale image quality. In this paper, we take the average of the three component’s evaluation indexes as the final color image evaluation index, which is the basic indicators of image fusion quality. Tables 1 and 2 show the evaluation of the fused image quality with space frequency (SF), average gradient (AG), entropy (EN), mean value (MV), standard deviation (SD), and mutual information (MI), and indicates how much edge information is reserved in the fused image [1923].

The space frequency (SF) is defined by (16) to (18), and the average gradient is defined by (19).

4.1.1. Entropy

The entropy (EN) of an image is defined by where is the probability of the gray level in the image and is the gray level of the image from 0 to 255.

4.1.2. Mean Value

The mean value (MV) of the image is defined by where is the pixel value of the fused image at the position . MV represents the average brightness of the whole image.

4.1.3. Standard Deviation

The standard deviation (SD) of an image is defined by where is the pixel value of the fused image at the position and is the mean value of the image. The larger the SD is, the better the result is.

4.1.4. Mutual Information

The mutual information between the source images and and the fused image is defined bywhere is the normalization union grey level histogram of images and and , is the normalization union grey level histogram of images and , is the normalization grey level histogram of , is the gray level of the image, and represents the pixel value of images and and , respectively.

4.1.5. Edge Based on Similarity Measure

The metric evaluates the sum of edge information preservation values and is defined bywhere , , , and are the edge strength and orientation preservation values, respectively, is similar to , and and are weights to measure the importance of and , respectively. The dynamic range is , and it should be close to 1 as far as possible for the best fusion, as . In addition, represents the pixel location, and and are the sizes of images, respectively.

4.2. Experiment One

For comparison, this paper proposes the fusion scheme of the algorithm and presents several common fusion algorithms; for instance, take the average of the source images pixel by pixel (average), principal component analysis (PCA), pulse coupled neural network (PCNN), pulse coupled neural network and Laplacian pyramid transform (PCNN + LP), and pulse coupled neural network and discrete wavelet transform (PCNN + WT). The fusion images using different methods are shown in Figure 6. According the contrast experiments, the fusion image of this paper is better than the others; see Figure 6. The algorithm of this paper does well in extracting the characteristics of the source images, and the fused image is closer to the natural color, which contains more edges, texture, and detail, so it is the closest to the source images. We can conclude that the method in this paper is an effective method.

From Table 1, the fusion image of this paper contains much more information. SF, AG, MV, and of this paper are larger than other methods, EN and MI are slightly better than other methods, and only SD is less than other methods.

4.3. Experiment Two

In other color spaces such as NTSC, YUV, YCbCr, HIS, and LAB, all have the similar histogram distribution as the distribution of H, S, and V components in HSV color space, so all can use the proposed algorithm for color image fusion. In order to confirm the method that works best under the color space of fusion, we carry out the following experiments. The fusion images under different color space are shown in Figure 7; all can achieve the goal of the image fusion, but the fusion effects on HSV, HIS, and LAB color spaces are better than others.

From Table 2, it is indicated that the fusion image of this paper in LAB color space contains more information, and SF, AG, MI, and are better than others. Overall, the evaluation indexes in HSV, HIS, and LAB are better than others. In practical application, we suggest that the color image fusion of the proposed algorithm should focus on HSV, HIS, and LAB color spaces.

4.4. Experiment Three

More experimental results of coastal images in HSV color space are shown in Figure 8, the source images with different focus position, and there are a lot of textures in the source images.

It can be seen in Figure 8 that the edge of the fusion image is clear, and it retains most of the textures in the source images; besides, the details are also well preserved. This method can extract the main features from the source images; it shows that the method in this paper also achieved effective results in these groups of color images.

The same conclusion as in Figure 6 can be concluded. Overall, the method presented in this paper is better than the traditional methods obviously. Compared with the other methods, this method reflects better performance and visual effect in terms of definition and detail.

5. Conclusions

We propose an effective multisensor color image fusion algorithm based on NSST and PCNN. PCNN’s neurons capture character will cause the similar brightness of the surrounding neurons to capture the ignition. This character can be automatically coupled and transmitted information. Nonsubsampled Laplacian pyramid filters are used in NSST to remove the influence of upsampling and subsampling. NSST has excellent performance in terms of shift-invariance, multiscale, and multidirectional properties. In the proposed algorithm, RGB color image is converted into HSV color image. H component is fused by adaptive S-PCNN; S and V components are decomposed into different frequency subbands according to different scales and direction by NSST and are fused by different rules. The experimental results show that the proposed color image fusion algorithm can fuse different focus position of the color images, and the fused image contains more information about color, texture, and detail. Compared with the traditional algorithms, this method embodies better fusion performance in many aspects. The paper also discusses the effect of the proposed algorithm on other color spaces, and the experiments show that the algorithm achieved better effects under HSV, HIS, and LAB color spaces, and we recommend three kinds of color spaces as the practical application color space.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors thank the editors and the anonymous reviewers for their careful works and valuable suggestions for this paper. The authors wish to express their gratitude to Zijun Wang, Shanshan Du, Jingxue Xu, and Shujuan Zhang for providing coastal images. The authors’ work is supported by the National Natural Science Foundation of China (no. 61365001 and no. 61463052), the Natural Science Foundation of Yunnan Province (no. 2012FD003), and the Science and Technology Plan of Yunnan Province (no. 2014AB016).