Abstract

In this work, we propose a novel pansharpening method based on the multidirection tree ridgelet dictionary. A pansharpened image has a wide-ranging application area, such as object detection, image segmentation, feature extraction, and so on. Remote sensing (RS) imagery contains more abundant information on surface features. In order to represent different object information, we use three main classes of different dictionaries, which can reveal the latent structure of RS image. First, RS imagery is divided into several blocks. Each block is classified as smooth, irregular, or multidirection categories. Different categories are sparsely represented in different dictionaries. Second, the smooth blocks are sparsely represented in the discrete cosine transform (DCT) dictionary. The irregular and the multidirection blocks are sparsely represented in the KSVD and multidirection tree ridgelet (MDTR) dictionary, respectively. Finally, we can obtain the fusion image by reconstructing those blocks. Some experiments are taken on three different datasets acquired by QuickBird, GeoEye, and IKONOS satellites. Experimental results show that the proposed method can reduce spectral distortion and enhance spatial information. Meanwhile, numerical guidelines outperform some related methods.

1. Introduction

Pansharpening refers to the fusion of a panchromatic (PAN) image and a multispectral (MS) or hyperspectral (HS) image. PAN has high spatial resolution but low spectral resolution. MS or HS has high spectral resolution but low spatial resolution. The fused imagery, which is unachievable by a single sensor, has not only high spatial resolution but also high spectral resolution, and the fused image can protect more detailed information for the downstream object detection, image segmentation, and feature extraction. Many fusion measures have been proposed. Generally, these measures can be divided into four main categories [1].(1)Component substitution-based (CS) methods [24](2)Multiscale analysis-based (MRA) methods [5, 6](3)Degradation model-based (DM) methods [717](4)Deep neural network-based (DNN) methods [1821]

CS methods include principal component analysis (PCA) and intensity-hue-saturation (IHS). CS methods behave well in computational efficiency. Due to the spectral discrepancies between the cover ranges of MS and PAN images, the spectral distortions usually can be found in the fused image. For the MRA category, the missing spatial information of MS image can be extracted from the PAN image via multiscale analysis tools. However, these methods can raise spatial artifacts.

In the DM category, image fusion is the degradation problem. These methods require additional priors to regularize the solution space. Li and Yang [7] proposed a new fusion method based on compressed sensing, which used the MS and PAN images through a KSVD [22] dictionary. Zhu and Bamler proposed the SparseFI method [8]. In SparseFI method, the atoms of coupling dictionaries came from the PAN image. In Ref. [9], the PN-TSSC method was proposed followed by SparseFI, and the PN-TSSC method was a two-step sparse coding method. In the study in Ref. [10], the reconstruction of the fusion imagery was achieved by using the identical sparse representation coefficients on the coupling dictionaries. Wang et al. [11] proposed a novel hybrid dictionary to fuse the RS imagery. However, those above methods had ignored the internal structural diversity information of RS imagery. Li et al. [12] proposed a novel pansharpening method with NSCT and HSAE, in which the detailed information was hierarchically injected into the MS image. In Ref. [13], Zhang et al. put forward a new RS fusion method, in which multiscale convolution sparse decomposition was used to extract more subtle feature. The similarity is important information in image processing. Thus, pansharpening based on similarity was proposed [1416]. Li et al. [14] proposed a local geometrical similarity-based method to capture detailed information. In Ref. [15], the similarity was obtained by a local adaptive sparse representation metric. Zhang et al. [16] proposed a spatial weighted neighborhood embedding method sharing the similar manifold structure. Nonnegative matrix factorization is a noteworthy method in RS image fusion. Nonnegativity is also used to reduce the ill-posedness of the spectral and spatial degradation models. In Ref. [17], semi-NMF based pansharpening was proposed to improve the image effect. DM methods enhance the image quality; however, the complexity is high for these methods.

Recently, the DNN methods are arousing more attention. In Ref. [18], the salient features can be extracted through two branches DNNs. The image feature, extracted from the convolutional layers, yielded the fusion result. In Ref. [19], the Pan-GAN model was proposed, and this method did not rely on the ground truth. Zhang et al. [20] proposed an SSE network-based pansharpening. In this paper, AFFMs were used to merge image features through information content. NLRNet was proposed for RS image fusion [21]. The author proposed the ENLA mechanism and ReZero technology to spread the signal easily, and the SpecAM was used to adjust the spectral information. However, DNN methods show that the training time is long and the sample size is large.

RS image contains various types of ground objects. Multiscale analysis tools have been used to capture image orientation information, and researchers have also demonstrated their excellent performance in processing geometric information. Based on this, we propose a novel pansharpening with a multidirection tree ridgelet dictionary to represent the diverse information of RS image. First, RS imagery is divided into several blocks. Each block is classified as smooth, irregular, or multidirection categories. Different categories are sparsely represented in different dictionaries. Second, we construct the discrete cosine transform (DCT), KSVD, and multidirection tree ridgelet (MDTR) dictionaries. Smooth and irregular blocks are sparsely represented in DCT and KSVD dictionaries, respectively, and the direction blocks are sparsely represented in the MDTR dictionary.

The contribution of this work is as follows: (1) the multidirection ridgelet dictionary can be constructed via discretized parameters of ridgelet function; (2) spatial details can be captured by sparsely encoding patches in DCT, KSVD, or MDTR dictionaries. The proposed approach has performed three different datasets acquired by the QuickBird, GeoEye, and IKONOS. The experimental results show that the proposed MDTR method can outperform its counterparts.

The remainder of this paper is structured as follows. The construction of the MDTR dictionary is depicted in Section 2. Section 3 presents the pansharpening method by taking the MDTR dictionary. Some experiments on different datasets are performed in Section 4. Finally, Section 5 concisely reviews conclusions.

2. Construction of the Multidirection Tree Ridgelet Dictionary

One of the major problems in sparse representation is the construction of dictionary. In view of the inherent diversity of RS image and other characteristics, in our paper, we construct three different dictionaries, namely, DCT [12], KSVD [22], and MDTR dictionaries. The DCT dictionary can represent the smooth region. The KSVD dictionary is used to learn the information of the irregular blocks adaptively. The MDTR dictionary is used to sparsely represent direction blocks.

Ridgelet scale function iswhere and

The prototype of the ridgelet dictionary iswhere , is a normalized factor, , is the scale parameter, is the location parameter, and is the direction parameter.

We can obtain the ridgelet dictionary via discretization of , , and . Figure 1 shows the ridgelet function with different parameters. Figure 1(a) is the ridgelet function. And Figures 1(b)1(e) are the result of different parameters.

The ridgelet dictionary can be obtained by discretized three parameters (, , and ). Lin et al. [23] proposed the collaborative compressed sensing reconstruction method. Authors proposed that the natural images were reconstructed by using the overcomplete ridgelet dictionary. It can be seen that the overcomplete ridgelet dictionary shows its advantages in image processing, which can maintain the structure and edge information of the image. However, the large-scale overcomplete ridgelet dictionary increases the complexity of the algorithm. In our work, we propose a novel fusion method with the multidirection tree ridgelet dictionary. The multidirection tree ridgelet dictionary shows superior fusion results than the global ridgelet dictionary and requires less time complexity. Figure 2 shows ridgelet dictionaries. Figure 2(a) represents the global ridgelet dictionary. Figures 2(b)2(f) are the different direction ridgelet dictionaries, respectively.

3. Image Fusion Using the Multidirection Tree Ridgelet Dictionary

In order to represent the structural diversity, image blocks are classified into different categories, e.g., smooth category, irregular category, or multidirection category. Figure 3 shows the classification results of blocks. Figure 3(a) shows a PAN image. Figures 3(b)3(h) show the blocks of the PAN image. Figures 3(b)3(c) show smooth and irregular blocks, respcetively. Figures 3(d)3(h) are direction blocks.

3.1. Smooth Category

Set the threshold, . Calculate the variance, , of each block. If the variance is less than the threshold , then the block is classified into the smooth category. For smooth blocks, the DCT dictionary, , is used for sparse representation.

3.2. Irregular Category

For each pixel of a block, we calculate the gradient in and direction. Then the gradient can be defined as

The gradient matrix is indicated as . Then, the gradient matrix is decomposed by singular value decomposition:

can be calculated as

Set the threshold, . If , then the block is classified as the irregular category. Otherwise, the block can be considered multidirection category. For irregular blocks, the KSVD dictionary is used for sparse reconstruction.

3.3. Multidirection Category

In equation (4), the first column of the matrix is named as the first principal direction. The second column of the matrix is the second principal direction. The angle of the block is determined as

The MDTR dictionary is represented as , where is the number of . The angle range is set as ; that is to say, there are 18 groups ridgelet dictionaries, and the interval of each group is . The angle of is from to . The angle of is from to . The angle of is from to and so forth. For multidirection blocks, is used for sparse reconstruction.

The fusion method is summarized in Algorithm 1.

input: the MS image and the PAN image
output: the fused MS image
Step1: the first principal component of the resampled and the were partitioned into blocks. Thus, we can obtain the and . represents the number of blocks.
Step2: are classified through the measure of section 3, which are divided into smooth, irregular, or multidirection categories. Correspondingly, is marked in the same category according to the classification of .
Step3: and are used to smooth blocks and irregular blocks, respectively. According to the angle of image block, is selected for multidirection category.
Step4: Finally, the fused block is obtained by sparse coding. Thus, can be obtained.

4. Experimental Results

Three different datasets demonstrate the effectiveness of the proposed algorithm. The experimental results are compared with six related methods, which are the PCA method [2], GIHS method [3], CT method [5], SparseFI method [8], PN-TSSC method [9], and NMF method [17]. The parameters of each method are set according to the papers. The size of and is 256 and 1024, respectively. There are 18 groups ridgelet dictionaries, and the interval of each group is . The objective quality evaluation indexes include CC [24], UIQI [25], RMSE, Q4 [26], SAM [24], and ERGAS [24]. The best results is highlighted in bold. Table 1 shows the quality evaluation indexes.

4.1. QuickBird Dataset

QuickBird has been acquired in Xi’an, China. The resolution is 0.61 m for the PAN image and 2.44 m for the MS image. Figure 4 shows all the fused results. Figures 4(a)4(c) are MS, PAN, and reference MS images, respectively. Figures 4(d)4(j) are the results obtained by employing the seven different fusion methods. In Figures 4(d)4(j), the first row is the fused image, and the second row is the difference image between the fused image and reference image. In addition, we display the magnified area for each fused image.

We can see that the spatial information is improved for all the fused images. The fusion result of the PCA method is darkness. For the difference image, we can see that the result of CT is better than the result of GIHS. For the magnified area of the PN-TSSC method, the color information is lost seriously. In the SparseFI, NMF, and proposed method, the spatial information is close to the ground truth. However, the spectral improvements achievable can be easily remarked for our method. Table 2 shows the numerical values of fusion results. We can see that the best Q4 is obtained from the NMF method, and the MDTR method provides the best CC, UIQI, RMSE, SAM, and ERGAS.

4.2. GeoEye Dataset

The fused results of GeoEye dataset is exhibited in this portion. The PAN image is 0.5 m resolution. The MS image is 2.0 m resolution. Figure 5 shows the result of different measures. Figures 5(a)5(b) are MS and PAN images, respectively. Figure 5(c) is the reference MS image. In addition, we analyze the difference image. In Figure 5(d), we can see that the spatial details preserve well, but the result is darkness. In Figures 5(e) and 5(f), there are slight spectral distortion and ringing artifacts. In Figure 5(g), the color information is poor. In Figures 5(h) and 5(i), the spatial details are enhanced although the spectral information is slight distortion. From the visual effect, the result from the MDTR method is close to the ground truth, and the missing information is less. Meanwhile, we can see that the difference image information of MDTR method is less than that of other compared methods.

Table 3 shows the results of each assessment indexes. The best UIQI, SAM, and ERGAS are given by the MDTR method. For CC and RMSE, the best values are produced by SparseFI, and the best Q4 performs the NMF method.

4.3. IKONOS Dataset

In this section, IKONOS dataset are tested. IKONOS dataset is 1 m resolution and 4 m resolution in PAN and MS images, respectively. The fused result is shown in Figures 6(d)6(j). The enlarged area in the red rectangles appears in the fused result, and the difference image is illustrated in the second row. All the fusion images provide enhanced spatial information. But the result is darkness for the PCA method. For GIHS, CT, PN-TSSC, and NMF methods, fused images has inferior performance in terms of the spatial information. For the proposed method, spatial details are maintained well, and the fused image is close to the reference image. For the enlarged region, the fused image of the MDTR method is better than the other fused images. From Table 4, the best SAM is from the NMF method. Other best indexes are from the MDTR method.

5. Conclusions

In this paper, we propose a new pansharpening with the multidirection tree ridgelet dictionary. Our method is assessed on three datasets. The contribution of our work is to construct the multidirection tree ridgelet dictionary, which can capture the different directional information of blocks. The spatial and spectral quality of the fused image is evaluated by six different indexes. Experimental results show that our proposed method can supress the color distortions in the fused image. Meanwhile, our method can produce satisfactory performance in both visual comparison and numerical evaluations. The future work will perform DNN and tensor-based methods to explore the spatial information and maintain image spatial relationships.

Data Availability

The datasets were obtained from the website https://resources.maxar.com/.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported by the Program for Outstanding Young Talents of Xianyang Normal University (No. XSYBJ202001), the Academic leader of Xianyang Normal University (No. XSYXKDT202107), the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2017JM6086), the Science Basic Research Program in Shaanxi Province of China (No. 16JK1823), the Education Scientific Program of the 13th Five-Year Plan in Shaanxi Province of China (No. SGH18H350), the Science Basic Research Program in Xianyang Normal University (No. XSYK20025), and Innovation and Entrepreneurship Training Program for College Students (No. 202110722023).