Abstract

Hyperspectral imaging is a crucial technique for military and environmental monitoring. However, limited equipment hardware resources severely affect the transmission and storage of a huge amount of data for hyperspectral images. This limitation has the potentials to be solved by compressive sensing (CS), which allows reconstructing images from undersampled measurements with low error. Sparsity and incoherence are two essential requirements for CS. In this paper, we introduce surfacelet, a directional multiresolution transform for 3D data, to sparsify the hyperspectral images. Besides, a Gram-Schmidt orthogonalization is used in CS random encoding matrix, two-dimensional and three-dimensional orthogonal CS random encoding matrixes and a patch-based CS encoding scheme are designed. The proposed surfacelet-based hyperspectral images reconstruction problem is solved by a fast iterative shrinkage-thresholding algorithm. Experiments demonstrate that reconstruction of spectral lines and spatial images is significantly improved using the proposed method than using conventional three-dimensional wavelets, and growing randomness of encoding matrix can further improve the quality of hyperspectral data. Patch-based CS encoding strategy can be used to deal with large data because data in different patches can be independently sampled.

1. Introduction

Typical hyperspectral imaging (HSI) is acquired on satellites and aerospace probes and then transmitted to grounds. The imaging spectrometer can provide tens to hundreds of narrow-band spectral information for each spatial location. The huge amounts of data but scarce equipment hardware resources on satellites and aerospace severely limit the transmission and storage of hyperspectral images [1, 2]. However, the ground receiving side holds very strong processing capability. If we transfer the system complexity from the satellites and aerospace probes to the ground, it is expected to potentially solve the limitation in traditional high spectral sampling.

Traditionally, super-resolution reconstruction is used to improve the spatial resolution of HSI [3, 4] while compression can improve the transmission efficiency [58]. We acquire “all” data and then “throw away” most of it in transmission process. Can we just directly measure the part that we need to save the cost of storage? Compressed sensing (CS) has been proposed to solve these contradictions in remote sensing. The theory of CS shows that a sparse signal can be recovered from a relatively small number of linear measurements [9, 10]. The difference between the conventional method and CS based transmission method is shown in Figure 1.

CS has been widely used in medical imaging [11, 12] and wireless sensor network [13, 14] to reduce the sampled data in recent years which has been applied in remote sensing. Duarte et al. [15] proposed a single-pixel imaging method. Ma [2] applied CS in single frame imaging. Aravind et al. [16] used ten spectral bands to compare orthogonal matching pursuit with simultaneous orthogonal matching pursuit in reconstruction. Fowler [17] presented compressive-projection PCA to effectively shift the computational burden of PCA from the encoder to the decoder.

According to the CS theory, the reconstruction error is bounded by the sparse approximation error [18]. For a signal , let be the solution to obey CS conditions, a forward transform, and the sparse approximation of in transform domain by retaining only the -largest entries of ; then, the reconstruction error is where , are small positive constants and is noise power [19]. Equation () implies that a sparser representation will reduce the reconstruction error. An optimal sparsifying transform is always important for sparse image reconstruction to reduce the reconstruction error.

In this study, we introduce the surfacelet transform (ST) to sparsely represent hyperspectral images by making use of the spatial and spectral information. The advantage of ST over wavelet on sparse HSI data reconstruction is demonstrated, where the data are reconstructed by using a fast iterative shrinkage-thresholding algorithm [20]. To further improve the reconstruction performance, a 3-dimensional (3D) random encoding is designed and the Gram-Schmidt orthogonalization is adopted. Finally, a patch-based CS encoding scheme is designed to deal with large size data.

The remainder of this paper is organized as follows. First, ST-based compressive sensing HSI reconstruction method is introduced in Section 2. The experimental results are presented in Section 3. Finally, conclusions are given in Section 4.

2. ST-Based Compressive Sensing HSI Reconstruction

The CS theory comes up with two principles: sparsity, which asserts that the reconstructed signal is sparse with a transform , and incoherence, which requires that the encoding matrix is incoherent with [10].

2.1. Sparse Representation of HSI Using Surfacelet Transform

Wavelet is commonly used as a typical sparse transform for HSI [2124]. However, wavelet sometimes fails in sparsely representing HSI because it has only 7 directions [2527]. Surfacelet transform (ST), proposed by Lu and Do [25], can efficiently capture the surface intrinsic geometrical structure within -dimensional signals [25]. It offers directional subbands with decomposition level by combining the multiscale pyramid with the 3-dimensional directional filter banks (3D-DFB). Thus, surfacelet with more directions may help reducing the blocky artifacts caused by orthogonal wavelets [24, 25].

ST is a multiscale version of the 3D-DFB. The input signal first goes through the 3D hourglass filter , which is a three-channel undecimated filter bank. One branch of the three-channel structure of 3D-DFB is given by Figure 2. The output is then fed into a 2D filter bank, denoted by , which operates on the (, ) planes. The tree-structured filter bank produces output subbands, denoted by for . Each output is then fed into another 2D filter bank operating on the (, ) planes. In the end, we get outputs, represented by for and [25]. spatial domain basis images of ST with 2 levels of decomposition are shown in Figure 3.

ST is optimal for the singularities, thus providing sparser representation of smooth curves and surface singularities than wavelet. Theoretically, for the singularities of an image , the best -term approximation error using ST has error decay rate of , while this error decay rate is for typical wavelets [28, 29]. Figure 4 shows that a sparser representation is achieved using ST. According to the CS theory [19], the reconstruction error bound is proportional to , where is the forward transform and means the number of preserved coefficients in transform domain. Therefore, ST is expected to reduce the image reconstruction error of HSI.

2.2. Incoherence and Gaussian Random Encoding Matrix

Incoherence means a column of the CS encoding matrix must be trying the proliferation in the corresponding sparsity basis . Since this paper focuses on investigating the spatial-spectral sparsity of HSI, a Gaussian random matrix is chosen as because it is incoherent with the entire existing basis [9, 10]. If each spatial image is encoded separately, can be represented by block matrix: where is an encoding matrix for the band of HSI. When , that is, each 2D image is encoded with the same encoding matrix, then is called 2D random encoding matrix; when , that is, each 2D image is encoded differently, then is called 3D random encoding matrix. Physically, 3D random encoding means that digital micromirror device (DMD) arrays are different for different bands. The performance of both encoding schemes will be discussed in Section 3.

In order to improve the performance of the recovered signal, a Gram-Schmidt orthogonalization (GSO) is used in CS encoding matrix. Given one band random encoding matrix and the th column denote as , the GSO is expressed by

The new matrix is adopted as the CS encoding matrix of one spectral band. The randomness of the matrix is not affected by GSO. In the following, columns of each are processed with GSO.

2.3. Reconstruction Algorithm

For HSI , where represents the th spectral band image, represent the spatial dimensions, and represents the spectral depth of HSI. Let , where is a column vector of the th band HSI with size . The data acquisition model for CS is given by where is a random encoding matrix for the th spectral band image and is the acquired undersampled data. The sampling ratio is defined as where , which means the HSI are undersampled.

In this paper, ST is adopted to provide sparser representation of HSI data and is expected to reduce the reconstruction error. Let represent inverse ST and let denote forward ST; CS recovers by solving [18, 19] where , stands for -norm, and is the regularization parameter which decides the tradeoff between the sparsity and the data fidelity. Many researchers seek for a simple and fast algorithm to solve (), such as the conjugate gradient [30], Bregman iteration [31, 32], low rank reconstruction [33], and other methods. In this paper, we choose FISTA to solve () because of its simplicity and fast convergence, whose convergence rate is , where is an iteration counter [20].

To solve (), the smallest Lipschitz constant [20] of the gradient is . The Lipschitz constant ensures the convergence of algorithm [34]. Taking and as the initial values, the threshold is . For , solutions are found by iterating from () as follows: The maximum number of iterations of FISTA is set as 200 to achieve stable solutions. The final output is .

2.4. Patch-Based Compressed Sensing with Surfacelet Reconstruction (PCSST)

For a larger HSI data , bigger encoding matrix may exceed the memory of computer or leads long computation time. In this case, a patch-based sampling operation can be used. Let be a patch operator and let divide the th band of into patches, and then the data acquisition model for CS is given by where denotes measurements of patches in th band of and satisfies , where means random encoding on each block. In our scheme, the encoding on different patches of each band is different and these patches are nonoverlapped.

PCSST allows implementing the proposed method on a larger dataset. It has the potential to allow for parallel computing as shown in Figure 5 and also provides the possibility to reduce the complexity of sensor arrays since the sensing detectors independently sample the data in different patches.

3. Experimental Results

Experiments are conducted in three aspects. First, better reconstructed signal using ST than using wavelet is demonstrated. Second, improving the randomness of encoding matrix for each spectral band is shown to improve the recovery quality and the reconstruction performance of two encoding schemes is compared. Third, a patch-based CS encoding scheme is designed to deal with large data.

The HSI data is obtained from U.S. AVRIS website, including Moffet Field and Lunar Lake [35]. These data contain 224 spectral bands with spatial size , and every pixel is encoded with 16 bits. Linear interpolation is employed to fix the junk bands [36] and maintain the consistency of spectral lines, and all data are normalized.

To evaluate the performance, the mean-square-error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity (SSIM) [37] are adopted as criteria to measure the reconstruction error. Their definitions are where denotes the original image, and are mean and standard deviation of , stands for the recovered image, and are mean and standard deviation of ; , are small constant, and is adopted in our work, which is used to avoid instability when either or is very close to zero [37]. Mean SSIM index is adopted in this paper. Simulations run on a dual core 2.5 GHz CPU laptop with 4 GB RAM. A fast ST implementation in C language is adopted.

3.1. Surfacelet and Wavelet with 2D Encoding

To simulate the CS data acquisition, the HSI are undersampled by random encoding matrix . 50% sampled data means for . In order to improve the quality of the recovered signal, Gram-Schmidt orthogonalization is used to obtain the CS encoding matrix. With the 2D encoding matrix (i.e., each band has the same encoding matrix), the ST-based reconstruction is compared with the 3D wavelets-based reconstruction. Daubechies filters “db4” with 2 levels of decomposition is used, and there are 7 directional subbands for 3D wavelet. The multiscale pyramid with 2 decomposition levels is chosen, and there are 12 directional subbands for ST. The complexity of ST is and wavelet is .

In order to well discuss the relationship between and PSNR, we compared the PSNR performance from to as shown in Figure 6. is empirically chosen to give optimal PSNR in our work. As shown in Figure 7, ST significantly improves the PSNR of each spectral band more than 3D wavelet. Edges and curves are better reconstructed using ST than 3D wavelets as shown in Figure 8. At a given spatial location, the spectral line reconstructed using ST is much more consistent with the ground truth than wavelets as shown in Figure 9. Under different sampling ratios, ST achieves much better PSNRs than wavelet as shown in Figure 10, and more advantage of ST is seen at low ratios. Besides, ST shows better performance for the Lunar Lake dataset which has richer geometric textures than Moffet Field dataset. Running time (unit: second) based on ST and WT with different sampled rate is shown in Table 1.

3.2. 2D and 3D Encoding Matrix

The performance of 2D encoding and 3D encoding schemes is compared in this section. The recovered spatial images with two encoding schemes are shown in Figure 11. Edges and curves are better reconstructed using 3D encoding than 2D encoding. The recovered spectral lines using 3D encoding are more consistent with ground truth than using 2D encoding as shown in Figure 12. The improvement is more obvious for Moffet Field dataset with textures in low intensities as shown in Figures 11(a) and 12(a). These results imply that increasing the encoding randomness among different bands will achieve better reconstruction.

3.3. Reconstruction with PCSST

We tested the methods on Lunar Lake data with size . A reconstructed spectral band is shown in Figure 13. The image structures recovered by surfacelet are much sharper than wavelet.

4. Conclusions

Compressive sensing is a new sampling theorem. In this paper, the surfacelet transform is introduced into hyperspectral image reconstruction from compressive sampled data. The surfacelet transform is a directional multiresolution transform for 3D data, which is applied to sparsify the hyperspectral images for the first time. Simulations are conducted in three aspects. First, better reconstructed signal using surfacelet than using wavelet is demonstrated. Second, improving the randomness of encoding matrix for each spectral band is shown to improve the recovery quality. Third, a patch-based CS encoding scheme is designed to deal with large data. It provides the possibility to reduce the complexity of sensor arrays because sensing detectors in different patches independently sample data. Experiments demonstrate that reconstruction of spectral lines and spatial images is significantly improved using the proposed method than using conventional three-dimensional wavelets.

In the future, our work includes the following two aspects.

The first aspect is optimizing sparse representation of hyperspectral imaging, for example, combining adaptive sparse representation [3840] and surfacelet transform. An adaptive dictionary may provide a sparser representation leading to a lower reconstruction error. Therefore, it is meaningful to try a low-complexity training method in the future.

The second aspect is overlapping patches compressed sensing reconstruction. Overlapping patches can reduce the “block artifacts.” But overlapping also introduces more encoded data and higher computations on image reconstructions. Trading the compressive sampling rate with image qualities using overlapping and speeding up the reconstruction algorithm [41] will be discussed in the future.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors wish to express their thanks to all reviewers for their comments. This work was supported in part by the National Natural Science Foundation of China (no. 61201045, no. 40971206, no. 61302174), Fundamental Research Funds for the Central Universities (no. 2013SH002), Open Fund from Key Lab of Digital Signal and Image Processing of Guangdong Province (no. 54600321, no. 2013GDDSIPL-07), Academic Innovation Team Building Project of Shantou University (no. ITC12002), and Scientific Research Foundation for the Introduction of Talent at Xiamen University of Technology (no. 90030606).