Abstract
Compressed hyperspectral imaging is a powerful technique for satelliteborne and airborne sensors that can effectively shift the complex computational burden from the resourceconstrained encoding side to a presumably more capable basestation decoder. Reconstruction algorithms play a pivotal role in compressive imaging systems. Traditional modelbased reconstruction approaches are computationally burdensome and achieve limited success. Deep learningbased approaches, while improving in reconstruction accuracy and speed, depend heavily on data, which is a major challenge for satelliteborne hyperspectral compressed imaging. In this article, we combine the respective advantages of modelbased and learningbased approaches in a distributed compressed hyperspectral sensing framework, employing linear mixed model assumptions and spectral library learning to simultaneously improve the reconstruction speed and accuracy without using a large amount of additional hyperspectral data. First, the relationship between the CS band and the key band is learned from the spectral library to ensure that the key band endmembers can be accurately predicted. Then, the joint horizontal and vertical difference operators are proposed to enhance the estimation of the initial values of abundance. Finally, the CS band endmembers and residuals are updated in the reconstruction module to deal with the endmember and abundance mismatch. Extensive experimental results on five real hyperspectral datasets demonstrate that the proposed spectral library learning, abundance initialization, and reconstruction strategy can effectively improve the compressed sensing reconstruction accuracy, outperforming the existing stateoftheart methods.
1. Introduction
Rich detail information endows hyperspectral images (HSIs) with a wide range of applications, such as mineral exploration, agriculture, military, etc. However, the huge volume of data puts great pressure on the storage and transmission of the onboard hyperspectral imaging system. Compressed sensing (CS) [1] theory is one of the key technologies to address massive data imaging.
CS theory points out that if the signal is compressible, we can collect the signal at a rate much lower than Nyquist and restore the original signal with high probability. CS technology perfectly realizes the combination of compression and sampling and effectively reduces the computational burden at the sampling end. Two key issues are inevitable in the application of CS technology. One is the sampling mode, which can optically collect data with compression. The other is how to reconstruct the original data accurately from a small number of observations.
Hyperspectral images have a variety of sampling modes due to their special 3D data structure, that is 2D spatial information and 1D spectral information. For example, we can sample the compressed hyperspectral images of each band using the same observation matrix as the gray image, which is usually called spatial compressed sampling [2]. Of course, we can also perform spectral compressed sampling for each pixel [3, 4]. Spatial and spectral compressed sampling alone is not conducive to data reconstruction. Therefore, a variety of spatialspectral joint sampling modes is proposed. Coded aperture snapshot spectral imaging (CASSI) [5] captured a compressed 2D measurement on the detector by multiplexing spatial and spectral information from the scene and modulating each spectral channel with different masks. In the spectral domain, however, the multiplexer of CASSI undergoes a deterministic uniform transformation. 3DCS approach [6], consisting of optical random convolution, random permutation, and spectralvarying subsampling on a sensor, was proposed to reduce the required sampling rate. Paired with an effective decoding algorithm, 3DCS enables the recovery of largescale HSIs and videos. Compressive hyperspectral imaging by separable spatial and spectral operators (CHISSS) [7] employed randomly encoding both the spatial and the spectral domains. Separable sensing architecture not only improves the compression efficiency but also reduces the computational complexity, making it especially suitable for largescale hyperspectral imaging. In recent years, random permuted Hadamard transform as the compressive operator is often used for theoretical reconstruction algorithm development [8–12]. However, random permuted Hadamard transform is difficult to implement with optical hardware, especially for onboard imaging spectrometers.
Image reconstruction has always been a key issue in hyperspectral compressed imaging. Traditional modelbased reconstruction methods exploit HSI structures by employing structureinducing regularizers through handcrafting. Sparsity in some orthogonal transformation domains and total variation (TV) minimization of images are the two most commonly used prior knowledge [6, 8, 9, 13–17]. Typically, the cost functions of such modelbased reconstruction methods consist of a data fidelity term and a single constraint or combination of multiple regularization terms capturing various assumed objection properties. Due to the 3D data structure of HSIs, a priori constraints on the multidimensional tensor space have been widely exploited in recent years. For example, sparse tensor and nonlinear compressed sensing (STNCS) [8] extended the sparse constraint on orthogonal transformation to the sparse constraint on a nonlinear mapping of tensor space. However, it is difficult to effectively constrain the optimal subspace for data recovery with a single prior. Therefore, hybrid spatiospectral total variation (HSSTV) [17] constructed a TV regularization item containing various difference operators of HSIs, such as spatial difference operators in horizontal and vertical directions, spatial difference operator, and spatialspectral difference operator. Although excellent reconstruction performance was achieved, determining the proportion of each difference in the regularization item for different data is very difficult. For HSIs, the lowrank (LR) constraint is also an important and extensive handcrafting regularization [10–12, 18–21]. Joint tensor/reweight 3DTV norm minimization (JT3DTV) [9] proposed a regularization by jointing a reweight 3DTV norm and LR matrix factorizations by Tucker decomposition to exploit TV and LR priories. Enhanced 3DTV (E3DTV) [12], on the other hand, interprets the LR properties of HSIs from the perspective of matrix factorization and then adds sparse and orthogonal constraints to the submatrices for reconstruction. However, the reconstruction accuracy of E3DTV needs to be further improved. LR matrix factorization is an important tool in HSI processing. The famous linear and nonlinear mixing model was developed. How to effectively use the linear mixing model (LMM) in compressed reconstruction is the main focus.
In recent years, with the continuous improvement in deep learning technology, datadriven or deep learningbased methods have gained much interest for compressed reconstruction. Classical deep networks, such as autoencoders [22], convolutional neural networks (CNNs) [23], and generative adversarial networks (GANs) [24, 25], were employed for endtoend learning of CS. To integrate the prior knowledge and the structure of the operators, deep unfolding maps motivated by the iterative algorithm put iterative restoration algorithms onto deep neural networks [26–28]. Deep learningbased methods are also extended to the compressed reconstruction of HSIs [29–32]. Although the learningbased methods have surpassed the traditional modelbased methods in terms of reconstruction speed and accuracy, the learningbased approach is heavily dependent on data, which is a major challenge for onboard hyperspectral compressed imaging. On the one hand, collecting large amounts of airborne hyperspectral data is costly. On the other hand, different imaging spectrometers may lead to inconsistent data structures due to the different spectral resolutions. In addition, since the huge size of HSIs, the computational complexity of reconstruction training on large datasets is very high.
In this article, a distributed compressed sensing (DCS) framework suitable for pushbroom and whiskbroom remote sensing imaging [7, 33] of onboard hyperspectral imaging spectrometer is built. With the help of the spectral library, we can more conveniently mine the LR priors of HSIs by spectral unmixing. For convenience, the proposed compressed imaging scheme is dubbed DCHI_SUL. In summary, the main contributions of this article include the following three aspects:(1)A distributed compressed sampling and reconstruction framework is proposed, which utilizes the LMM of HSIs to effectively exploit the LR prior. The framework incorporates modelbased and learningbased CS reconstruction methods. With the aid of a small number of collected spectral libraries, the signatures of key bands can be accurately learned without large amounts of data and heavy calculations.(2)A novel joint horizontal and vertical difference operator for abundance TV constraint is proposed, coupled with a welldesigned initialization scheme, which effectively improves the reconstruction quality.(3)The alternating direction method of multipliers (ADMM) [34] algorithms used for initialization and reconstruction are readily designed and implemented efficiently to solve the abundance optimization and HSI reconstruction problem. Comprehensive experiments compared with other stateoftheart methods substantiate the superiority of the proposed framework.
The rest of the article is organized as follows: Section 2 introduces our DCS framework. In Section 3, we comprehensively describe the reconstruction of compressed data, including endmember estimation, initialization strategy, ADMMbased reconstruction algorithm, etc. Extensive experiments compared with other stateoftheart methods are reported in Section 4, where we also discuss some ablation studies in our proposed CS framework. Section 7 concludes the article.
2. Distributed Compressed Sensing Imaging Framework
Distributed compressed sensing originated from distributed source coding [35], a solution for distributed video coding (DVS) [36] in video compression and coding to reduce encoding complexity and preserve coding efficiency. Incorporating CS technology into the DVS framework, distributed compressed video sensing (DCVS) [37, 38] first grouped video frames, called groupofpictures (GOP). The keyframe and CS frame are selected from each frame and compressed sampled with different sampling rates. In our previous work, we followed the framework of DCVS and proposed distributed compressed hyperspectral sensing (DCHS) [39]. DCHS grouped equal parts according to the spectrum of HSIs. However, the GOP is grouped differently for different sample rates and is also not applicable to higher sample rates. Therefore, DCS according to spectral library matching (DCS_SLM) [40] proposed the strategy of randomly extracting key bands, but it also brought difficulties to reconstruction. DCS_SLM assumed that all endmembers contained in the specified scene correspond to some signatures in the spectral library. Then the closest endmember is matched from the spectrum library. However, due to environmental conditions and instrumental configurations during data collecting, the real spectrum may be inconsistent with the spectrum library. This inconsistency is fatal to DCS_SLM.
In this article, we follow the compressed sampling scheme of DCS_SLM, that is, a small number of key band images are randomly selected from hyperspectral data. During reconstruction, however, instead of directly matching the corresponding spectrum, we learn the relationship between key bands and CS bands through the spectrum library. Figure 1 shows the proposed DCHI_SUL sampling and reconstruction process. At the sampling end, we abandon the GOP of DCVS and instead randomly extract some images from the hyperspectral data as key bands. Let denotes HSIs, where is the number of pixels and is the number of bands. is the randomly selected key band images. denotes the number of images in the key band. The remaining images are used as CS bands , where is the number of CS band images and .
The CS bands’ compressed sampling realized by optical hardware can be mathematically expressed as folows:where is the observed data, is the spatial compressed measurement matrix, achieves the extraction of CS bands, and denotes the number of a small number of spatial observations. So that the observed data still meet the requirements of spectral unmixing, the measurement matrix adopts the structure of [41], that is, 1 is the only nonzero element in each row of AC, and the other elements are 0. Fortunately, this structure is more conducive to the coding of digital micromirror devices (DMD) [42]. is similar to the identity matrix, but only the diagonal position corresponding to the extracted band is 1.
Equation (1) is a typical spatialspectral compressed sampling mode, which can be easily implemented by CHISSS [7]. Similarly, the transmission of key bands can be expressed as
In order to improve the reconstruction performance as much as possible, the sampling of key bands is not compressed. This means that the measurement matrix is a square matrix with full rank. Therefore, the observation data and are equivalent. We can consider that is transmitted directly from the sampling end to the reconstruction end. is used to extract key bands, which is complementary to , , where is an identity matrix.
Now let us analyze the sampling rate of the proposed distributed compressed sampling. First, the key band images are not compressed and only bands are extracted from the original bands, so its sampling rate is . For the CS band, the sampling rate is due to the spatially compressed sampling. Therefore, the total equivalent sampling rate of distributed compressed sampling satisfies . In our experiments, we find that only a very small amount of spatial observations can achieve excellent reconstruction performance. Therefore, in the following experiments, we fix the value of to 0.01.
From the reconstruction end of Figure 2, the task of the proposed scheme is to recover CS band data given the key band and the spectral library. Spectral unmixing and ADMM [34] algorithm are used to solve this problem (please refer Section 3 for details).
3. Recovery of CS Band
In this section, we will describe the reconstruction method of CS band images in detail. Recovering CS band images directly from spatially compressed data is an underdetermined inverse problem, which is inherently illposed. Especially when the sampling rate is low and is a binary matrix, achieving competitive reconstruction is practically impossible. Therefore, we expect to improve reconstruction accuracy by means of the LMM of HSIs.
3.1. LMM
LMM is a single and popular prior model in HSI processing. According to LMM, HSIs can be decomposed into endmembers and their corresponding abundances. In short, LMM can be formulated as follows:where is defined as an endmember matrix containing various spectral signatures, is the abundance distribution matrix corresponding to the endmember, and is the number of endmembers contained in a particular scene.
According to the LMM, equations (1) and (2) can be rewritten aswhere we directly replace with for convenience, since the key bands are not compressed. Of course has also been eliminated. Noting that and in equation (4) have the same . Because the key bands and CS bands correspond to the same ground truth, they have the same signatures and the same distribution. However, the spectral range of signatures is different, so we denote them by and .
We can easily utilize the endmember extraction algorithm in the spectral unmixing domain, thanks to the special measurement matrix , which does not change the nonnegativity and sumtoone constraint of abundance in [41]. In this article, we employ vertex component analysis (VCA) [43] algorithm to extract endmember from compressed observation .where denotes the VCA algorithm.
Now, the reconstruction of the CS bands is transformed into an abundance estimation problem for a given , , and .where is the prior constraint term of abundance. Obviously, the appropriate abundance prior constraint is helpful to improve the reconstruction accuracy. However, when is very small, the estimate of in equation (6) tends to fall into a local optimum. There are usually two ways to solve the local optimization problem in the optimization process: one is the initialization scheme, which makes the starting point of the optimization process around the global optimal point, and the other is a more appropriate constraint regularization term. The next part of this article focuses on these two issues.
3.2. Initialization Module
Compared with the large number of hyperspectral training samples required by deep learningbased methods [29–31], the collection of spectral libraries is much easier. Many spectral libraries have been established in successions, such as the United States Geological Survey (USGS)^{1} and Jet Propulsion Laboratory (JPL)^{2}. In this subsection, the initial value of abundance is estimated mainly through the key band data and spectral library, as well as the joint difference operator for abundance TV constraints.
3.2.1. Endmember Prediction of Key Bands by Spectral Library Learning
For key band data , if we want to get the abundance, we must first estimate the endmember matrix . Since a small number of key bands were collected, accurately extracting by the VCA algorithm is very difficult. Now that the CS band endmember has been extracted, it is very reasonable to assume that there is some relationship between the endmembers of the key bands and the endmembers of the CS bands. Next, we focus on how to predict this relationship. Therefore, our next main focus is on how to predict the relationship.
First, we assume that any point in a spectral signature can be a linear combination of all or part of other points, that iswhere refers to the point in the endmember signature of the key bands; refers to the point in the endmember signature of the CS bands, and can be achieved by adding the first column of all 1 to ; is the weighting coefficient, which is obtained through spectral library training; and superscript indicates transpose operation.
For a spectral curve in a given spectral library , we define the following cost function to calculate the optimal value of :where and correspond to the values of the CS bands and key bands of the spectral curve in the spectral library, respectively, and is the total number of spectral curves in the spectral library. can be calculated by minimizing , and the gradient descent method is applied to iteratively update and obtain the optimal solution of .
After learning the relationship between the key bands and CS bands from the spectral library, the results of using the relationship to predict the key band endmembers from the CS bands are presented in Figure 2. The testing process is as follows: first, the VCA algorithm is applied to extract the endmembers (shown in Figure 2 by solid lines) from the Yellowstone 0 hyperspectral dataset, then 43 bands (corresponding to an equivalent sampling rate of about 0.2 for our method) are randomly selected as the real key band endmembers and the other 181 bands as real CS band endmembers. That is, these 181 band endmembers are the inputs of the neural network during testing. In the training phase, the spectral library is divided into CS band signatures and key band signatures according to the same rules to form training sample pairs. After successfully training the parameter , equation (7) is applied to predict the key band endmembers by the real CS band endmembers. The predicted values of 1st and 2nd endmembers are labeled with “x” and “o,” respectively, in Figure 2. It can be seen that the proposed prediction method can effectively predict the key band endmembers through the CS bands with the aid of a spectral library.
3.2.2. Joint Horizontal and Vertical for Initial Abundance Estimation
After successfully predicting the endmember matrix with the assistance of a spectral library, the initial abundance value can be easily estimated from the key band data by the least square method.where the superscript denotes the inverse operation of the matrix.
However, when the sampling rate is low, such as below 0.1, the number of endmembers may exceed the number of key bands, that is, . At this point, estimating the initial value of abundance from the key band is still an underdetermined problem, and the introduction of suitable prior information is necessary to improve the estimation accuracy. TV minimization is a widely used abundance smoothing constraint in hyperspectral unmixing [2, 44, 45]. However, previous works usually calculate the horizontal and vertical differences independently and pay little attention to their correlation in the abundance TV constraints. In the experiments, we found that the vertical difference of the horizontal gradient map (defined as ) has stronger sparsity than the gradient map in a single direction ( or ), where and denote the difference operator along the horizontal and vertical directions, respectively. Figure 3 shows the abundance corresponding to the first endmember of Yellowstone 0 data, the horizontal gradient map of abundance, the vertical gradient map, the vertical difference of the horizontal gradient map, and the sparsity comparison of various TVs.
(a)
(b)
(c)
(d)
(e)
For clarity, the abundance difference maps in Figure 3 are magnified by a factor of 5 in magnitude. , , and can describe spatial local smooth prior of abundance. However, the map is sparser and more efficient, as shown in Figure 3(e). Although the simultaneous employment of multiple TV operators is beneficial to improve the optimization performance, the increase of operators leads to the increase of weight hyperparameters, which affects the practicality of the algorithm. In this article, only is employed as the constraint term of the objective function. The initial estimate of the abundance can be described aswhere ( is the column vector of , and denotes the norm of vector ). Problem (10) can be rewritten as the following objective functionwhere is referred to the Frobenius norm of matrix , and are regularization parameters that can tradeoff between fidelity terms and prior constraint term. We employ ADMM to solve the optimization problem given by (11), which is detailed in Algorithm 1, and the solution to each subproblem is given in Appendix A.
3.3. Reconstruction Module
With the endmember extraction in Subsection 3.1 and the initial abundance estimation in Subsection 3.2, we can obtain better reconstructed HSIs based on LMM. In this subsection, in addition to the proposed horizontal and vertical joint gradient of abundance, the LMM residual term is added to further improve the reconstruction performance. For the CS band, it is not easy to improve the reconstruction accuracy through a small amount of observation data. More appropriate regularization terms should be explored.
Now, our task is to recover the CS band images, not just reconstruct the abundance. In contrast to problem (6), several changes are made in the minimization of the objective function as follows. First, the goal now is to reconstruct the CS band images. Considering that both the CS band endmember estimated by the VCA algorithm and the key band endmember learned from the spectral library may be deviated, the endmember matrix is also updated together during the minimization of the objective function. Spectral variabilities caused by environmental conditions and instrumental configurations, as well as material nonlinear mixing effects, are inevitable. Therefore, this subsection introduces the residual term of LMM to describe the spectral variability. Then, rewrite the compressed sampling of the CS band as follows:where denotes the residual caused by spectral variability. Finally, is added to the objective function to constrain the optimal solution of the residuals. In summary, the reconstruction objective function of the component in CS band images is constructed as follows:where and are regularization parameters, which adjust the proportion of abundance TV regularization term and residual constraint term. Similarly, the ADMM algorithm is employed to solve the problem (13). For algorithm details, please refer Algorithm 2 in Appendix B.
To sum up, the pseudocode of the proposed reconstruction scheme for recovering the CS band images from the observed values is described in Algorithm 3.
In this section, we report the performance of the proposed DCHI_SUL framework compared with 6 recently developed conventional and stateoftheart hyperspectral compressed imaging methods, including STNCS [8], JT3DTV [9], E3DTV [12], HSSTV [17], DCHS [39], and DCS_SLM [40]. DCHS and DCS_SLM are the DCS framework. However, DCHS requires GOP processing during compressed sampling of HSIs. DCS_SLM roughly matches endmembers directly through the spectrum library, resulting in low prediction accuracy of endmembers in key bands. To ensure the fairness of the comparison, involved parameters are set according to the rules in their papers or finetuned using default settings.
Experimental results are reported in terms of sampling rates, mean peak signaltonoise ratio (MPSNR), mean structure similarity (MSSIM), mean spectral angle mapper (MSAM), and runtime. MPSNR and MSSIM are two conventional and effective evaluation metrics for spatial information reconstruction quality, while MSAM is a spectralbased evaluation measure. MPSNR evaluates the average peak signaltonoise ratio of the reconstructed image, and a larger value indicates that the reconstructed image is closer to the original image in terms of peak signaltonoise ratio. MSSIM evaluates the structural similarity of the reconstructed image to the original image. The closer the reconstructed image to the original image in structure, the higher the value of MSSIM is. MSAM evaluates the distance between the reconstructed and the original spectral curve. When the reconstructed spectrum is closer to the original spectrum, the MSAM value is smaller. In short, higher MPSNR and MSSIM and lower MSAM indicate better reconstruction performance of the algorithm. MPSNR, MSSIM, and MSAM are defined as follows:where and are the band image vectors of original and reconstructed hyperspectral data, respectively; and corresponding to the original and reconstructed spectral signature vectors, respectively. is the peak value of . is defined as the structure similarity of between and . The details of SSIM function can be found in [46].
In this article, to evaluate the performance of the algorithm objectively, all tested data are the raw data that were acquired by airborne or spaceborne hyperspectral imaging platforms without any preprocessing. The five tested data employed for performance evaluation are referred to Yellowstone 0, Yellowstone 3, Yellowstone 18, Hawaii 1, and Low altitude acquired by airborne visible infrared imaging spectrometer (AVIRIS)^{3}. Yellowstone and Hawaii hyperspectral datasets were collected by the AVIRIS sensor over Yellowstone, WY, and Mt. St. Helens, Hawaii in 2006, with a spatial resolution of 20 m. The datasets have a spatial size of 512 × 614 with 224 spectral bands ranging from 0.4 to 2.5 μm with a 10 nm bandwidth. The subsequent numbers 0, 1, 3, and 18 indicate the 0th, 1st, 3rd, and 18th scene of the dataset, respectively. Low altitude scenes containing mainly agricultural areas are the free standard data products downloadable from the AVIRIS website. The spatial size of all tested images was cropped to 256 × 256, with 224 spectral bands, where the pseudo color images composed of 40, 30, and 10 bands are shown in Figure 4. In our experiments, USGS digital spectral library was selected to assist in reconstruction. Spectral signatures cover 224 bands without removing water absorption and noise channels.
All algorithms were performed with MATLAB (R2021a) on a mobile workstation and Windows 10 OS with a dualcore Inter 2.6GHz CPU and 32GB RAM.
3.4. Quantitative Comparison
The performance evaluation of the proposed DCHI_SUL framework is reflected by the sampling rates ranging from 0.1 to 0.5 with a step size of 0.1. Table 1 compares the reconstruction MPSNR by all the compared methods, over all the spectral bands in five hyperspectral datasets. We highlight the best results for each case in bold in the current and following tables. The MPSNR of the proposed DCHI_SUL framework outperforms that of all other competing approaches, in particular the MPSNRs are higher than the DCHS method. DCHS, DCS_SLM, and DCHI_SUL all use the DCS framework. However, DCHS can predict the key band endmembers more accurately according to GOP, with the disadvantage that it does not apply to higher sampling rates. Although DCS_SLM also predicts key band endmember by the spectral library, the rough direct selecting spectral signatures from the spectral library as the real endmember in the scene has slightly lower matching accuracy. This is because the standard spectrums of different objects provided by the spectral library are obviously different from the spectrum in the actual HSI due to the weather influence, imaging conditions, etc. However, the relationship between bands remains unchanged. DCHI_SUL employs the neural network to predict this relationship, so as to accurately predict the key band endmember. Therefore, the reconstructed MPSNR of DCHS is higher than that of DCS_SLM. The reason for the superiority of DCHI_SUL over DCHS is mainly due to the refining strategies such as accurate initialization and spectral variability introduction during the reconstruction process.
Another issue revealed by the results in Table 1 is that the joint optimization of multiple prior constraints is usually better than a single prior. For example, a single sparse prior, even for tensor space sparsity, gives the worst reconstruction performance of STNCS among all methods. The HSSTV has also been able to achieve a very high MPSNR by combining multiple TVs. E3DTV is similar to spectral unmixing, which adds constraints in subspace for reconstruction. However, due to the improper reconstruction framework, its accuracy is not high. Overall, the subspace reconstruction scheme based on spectral unmixing outperforms the direct reconstruction of the entire HSI.
Table 2 provides the MSSIM comparison of all competing methods. SSIM is an index to measure the similarity between the reconstructed image and the original image. DCHI_SUL also largely outperforms other algorithms on MSSIM, except for three cases that are slightly below DCHS, but the difference is also very small. Comparing HSSTV and DCS_SLM in Tables 1 and 2, the MPSNR of DCS_SLM is sometimes higher than that of HSSTV, but the MSSIM is lower instead, which indicates that HSSTV can protect the structure of the image better than DCS.
One worth mentioning is that DCS has a low number of key bands at low sampling rates and high endmember prediction accuracy, but poor abundance estimation. Conversely, the endmember prediction accuracy decreases at high sampling rates, while the abundance estimation depends on the endmember prediction, resulting in no significant improvement in estimation performance. Therefore, the MPSNR and MSSIM of DCS_SLM sometimes decrease after the sampling rate exceeds 0.3. However, the accuracy of DCHI_SUL does not decay or decays little due to a series of refining strategies.
We also provide the MSAM comparison results of all methods in Table 3. In most cases, DCHI_SUL still accomplishes the best reconstructing performance. Compared to the results of MPSNR, however, the minimum MSAM cases outside of DCHI_SUL are more frequent, and the relative advantage is also a little weaker. This indicates that DCHI_SUL is better at maintaining spatial rather than spectral information for HSI. It is worth mentioning that the MSAM of the HSSTV algorithm at high sampling rates reaches or is even lower than the optimal value obtained by the DCS framework, which may benefit from the application of its multiTV constrained prior consisting of spectral difference and spatialspectral difference operators.
The last one of the quantitative experiments compares the reconstruction speed of all algorithms. Table 4 shows the reconstruction times for all methods under different sampling rates on the Yellowstone 0 dataset. Obviously, the reconstruction methods based on spectral unmixing significantly reduce the computational complexity. For example, DCS_SLM improves the reconstruction time by 2 orders of magnitude over JT3DTV, and also by nearly 5 times over STNCS, which is the fastest of the conventional algorithms in our experiments. Spectral unmixing converts the reconstruction of hyperspectral data into endmember and abundance estimates rather than directly recovering the original data. The optimization of two small matrices during the iterative process is certainly much less computationally complex than the direct optimization of the entire dataset. The reconstruction time of DCHI_SUL is about 2–3 times longer than that of DCS_SLM, mainly due to the initialization and reconstruction modules performed.
3.5. Visual Quality Comparison
To visually demonstrate the DCHI_SUL performances of the proposed method, we build the pseudo color images composed of 40, 30, and 10 bands corresponding to the five reconstructed HSIs obtained by all methods under SR = 0.2 in Figure 4. We observe that all the competing methods can complete relatively excellent reconstruction results, with little visual discrimination between the superior and inferior. Only the reconstructed images of the STNCS algorithm are slightly blurred.
In order to clearly distinguish the reconstruction results of different methods visually, the comparison of residual images between the original images and the reconstructed images achieved by all the competing methods with a 0.2 sampling rate is provided in Figure 5. For the convenience of observation, the residuals of different methods are scaled by the same degree. For example, the Low altitude dataset was scaled up by a factor of 100, and the other datasets were scaled up by a factor of 50 for the higher reconstruction accuracy of the low altitude dataset. The better the reconstruction performance of the algorithm, the smaller the residuals and the darker the residual images; conversely, the brighter the residual images. We can clearly observe in Figure 5 that the residual images of STNCS and E3DTV are brighter, implying that the reconstruction quality of these two methods is poor. Next is the JT3DTV algorithm. The image brightness of HSSTV and DCS_SLM are similar. Although the DCS_SLM achieves a reconstructed MPSNR of 57.318 on the Low altitude dataset under 0.2 sampling rate, its reconstructed residuals are large from the image in Figure 5. This may be related to the low accuracy of the endmember prediction in band 30. DCHI_SUL is slightly darker than DCHS and better than the other methods. However, we observe a strange phenomenon where two bright spots appear on the reconstructed residual images of the DCHI_SUL on the Yellowstone 3 and Yellowstone 18 datasets, respectively. By observing the corresponding position of the original image, we found that there is a small anomalous target at that position. This means that the proposed DCHI_SUL algorithm cannot handle the small anomalous targets contained in the scene. In the DCHI_SUL CS framework, on the one hand, spatial random observation and low sampling rate may lead to the ignoring of small anomalous targets; on the other hand, when the number of endmembers is small, the endmember extraction algorithm preferentially extracts the main component endmembers and ignores the small anomalous targets. However, due to the low percentage of anomalous targets in the whole scene, the CS reconstruction performance (MPSNR, MSSIM, and MSAM) is hardly unaffected. Yellowstone 18 outperforms other algorithms in all metrics.
To further illustrate the superiority of the proposed DCHI_SUL on spectrum reconstruction, reconstructed and original spectral curves across all bands are compared from the pixel (100, 100) of all the five datasets. We provide the comparisons of the spectral curves in Figure 6 and enlarge the local area for facilitating observation. The MSAM in Table 3 is the average measurement of all pixels. Here, only one pixel is used to visualize the difference between the original and the reconstructed spectral curves. The relationship between Figure 6 and Table 3 is similar to Figure 4 and Table 1. The comparison of spectral curves shows that DCS frameworks reconstruct the spectral curve closer to the original one. While DCS_SLM performs somewhat poorly than the other two DCS methods, it still significantly outperforms the other four conventional algorithms. STNCS, JT3DTV, and E3DTV performed poorly on all datasets, especially the Yellowstone 18 and Low altitude datasets. This is consistent with the results in Table 3. Visual comparison experiments show that the DCS framework can perfectly recover the spatial and spectral details of the original images.
3.6. Ablation Studies and Discussion
As noted above, the proposed DCHI_SUL framework achieves higher reconstruction quality. In this subsection, we mainly discuss the contribution of the initialization module and reconstruction module of the proposed framework. For easy reference, DCHI_SUL_E is named as DCHI_SUL only with the endmember prediction module. That is, the least square solution of equation (9) estimates the abundance after the key band endmembers are predicted by the spectral library, and then the product of the extracted CS band endmember matrix and the estimated abundance matrix recovers the HSI. DCHI_SUL_EI is DCHI_SUL with the abundance initialization module replacing the least square solution of equation (9) after the endmember prediction. DCHI_SUL_EIR is DCHI_SUL with the endmember prediction module, the abundance initialization module, and the reconstruction module. Figure 7 shows the reconstructed MPSNR with different modules.
(a)
(b)
(c)
(d)
(e)
Overall, with the enhancement of the 3 modules, DCHI_SUL_EIR significantly improves the reconstruction accuracy. Moreover, the MPSNR curves of DCHI_SUL_E and DCHI_SUL_EI fluctuate severely, which may be caused by random sampling of key bands. All the data used in our experiments are raw hyperspectral data without removing water absorption and noise bands. The quality of randomly selected bands with different sampling rates varies, which leads to fluctuating accuracy of endmember prediction in key bands. By adopting appropriate regularization terms, however, DCHI_SUL_EIR gradually increases MPSNR with increasing sampling rate, except for the Yellowstone 18 dataset, which also further validates the robustness of the proposed method.
At 0.1 sampling rate, the key band endmembers can be predicted more accurately with the aid of the spectral library, which has been confirmed in Figure 2 in the previous section. However, it is difficult to reconstruct only by the endmember prediction module, for example, the MPSNR of DCHI_SUL_E on the Yellowstone 3 dataset is only 1.28 dB. When the abundance initialization module was added, the MPSNR increased to 37.62 dB, which confirmed the effectiveness of the initialization module. On other datasets, DCHI_SUL_EI can also be boosted by 15–20 dB at 0.1 sampling rate. Nevertheless, when the sampling rate is large, the initialization module can only marginally improve the accuracy, sometimes by less than 1 dB. This is because when the sampling rate is below 0.1, the endmember number is greater than the key band number , and the abundance estimation is an underdetermined optimization problem. A proper prior constraint has a pivotal role in the reconstruction. As the sampling rate increases, the abundance estimation transforms from an underdetermined to an overdetermined problem. The least square method then yields excellent exact solutions.
On the Yellowstone 0, Yellowstone 3, and Yellowstone 18 datasets, the MPSNR of both DCHI_SUL_E and DCHI_SUL_EI have a clearly decreasing trend under 0.5 sampling rate, while DCHI_SUL_EIR can still maintain a slight increase. As the sampling rate increases, the number of key bands increases, leading to a decrease in the accuracy of end element prediction. Furthermore, the deviation of abundance estimation depending on endmember prediction increases. Then the match between the estimated abundance and the endmembers of the CS band will be broken and the LMM formed by their product is not applicable. Therefore, iterative updating of endmember and residual in the reconstruction module of DCHI_SUL_EIR plays a prominent role.
Finally, we discuss the training dataset of the proposed DCHI_SUL framework. In our framework, we need to learn the endmember projection relationship from CS band to key band. Spectral library is a convenient learning data. On the one hand, the spectral library contains the endmembers in the real scene. On the other hand, compared with HSIs, the amount of data in the spectral library is much smaller. For example, the USGS digital spectral library used in this article contains spectral curves for 498 materials, each with 224 bands, and occupies less than 500 KB of total storage space. This means that the amount of training data for DCHI_SUL is even less than a HSI, which is hardly different from the traditional modelbased reconstruction methods. However, deep learningbased reconstruction usually requires training on hundreds or even thousands of HSIs. For example, deep compressed sensing network (DCSN) [32] applied a total of 2537 HSIs sized of 256 × 256 with 172 bands and randomly selected 90% as the training sets.
4. Conclusions
In this article, we propose a novel DCS framework with an auxiliary prediction by the spectral library for HSIs, which utilizes spectral learning, fine initialization, and strategic reconstruction to improve the reconstruction quality. Specifically, in the endmember estimation, the relationship between the key band and CS band endmembers is efficiently learned by the spectral library, so that the key band endmembers can be effectively predicted from the CS band endmembers. In the initialization module, A TV difference operator combining horizontal and vertical directions is proposed to improve the estimation of initial abundance under the low sampling rate. Analogically, in the final reconstruction iteration, the CS band endmembers and residuals are updated to compensate for the mismatch between endmember and abundance. Experimental results demonstrate that the proposed DCS framework achieves much higher reconstruction performance and better perceptual image quality than other stateoftheart CS methods.
In the future, applying spectral unmixing and spectral library learning to process the CS of HSIs containing small anomalous targets will be the focus of our research.
Appendix
A. Algorithm to Estimate Initial Abundance
In Appendix A, we provide a detailed solution to the unconstrained optimization problem (11) in the main text. By introducing the auxiliary variables , , and , problem (11) can be rewritten as the following constrained optimization problem:
The augmented Lagrangian function for the problem (A.1) is as follows:where , , and denote the Lagrange multipliers and is a positive penalty constant.
Each iteration of ADMM fixes the other variables and updates only one variable. For the update of , the optimality subproblem is as follows:where subscript denotes the iteration. Problem (A.3) has the analytical solution ofwhere is a identity matrix.
The subproblem of is as follows:which has the analytical solution ofwhere indicate the adjoint of , and is an identity matrix.
Similarly, the objective function with respect to minimization subproblem is as follows:which is the wellknown softthreshold problem [47], and the solution to the softthreshold problem (A.7) is given by:where denotes the componentwise application of the softthreshold function.
The subproblem of is as follows:
Whose optimization solution is
At the end of each iteration, the Lagrange multipliers and are updated by the gradient descent method.
After the iteration, the residual is defined as follows:
The iteration stopping criterion is defined as .
In summary, the initialization module to estimate is described in Algorithm 1.
B. Algorithm to Reconstruct CS Bands
Similarly, in Appendix B, we provide an ADMM solution for the optimization problem (13). Introducing the auxiliary variables and , problem (13) is transformed into
And its augmented Lagrangian function is as follows:where , , and denote the Lagrange multipliers.
By minimizing the augmented Lagrangian function with respect to , we can obtain
The optimization problem to compute at iteration is a typical least square problem with the solution
Similarly, the solution of and at the iteration can be obtained by the following expressions:
The auxiliary variables and are updated as follows:
Finally, the Lagrange multiplier is updated.
The reconstruction module to obtain is described in Algorithm 2.

Data Availability
The AVIRIS hyperspectral image datasets used to support the findings of this study are available at https://aviris.jpl.nasa.gov/data/get_aviris_data.html. The USGS spectral library used to support the findings of this study is available at https://speclab.cr.usgs.gov/spectral.lib06.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this article.
Acknowledgments
This research was funded by the Key Projects of Natural Science Research of Universities of Anhui Province (Grant nos. KJ2018A0483, KJ2019A0709, and KJ2020A0703), Overseas Visiting and Research Project for Excellent Young Key Talents in Higher Education Institutions in Anhui Province (Grant no. gxgwfx2019056), and Shaoguan City Scientific Program (Grant no. 2019sn075).