Abstract

The wavelet transform is widely used in the task of hyperspectral image compression (HSIC). They have achieved outstanding performance in the compression of a hyperspectral (HS) image, which has attracted great interest. However, transform based hyperspectral image compression algorithm (HSICA) has low-coding gain than the other state of art HSIC algorithms. To solve this problem, this manuscript proposes a curvelet transform based HSIC algorithm. The curvelet transform is a multiscale mathematical transform that represents the curve and edges of the HS image more efficiently than the wavelet transform. The experiment results show that the proposed compression algorithm has high-coding gain, low-coding complexity, at par coding memory requirement, and works for both (lossy and lossless) compression. Thus, it is a suitable contender for the compression process in the HS image sensors.

1. Introduction

Due to the recent advancement in imaging spectrometry, the HS image sensors capture the reflectance (intensity) of the scene with increasingly higher spatial and spectral resolution [1]. This abundant information makes HSI useful in multiple applications from remote sensing, anomaly detection, land surface classification, and environmental monitoring [2]. The processing of the obtained information is a huge task and it also reduces sensor performance and requires a lot of bandwidth with a huge sensor memory. Thus, HS image compression is a necessary step before the transmission of the image data [3]. Different compression techniques have been used to remove the redundancy existing in the HS image. There are two types of redundancies that exist in any HS image: spatial redundancy (exists due to the intraband dependency present in the spatial domain) and spectral redundancy (exist due to dependency among pixels of the different frequency frames at the same spatial location). These redundancies need to remove to achieve compression [4].

In the last couple of decades, there are many HSICAs proposed according to the compression requirement or data loss (lossless, lossy, and near-lossless) or coding process (prediction, vector quantization, compressive sensing, tensor decomposition, and learning based hybrid). In view of data loss, the HS image is constructed without any loss through lossless or near-lossless compression. The compression ratio (CR) is a unitless parameter and the numeric value for the HSICA is approximately 2∼3. The lossy HSICAs have high CR with the loss of some data. The CR value is managed by changing the bit rate of the encoding process [5].

The predictive coding algorithms have low complexity with moderate coding gain. Entropy coding methods (Huffman coding, unary coding, or arithmetic coding) are used to encode the predictive error [6]. The vector quantization based HSICAs use the coding book (same available at the encoder and decoder end) to achieve the compression. These algorithms only work with lossless compression and if any data loss happens in the communication channel, the coding gain reduces significantly [7]. The compressive sensing based HSICA shifts the complexity from the encoder to the decoder. It divides the whole HS image data into blocks, compresses the data, and transmits it. These HSICAs require a low coding memory during the computation process [8]. Tensor based HSICA uses an n-dimensional tensor matrix to store HS data. These HSICAs work with the other type of HSICA to achieve compression [9]. Learning based HSICSs use machine learning algorithms, neural networks, and support vector machines to perform the compression of an HS image. These algorithms have high-coding gain but it comes at the cost of very high complexity. These algorithms are suitable when large data need to be compressed [10]. Transform based algorithms use a mathematical transform (Fourier transform, cosine transform, and wavelet transform) to achieve compression. They have moderate coding gain with coding complexity. The mathematical transform converts pixel values into the frequency domain [11]. The hybrid HSICAs are the union of any two techniques mentioned above. These algorithms are designed for the requirement of the application only [12]. Though machine learning-based HSICAs and neural network-based HSICAs have high-coding efficiency but this comes at the cost of coding complexity and requirement of a huge HS image sensor memory. These algorithms work only for the lossless image compression process and learning algorithms have varied performance for every HS image sensor. The mathematical transform-based HSICAs have low complexity and can work with any type of compression.

The selection of HSICA for a certain application depends on a number of variables. The coding complexity, coding memory, coding gain, and data loss are the major factors for the determination of the HSICA [13]. The rest of the manuscript is organized as follows. Section 2 provides an overview of related works. Section 3 bears a detailed description of the proposed compression algorithm 3D-Listless Embedded Zerotree Set Partitioning Coding (3D-LEZSPC). Section 4 provides the experiment detail. Section 5 provides the details of the simulation results and analysis, along with detailed performance evaluations. The work is concluded in Section 6.

There are different coding techniques are described as follows.

2.1. Transform Based Coding Techniques

The transform-based HSICAs work with all types of compression (lossy, lossless, and near-lossless), and CR can be adjusted according to the bit rate. As aforementioned, the mathematical transform converts the HS image to the frequency domain and the encoder compresses the transform HS image. The output bit stream is transmitted to the decoder. The decoder follows the same overall procedure as the encoder with some low-level logical or mathematical calculations. The inverse mathematical transform is applied to the received bit stream [13, 14]. The performance of the wavelet transform is better than that of the cosine transforms as the information about the scale and location [15].

The set partition HSICAs are a bit plane coding algorithm and use the set structure to define a larger number of insignificant coefficients at the high bit plane. These algorithms have low-coding complexity, bit stream embeddedness, low coding memory demand, and high-coding gain than other transform-based HSICAs. The insignificant coefficients are represented by the block cube (3D-SPECK, 3D-LSK, and 3D-ZM-SPECK) or tree (3D-SPIHT and 3D-NLS) or block cube tree fashion (3D-WBTC, 3D-LMBTC, and 3D-LCBTC) [1623].

2.2. Curvelet Transform

The last couple of years, there is a lot of transform-based HSICAs are proposed that use different taste of wavelet transform (Haar, dyadic, and continuous). The curvelet transform has curve similar singularities and it give information about the scale, location, and orientation while the wavelet transform can recognize point similar singularities and information about the scale and location [24].

The curvelet transform is used in feature extraction [25], image denoising [26], image classification [27], and image fusion [28] for the HS image. The 2D curvelet transform is used for the compression of a grayscale image which shows a higher coding gain than the wavelet transform-based compression algorithms [29]. The first generation curvelet transform was proposed by Candes and Donoho in 1999 but it suffered from high complexity and output image redundancy. The second generation of the curvelet transform is determined by the frequency partitioning technique without the use of other transforms which makes it a fast-processing image analysis technique. It is also known as a fast discrete curvelet transform. It is performed by the unequispaced fast Fourier transform (USFFT) based and frequency wrapping based curvelet transform. The USFFT is slightly slow to the other technique in the calculation of the transform matrix. The selection of the aforementioned technique differs from application to application [30]. In the last couple of decades, computational devices generate a lot of data, thus the compression of the data is required before the transmission process [23, 31].

3. 3D-Listless Embedded Zerotree Set Partitioning Coding (3D-LEZSPC)

The proposed compression algorithm has a similar partition rule as 3D-SPIHT [17] and 3D-NLS [20] and produces an almost similar embedded output bit stream. The functionality of the lists in 3D-SPIHT is performed by the fixed size markers in 3D-LEZSPC. The 3D-LEZSPC uses two types of markers state marker of coefficient (SMC) and state marker of zerotree (SMZ). These markers are used for the tracking of the significance/insignificance coefficients/sets. The information about the significance of the coefficient is represented by the “α” marker. There are four possibilities for this marker which are given in Table 1. These markers require the memory of 2 bits per coefficient. In the same way, the information about the node of the zerotree is determined by another set of markers called the state marker of zerotree (SMZ). The SMZ executes the same function as LIS in 3D-SPIHT. Each node of the zerotree is assigned the SMZ numeric value defined in Table 2. It has been known that around 75% of the coefficients of the transform matrix are present in the seven lower sub-bands. These sub-bands of the pyramid have no direct descendants (leaf node of the zerotree or no coefficients are present), thus it does not assign any SMZ marker numeric value. During the significance test of zerotree, a node may be in any of the four aforementioned possible states, which can be identified by a 2 bits marker. A simple mathematical calculation suggests that on average 0.25 bits/node memory will be sufficient to mark the states of all trees or subtrees in the transformed HS image.

Like other set partitioned compression algorithms, the proposed compression algorithm has two passes named Initialization Pass and Bit Plane Pass. Bit plane pass has three subpasses named Insignificant Coefficient Pass (ICP), Insignificant Set Pass (ISP), and Refinement Pass (RP).

3.1. Initialization Pass

The compression algorithm initialized from the top bit plane “n.” The transform 3D HS image is converted to the 1D array. The size of LLL sub-band is decided by the number of levels of the curvelet transform performed on the HS image. All the coefficients in the LLL sub-band are marked with the SMC marker as α = 1 while the rest of the coefficients are marked as α = 0. The tree nodes present in the LLL sub-band are marked with the β = 1 and the other tree nodes are represented as β = 0. This pass runs once at the initiation of the encoding process only.

3.2. Bit Plane Pass

The bit plane pass executes from the top of the bit plane to runs till the last bit plane (n = 0) or till the bit budget is available. The proposed compression algorithm utilizes the depth-first search encoding approach, one tree after the other. The bit plane pass is three types as explained follows.

3.2.1. Insignificant Coefficient Pass

This is the first subpass in the bit plane pass and all coefficients (insignificant to the last bit plane) are tested against the current threshold. If any coefficient is found significant then the significance is decoded with the generation of sign bit and the marker also changes.

3.2.2. Insignificant Set Pass

The insignificant set pass is applied to test the significance of the tree node against the current threshold. If the tree node (Type A) is significant to the current threshold, then the tree is divided into eight coefficients and each coefficient is tested for significance. If the coefficient found significance against the current threshold, then the significance is encoded with the sign bit. The SMC marker is allotted for the coefficient with the marker α = 2. If the coefficient is not significant then the SMC marker is allotted with the marker α = 1. For the tree node type B, if the node is significant to the current threshold, it will be split into the eight new zerotrees with the SMZ marker having the value β = 3 while if the tree node is not significant to the current threshold, the marker value remains β = 2.

3.2.3. Refinement Pass

The refinement pass is conducted to those coefficients which are already significant in the previous bit plane. For those coefficients that are significant in the last bit plane, the SMC marker will change from α = 2 to α = 3. The pseudo code for the 3D-LEZSPC is given as (See Algorithm 1).

Input: four level curvelet transform is applied to the input HS image (spatial dimension is M&N and spectral dimension is P)
 The transform HS image is converted to the array Г through linear indexing (Morton mapping)
Output: embedded bit stream
 Initialization pass
  Set topmost bit plane
   Threshold of the transform HS image
  Set all coefficients in the LLL band with α = “1” and rest of the coefficients in the transform HS image with α = “0”
  Set the tree nodes of LLL band with β = “1” and rest of the tree nodes are mentioned as β = “0”
   Total number of bits
 Bit plane pass
 while n ≥ 0
  Insignificant coefficient pass
   for i = 1: λ
    if {α (i) = “1” && 2n ≤ Г(i) < 2n + 1}
     Output = 1 and sign bit of Г (i)
     Set α (i) = “2”
    end
   end
  Insignificant set pass
   for j = 1 : λ/8
    if {β (j) = “1”}
     if {Max_SoT_A ≥ 2n}
      Output = “1”
      for η = (8j − 7 : 8j)
       if {2n ≤ Г(η) < 2n + 1}
        Output = 1 and sign bit of Г (η)
        Set α (η) = “2”
       else
        Set α (η) = “1”
        if {Max_SoT_B ≥ 2n}
         Output = “1”
         Set β (8j − 7) = 1; Set β (8j − 6) = 1
         Set β (8j − 5) = 1; Set β (8j − 4) = 1
         Set β (8j − 3) = 1; Set β (8j − 2) = 1
         Set β (8j − 1) = 1; Set β (8j) = 1
         Set β (j) = “3”
        else
         Set β (j) = “2”
       else if β (j) = “2”
        if {Max_SoT_B ≥ 2n}
         Output = “1”
          Set β (8j − 7) = 1; Set β (8j − 6) = 1
          Set β (8j − 5) = 1; Set β (8j − 4) = 1
          Set β (8j − 3) = 1; Set β (8j − 2) = 1
          Set β (8j − 1) = 1; Set β (8j) = 1
         end
        end
       end
      end
     end
    end
   end
  Refinement pass
   for k = 1 : λ
    if α (k) = “2”
     Set α (k) = “3”
     else if α (k) = “3”
      Output = nth bit of Г (k)
     end
    end
   end
n = n − 1
 end

4. Experimental Details

The proposed HSICA is compared with the other transform-based HSICA [1623] based on coding gain, coding memory, and coding complexity. All HSICAs are simulated on the same software platform (64 bit Windows 8.1 operating system) and hardware platform (8 GB RAM and Intel i3 CPU with 1.6 GHz). Six HS images Washington DC Mall (Hyperspectral Image I), Yellowstone Scene 0 (Hyperspectral Image II), Yellowstone Scene 3 (Hyperspectral Image III), Yellowstone Scene 18 (Hyperspectral Image IV), Pavia Centre (Hyperspectral Image V), and Pavia University (Hyperspectral Image VI). HYDICE sensor (400 nm to 2500 nm with a 10 nm bandwidth) captured “Washington DC Mall” HS image (spatial and spectral dimension are 1280 × 307 × 191), AVIRIS sensor (380 nm to 2500 nm with a 10 nm bandwidth) captured “Yellow Stone” data set (spatial and spectral dimensions are 512 × 680 × 224), and ROSIS sensor (430 nm to 860 nm with a 4 nm bandwidth) captured Pavia Centre (spatial and spectral dimensions are 1096 × 1096 × 102) and Pavia University (spatial and spectral dimensions are 610 × 610 × 103) [32, 33]. The pixel depth (bit depth) of Hyperspectral Image I is 14 bits while the pixel depth of Hyperspectral Image II, II, and IV is 16 bits. The pixel depth of Hyperspectral Image V and VI is 13 bits.

5. Simulation Results and Discussion

The four-level curvelet transform is used to transform the HS image and this transform 3D HS image is converted to the 1D array through Morton mapping [23]. The coefficients in the 1D array are encoded with the HSICAs. The Peak Signal to Noise Ratio (PSNR) is calculated in decibel (dB), while Bjøntegaard delta peak signal-to-noise rate (BD-PSNR), Structural Similarity (SSIM) index, and Feature Similarity (FSIM) index are unit less parameter. The coding memory is calculated as kilobyte (kB) or megabyte (MB) and the coding complexity (encoding time and decoding time) is calculated in second (sec) [22].

5.1. Coding Efficiency

The coding gain or coding efficiency is a measure of rate-distortion (RD) performance of any HSICA. The PSNR is used to assess the performance of image reconstruction. Comprehensively, it can be deduced that the objective (PSNR) and perceptual subjective quality metric (SSIM) show significant improvement with compression rates [21, 34]. From Table 3, it can be noticed that the 3D-LEZSPChigh-coding gain than other state of art 3D-SPIHT and 3D-NLS as 3D-LEZSPC has more significant coefficients than the other compression algorithms. Due to the significant coefficients with refinement coefficients, the coding gain increases [3436].

The PSNR and SSIM are mathematically, it is calculated in equations (1) and (2).

The original HS image and reconstructed HS image are represented as “I (i, j, k)” and “D (i, j, k)” while “N” is the dimension of the HS image. The mean average of the input HS image “I” and “D” is . The variance is given as while covariance between the original HS image and reconstructed HS image is . The C1 and C2 are correction factors. From Table 4, it is clear that the SSIM index is almost similar to all HSICAs.

From Figure 1, it is inferred that 3D-LEZSPC outperforms to the 3D-SPIHT and 3D-NLS at every bit rate. It is observed that the variation of PSNR between 3D-LEZSPC and 3D-SPIHT exists in the range of −0.22 dB to −0.02 dB for Washington DC Mall HS image, −0.32 dB to −0.09 dB for the Yellowstone Scene 0, −0.28 dB to −0.22 dB for the Yellowstone Scene 3, −0.26 dB to −0.15 dB for the Yellowstone Scene 18, −0.51 dB to −1.62 dB for Pavia Centre, and −0.49 dB to −1.14 dB for Pavia University hyperspectral image.

In the same way, 3D-LEZSPC and 3D-NLS exist in the range of −0.69 dB to −0.26 dB for Washington DC Mall HS image, −0.48 dB to −0.24 dB for the Yellowstone Scene 0, −0.49 dB to −0.22 dB for the Yellowstone Scene 3, −0.49 dB to −0.06 dB for the Yellowstone Scene 18, −0.51 dB to −1.88 dB for Pavia Centre, and −0.74 dB to −1.42 dB for Pavia University hyperspectral image. Bjontegaard metric calculation or BD-PSNR is tabulated in Table 5 is used to compare the rate-distortion performance of two different HSICAs of the same HS image over a range of different bit rates (bpppb).

The relationship between the different features of the original HS image and the reconstructed HS image is measured by the feature-similarity (FSIM) index. The value of FSIM lies between 0 to 1. The value near 1 represents the best reconstruction of the HS image. Table 6 shows quantitative comparison of different HSICA with the proposed compression algorithm for seven-bit rates.

5.2. Coding Memory

The proposed HSICA does not use any list for tracking the coefficients or zerotree and uses only two different type markers (SMC and SMZ) for the same task. The 3D-LEZSPC requires on average 2.25 bit per coefficient marker while state of art 3D-LSK & 3D-NLS require a 4 bit per coefficient marker and 8 bit per coefficient marker. The listless HSICA required constant coding memory for any bit rate and depending only on the size of HS image while the list-based HSICA has varying coding memory requirements depend only on the bit rate. It is clear from Table 7 that for the high bit rates, the listless HSICA has a low-coding memory requirement while for the low bit rates, list-based HSICAs perform best.

5.3. Coding Complexity

The coding complexity of any compression algorithm is calculated based on the time required in the encoding and decoding process. The complex algorithms have required more time to conclude the computation. The time required to encode the coefficients is always greater than the time required for the decoding process. It is because the encoding process took more time to fetch the zerotree and test for the significance of the zerotree for each threshold. From Table 8, it is clear that 3D-LEZSPC outperforms to 3D-SPIHT but has a similar performance with the 3D-NLS. 3D-LEZSPC has inferior performance with the other listless HSICAs 3D-LSK, 3D-LMBTC, 3D-LBCTC, and 3D-ZM-SPECK because significance testing of the zerotree requires more time than the testing of the block cubes or block cube tree. It has been observed from the table, it is clear that 3D-LEZSPC required ∼70% less computational time than 3D-SPIHT (combining all computational time) while for 3D-NLS it is almost the same.

The correlation coefficient between the original HS image and the reconstructed HS image after the compression process is represented in Table 9.

The visual representation of the HS image and reconstruct HS image after the compression process for Yellowstone Scene 00 (frame 50, 100, and 150) is shown in Figure 2.

6. Conclusion

From the findings of the research, conclusions tend to be drawn based on the policies of coding gain, coding memory, and coding complexity of the proposed algorithm. First, the coding gain is improved for almost all bit rates, and it can be observed that the proposed 3D-LEZSPC had a positive Bjontegaard Delta PSNR gain for all HS images except one case. The coding gain is increased because the curvelet transform requires less number of coefficients to represent the edges in the HS image. The coding memory is also significantly reduced to other zerotree HSICAs 3D-SPIHT and 3D-NLS. The coding complexity is slightly increases. The slightly higher complexity of 3D-LEZSPC is paid off with the low-coding memory requirements. The use of the bandlet and ripplet transform improves the coding gain. The demand for coding memory can be reduced by using less number of markers.

Data Availability

The data used to support the findings of this study are included within the article.

Ethical Approval

Not applicable.

All authors agreed on the final approval of the version to be published.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

.