Research Article  Open Access
Yongjun Zhu, Wenbo Liu, Qian Shen, Yin Wu, Han Bao, "JPEG Lifting Algorithm Based on Adaptive Block Compressed Sensing", Mathematical Problems in Engineering, vol. 2020, Article ID 2873830, 17 pages, 2020. https://doi.org/10.1155/2020/2873830
JPEG Lifting Algorithm Based on Adaptive Block Compressed Sensing
Abstract
This paper proposes a JPEG lifting algorithm based on adaptive block compressed sensing (ABCS), which solves the fusion between the ABCS algorithm for 1dimension vector data processing and the JPEG compression algorithm for 2dimension image data processing and improves the compression rate of the same quality image in comparison with the existing JPEGlike image compression algorithms. Specifically, mean information entropy and multifeature saliency indexes are used to provide a basis for adaptive blocking and observing, respectively, joint model and curve fitting are adopted for bit rate control, and a noise analysis model is introduced to improve the antinoise capability of the current JPEG decoding algorithm. Experimental results show that the proposed method has good performance of fidelity and antinoise, especially at a medium compression ratio.
1. Introduction
Image processing technology has always been a research hotspot in the field of computer science. Especially, in the recent years, under the emergence of highdefinition and largescale images and the impact of massive video information, image compression processing technology has become particularly noticeable. Image compression technology can use limited storage space to save a larger proportion of image data; at the same time, it can also reduce the data size of images of the same quality, which can effectively improve the efficiency of network data transmission. The traditional image compression technology includes two independent parts, image acquisition and image compression, which limit the fusion improvement method of the two correlated compression technology parts. The emergence of compressed sensing (CS) theory breaks the above frame of image compression, and it completes the image acquisition and compression in the step of sparse observation synchronously; on the one hand, it simplifies the image processing process, and on the other hand, it also provides new research areas for image fusion compression.
There are many types of images processed in image compression technology, and this article selects a still image as the research object. The common still image compression formats include JPEG, JPEG2000, JPEGXR, TIFF, GIF, and PCX. This paper focuses on the research of image compression algorithms with a JPEG similar structure and improves them with the combination of CS technology. In addition, the algorithms with a similar principle architecture to JPEG are collectively referred to as JPEGlike algorithms, including traditional JPEG, JPEGLS, JPEG2000, and JPEGXR. Data redundancy is essential to the compression of a still image. JPEGlike algorithms use timefrequency transform algorithms and entropy coding as main methods to eliminate data redundancy [1–3]. Although having achieved certain effects of still image compression, these algorithms have insufficient considerations on three types of data redundancy (coding redundancy, interpixel redundancy, and psychological visual redundancy) [4]. Firstly, the simple image blocking without guidance could not support the effective coding efficiency to eliminate redundancy in the existing JPEGlike algorithms. Secondly, the uniform timefrequency transform of the same dimension cannot reasonably use the a priori information between pixels of different subimage blocks to reduce interpixel redundancy. In the end, the former JPEGlike algorithms fail to eliminate psychological visual redundancy by considering overall and local saliency. CS technology breaks through the limitations of the Nyquist sampling theorem to provide innovative ideas for sparse reconstruction of signals [5]. In particular, the adaptive block compressed sensing (ABCS) combined with adaptive partitioning and sampling provides a feasible solution for the optimization of JPEGlike algorithms [6, 7]. That is, the block compression measurement matrix could be used as the forward discrete cosine transform (FDCT) matrix in the JPEG coding, and the inverse discrete cosine transform (IDCT) process is replaced by sparse reconstruction. In addition, multiple feature saliency and noise analysis are introduced to implement adaptive control of the observation matrix and minimal error iterative reconstruction [8, 9].
In this article, we proposed a JPEG lifting algorithm based on the ABCS, and named it as JPEGABCS. This proposed algorithm focuses on the following aspects: (1) guiding best morphological blocking by minimizing mean information entropy (MIE); (2) generating an element vector of subimage pixels using the texture feature and 2dimensional direction DCT; (3) selecting the dimension of the measurement matrix by variance and local significance factors; (4) rate control by matching the overall sampling rate and the quantization matrix; (5) realizing iterative reconstruction of a minimum error under noise condition by using noise influence model analysis.
The remainder of this paper is organized as follows. In Section 2, the basic theories of JPEGlike algorithms and the ABCS algorithm are illustrated. In Section 3, we focus on the introduction of the JPEGABCS algorithm. Then, the implementation of the proposed JPEGABCS algorithm is analyzed in Section 4. In Section 5, the experiment and result analysis shows the benefit of the JPEGABCS. The paper concludes in Section 6.
2. Preliminary Knowledge
2.1. Background of the Existing JPEGLike Algorithms
The existing JPEGlike algorithms are similar in structure, mainly including blocking, forward timefrequency transform, quantization, entropy coding, and the inverse operation of the above four processes. As the basic one of JPEGlike algorithms, the structure of the JPEG model is shown in Figure 1.
It can be seen from Figure 1 that in the entire JPEG model, the original image I is treated as twodimensional data, and its key link is adopting the 2dimensional DCT. Generally, the block size is square, such as , and the recommended quantization matrix (lighttable) is given in equation (1) [10]. Based on the Hoffman coding, the encoding part adopts differential pulse code modulation (DPCM) for DC coefficients and run length coding (RLC) for AC coefficients:
Compared with the fixed bit rate of the JPEG algorithm, the JPEGLS algorithm adds the function of rate control by using a quality factor. JPEG2000 adopts nonfixed square blocking (tile) and discrete wavelet transform (DWT) to improve the quality of the restored image. JPEGXR introduces the lapped orthogonal transform (LOT) to reduce the blocking artifact at low bit rates.
2.2. Basic Theory of CS Algorithm
CS theory was originally proposed by Candès et al. in 2006, which proved that the original signal can be accurately reconstructed by partial Fourier transform coefficients. The advent of CS technology solves the problem that image sampling and compression cannot be performed simultaneously. In general, the main contents of the research on CS theory include sparse representation, compression observation, and optimization reconstruction [11]. Firstly, the main task of sparse representation is to find a set of bases that can make the signal sparse representation, which is the premise and foundation of the entire CS theory. Secondly, the primary task of compression observation is to design a linear measurement matrix uncorrelated with the basis vector to obtain dimensionality reduction observation data, which is the key content of CS theory. Lastly, optimization reconstruction is a difficult problem in CS theory, and its main goal is to solve the original signal through the reverse optimization problem of the sparse vector. The specific solution method of this process is the constrained optimization method.
CS mathematical model is based on the assumption of signal sparsity. Let be the original signal with n dimension. Suppose that the sparse matrix makes the sparse representation coefficient of as , where contains only K () nonzero elements. The original signal is called the K sparse signal under sparse basis . The number of nonzero elements in the coefficient vector can be calculated by , where denotes norm.
CS theory states that the information content in sparse signals can be effectively captured by a smaller number of observations. Let be the measurement matrix, where . The linear dimensionreduction acquisition vector of the original signal is given as , where represents the CS observation signal. In addition, the CS theory points out that to accurately recover the original signal by the observation signal, its dimensions must obey the following condition: , where c is an adjustment constant.
Since M < N, the reconstruction of the sparse signal from the measurement vector is illposed which requires us to solve the underdetermined system of equations. There are many solutions for such a system. It is common practice to achieve effective signal reconstruction by using signal sparsity as an additional constraint. The accurate signal reconstruction is accomplished through solving the following optimization problem:where is the sensing matrix, denotes the norm, and the value of p is usually 0, 1, and 2 according to different optimization goals. This is a NPhard problem, and in order to ensure the stability and robustness of the reconstruction process, the measurement matrix must satisfy the restricted isometric property (RIP).
The above is the description of the three important problems of the CS algorithm, which solves the separation problem of traditional image acquisition and compression. However, when CS is applied to largescale highdefinition images and video processing, because the 2dimensional image contains a lot of information, the overall projection requires a largescale measurement matrix, which will inevitably lead to two major problems: excessive storage and reconstruction algorithm complexity. The above problems limit the application of CS in image processing. The emergence of block compressed sensing (BCS) theory solves this problem well. The solution is to cut the whole image into several small unit blocks, reconstruct after independent observation, and then perform stitching to restore and reconstruct the original image.
Traditional block compressed sensing (BCS) technology introduces the idea of blocks in CS theory to solve the dimensional disaster of data processing, and then improves the processing speed of the algorithm [12]. Its basic model is shown in the following equation:where , , and are the ith subblocks of the original signal, observation signal, and block measurement matrix and is the number of blocks. In addition, a coefficient is often defined in the BCS, which is called the mean sampling rate. In the analysis of the above BCS algorithm model, although the blocking strategy solves the problems of dimensional disaster and computational complexity, the model uses a unified measurement matrix which can neither reflect the inherent differences between each subimage, nor can it achieve differentiated blocking.
In order to overcome the above shortcomings, the nonuniform blocking and observing are introduced into BCS, and combined with the idea of the adaptive algorithm, the ABCS algorithm is generated. The ABCS algorithm mentioned in this article is the introduction of the adaptive strategy into BCS, which is mainly reflected in adaptive blocking and observation [13, 14]. The ABCS algorithm model is as follows:where , , and are the ith subblocks of the original signal, observation signal, and measurement matrix. The difference between the ABCS algorithm and the BCS algorithm is that it gives the dimensional freedom of subblock and measurement matrix, which provides conditions for the reasonable use of the correlation of the internal elements of the original signal.
3. Fusion of JPEG Model and ABCS Algorithm
3.1. Workflow of JPEG Lifting Algorithm
According to the above section, the JPEG image compression model mainly includes blocking, FDCT, quantization, coding, and the inverse process of the above four parts. The focus of this section is to do the research about the method on how to embed the advantages of the ABCS algorithm into the JPEG model. The basis for the fusion of the JPEG model and the ABCS algorithm is that the consistent purpose is for image compression. The former mainly compresses the image by reducing the number of bits occupied by each pixel, and the latter mainly compresses the data by reducing the amount of sampled data, so the ABCS algorithm is suitable to be embedded to the data acquisition stage of the JPEG model; that is, the ABCS algorithm is fused in the blocking and FDCT processes to reduce the amount of input data in the quantization process. In addition, the distinction of the data processing method in JPEG and ABCS is noticed. The image data in the JPEG algorithm are processed in the form of twodimensional data, which is conducive to saving the twodimensional structural characteristics of the image, while the input signal in the ABCS algorithm is a simple onedimensional vector form and does not have twodimensional characteristics. The proposed algorithm JPEGABCS mainly includes the solution of two main problems: (a) the conversion problem between twodimensional timefrequency transformation of JPEG and onedimensional measurement model of ABCS; (b) the specific method of applying ABCS to the JPEG compression algorithm.
In typical JPEG image compression, after the preprocessing stage, an R × C input image is divided into 8 × 8 size subimages , , . Each subimage is transmitted to a 2D DCT transform, and the 2D DCT can be completed using two onedimensional DCTs according to the separability of the DCT. In addition, the blocking method designed in this paper adopts the variable shape blocking method under a unified dimension (), so the FDCT process can be described as follows:where is the subimage in the DCT domain, and are the 1D vertical and horizontal DCT orthogonal matrices, respectively, and and are the number of rows and columns of each subimage [15].
The block sparse representation and the flexible uniformdimension blocking are introduced into the ABCS algorithm. Equation (4) can be rewritten as follows:where , , and are the ith subblocks of the original signal, observation signal, and sparse signal; , , and are the ith subblocks of the measurement matrix, sparse matrix, and sensing matrix; is the subsampling rate of the ith subblock.
For retaining the twodimensional characteristics of the image signal in the application of JPEGABCS, it is necessary to analyze the twodimensional DCT transform in JPEG and the compression observation in ABCS. It is impossible that the 1dimension vector generated directly from the subimage through column/row scanning has twodimensional structural characteristics. The inverse solution of the reconstructed signal in the ABCS algorithm is generally denoted as ; that is, the reconstruction of the original signal is only related to the sparse representation coefficient . If the sparse representation coefficient has twodimensional structure information, it is equivalent to the original signal with twodimensional structure information. Therefore, the equivalent twodimensional block vector generation can be achieved by taking the sparse matrix of ABCS as the corresponding matrix under the twodimensional DCT transform:
Analyzing equation (7), the function between the sparse matrix and the DCT orthogonal matrices is established as follows:where represents the Kronecker product function. Original signal vector is obtained by scanning the pixel value of the subimage vertically. In addition, if the texture of the image is not in the vertical and horizontal directions, the directional DCT is used instead of the horizontal and vertical DCT orthogonal matrix.
Replacing FDCT and IDCT in the JPEG with adaptive sparse observing and sparse restoring, respectively, replacing blocking in the JPEG with adaptive blocking and vectorization, adding noise to the data storage or data transmission are done, and then the workflow of the proposed JPEGABCS algorithm is shown in Figure 2. Comparing the two image compression models shown in Figures 1 and 2, the key points of the JPEGABCS model are (1) adaptive blocking, that is, replacing a fixed block with a variable block; (2) adaptive vectorization, that is, providing a matching vector generation method based on image orientation characteristics; (3) adaptive observing, that is, replacing uniform observing with nonuniform observing; (4) adding a controllable variable in the rate control process to improve the JPEG algorithm; (5) designing a denoising method in adaptive restoring to reduce the noise impaction on restored data.
3.2. Innovation of JPEG Lifting Algorithm
The innovations of the above JPEG lifting algorithm are as follows:(1)Adding the mean sampling rate to overcome the deficiency of the traditional JPEGlike algorithm that can only use the timefrequency transform and the quantization matrix to eliminate redundant information in image compression.(2)By analyzing the correlation between sparseness and error, the optimal OMP iterative algorithm is established to enhance the JPEGlike algorithm’s noise immunity performance.(3)In the adaptive block observation, the MIEbased adaptive block reduces the information entropy of the subimage set to lower the bpp, the ASMbased adaptive vectorization can ensure the maximum avoidance of image information loss, and the adaptive observation based on multifeature saliency ensures a reasonable distribution of the total measurement number.
4. Implementation of JPEGABCS
This section mainly describes the implementation of the JPEGABCS algorithm mentioned in the previous section. The specific implementation is discussed from four aspects: adaptive blocking, adaptive vectorization, adaptive observing, and denoising by optimizing the number of iterations.
4.1. Adaptive Blocking Method Based on MIE
The adaptive blocking method proposed in this paper is a variable partitioning in the same dimension, that is, , where n is a fixed value, typical value is 64, and and are the number of rows and columns of the variable block, typical value , . Specifically, the optimized block is based on minimizing the mean information entropy (MIE) of the block observation signal set. Since blocking process needs to be completed before observation, it is impossible to use an ungenerated observation set for guiding the blocking optimization. Therefore, an alternative method is introduced to guide the reasonable blocking by minimizing the MIE of the original signal’s block set:where represents the MIE function of the pixel set, represents the information entropy of the ith subimage in the pixel domain, represents the proportion of elements with the pixel gray value j in the ith subimage, is the number of blocking ways, and and are the minimum and maximum values of the pixel gray in the original signal, respectively. However, the effectiveness of the above method lies on consistency between the observation signal’s MIE and the original signal’s MIE at the same partitioning. In order to verify the above consistency problem, this paper conducted a test experiment using multiple standard images, and its experimental results are shown in Figure 3. The experimental data show that, under the constraint of minimizing MIE, the optimal block of the original signal and the observed signal is consistent, which verifies the feasibility and rationality of the proposed block optimization method. Specifically, it can be seen from Figure 3 that under the constraint of minimum MIE, the optimal block shape is only related to the test image itself, not to the sampling rate. In addition, it has been verified by a large number of other standard test images that the method of finding the best block has the same trend whenever applied to the observation signal and the original signal, and there must be an extreme point.
(a)
(b)
4.2. Adaptive Vectorization Based on ASM
The basis of adaptive vectorization is how to identify the directional characteristics of the image. There are many methods for identifying direction features in the field of array signal processing, especially in DOA estimation research, such as the Capon algorithm, MUSIC algorithm, maximum likelihood algorithm, subspace fitting algorithm, and ESPRIT algorithm [16, 17]. In this article, the angular secondorder moment (ASM) value under the graylevel cooccurrence matrix (GLCM) is used to characterize the saliency of the direction [18]:where is the GLCM function, is the term of , is the normalized form of , is the adjacent pixel pairs in the image with distance , direction , and gray values , and is the ideal maximum number of pixel pairs under the selected conditions.
Combined with the rectangular shape of adaptive blocking, the ASM values in four directions are defined for adaptive vectorization, namely, , , , and [19]. In addition, the maximum value of these four values is defined as .
The specific method of adaptive vectorization is based on the relationship between the four ASM values. If and , then the vectorization set of the original subimage set is generated using horizontal scanning and vertical linking; if and , then the vectorization set of the original subimage set is generated using vertical scanning and horizontal linking; if and , then the vectorization set of the original subimage set uses zigzag generation along the main diagonal direction; if and , then the vectorization set of the original subimage set uses zigzag generation along the counterdiagonal direction. It should be noted that the adaptive vectorization of each subimage must be related to the design of the sparse matrix to jointly realize the onedimensional vectorization that preserves the twodimensional structural characteristics of the subimage data.
4.3. Adaptive Observing Based on Multifeature Saliency and Bit Rate Control
The key point of the nonuniform measurement matrix is the determination of . Considering that different subimages contain different amounts of information and the sensitivity of the human eye’s attention mechanism to different images is different, this paper proposes an adaptive measurement matrix based on multifeature saliency and the orthogonal symmetric Toeplitz matrix (OSTM):where is the adjustment factor, stands for the overall variance function, is the local saliency function according to Weber’s theorem [20], is the number of elements in the salient domain determined by the optimal bounding box, is formed by randomly taking rows of dimensional OSTM [21], and α = 2 and β = 1 are the recommended values. The purpose of designing the adaptive measurement matrix in this way is to rationalize the sampling process and to achieve more sampling of detail blocks and less sampling of smooth blocks.
The traditional JPEGlike algorithms control the bit rate (bits per pixel, bpp) through the quantization matrix, encoding, and bitstream organization [22]. In this paper, the mean sampling rate has been used to improve the compression performance of the JPEGABCS algorithm. The bit rate control for an 8bit 256level grayscale image is as follows:where is the mean sampling rate and also corresponds to information decay ratio caused by sparse measurement, () represents information decay ratio caused by the quantization, and () is the bit compression ratio of the entropy encoding and bitstream organization.
Analysis of the above three factors that affect the rate control in the JPEGABCS model shows that once the encoding method is determined, the only factors that can be optimized are and , while is a fixed number. In order to reduce the bit rate of restored images at the same quality, the value of these two factors must be set reasonably. This article focuses on the analysis of image performance impact in terms of different under the same bpp and matching design of the quantization matrix that can determine the value of . In the process of analyzing the impact of on the performance of compressed images, the synthetic indicator composed of the peak signaltonoise rate (PSNR) and structural similarity (SSIM) is used as evaluation criteria to find the best under different bpp. At the same time, in order to complete the comparison experiment of different under the same bpp, it is necessary to set different quantization matrixes. This article uses the quality factor (QF) to design different quantization matrices [23]. Because the data () quantized in the JPEGABCS algorithm are a onedimensional vector and are also a normalized measurement of the frequency domain sparse coefficients of the original signal (), the quantization matrix is weakened into a quantization vector whose elements no longer characterize frequency domain property and have the same importance. Therefore, the elements of the quantization matrix for the subimage designed in this paper have the same value (). The goal of the quantization matrix matching design is only to find a fitting function to approximate the relationship between bpp and QF. Figure 4 shows the experimental data of the above test process using Lena. According to Figure 4(a), it can be seen that under the constraint of maximizing synthetic features, the optimal mean sampling rate () increases with the increase in bpp. Meanwhile, an obtaining function can be summarized, as shown in equation (14), and the typical values of and in the equation are 0.15 and 0.3. In addition, it should be noted that equation (14) can only be directly applied to images with a similar MIE of Lena. For other images, the threshold determination condition in the equation should be corrected according to the MIE of the image block set. Specifically, the coefficient is introduced for the correction of the above two threshold conditions (that is, and ). The coefficient can be defined as the MIE ratio of other images to the Lena image. The design of the fitting function () adopts the cubic curve fitting method whose data are derived from the actual measurement value of QF and bpp. Figure 4(b) shows the comparison of consistency between the actual lighttable’s QF and the design value obtained from equation (15). From the results, the QF obtained by equation (15) satisfies the actual requirements well:where is the floor function and , , and are obtained by quadratic curve fitting.
(a)
(b)
4.4. Denoising by Optimizing the Number of Iterations
Consider the noise observation model as follows:where is the additive white Gaussian noise with zeromean and standard deviation and is the equivalent noisy original signal.
Since the reconstructed original signal () is recovered from the noisy observation signal (), the reconstruction error () is mainly caused by the noise and reconstruction algorithm, and its mathematical expression can be defined as the following equation using the norm:where is restored by the pseudoinverse operation, that is, , and is the reconstructed sparse signal. The abovementioned is the pseudoinverse of , usually , and also represents the reconstruction algorithm in CS. The reconstruction algorithm of CS is based on the sparse representation of the signal [24], that is, the reconstruction sparsity () of satisfies the inequation (), so the pseudoinverse operation to get can be rewritten as follows:where is the matrix generated of column vectors in and is the matrix generated of column vectors in that has the greatest correlation with .
Because equation (17) cannot be calculated directly, we add to help in analysis and calculation:where is a projection matrix of rank and is a projection matrix of rank . Since and satisfy orthogonality, the inner product of and is equal to zero. Therefore, equation (19) can be transformed into the following form:
Equation (21) reveals that the reconstruction error () is composed of the algorithm error () and the noise error (). decreases as the reconstruction sparsity () increases, and increases with the reconstruction sparsity () [25, 26]. Therefore, reconstruction error and reconstruction sparsity are a biasvariance tradeoff, and there must be an optimal reconstruction sparsity () that minimizes the reconstruction error ():
Figure 5 shows the relationship between reconstruction sparsity and reconstruction error under different noise conditions by using a modified Lena test image. The modified Lena image is generated by intercepting 60 sparse coefficients under a discrete cosine basis; that is, its original sparsity () is 60. The noise added in the test is zeromean Gaussian white noise, and its standard deviation ( = noise − std) also represents the intensity of the noise. The indicator PSNR is used to characterize the size of the reconstruction error. It can be seen from Figure 5 that the optimal reconstruction sparsity decreases as the noise intensity increases and is less than the original sparsity ().
From the verification experiment shown in Figure 5, we can see that there is indeed an optimal reconstruction sparsity in the reconstruction process under the noise background. However, equation (22) is not a feasible solution that can be directly used to optimize the reconstruction process. In the actual reconstruction process, only the observation data at the receiving end can be used for the optimization algorithm. Therefore, this paper designs a solution that uses observation data to optimize the reconstruction sparsity.
According to the definition of CS, the measurement matrix () obeys the RIP criteria, and thereforewhere is a coefficient related to and , and is the reconstruction error of observation data. The transformation of formula (23) can get the boundaries of the original data reconstruction error as follows:
It can be seen from the above two equations that the reconstruction errors of the original data and the observation data are consistent, so the reconstruction sparsity can be optimized by minimizing the errors of the observation data:
It is known from the above equation that the reconstruction errors of the observation signal satisfies the chisquare distribution, so the upper and lower boundary of can be derived from the chisquare distribution probability. In addition, when calculating the minimum value of , the worst condition is considered, that is, by calculating the minimum value of the upper bound of .
In the norm reconstruction algorithm of CS, the reconstruction sparsity is equal to the number of iterations. Therefore, optimizing the number of iterations can reduce the noise impact on image quality in using orthogonal matching pursuit (OMP) as the signal recovery algorithm [27, 28]. According to the Bayesian information criterion (the tuning parameters of confidence probability and effective probability are taken as and 0, respectively) [29], the optimal value of iteration number can be achieved by minimizing the noise influence:where is the noise error of observation data.
4.5. Pseudocode of JPEGABCS
The JPEG lifting algorithm (JPEGABCS) described in this article mainly consists of the above four sections, except for the entropy codec, and its full pseudocode is shown in Algorithm 1.
