Abstract
An optimal bit allocation is presented for classification-based source-dependent transform coding. A vector of transform coefficients is considered to have been produced by a mixture of processes. The available bit resource is distributed optimally in two stages: (1) bit allocation is performed for each class of coefficient vectors, and (2) bit allocation is performed for each vector coefficient. The solution for low bit rates imposing nonnegative bit resource is also presented. The rate-distortion bound of the classification-based source coding is derived.
1. Introduction
Source coding techniques are used to reduce data in digital systems to meet realistic transmission and storage constraints. In many such systems, data is reduced using quantizers and bit allocation methods which encode quantizer outputs for further processing. Quantizers apply to original or transformed data and their outputs can be compactly represented for communication or storage. Popular compressed image formats such as JPEG [1] and JPEG2000 [2] employ such techniques to reduce data sizes.
Particularly, in the so-called lossy transform/subband coding techniques [1–20], original signal data are transformed to a vector source, that is, a set of coefficients. Then each coefficient is quantized to a symbol which is encoded using certain number of bits. Coarse quantization outputs are encoded with a fewer number of bits, but the reconstruction error (distortion) will be higher.
The bit allocation problem for the vector source has been widely addressed in the literature and many different strategies have been proposed. [9] contains an extensive overview of various bit allocation approaches in the context of transform/subband coding methods.
The early transform coders were based on i.i.d models of transform coefficients and bit allocation was performed in proportion to their importance as defined by the variance of their distributions [4, 5, 8]. In [4], all of the quantizers are described by the same exponential quantizer function (QF). The method may result in negative bit rates, which may be corrected using an iterative procedure. In [5], the QF is the same for all of the quantizers and is strictly convex, although it is not assumed to be exponential. In addition, a constraint on the nonnegativity of bit rates is imposed. Both methods are described by closed-form solutions. More general approaches have been suggested in [6, 7] with an optimal bit allocation strategy found for an arbitrary set of quantizers and integer bit allocation constraint, but the solutions are not closed-form.
The performance of these coders was later improved by more accurate modeling of coefficients. One approach simply assumes that the vector source may come from a finite set of several possible processes with different distributions [10–19]. The coder decides to which distribution the coefficients belong using a classification process and encodes them properly. The classification process may include additional transformations [10–14] and may be optimized for signals of interest, not just typical ones. The classification approach in [10] improves the compression ratio twice at a fixed distortion, while up to 2 dB improvement in reconstruction quality is reported for another approach [13].
Practical still image compression techniques [1–3] rely on simple quantizers, but finely allocate bit resources adaptively for each transformed image fragment in an “input-by-input" manner [9], as opposed to a classification approach described above. Each image fragment is coded in a deterministic way with coder parameters optimized for that particular input and not for the ensemble of inputs. Input-by-input or the realization-adaptive approach is not well explained by classical rate distortion theory as many practical encoders operate in the low bit rate or high distortion region.
Eventhough the realization adaptive approach is in general superior for still image compression, the extensions of classical bit-allocation and classification techniques have been successfully applied in various applications [15–20].
In this paper, bit allocation technique is presented for classification-based methods. In Section 2, we systematize the presentation of our results in [13, 14] which were later rederived in [17, 18] in another context. Then in Section 3 we extend this solution by imposing the nonnegativity constraint on bit rates for low rate coding and estimate the rate-distortion boundary for the classification approach (Section 3).
2. Bit Allocation for Classification-Based Method
Subsequently in the paper, we operate on blocks (vectors) of data to be quantized. Our purpose here is to present results for transform coders where quantization is applied to the transformed coefficients. However, the method could be applied to other vector sources as well.
The classical bit allocation techniques assume the same number of bits allocated to different blocks, and they address optimal bit resource allocation between coefficients. In our model, the blocks of data may come from different sources, and the goal is to optimally distribute the bits among the blocks and block coefficients [13, 14].
Let be the average number of bits per sample. If bits are allocated to the th coefficient, then the quantization error in the output of an optimal quantizer can be modeled as where is the coefficient which depends on the pdf of the input signal, is the variance of the input to quantizer By definition . The following optimal classical bit allocation minimizes the overall distortion:
Let be the average number of bits per sample, is the average number of bits per sample for a vector (block) belonging to a class , is the number of bits assigned to coefficient of the vector from class . Let be the probability of a vector (block) from class . The probabilities can be estimated based on the block classification approach used. For example, it can be estimated using a training signal, and no constraints are imposed on the classification approach.
The bit allocation problem is a two-stage process. The first stage: the average bit resource is distributed among the classes so that . The second stage: for blocks assigned to each class , find the optimal bit allocation strategy among the quantized coefficients with the overall bit resource .
For the second stage, the bit allocation is simply the classical solution:with the resulting distortion:whereAverage distortion over all classes:An optimal bit-allocation for the first stage can now be formulated asObserve that the resulting distortion for the whole block (4) is similar to the distortion function in the output of a single quantizer (1) with the difference that represents the average number of bits per sample for blocks from class . One can expect that the optimization problem will result in the classical log-variance rule and this is indeed the case. We use the method of Lagrange multipliers to solve this problem [14]: from which we find expressions for : From the constraint , and the condition one can obtainand finally
If the quantization is performed in the transform domain and inverse orthogonal transform is used to reconstruct data, then the average distortion in output is equal to the distortion produced by quantizers in the transform domain. The bit allocation rule obtained above applies. In certain cases, the distortion of the reconstructed source is not equal to the distortion introduced by quantizers. This is, for example, the case when additional weighting is applied prior to quantization to account for the human visual system or when the inverse transformation/filterbank is not orthogonal. In such scenarios, weighting factors are multiplied to the variances and in our solutions . More details for filterbanks can be found in our earlier work [13].
3. Bit Allocation for a General Quantization Function and Low Bit Rates
In this section, we generalize the results from [5] for a classification-based approach to account for a general quantizer function and low bit rates, for which the result of Section 2 may produce negative bit allocations. Assume that is a quantizer function defining the average distortion on the quantizer output as a function of the allocated bits, defined for unit variance input. The distortion that results from quantizing the input with variance is . Let be strictly convex with a continuous first derivative , , and let be the inverse function of . Let us denote as the average rate per sample, and as the number of bits assigned to the th component of the quantized block belonging to a class . Similarly, is the variance of the component from class . Then the allocation of bits that will minimize the average total distortion per block: Subject to constraints is given by where is the unique root of the equation: and the value of minimum distortion is
Proof.
This allocation rule follows from
[5, Proposition 2.1].
Let us consider the joint quantization of large number () of transform coefficient blocks. The vector
is now of larger size, and let us apply the classical bit allocation to this
extended vector. Then the problem is to minimize the total distortion: Subject to constraints
According to
[5, Proposition 2.1], the solution to this
problem iswhere is the unique root of the equation: and the value of the minimum
overall distortion is
Recall that there are only possible classes. Let the blocks from the same
classes be grouped together and denote the number of blocks belonging to class as .
Then (20),
(21) will become For an ensemble of realizations, is the probability of the transformed block
belonging to a class .
The bit allocation (19) applies here as well. This concludes the proof of the
above statement.
Remark on rate-distortion bound for the classification-based source-dependent quantization. The rate-distortion bound for a unit variance source is given by the formula with , and [5]. For the classification-based source-dependent quantization, the bit allocation rule described above and , , are where is obtained fromand the minimum distortion is
The side information is bounded by entropy and the minimum rate for a lossless compression is , which is counted per transformed block. The overall rate (per sample) is counted as
Figure 1 presents a simulation example. The input signal is assumed to have blocks of 16 samples from four stationary unit-variance, first-order, zero-mean, Markov processes with covariance function . Blocks of data are transformed by discrete cosine transform (DCT). The parameters () and probabilities () for these processes are (0.95, 0.7), (0.75, 0.1), (0.65, 0.1), and (0.55, 0.1). For rate-distortion bound estimation it is assumed that an ideal classification is performed and a comparison is made with conventional bit allocation, which is derived for a single-process assumption. The rate-distortion figure demonstrates that, at equal quantization distortions, the bit rate bound achieved with the source classification method is lower for most of the rates. When the bit rate approaches zero, the rate-distortion characteristics are occasionally worse than with the conventional quantization due to the side information. The reason for this effect lies in the fact that some of the classes are not assigned any bits due to coarse quantization.
4. Conclusion
This paper presents an optimal bit allocation technique for compression methods using the classification of vector sources. First, we generalized the well-known log-variance rule using the exponential quantizer function. The quantizer function is often approximated more accurately with other functions and the log-variance rule may produce negative bit quotas at low bit rates. For this reason, the solution for a more general quantizer function and low bit rates is also presented. The rate-distortion bound is calculated for the model of source-dependent quantization, and it illustrates improvement in coding performance.
Acknowledgment
The author would like to thank the reviewers for their useful comments and corrections.