Abstract

An optimal bit allocation is presented for classification-based source-dependent transform coding. A vector of transform coefficients is considered to have been produced by a mixture of processes. The available bit resource is distributed optimally in two stages: (1) bit allocation is performed for each class of coefficient vectors, and (2) bit allocation is performed for each vector coefficient. The solution for low bit rates imposing nonnegative bit resource is also presented. The rate-distortion bound of the classification-based source coding is derived.

1. Introduction

Source coding techniques are used to reduce data in digital systems to meet realistic transmission and storage constraints. In many such systems, data is reduced using quantizers and bit allocation methods which encode quantizer outputs for further processing. Quantizers apply to original or transformed data and their outputs can be compactly represented for communication or storage. Popular compressed image formats such as JPEG [1] and JPEG2000 [2] employ such techniques to reduce data sizes.

Particularly, in the so-called lossy transform/subband coding techniques [120], original signal data are transformed to a vector source, that is, a set of coefficients. Then each coefficient is quantized to a symbol which is encoded using certain number of bits. Coarse quantization outputs are encoded with a fewer number of bits, but the reconstruction error (distortion) will be higher.

The bit allocation problem for the vector source has been widely addressed in the literature and many different strategies have been proposed. [9] contains an extensive overview of various bit allocation approaches in the context of transform/subband coding methods.

The early transform coders were based on i.i.d models of transform coefficients and bit allocation was performed in proportion to their importance as defined by the variance of their distributions [4, 5, 8]. In [4], all of the quantizers are described by the same exponential quantizer function (QF). The method may result in negative bit rates, which may be corrected using an iterative procedure. In [5], the QF is the same for all of the quantizers and is strictly convex, although it is not assumed to be exponential. In addition, a constraint on the nonnegativity of bit rates is imposed. Both methods are described by closed-form solutions. More general approaches have been suggested in [6, 7] with an optimal bit allocation strategy found for an arbitrary set of quantizers and integer bit allocation constraint, but the solutions are not closed-form.

The performance of these coders was later improved by more accurate modeling of coefficients. One approach simply assumes that the vector source may come from a finite set of several possible processes with different distributions [1019]. The coder decides to which distribution the coefficients belong using a classification process and encodes them properly. The classification process may include additional transformations [1014] and may be optimized for signals of interest, not just typical ones. The classification approach in [10] improves the compression ratio twice at a fixed distortion, while up to 2 dB improvement in reconstruction quality is reported for another approach [13].

Practical still image compression techniques [13] rely on simple quantizers, but finely allocate bit resources adaptively for each transformed image fragment in an “input-by-input" manner [9], as opposed to a classification approach described above. Each image fragment is coded in a deterministic way with coder parameters optimized for that particular input and not for the ensemble of inputs. Input-by-input or the realization-adaptive approach is not well explained by classical rate distortion theory as many practical encoders operate in the low bit rate or high distortion region.

Eventhough the realization adaptive approach is in general superior for still image compression, the extensions of classical bit-allocation and classification techniques have been successfully applied in various applications [1520].

In this paper, bit allocation technique is presented for classification-based methods. In Section 2, we systematize the presentation of our results in [13, 14] which were later rederived in [17, 18] in another context. Then in Section 3 we extend this solution by imposing the nonnegativity constraint on bit rates for low rate coding and estimate the rate-distortion boundary for the classification approach (Section 3).

2. Bit Allocation for Classification-Based Method

Subsequently in the paper, we operate on blocks (vectors) of data to be quantized. Our purpose here is to present results for transform coders where quantization is applied to the transformed coefficients. However, the method could be applied to other vector sources as well.

The classical bit allocation techniques assume the same number of bits allocated to different blocks, and they address optimal bit resource allocation between coefficients. In our model, the blocks of data may come from different sources, and the goal is to optimally distribute the bits among the blocks and block coefficients [13, 14].

Let 𝑅 be the average number of bits per sample. If 𝑅𝑖(𝑖=0,,𝑀1) bits are allocated to the 𝑖th coefficient, then the quantization error in the output of an optimal quantizer can be modeled as 𝜎2𝑞𝑖=𝜖222𝑅𝑖𝜎2𝑖,(1)where 𝜖 is the coefficient which depends on the pdf of the input signal, 𝜎2𝑖 is the variance of the input to quantizer 𝑖. By definition 𝑀1𝑖=0𝑅𝑖=𝑀𝑅. The following optimal classical bit allocation minimizes the overall distortion:𝑅𝑖1=𝑅+2log2𝜎2𝑖[𝑀1𝑗=0𝜎2𝑗]1/𝑀.(2)

Let 𝑅 be the average number of bits per sample, 𝑅𝑘 is the average number of bits per sample for a vector (block) belonging to a class 𝑘,𝑘{0,,𝐾1}, 𝑅𝑘,𝑖 is the number of bits assigned to coefficient 𝑖 of the vector from class 𝑘. Let 𝑤𝑘 be the probability of a vector (block) from class 𝑘. The probabilities can be estimated based on the block classification approach used. For example, it can be estimated using a training signal, and no constraints are imposed on the classification approach.

The bit allocation problem is a two-stage process. The first stage: the average bit resource 𝑅 is distributed among the classes so that 𝐾1𝑘=0𝑤𝑘𝑅𝑘=𝑅. The second stage: for blocks assigned to each class 𝑘, find the optimal bit allocation strategy among the quantized coefficients with the overall bit resource 𝑅𝑘.

For the second stage, the bit allocation is simply the classical solution:𝑅𝑘,𝑖=𝑅𝑘+12log2𝜎2𝑘,𝑖[𝑀1𝑗=0𝜎2𝑘,𝑗]1/𝑀(3)with the resulting distortion:𝐷𝑘=𝜖222𝑅𝑘[𝑀1𝑗=0𝜎2𝑘,𝑗]1/𝑀=𝐴𝑘22𝑅𝑘,(4)where𝐴𝑘=𝜖2[𝑀1𝑗=0𝜎2𝑘,𝑗]1/𝑀.(5)Average distortion over all classes:𝐷=𝐾1𝑘=0𝑤𝑘𝐷𝑘.(6)An optimal bit-allocation for the first stage can now be formulated asmin𝑅0,,𝑅𝐾1𝐷undertheconstraint𝐾1𝑘=0𝑤𝑘𝑅𝑘=𝑅.(7)Observe that the resulting distortion for the whole block (4) is similar to the distortion function in the output of a single quantizer (1) with the difference that 𝑅𝑘 represents the average number of bits per sample for blocks from class 𝑘. One can expect that the optimization problem will result in the classical log-variance rule and this is indeed the case. We use the method of Lagrange multipliers to solve this problem [14]: 𝜕𝜕𝑅𝑘[𝐷𝜆(𝑅𝐾1𝑙=0𝑤𝑙𝑅𝑙)]=0;𝑘=0,1,,𝐾1,2ln2𝐴𝑘22𝑅𝑘𝑤𝑘+𝜆𝑤𝑘=0,𝑘=0,1,,𝐾1,(8)from which we find expressions for 𝑅𝑘: 𝑅𝑘1=2log2𝜆+12ln22log2𝐴𝑘1=Λ+2log2𝐴𝑘,𝑘=0,1,,𝐾1.(9)From the constraint 𝐾1𝑙=0𝑤𝑙𝑅𝑙=𝑅, and the condition 𝐾1𝑙=0𝑤𝑙=1, one can obtainΛ=𝑅𝐾1𝑙=0𝑤𝑙2log2𝐴𝑙,𝑅𝑘=𝑅𝐾1𝑙=0𝑤𝑙2log2𝐴𝑙+12log2𝐴𝑘,(10)and finally𝑅𝑘1=𝑅+2log2𝐴𝑘𝐾1𝑙=0𝐴𝑙𝑤𝑙.(11)

If the quantization is performed in the transform domain and inverse orthogonal transform is used to reconstruct data, then the average distortion in output is equal to the distortion produced by quantizers in the transform domain. The bit allocation rule obtained above applies. In certain cases, the distortion of the reconstructed source is not equal to the distortion introduced by quantizers. This is, for example, the case when additional weighting is applied prior to quantization to account for the human visual system or when the inverse transformation/filterbank is not orthogonal. In such scenarios, weighting factors 𝛾𝑖,𝑘 are multiplied to the variances 𝜎2𝑘,𝑖 and in our solutions 𝜎2𝑘,𝑖𝛾𝑘,𝑖𝜎2𝑘,𝑖. More details for filterbanks can be found in our earlier work [13].

3. Bit Allocation for a General Quantization Function and Low Bit Rates

In this section, we generalize the results from [5] for a classification-based approach to account for a general quantizer function and low bit rates, for which the result of Section 2 may produce negative bit allocations. Assume that 𝑄(𝑅) is a quantizer function defining the average distortion on the quantizer output as a function of the allocated bits, defined for unit variance input. The distortion that results from quantizing the input with variance 𝜎2 is 𝜎2𝑄(𝑅). Let 𝑄(𝑅) be strictly convex with a continuous first derivative 𝑄(𝑅), 𝑄()=0, and let () be the inverse function of 𝑄(). Let us denote 𝑅 as the average rate per sample, and 𝑅𝑘,𝑗 as the number of bits assigned to the 𝑗th component of the quantized block belonging to a class 𝑘. Similarly, 𝜎𝑘,𝑗 is the variance of the component 𝑗 from class 𝑘. Then the allocation of bits that will minimize the average total distortion per block: 𝐷=𝐾1𝑘=0𝑀1𝑗=0𝑤𝑘𝜎2𝑘,𝑗𝑄𝑅𝑘,𝑗.(12)Subject to constraints 𝐾1𝑘=0𝑤𝑘𝑀1𝑗=0𝑅𝑘,𝑗𝑅=𝑀𝑅,𝑘,𝑗0,𝑘=0,,𝐾1,𝑗=0,,𝑀1,(13)is given by𝑅𝑘,𝑗=𝑅𝑘,𝑗𝜃=(𝜎2𝑘,𝑗𝑄(0)),if0<𝜃<𝜎2𝑘,𝑗0,if𝜃𝜎2𝑘,𝑗,(14) where 𝜃 is the unique root of the equation: 𝑘,𝑗𝜎2𝑘,𝑗𝜃𝑤𝑘𝜃𝜎𝑘,𝑗𝑄(0)=𝑀𝑅,(15)and the value of minimum distortion is 𝐷𝜃=𝑘,𝑗𝜎2𝑘,𝑗𝜃𝑤𝑘𝜎2𝑘,𝑗𝑄𝑅𝑘,𝑗+𝑘,𝑗𝜎2𝑘,𝑗<𝜃𝜎2𝑘,𝑗.(16)

Proof. This allocation rule follows from [5, Proposition 2.1]. Let us consider the joint quantization of large number (𝑁) of transform coefficient blocks. The vector is now of larger size, and let us apply the classical bit allocation to this extended vector. Then the problem is to minimize the total distortion:𝐷𝑡=𝑁1𝑛=0𝑀1𝑗=0𝜎2𝑗𝑄𝑅𝑛,𝑗.(17) Subject to constraints𝑁1𝑛=0𝑀1𝑗=0𝑅𝑛,𝑗=𝑁𝑀𝑅,𝑅𝑛,𝑗0.(18) According to [5, Proposition 2.1], the solution to this problem is𝑅𝑛,𝑗=𝑅𝑛,𝑗𝜃=(𝜎2𝑛,𝑗𝑄(0)),if0<𝜃<𝜎2𝑛,𝑗0,if𝜃𝜎2𝑛,𝑗,(19)where 𝜃 is the unique root of the equation: 𝑛,𝑗𝜎2𝑛,𝑗𝜃𝜃(𝜎𝑛,𝑗𝑄(0))=𝑁𝑀𝑅,(20)and the value of the minimum overall distortion is 𝐷𝑡𝜃=𝑛,𝑗𝜎2𝑛,𝑗𝜃𝜎2𝑛,𝑗𝑄𝑅𝑛,𝑗+𝑛,𝑗𝜎2𝑛,𝑗<𝜃𝜎2𝑛,𝑗.(21)
Recall that there are only 𝐾 possible classes. Let the blocks from the same classes be grouped together and denote the number of blocks belonging to class 𝑘 as 𝑁𝑘,𝑘=0,,𝐾1. Then (20), (21) will become 𝑘,𝑗𝜎2𝑘,𝑗𝜃𝑁𝑘𝑁𝜃(𝜎𝑘,𝑗𝑄𝐷(0))=𝑀𝑅,𝑡𝜃𝑁=𝑘,𝑗𝜎2𝑘,𝑗𝜃𝑁𝑘𝑁𝜎2𝑘,𝑗𝑄𝑅𝑘,𝑗+𝑘,𝑗𝜎2𝑘,𝑗<𝜃𝑁𝑘𝑁𝜎2𝑘,𝑗.(22)For an ensemble of realizations, 𝑁𝑘/𝑁𝑤𝑘 is the probability of the transformed block belonging to a class 𝑘. The bit allocation (19) applies here as well. This concludes the proof of the above statement.

Remark on rate-distortion bound for the classification-based source-dependent quantization. The rate-distortion bound for a unit variance source is given by the formula 𝑅(𝐷)=(1/2)log(1/𝐷),𝐷1 with 𝑄(𝑅)=22𝑅, 𝑄(𝑅)=222𝐵ln2 and (𝑥)=(1/2)log2[2ln2/𝑥],𝑥<0 [5]. For the classification-based source-dependent quantization, the bit allocation rule described above and 𝑘=0,,𝐾1, 𝑗=0,,𝑀1, are 𝑅𝑘,𝑗1=max(0,2log2𝜎2𝑘,𝑗𝜃),(23)where 𝜃 is obtained from𝑘,𝑗𝑤𝑘1max(0,2log2𝜎2𝑘,𝑗𝜃)=𝑀𝑅,(24)and the minimum distortion is𝐷=𝑘,𝑗𝜃𝜎2𝑘,𝑗𝑤𝑘𝜃+𝑘,𝑗𝜃>𝜎2𝑘,𝑗𝑤𝑘𝜎2𝑘,𝑗=𝑘,𝑗𝑤𝑘𝜃min,𝜎2𝑘,𝑗.(25)

The side information is bounded by entropy and the minimum rate for a lossless compression is 𝑅side=𝑤𝑘log2𝑤𝑘, which is counted per transformed block. The overall rate (per sample) is counted as1𝑅=𝑀𝑅side+1𝑀𝐾1𝑘=0𝑀1𝑗=0𝑤𝑘𝑅𝑘,𝑗.(26)

Figure 1 presents a simulation example. The input signal is assumed to have blocks of 16 samples from four stationary unit-variance, first-order, zero-mean, Markov processes with covariance function 𝑟(𝑛)=𝜌|𝑛|. Blocks of data are transformed by discrete cosine transform (DCT). The parameters (𝜌𝑘) and probabilities (𝑤𝑘) for these processes are (0.95, 0.7), (0.75, 0.1), (0.65, 0.1), and (0.55, 0.1). For rate-distortion bound estimation it is assumed that an ideal classification is performed and a comparison is made with conventional bit allocation, which is derived for a single-process assumption. The rate-distortion figure demonstrates that, at equal quantization distortions, the bit rate bound achieved with the source classification method is lower for most of the rates. When the bit rate approaches zero, the rate-distortion characteristics are occasionally worse than with the conventional quantization due to the side information. The reason for this effect lies in the fact that some of the classes are not assigned any bits due to coarse quantization.

4. Conclusion

This paper presents an optimal bit allocation technique for compression methods using the classification of vector sources. First, we generalized the well-known log-variance rule using the exponential quantizer function. The quantizer function is often approximated more accurately with other functions and the log-variance rule may produce negative bit quotas at low bit rates. For this reason, the solution for a more general quantizer function and low bit rates is also presented. The rate-distortion bound is calculated for the model of source-dependent quantization, and it illustrates improvement in coding performance.

Acknowledgment

The author would like to thank the reviewers for their useful comments and corrections.