Research Letters in Signal Processing
Volume 2008 (2008), Article ID 421650, 5 pages
doi:10.1155/2008/421650
Research Letter

On Optimal Bit Allocation for Classification-Based Source-Dependent Transform Coding

Department of Electrical and Computer Engineering, The University of Texas at San Antonio, San Antonio, TX 78249, USA

Received 27 December 2007; Accepted 3 March 2008

Academic Editor: Jarmo Takala

Copyright © 2008 David Akopian. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

An optimal bit allocation is presented for classification-based source-dependent transform coding. A vector of transform coefficients is considered to have been produced by a mixture of processes. The available bit resource is distributed optimally in two stages: (1) bit allocation is performed for each class of coefficient vectors, and (2) bit allocation is performed for each vector coefficient. The solution for low bit rates imposing nonnegative bit resource is also presented. The rate-distortion bound of the classification-based source coding is derived.

1. Introduction

Source coding techniques are used to reduce data in digital systems to meet realistic transmission and storage constraints. In many such systems, data is reduced using quantizers and bit allocation methods which encode quantizer outputs for further processing. Quantizers apply to original or transformed data and their outputs can be compactly represented for communication or storage. Popular compressed image formats such as JPEG [1] and JPEG2000 [2] employ such techniques to reduce data sizes.

Particularly, in the so-called lossy transform/subband coding techniques [120], original signal data are transformed to a vector source, that is, a set of coefficients. Then each coefficient is quantized to a symbol which is encoded using certain number of bits. Coarse quantization outputs are encoded with a fewer number of bits, but the reconstruction error (distortion) will be higher.

The bit allocation problem for the vector source has been widely addressed in the literature and many different strategies have been proposed. [9] contains an extensive overview of various bit allocation approaches in the context of transform/subband coding methods.

The early transform coders were based on i.i.d models of transform coefficients and bit allocation was performed in proportion to their importance as defined by the variance of their distributions [4, 5, 8]. In [4], all of the quantizers are described by the same exponential quantizer function (QF). The method may result in negative bit rates, which may be corrected using an iterative procedure. In [5], the QF is the same for all of the quantizers and is strictly convex, although it is not assumed to be exponential. In addition, a constraint on the nonnegativity of bit rates is imposed. Both methods are described by closed-form solutions. More general approaches have been suggested in [6, 7] with an optimal bit allocation strategy found for an arbitrary set of quantizers and integer bit allocation constraint, but the solutions are not closed-form.

The performance of these coders was later improved by more accurate modeling of coefficients. One approach simply assumes that the vector source may come from a finite set of several possible processes with different distributions [1019]. The coder decides to which distribution the coefficients belong using a classification process and encodes them properly. The classification process may include additional transformations [1014] and may be optimized for signals of interest, not just typical ones. The classification approach in [10] improves the compression ratio twice at a fixed distortion, while up to 2 dB improvement in reconstruction quality is reported for another approach [13].

Practical still image compression techniques [13] rely on simple quantizers, but finely allocate bit resources adaptively for each transformed image fragment in an “input-by-input" manner [9], as opposed to a classification approach described above. Each image fragment is coded in a deterministic way with coder parameters optimized for that particular input and not for the ensemble of inputs. Input-by-input or the realization-adaptive approach is not well explained by classical rate distortion theory as many practical encoders operate in the low bit rate or high distortion region.

Eventhough the realization adaptive approach is in general superior for still image compression, the extensions of classical bit-allocation and classification techniques have been successfully applied in various applications [1520].

In this paper, bit allocation technique is presented for classification-based methods. In Section 2, we systematize the presentation of our results in [13, 14] which were later rederived in [17, 18] in another context. Then in Section 3 we extend this solution by imposing the nonnegativity constraint on bit rates for low rate coding and estimate the rate-distortion boundary for the classification approach (Section 3).

2. Bit Allocation for Classification-Based Method

Subsequently in the paper, we operate on blocks (vectors) of data to be quantized. Our purpose here is to present results for transform coders where quantization is applied to the transformed coefficients. However, the method could be applied to other vector sources as well.

The classical bit allocation techniques assume the same number of bits allocated to different blocks, and they address optimal bit resource allocation between coefficients. In our model, the blocks of data may come from different sources, and the goal is to optimally distribute the bits among the blocks and block coefficients [13, 14].

Let 𝑅 be the average number of bits per sample. If 𝑅 𝑖 ( 𝑖 = 0 , , 𝑀 1 ) bits are allocated to the 𝑖 th coefficient, then the quantization error in the output of an optimal quantizer can be modeled as 𝜎 2 𝑞 𝑖 = 𝜖 2 2 2 𝑅 𝑖 𝜎 2 𝑖 , ( 1 ) where 𝜖 is the coefficient which depends on the pdf of the input signal, 𝜎 2 𝑖 is the variance of the input to quantizer 𝑖 . By definition 𝑀 1 𝑖 = 0 𝑅 𝑖 = 𝑀 𝑅 . The following optimal classical bit allocation minimizes the overall distortion: 𝑅 𝑖 1 = 𝑅 + 2 l o g 2 𝜎 2 𝑖 [ 𝑀 1 𝑗 = 0 𝜎 2 𝑗 ] 1 / 𝑀 . ( 2 )

Let 𝑅 be the average number of bits per sample, 𝑅 𝑘 is the average number of bits per sample for a vector (block) belonging to a class 𝑘 , 𝑘 { 0 , , 𝐾 1 } , 𝑅 𝑘 , 𝑖 is the number of bits assigned to coefficient 𝑖 of the vector from class 𝑘 . Let 𝑤 𝑘 be the probability of a vector (block) from class 𝑘 . The probabilities can be estimated based on the block classification approach used. For example, it can be estimated using a training signal, and no constraints are imposed on the classification approach.

The bit allocation problem is a two-stage process. The first stage: the average bit resource 𝑅 is distributed among the classes so that 𝐾 1 𝑘 = 0 𝑤 𝑘 𝑅 𝑘 = 𝑅 . The second stage: for blocks assigned to each class 𝑘 , find the optimal bit allocation strategy among the quantized coefficients with the overall bit resource 𝑅 𝑘 .

For the second stage, the bit allocation is simply the classical solution: 𝑅 𝑘 , 𝑖 = 𝑅 𝑘 + 1 2 l o g 2 𝜎 2 𝑘 , 𝑖 [ 𝑀 1 𝑗 = 0 𝜎 2 𝑘 , 𝑗 ] 1 / 𝑀 ( 3 ) with the resulting distortion: 𝐷 𝑘 = 𝜖 2 2 2 𝑅 𝑘 [ 𝑀 1 𝑗 = 0 𝜎 2 𝑘 , 𝑗 ] 1 / 𝑀 = 𝐴 𝑘 2 2 𝑅 𝑘 , ( 4 ) where 𝐴 𝑘 = 𝜖 2 [ 𝑀 1 𝑗 = 0 𝜎 2 𝑘 , 𝑗 ] 1 / 𝑀 . ( 5 ) Average distortion over all classes: 𝐷 = 𝐾 1 𝑘 = 0 𝑤 𝑘 𝐷 𝑘 . ( 6 ) An optimal bit-allocation for the first stage can now be formulated as m i n 𝑅 0 , , 𝑅 𝐾 1 𝐷 u n d e r t h e c o n s t r a i n t 𝐾 1 𝑘 = 0 𝑤 𝑘 𝑅 𝑘 = 𝑅 . ( 7 ) Observe that the resulting distortion for the whole block (4) is similar to the distortion function in the output of a single quantizer (1) with the difference that 𝑅 𝑘 represents the average number of bits per sample for blocks from class 𝑘 . One can expect that the optimization problem will result in the classical log-variance rule and this is indeed the case. We use the method of Lagrange multipliers to solve this problem [14]: 𝜕 𝜕 𝑅 𝑘 [ 𝐷 𝜆 ( 𝑅 𝐾 1 𝑙 = 0 𝑤 𝑙 𝑅 𝑙 ) ] = 0 ; 𝑘 = 0 , 1 , , 𝐾 1 , 2 l n 2 𝐴 𝑘 2 2 𝑅 𝑘 𝑤 𝑘 + 𝜆 𝑤 𝑘 = 0 , 𝑘 = 0 , 1 , , 𝐾 1 , ( 8 ) from which we find expressions for 𝑅 𝑘 : 𝑅 𝑘 1 = 2 l o g 2 𝜆 + 1 2 l n 2 2 l o g 2 𝐴 𝑘 1 = Λ + 2 l o g 2 𝐴 𝑘 , 𝑘 = 0 , 1 , , 𝐾 1 . ( 9 ) From the constraint 𝐾 1 𝑙 = 0 𝑤 𝑙 𝑅 𝑙 = 𝑅 , and the condition 𝐾 1 𝑙 = 0 𝑤 𝑙 = 1 , one can obtain Λ = 𝑅 𝐾 1 𝑙 = 0 𝑤 𝑙 2 l o g 2 𝐴 𝑙 , 𝑅 𝑘 = 𝑅 𝐾 1 𝑙 = 0 𝑤 𝑙 2 l o g 2 𝐴 𝑙 + 1 2 l o g 2 𝐴 𝑘 , ( 1 0 ) and finally 𝑅 𝑘 1 = 𝑅 + 2 l o g 2 𝐴 𝑘 𝐾 1 𝑙 = 0 𝐴 𝑙 𝑤 𝑙 . ( 1 1 )

If the quantization is performed in the transform domain and inverse orthogonal transform is used to reconstruct data, then the average distortion in output is equal to the distortion produced by quantizers in the transform domain. The bit allocation rule obtained above applies. In certain cases, the distortion of the reconstructed source is not equal to the distortion introduced by quantizers. This is, for example, the case when additional weighting is applied prior to quantization to account for the human visual system or when the inverse transformation/filterbank is not orthogonal. In such scenarios, weighting factors 𝛾 𝑖 , 𝑘 are multiplied to the variances 𝜎 2 𝑘 , 𝑖 and in our solutions 𝜎 2 𝑘 , 𝑖 𝛾 𝑘 , 𝑖 𝜎 2 𝑘 , 𝑖 . More details for filterbanks can be found in our earlier work [13].

3. Bit Allocation for a General Quantization Function and Low Bit Rates

In this section, we generalize the results from [5] for a classification-based approach to account for a general quantizer function and low bit rates, for which the result of Section 2 may produce negative bit allocations. Assume that 𝑄 ( 𝑅 ) is a quantizer function defining the average distortion on the quantizer output as a function of the allocated bits, defined for unit variance input. The distortion that results from quantizing the input with variance 𝜎 2 is 𝜎 2 𝑄 ( 𝑅 ) . Let 𝑄 ( 𝑅 ) be strictly convex with a continuous first derivative 𝑄 ( 𝑅 ) , 𝑄 ( ) = 0 , and let ( ) be the inverse function of 𝑄 ( ) . Let us denote 𝑅 as the average rate per sample, and 𝑅 𝑘 , 𝑗 as the number of bits assigned to the 𝑗 th component of the quantized block belonging to a class 𝑘 . Similarly, 𝜎 𝑘 , 𝑗 is the variance of the component 𝑗 from class 𝑘 . Then the allocation of bits that will minimize the average total distortion per block: 𝐷 = 𝐾 1 𝑘 = 0 𝑀 1 𝑗 = 0 𝑤 𝑘 𝜎 2 𝑘 , 𝑗 𝑄 𝑅 𝑘 , 𝑗 . ( 1 2 ) Subject to constraints 𝐾 1 𝑘 = 0 𝑤 𝑘 𝑀 1 𝑗 = 0 𝑅 𝑘 , 𝑗 𝑅 = 𝑀 𝑅 , 𝑘 , 𝑗 0 , 𝑘 = 0 , , 𝐾 1 , 𝑗 = 0 , , 𝑀 1 , ( 1 3 ) is given by 𝑅 𝑘 , 𝑗 = 𝑅 𝑘 , 𝑗 𝜃 = ( 𝜎 2 𝑘 , 𝑗 𝑄 ( 0 ) ) , i f 0 < 𝜃 < 𝜎 2 𝑘 , 𝑗 0 , i f 𝜃 𝜎 2 𝑘 , 𝑗 , ( 1 4 ) where 𝜃 is the unique root of the equation: 𝑘 , 𝑗 𝜎 2 𝑘 , 𝑗 𝜃 𝑤 𝑘 𝜃 𝜎 𝑘 , 𝑗 𝑄 ( 0 ) = 𝑀 𝑅 , ( 1 5 ) and the value of minimum distortion is 𝐷 𝜃 = 𝑘 , 𝑗 𝜎 2 𝑘 , 𝑗 𝜃 𝑤 𝑘 𝜎 2 𝑘 , 𝑗 𝑄 𝑅 𝑘 , 𝑗 + 𝑘 , 𝑗 𝜎 2 𝑘 , 𝑗 < 𝜃 𝜎 2 𝑘 , 𝑗 . ( 1 6 )

Proof. This allocation rule follows from [5, Proposition 2.1]. Let us consider the joint quantization of large number ( 𝑁 ) of transform coefficient blocks. The vector is now of larger size, and let us apply the classical bit allocation to this extended vector. Then the problem is to minimize the total distortion: 𝐷 𝑡 = 𝑁 1 𝑛 = 0 𝑀 1 𝑗 = 0 𝜎 2 𝑗 𝑄 𝑅 𝑛 , 𝑗 . ( 1 7 ) Subject to constraints 𝑁 1 𝑛 = 0 𝑀 1 𝑗 = 0 𝑅 𝑛 , 𝑗 = 𝑁 𝑀 𝑅 , 𝑅 𝑛 , 𝑗 0 . ( 1 8 ) According to [5, Proposition 2.1], the solution to this problem is 𝑅 𝑛 , 𝑗 = 𝑅 𝑛 , 𝑗 𝜃 = ( 𝜎 2 𝑛 , 𝑗 𝑄 ( 0 ) ) , i f 0 < 𝜃 < 𝜎 2 𝑛 , 𝑗 0 , i f 𝜃 𝜎 2 𝑛 , 𝑗 , ( 1 9 ) where 𝜃 is the unique root of the equation: 𝑛 , 𝑗 𝜎 2 𝑛 , 𝑗 𝜃 𝜃 ( 𝜎 𝑛 , 𝑗 𝑄 ( 0 ) ) = 𝑁 𝑀 𝑅 , ( 2 0 ) and the value of the minimum overall distortion is 𝐷 𝑡 𝜃 = 𝑛 , 𝑗 𝜎 2 𝑛 , 𝑗 𝜃 𝜎 2 𝑛 , 𝑗 𝑄 𝑅 𝑛 , 𝑗 + 𝑛 , 𝑗 𝜎 2 𝑛 , 𝑗 < 𝜃 𝜎 2 𝑛 , 𝑗 . ( 2 1 )
Recall that there are only 𝐾 possible classes. Let the blocks from the same classes be grouped together and denote the number of blocks belonging to class 𝑘 as 𝑁 𝑘 , 𝑘 = 0 , , 𝐾 1 . Then (20), (21) will become 𝑘 , 𝑗 𝜎 2 𝑘 , 𝑗 𝜃 𝑁 𝑘 𝑁 𝜃 ( 𝜎 𝑘 , 𝑗 𝑄 𝐷 ( 0 ) ) = 𝑀 𝑅 , 𝑡 𝜃 𝑁 = 𝑘 , 𝑗 𝜎 2 𝑘 , 𝑗 𝜃 𝑁 𝑘 𝑁 𝜎 2 𝑘 , 𝑗 𝑄 𝑅 𝑘 , 𝑗 + 𝑘 , 𝑗 𝜎 2 𝑘 , 𝑗 < 𝜃 𝑁 𝑘 𝑁 𝜎 2 𝑘 , 𝑗 . ( 2 2 ) For an ensemble of realizations, 𝑁 𝑘 / 𝑁 𝑤 𝑘 is the probability of the transformed block belonging to a class 𝑘 . The bit allocation (19) applies here as well. This concludes the proof of the above statement.

Remark on rate-distortion bound for the classification-based source-dependent quantization. The rate-distortion bound for a unit variance source is given by the formula 𝑅 ( 𝐷 ) = ( 1 / 2 ) l o g ( 1 / 𝐷 ) , 𝐷 1 with 𝑄 ( 𝑅 ) = 2 2 𝑅 , 𝑄 ( 𝑅 ) = 2 2 2 𝐵 l n 2 and ( 𝑥 ) = ( 1 / 2 ) l o g 2 [ 2 l n 2 / 𝑥 ] , 𝑥 < 0 [5]. For the classification-based source-dependent quantization, the bit allocation rule described above and 𝑘 = 0 , , 𝐾 1 , 𝑗 = 0 , , 𝑀 1 , are 𝑅 𝑘 , 𝑗 1 = m a x ( 0 , 2 l o g 2 𝜎 2 𝑘 , 𝑗 𝜃 ) , ( 2 3 ) where 𝜃 is obtained from 𝑘 , 𝑗 𝑤 𝑘 1 m a x ( 0 , 2 l o g 2 𝜎 2 𝑘 , 𝑗 𝜃 ) = 𝑀 𝑅 , ( 2 4 ) and the minimum distortion is 𝐷 = 𝑘 , 𝑗 𝜃 𝜎 2 𝑘 , 𝑗 𝑤 𝑘 𝜃 + 𝑘 , 𝑗 𝜃 > 𝜎 2 𝑘 , 𝑗 𝑤 𝑘 𝜎 2 𝑘 , 𝑗 = 𝑘 , 𝑗 𝑤 𝑘 𝜃 m i n , 𝜎 2 𝑘 , 𝑗 . ( 2 5 )

The side information is bounded by entropy and the minimum rate for a lossless compression is 𝑅 s i d e = 𝑤 𝑘 l o g 2 𝑤 𝑘 , which is counted per transformed block. The overall rate (per sample) is counted as 1 𝑅 = 𝑀 𝑅 s i d e + 1 𝑀 𝐾 1 𝑘 = 0 𝑀 1 𝑗 = 0 𝑤 𝑘 𝑅 𝑘 , 𝑗 . ( 2 6 )

Figure 1 presents a simulation example. The input signal is assumed to have blocks of 16 samples from four stationary unit-variance, first-order, zero-mean, Markov processes with covariance function 𝑟 ( 𝑛 ) = 𝜌 | 𝑛 | . Blocks of data are transformed by discrete cosine transform (DCT). The parameters ( 𝜌 𝑘 ) and probabilities ( 𝑤 𝑘 ) for these processes are (0.95, 0.7), (0.75, 0.1), (0.65, 0.1), and (0.55, 0.1). For rate-distortion bound estimation it is assumed that an ideal classification is performed and a comparison is made with conventional bit allocation, which is derived for a single-process assumption. The rate-distortion figure demonstrates that, at equal quantization distortions, the bit rate bound achieved with the source classification method is lower for most of the rates. When the bit rate approaches zero, the rate-distortion characteristics are occasionally worse than with the conventional quantization due to the side information. The reason for this effect lies in the fact that some of the classes are not assigned any bits due to coarse quantization.

21650.fig.001
Figure 1: Example of rate-distortion bound improvement for source-dependent quantization. Data are assumed from 4 different sources, transformed by discrete cosine transform (DCT). Ideal classification after the DCT transform prior to quantization. Class probabilities 𝑤 𝑘 are 0.7, 0.1, 0.1, and 0.1. Original sources are stationary unit variance, zero mean, first-order, Markov processes with 𝜌 𝑘 from the set { 0 . 9 5 , 0 . 7 5 , 0 . 6 5 , a n d 0 . 5 5 } .

4. Conclusion

This paper presents an optimal bit allocation technique for compression methods using the classification of vector sources. First, we generalized the well-known log-variance rule using the exponential quantizer function. The quantizer function is often approximated more accurately with other functions and the log-variance rule may produce negative bit quotas at low bit rates. For this reason, the solution for a more general quantizer function and low bit rates is also presented. The rate-distortion bound is calculated for the model of source-dependent quantization, and it illustrates improvement in coding performance.

Acknowledgment

The author would like to thank the reviewers for their useful comments and corrections.

References

  1. G. K. Wallace, “The JPEG still picture compression standard,” Communications of the ACM, vol. 34, no. 4, 30 pages, 1991.
  2. D. Taubman, “High performance scalable image compression with EBCOT,” in Proceedings of the IEEE International Conference on Image Processing (ICIP '99), vol. 3, p. 344, Kobe, Japan, October 1999.
  3. A. Said and W. A. Pearlman, “A new, fast, and efficient image codec based on set partitioning in hierarchical trees,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 6, no. 3, 243 pages, 1996.
  4. J. Huang and P. Schultheiss, “Block quantization of correlated Gaussian random variables,” IEEE Transactions on Communications, vol. 11, no. 3, 289 pages, 1963.
  5. A. Segall, “Bit allocation and encoding for vector sources,” IEEE Transactions on Information Theory, vol. 22, no. 2, 162 pages, 1976.
  6. Y. Shoham and A. Gersho, “Efficient bit allocation for an arbitrary set of quantizers,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 36, no. 9, 1445 pages, 1988.
  7. H. Xie and A. Ortega, “Entropy- and complexity-constrained classified quantizer design for distributed image classification,” in Proceedings of the IEEE Workshop on Multimedia Signal Processing (MMSP '02), p. 77, St. Thomas, Virgin Islands, USA, December 2002.
  8. P. H. Westerink, J. Biemond, and D. E. Boekee, “An optimal bit allocation algorithm for sub-band coding,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '88), vol. 2, p. 757, New York, NY, USA, April 1988.
  9. A. Ortega and K. Ramchandran, “Rate-distortion methods for image and video compression,” IEEE Signal Processing Magazine, vol. 15, no. 6, 23 pages, 1998.
  10. R. D. Dony and S. Haykin, “Neural network approaches to image compression,” Proceedings of the IEEE, vol. 83, no. 2, 288 pages, 1995.
  11. M. Effros, “Optimal modeling for complex system design,” IEEE Signal Processing Magazine, vol. 15, no. 6, 51 pages, 1998.
  12. M. Helsingius, S. Atourian, D. Akopian, and J. Astola, “Multibase transform coding of images,” in Proceedings of the IEEE Nordic Signal Processing Symposium (NORSIG '96), p. 255, Helsinki, Finland, September 1996.
  13. D. Akopian, M. Helsingius, and J. Astola, “Multibase/wavelet transform coding of still images without blocking artifacts,” in Proceedings of the Conference Record of the 32nd Asilomar Conference on Signals, Systems & Computers, vol. 1, p. 154, Pacific Grove, Calif, USA, November 1998.
  14. D. Akopian, M. Helsingius, and J. Astola, “An optimized multiscanning approach for still image compression,” in Proceedings of the 3rd IEEE Workshop on Multimedia Signal Processing, p. 401, Copenhagen, Denmark, September 1999.
  15. J. K. Su and R. M. Mersereau, “Coding using Gaussian mixture and generalized Gaussian models,” in Proceedings of the IEEE International Conference on Image Processing (ICIP '96), vol. 1, p. 217, Lausanne, Switzerland, September 1996.
  16. K. K. Paliwal and S. So, “Low-complexity GMM-based block quantisation of images using the discrete cosine transform,” Signal Processing: Image Communication, vol. 20, no. 5, 435 pages, 2005.
  17. A. D. Subramaniam and B. D. Rao, “PDF optimized parametric vector quantization of speech line spectral frequencies,” IEEE Transactions on Speech and Audio Processing, vol. 11, no. 2, 130 pages, 2003.
  18. S. Jana and P. Moulin, “Optimality of KLT for high-rate transform coding of Gaussian vector-scale mixtures: application to reconstruction, estimation, and classification,” IEEE Transactions on Information Theory, vol. 52, no. 9, 4049 pages, 2006.
  19. C. Archer and T. K. Leen, “A generalized Lloyd-type algorithm for adaptive transform coder design,” IEEE Transactions on Signal Processing, vol. 52, no. 1, 255 pages, 2004.
  20. P. Kechichian, D. Tran, and F. Labeau, “Improved bit allocation for transform coding of images,” in Proceedings of the Conference Record of the 39th Asilomar Conference on Signals, Systems & Computers, p. 879, Pacific Grove, Calif, USA, October-November 2005.