Abstract

A generic multiplication scheme for the low power VLSI implementation of the DCT is described in this paper. The scheme concurrently processes blocks of cosine coefficient and pixel values during the multiplication procedure, with the aim of reducing the total switched capacitance within the multiplier circuit. The cosine coefficients, within each block, are manipulated such that some are processed using shift operations only. The remaining coefficients are presented to the multiplier inputs as a sequence, ordered according to bit correlation between successive cosine coefficients. The paper describes the multiplication scheme, the power evaluation environment used, and presents results, with a number of standard benchmark examples, demonstrating upto 50% power saving.