High Performance Discrete Cosine Transform Operator Using Multimedia Oriented Subword Parallelism

In this paper an efficient two-dimensional discrete cosine transform (DCT) operator is proposed for multimedia applications. It is based on the DCT operator proposed in Kovac and Ranganathan, 1995. Speed-up is obtained by using multimedia oriented subword parallelism (SWP). Rather than operating on a single pixel, the SWP-based DCT operator performs parallel computations on multiple pixels packed in word size input registers so that the performance of the operator is increased. Special emphasis is made to increase the coordination between pixel sizes and subword sizes to maximize resource utilization rate. Rather than using classical subword sizes (8, 16, and 32 bits), multimedia oriented subword sizes (8, 10, 12, and 16 bits) are used in the proposed DCT operator. The proposed SWP DCT operator unit can be used as a coprocessor for multimedia applications.