A Rate-Distortion-Based Merging Algorithm for Compressed Image Segmentation

Juang, Ying-Shen; Hsin, Hsi-Chin; Sung, Tze-Yun; Shieh, Yaw-Shih; Cattani, Carlo

doi:https://doi.org/10.1155/2012/648320

Computational and Mathematical Methods in Medicine

On this page

Abstract Introduction Experimental Results Conclusions Acknowledgments References Copyright Related Articles

Special Issue

Biomedical Signal Processing and Modeling Complexity of Living Systems 2013

View this Special Issue

Research Article | Open Access

Volume 2012 | Article ID 648320 | https://doi.org/10.1155/2012/648320

A Rate-Distortion-Based Merging Algorithm for Compressed Image Segmentation

Ying-Shen Juang,¹Hsi-Chin Hsin,²Tze-Yun Sung,³Yaw-Shih Shieh,³and Carlo Cattani⁴

Academic Editor: Sheng-yong Chen

Received06 Aug 2012

Accepted05 Sept 2012

Published15 Oct 2012

Abstract

Original images are often compressed for the communication applications. In order to avoid the burden of decompressing computations, it is thus desirable to segment images in the compressed domain directly. This paper presents a simple rate-distortion-based scheme to segment images in the JPEG2000 domain. It is based on a binary arithmetic code table used in the JPEG2000 standard, which is available at both encoder and decoder; thus, there is no need to transmit the segmentation result. Experimental results on the Berkeley image database show that the proposed algorithm is preferable in terms of the running time and the quantitative measures: probabilistic Rand index (PRI) and boundary displacement error (BDE).

1. Introduction

Data segmentation is important in many applications [1–6]. Early research work on image segmentation is mainly at a single scale, especially for medical images [7–9]. In the human visual system (HVS), the perceived image is decomposed into a set of band-pass subimages by means of filtering with simple visual cortical cells, which can be well modeled by Gabor filters with suitable spatial frequencies and orientations [10]. Other state-of-the-art multiscale techniques are based on wavelet transform (WT), which provides an efficient multiresolution representation in accord with the property of HVS [11]. Specifically, the higher-detail information of an image is projected onto a shorter basis function with higher spatial resolution. Various WT-based features and algorithms were proposed in the literature for image segmentation at multiple scales [12–14].

For the communication applications, original images are compressed in order to make good use of memory space and channel bandwidth. Thus, it is desirable to segment a compressed image directly. The Joint Photographic Expert Group (JPEG) standard adopts discrete cosine transform for subband image coding. In order to improve the compression performance of JPEG with more coding advantages, for example, embedded coding and progressive transmission, the JPEG2000 standard adopts WT as the underlying transform algorithm. Specifically, embedding coding is to code an image into a single code stream, from which the decoded image at any bit rate can be obtained. The embedded code stream of an image is organized in decreasing order of significance for progressive transmission over band-limited channels. This property is particularly desirable for the Internet streaming and database browsing applications [15–17]. Zargari proposed an efficient method for JPEG2000 image retrieval in the compressed domain [18]. Pi proposed a simple scheme to estimate the probability mass function (PMF) of wavelet subbands by counting the number of 1-bits and used the global PMF as features to retrieve similar images from a large database [19]. For image segmentation, however, the local PMF is needed. In [20], we proposed a simple method to compute the local PMF of wavelet coefficients based on the MQ table. It can be applied to a JPEG2000 code stream directly, and the local PMF can be used as features to segment a JPEG2000 image in the compressed domain.

Motivated by the idea behind the postcompression rate distortion (PCRD) algorithm [15], we propose a simple algorithm called the rate-distortion-based merging (RDM) algorithm for JPEG2000 image segmentation. It can be applied to a JPEG2000 code stream instead of the decoded image. As a result, the burden of decoding computation can be saved. In addition, the RDM algorithm is based on the MQ table, which is available at both encoder and decoder; thus, no overhead transmission is added from a segmentation viewpoint. The remainder of the paper proceeds as follows. In Section 2, the JPEG2000 standard is reviewed briefly. In Section 3, the MQ-table-based rate distortion slope (MQRDS) is proposed to examine the significance of wavelet segments; based on which, the RDM algorithm is thus proposed to merge wavelet segments with similar characteristics. Experimental results on the Berkeley color image database are given in Section 4. Conclusions can be found in Section 5.

2. Review of the JPEG2000 Standard

The core module of the JPEG2000 standard is the embedded block coding with optimized truncation (EBCOT) algorithm [15], which adopts wavelet transform (WT) as the underlying method to decompose an image into multiresolution subbands. WT has many desirable properties, for example, the self-similarity of wavelet coefficients across subbands of the same orientation, the joint space-spatial frequency localization with orientation selectivity, and the energy clustering within each subband [11]. The fundamental idea behind EBCOT is to take advantage of the energy clustering property of wavelet coefficients. EBCOT is a two-tier algorithm; tier-1 consists of bit plane coding (BPC) followed by arithmetic coding (AC); tier-2 is primarily for optimal rate control. Three coding passes, namely, the significance propagation (SP) pass, the magnitude refinement (MR) pass, and the clean-up (CU) pass, are involved with four primitive coding operations, namely, the significance coding operation, the sign coding operation, the magnitude refinement coding operation, and the clean-up coding operation. For a wavelet coefficient that is currently insignificant, if any of the 8 neighboring coefficients are already significant, it is coded in the SP pass using the significance coding operation; otherwise, it is coded in the CU pass using the clean-up coding operation. If this coefficient becomes significant, its sign is then coded using the sign coding operation. The magnitude of the significant wavelet coefficients that have been found in the previous coding passes is updated using the magnitude refinement coding operation in the MR pass. The resulting code streams of coding passes can be compressed further by using a context-based arithmetic coder known as the MQ coder. JPEG2000 defines 18 context labels for the MQ coder and stores their respective probability models in the MQ table. Specifically, 10 context labels are used for the significance coding operation and the clean-up coding operation, 5 context labels are used for the sign coding operation, and 3 context labels are used for the magnitude refinement coding operation.

In JPEG2000, a large image can be partitioned into nonoverlapped subimages called tiles for computational simplicity. WT is then applied to the tiles of an image for subband decompositions; and each wavelet subband is further divided into small blocks called code blocks. The code blocks of an image are independently coded from the most significant bit plane (MSB) to the least significant bit plane (LSB). Based on the true rate-distortion slope (RDS) of code blocks, JPEG2000 concatenates the significant code streams with large RDS using the post compression rate distortion (PCRD) algorithm for optimal rate control. More specifically, let be a set of code blocks in the whole image. The code stream of can be terminated at the end of a coding pass, say , with the bit rate denoted by ; all the end points of coding passes are possible truncation points. The distortion incurred by discarding the coding passes after is denoted by . PCRD selects the optimal truncation points to minimize the overall distortion: subject to the rate constraint: , where is a given bitrate. It is noted that the coding passes with nonincreasing RDS are candidates for the optimal truncation points. Motivated by the idea of the above, a new technique is proposed to segment JPEG2000 images in the JPEG2000 domain; the detail is given in the following section.

3. Image Segmentation in the JPEG2000 Domain

This section presents a simple merging algorithm for JPEG2000 image segmentation. It merges wavelet segments with similar characteristics based on the change of the estimated RDS in the JPEG2000 domain. Thus, the proposed algorithm can be applied to a JPEG2000 code stream without decompressing complexity.

3.1. MQ Table-Based Probability Mass Function

In JPEG2000, the wavelet coefficients of an image are quantized with bit planes, and binary wavelet variables are almost independent across bit planes. The probability mass function (PMF) known as the wavelet histogram [19] can be approximated by where is the magnitude of a wavelet coefficient,, is the PMF of the binary wavelet variable, , on the bit plane, and is the number of bit planes. For image segmentation, the local PFM is needed. We had proposed a simple method to estimate the local PMF based on the MQ table [20]. Specifically, the probability of 1-bit,, is given by where is the probability of the less probable symbol (LPS), which is stored in the MQ table and MPS denotes the more probable symbol. The set obtained from the MQ table can be used to compute the local PMF. As the MQ table is also available at decoder, no overhead transmission is needed for the computation of PMF. In addition, JPEG2000 defines only 18 context labels to model the binary wavelet variables; thus, the computation of PMF is simple.

3.2. MQ Table-Based Rate Distortion Slope and Merging Algorithm

Motivated by the post compression rate distortion (PCRD) algorithm [15], we propose the MQ table-based rate distortion slope (MQRDS) for image segmentation in the JPEG2000 domain as follows: where is the distortion of wavelet segment: defined as is a wavelet coefficient at location: in wavelet segment, represented by The estimate of can be computed by in which can be obtained from the binary arithmetic code table known as the MQ table as follows: The estimate of code length can be efficiently obtained by using [2] where denotes the bit plane index, is the binary variable of on bit plane , which are independent across bit planes, is the number of bit planes, is the feature space dimension, is the number of wavelet coefficients in segment , is the total number of wavelet coefficients, and is an entropy operation. After merging two wavelet segments, say and n, the change of MQRDS is given by where and are the MQRDS of wavelet segments, and , with sizes and , respectively, and is the MQRDS of the merged wavelet segment. As one can see, the change of MQRDS is likely to be increased significantly for wavelet segments with similar characteristics. Thus, we propose a simple algorithm called the rate-distortion-based merging (RDM) algorithm for JPEG2000 image segmentation, which is presented in the steps below.

The RDM Algorithm
Step 1. Given a JPEG2000 code stream, compute the MQ table-based local PMF of wavelet coefficients using (2). Step 2. As mentioned in [2], a set of oversegmented regions known as superpixels is in general needed for any merging algorithms; this low-level initial segmentation can be obtained by coarsely clustering the local PMF as features. Step 3. For all pairs of superpixels, compute their respective changes of MQRDS using (12), and merge the one with maximum change of MQRDS. Step 4. Continue the merging process in step 3 until the change of MQRDS is insignificant.

In order to reduce the computation time, the following equation can be used to approximate (6): Moreover, the cross terms of the previous equation are not significant and can be discarded for computational simplicity. Figure 1 depicts flowchart of the RDM algorithm. It is noted that the MQ table defined in JPEG2000 is finite, thus (10) can be obtained by look-up table (LUT); this sure reduces the computation time further. As shown in Figure 2, RDM can be applied to a JPEG2000 code stream directly; this is one of the advantages of RDM.

(a)

(b)

4. Experimental Results

In the first experiment, the potential of the MQ table-based local PMF (LPMF) is shown by segmenting images with Brodatz textures. As noted, the essential characteristics of textures are mainly contained in the middle-high-frequency wavelet subbands; thus, we applied a simple clustering algorithm known as K-means to the LPMF of wavelet coefficients to generate an initial segmentation. The number of superpixels was set to 30, which was then finely merged using the RDM algorithms. Figure 3(a) shows the test image with two Brodatz textures, namely, wood and grass. The segmentation result and error image with white pixels representing misclassifications are shown in Figure 3(b) and Figure 3(c), respectively. Figure 3(d) shows the percentages of errors at various rates of bits per pixel (bpp). It is noted that the segmentation results even at low-middle bpp rates are still satisfactory. Hence, a small portion of JPEG2000 code stream is sufficient for the segmentation task.

(a)

(b)

(c)

(d)

The RDM algorithm has also been extensively evaluated on the Berkeley image database [21]. We adopted the Waveseg algorithm [14] to compute the initial superpixels of a natural color image. In order to avoid decoding a JPEG2000 code stream, the Waveseg algorithm was applied to the estimated wavelet coefficients instead of the decoded wavelet coefficients. More specifically, the estimated wavelet coefficient of using the MQ table-based LPMF is as follows. where is the probability of 1-bit on the bit plane, which can be obtained from the MQ table. The resulting superpixels were then merged by RDM with threshold, , set to 0.1. We compared the RDM algorithm with two other state-of-the-art algorithms known as Mean-shift [22] and CTM [2]. In Mean-shift, the parameters, and , were set to 13 and 19, respectively; in CTM, the threshold was set to 0.1, as suggested in [2]. The original images shown at the top of Figure 4 are natural images contained in the Berkeley database, namely, Pyramids, Landscape, Horses, and Giraffes. Their respective segmentation results using RDM, CTM, and Mean-shift are shown in the second, third, and fourth rows. Visual inspection shows that RDM and Mean-shift have similar performances for the first three images; the performances of RDM and CTM are similar to detect the giraffes shown in the fourth image.

(a)

(b)

(c)

In addition to visual inspection [23, 24], two commonly used measures, namely, the probabilistic Rand index (PRI) and the boundary displacement error (BDE) [25], were adopted for quantitative comparisons. Table 1 gives the average PRI performance on the Berkeley database. PRI ranges from 0 to 1, and higher is better. BDE measures the average displacement error of boundaries between segmented images, which is nonnegative, and lower is better. The average BDE performance is given in Table 2. It is noted that RDM outperforms CTM and Mean-shift in terms of the PRI and BDE measures.

The running times on a PC are given in Table 3. It shows that RDM is faster than CTM and Mean-shift due largely to the simple computations of (8) and (13). Moreover, RDM can be applied to a JPEG2000 code stream directly while most algorithms such as Mean-shift and CTM are primarily applied to the original or decoded image and it takes more time to decode a compressed image.

5. Conclusions

The MQ table defined in the JPEG2000 standard provides useful information that can be used to compute the local probability mass function (LPMF) of wavelet coefficients. A simple LPMF-based scheme has been proposed to estimate the rate distortion slope (RDS) of a wavelet segment. It is noted that the RDS is increased significantly after merging a pair of wavelet segments with similar characteristics into a single segment. Similar ideas of the above can be used to improve the rate control performance of JPEG2000 [26–28]. In this paper, we propose the rate-distortion-based merging (RDM) algorithm to segment images in the framework of JPEG2000. RDM has been evaluated on images with Brodatz textures and the Berkeley color image database. Experimental results show that the segmentation performance even at low-middle bpp rates is rather promising. For natural images with high-detail contents, RDM is preferable in terms of the average PRI and BDE measures. In addition, the total running time of RDM, which includes the computation of superpixels and the merging process, is faster than Mean-shift and CTM.

As RDM is based on the MQ table, which is available at both encoder and decoder, no overhead transmission is needed to compute the LPMF of wavelet coefficients. RDM can be applied to a JPEG2000 code stream directly; thus, the burden of decompressing computation can be avoided, and memory space that is required to store the decompressed image is no longer necessary from the segmentation point of view.

Acknowledgments

The authors are grateful to the maintainers of the Berkeley image database. The National Science Council of Taiwan, under Grants NSC100-2628-E-239-002-MY2 and NSC100-2410-H-216-003, supported this work.

References

Y. Xia, D. Feng, and R. Zhao, “Adaptive segmentation of textured images by using the coupled Markov random field Model,” IEEE Transactions on Image Processing, vol. 15, no. 11, pp. 3559–3566, 2006.
View at: Publisher Site | Google Scholar
A. Y. Yang, J. Wright, Y. Ma, and S. Shankar Sastry, “Unsupervised segmentation of natural images via lossy data compression,” Computer Vision and Image Understanding, vol. 110, no. 2, pp. 212–225, 2008.
View at: Publisher Site | Google Scholar
N. A. M. Isa, S. A. Salamah, and U. K. Ngah, “Adaptive fuzzy moving K-means clustering algorithm for image segmentation,” IEEE Transactions on Consumer Electronics, vol. 55, no. 4, pp. 2145–2153, 2009.
View at: Publisher Site | Google Scholar
S. Xiang, C. Pan, F. Nie, and C. Zhang, “Turbopixel segmentation using eigen-images,” IEEE Transactions on Image Processing, vol. 19, no. 11, pp. 3024–3034, 2010.
View at: Publisher Site | Google Scholar
M. Li and W. Zhao, “Quantitatively investigating locally weak stationarity of modified multifractional gaussian noise,” Physica A, vol. 391, no. 24, pp. 6268–6278, 2012.
View at: Publisher Site | Google Scholar
M. Li and W. Zhao, “Variance bound of ACF estimation of one block of fGn with LRD,” Mathematical Problems in Engineering, vol. 2010, Article ID 560429, 14 pages, 2010.
View at: Publisher Site | Google Scholar
S. Chen and X. Li, “Functional magnetic resonance imaging for imaging neural activity in the human brain: the annual progress,” Computational and Mathematical Methods in Medicine, vol. 2012, Article ID 613465, 9 pages, 2012.
View at: Publisher Site | Google Scholar
Z. Teng, J. He, A. J. Degnan et al., “Critical mechanical conditions around neovessels in carotid atherosclerotic plaque may promote intraplaque hemorrhage,” Arteriosclerosis, Thrombosis, and Vascular Biology, vol. 223, no. 2, pp. 321–326, 2012.
View at: Google Scholar
S. Y. Chen and Q. Guan, “Parametric shape representation by a deformable NURBS model for cardiac functional measurements,” IEEE Transactions on Biomedical Engineering, vol. 58, no. 3, pp. 480–487, 2011.
View at: Publisher Site | Google Scholar
D. E. Ilea and P. F. Whelan, “CTex—an adaptive unsupervised segmentation algorithm based on color-texture coherence,” IEEE Transactions on Image Processing, vol. 17, no. 10, pp. 1926–1939, 2008.
View at: Publisher Site | Google Scholar
S. Mallat, A Wavelet Tour of Signal Processing, Academic Press, San Diego, Calif, USA, 1999.
M. K. Bashar, N. Ohnishi, and K. Agusa, “A new texture representation approach based on local feature saliency,” Pattern Recognition and Image Analysis, vol. 17, no. 1, pp. 11–24, 2007.
View at: Publisher Site | Google Scholar
C. M. Pun and M. C. Lee, “Extraction of shift invariant wavelet features for classification of images with different sizes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp. 1228–1233, 2004.
View at: Publisher Site | Google Scholar
C. R. Jung, “Unsupervised multiscale segmentation of color images,” Pattern Recognition Letters, vol. 28, no. 4, pp. 523–533, 2007.
View at: Publisher Site | Google Scholar
T. Acharya and P. S. Tsai, JPEG2000 Standard for Image Compression: Concepts, Algorithms and VLSI Architectures, John Wiley & Sons, New York, NY, USA, 2005.
C. Cattani, “Harmonic wavelet approximation of random, fractal and high frequency signals,” Telecommunication Systems, vol. 43, no. 3-4, pp. 207–217, 2010.
View at: Publisher Site | Google Scholar
S. Y. Chen and Z. J. Wang, “Acceleration strategies in generalized belief propagation,” IEEE Transactions on Industrial Informatics, vol. 8, no. 1, pp. 41–48, 2012.
View at: Publisher Site | Google Scholar
F. Zargari, A. Mosleh, and M. Ghanbari, “A fast and efficient compressed domain JPEG2000 image retrieval method,” IEEE Transactions on Consumer Electronics, vol. 54, no. 4, pp. 1886–1893, 2008.
View at: Publisher Site | Google Scholar
M. H. Pi, C. S. Tong, S. K. Choy, and H. Zhang, “A fast and effective model for wavelet subband histograms and its application in texture image retrieval,” IEEE Transactions on Image Processing, vol. 15, no. 10, pp. 3078–3088, 2006.
View at: Publisher Site | Google Scholar
H. C. Hsin, “Texture segmentation in the joint photographic expert group 2000 domain,” IET Image Processing, vol. 5, no. 6, pp. 554–559, 2011.
View at: Publisher Site | Google Scholar
http://www.eecs.berkeley.edu/~yang/software/lossy_segmentation/.
D. Comaniciu and P. Meer, “Mean shift: a robust approach toward feature space analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 603–619, 2002.
View at: Publisher Site | Google Scholar
H. C. Hsin, T.-Y. Sung, Y.-S. Shieh, and C. Cattani, “MQ Coder based image feature and segmentation in the compressed domain,” Mathematical Problems in Engineering, vol. 2012, Article ID 490840, 14 pages, 2012.
View at: Publisher Site | Google Scholar
S. Chen, M. Zhao, G. Wu, C. Yao, and J. Zhang, “Recent advances in morphological cell image analysis,” Computational and Mathematical Methods in Medicine, vol. 2012, Article ID 101536, 10 pages, 2012.
View at: Publisher Site | Google Scholar
R. Unnikrishnan, C. Pantofaru, and M. Hebert, “Toward objective evaluation of image segmentation algorithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 929–944, 2007.
View at: Publisher Site | Google Scholar
H. C. Hsin and T. Y. Sung, “Context-based rate distortion estimation and its application to wavelet image coding,” WSEAS Transactions on Information Science and Applications, vol. 6, no. 6, pp. 988–993, 2009.
View at: Google Scholar
H.-C. Hsin and T.-Y. Sung, “Image segmentation in the JPEG2000 domain,” in Proceedings of the International Conference on Wavelet Analysis and Pattern Recognition (ICWAPR '11), pp. 24–28, 2011.
View at: Google Scholar
H.-C. Hsin, T.-Y. Sung, Y.-S. Shieh, and C. Cattani, “Adaptive binary arithmetic coder-based image feature and segmentation in the compressed domain,” Mathematical Problems in Engineering, vol. 2012, Article ID 490840, 14 pages, 2012.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2012 Ying-Shen Juang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1061

Downloads

1250

Citations