Abstract

We describe an authentication and recovery scheme for color image protection based on adaptive encoding. The image blocks are categorized based on their contents and different encoding schemes are applied according to their types. Such adaptive encoding results in better image quality and more robust image authentication. The approximations of the luminance and chromatic channels are carefully calculated, and for the purpose of reducing the data size, differential coding is used to encode the channels with variable size according to the characteristic of the block. The recovery data which represents the approximation and the detail of the image is embedded for data protection. The necessary data is well protected by using error correcting coding and duplication. The experimental results demonstrate that our technique is able to identify and localize image tampering, while preserving high quality for both watermarked and recovered images.

1. Introduction

The revolution of digital technologies has brought many conveniences to our daily lives. For example, it becomes very easy to create, duplicate, transmit, or modify digital products. Accompanying such advance, however, unauthorized use, illegal copying, and malicious modification of digital products become serious problems. A common approach to tackle such problems is the use of digital watermarking techniques, and one of the applications is image content authentication, in which the integrity of an image is considered very important and therefore requires protection.

Researchers tried to develop various image authentication techniques to detect if an image has experienced unauthorized modifications. Some of them can only detect if a certain part of the image has been tampered with, whereas others may have the additional capability to recover the tampered regions. Lin et al. [1] embedded an image block’s average intensity and parity check into its corresponding block, which is pseudorandomly determined. The tampered regions are identified based on a three-level hierarchical structure, and the extracted data is used to recover the tampered blocks. With such a hierarchy, the inspection can be performed at different levels and the precision of tamper detection can be achieved close to 100%. Wang and Tsai [2] used fractal codes to generate the approximation for the region of interest (ROI) of the image. Such approximation acts as the recovery data and is embedded into the least significant bits of the image pixels. During the recovery process, the tampered regions in the ROI are recovered using the extracted data, and the blocks outside the ROI are recovered by the inpainting technique instead. One of the problems of their method is that the quality of the recovered blocks outside the ROI is not as good as that in the ROI. Moreover, the original image is required for tamper detection. Zhang and Wang [3] used the five most significant bits of the pixels in each block to generate the reference bits. The check bits are derived by applying a hash function to the block, and they are embedded using a reversible method. The tampered regions are then identified with the extracted check bits. In their method, a tampered region can be perfectly recovered as long as its total size is less than 3.2% of the whole image. Lee and Lin [4] generated the recovery and verification data of each block and then embedded two copies of them into two other blocks. Such strategy provides more opportunities to recover the block in case one of the two corresponding blocks is corrupted. However, this method suffers from the collage attack. In He et al.’s method [5], an image is divided into blocks of size , followed by the discrete cosine transform (DCT), and the leading 11 DCT coefficients of each block are embedded into the corresponding block. During authentication, a statistics-based rule is used to determine the validity of a block by considering its adjacent and mapping blocks. This method provides a high detection rate for tampered regions, but they are recovered with only ordinary quality. Qin et al. [6] applied the non-sub-sampled contourlet transform (NSCT) on each image block and selected the low-frequency coefficients to generate the recovery bits. The size of the recovery data of each block is determined by the block’s characteristics and the smooth blocks are encoded with less data than the textured ones. Their method maintains high quality for watermarked images by effectively controlling the size of the embedding data. Qian et al. [7] also divided an image into blocks of size , followed by the DCT, and each block is classified into one of the several types according to the index of the last nonzero DCT coefficients. Several DCT coefficients are encoded with variable-length coding and embedded into the three least significant bits (LSBs), and the length of the code is determined by the type of the block. During the recovery process, the data is extracted using dequantization and the inverse DCT. Their method recovers the tamper areas with high image quality.

While most of the existing techniques deal with grayscale images, others are designed specifically for color images. Wang and Chen [8] computed the means of YCbCr as the recovery data, and the authentication data is derived from the global and local features. During authentication, all the extracted authentication data are taken into consideration and a majority-voting strategy is used to determine the validity of the image block. Their method recovers the tampered regions with acceptable quality. Liu [9] utilized the block-edge pattern to describe the details of the luminance channel and to preserve more complete luminance information. After pattern matching, the best index is recorded. The positive gradient is obtained by calculating the difference between the mean and the bright pixels, whereas the negative gradient is obtained by the mean and the dark pixels. The method can achieve good quality for both watermarked and recovered images.

In this paper, we describe an authentication and recovery scheme for color image protection based on adaptive encoding. The image blocks are categorized according to their contents and different encoding schemes are applied according to their types. Such adaptive encoding results in better image quality and more robust image authentication. The approximations of the luminance and chromatic channels are carefully calculated, and for the purpose of reducing the data size, differential coding is used to encode the channels with variable size according to the characteristic of the block. The recovery data which represents the approximation and detail of the image is embedded for data protection. The necessary data is well protected by using error correcting coding (ECC) and duplication. This paper is organized as follows. The embedding and authentication processes are described in Section 2. Several experimental results are presented in Section 3 and Section 4 gives some concluding remarks. The experimental results demonstrate that our technique is able to identify and localize image tampering, while preserving high quality for both watermarked and recovered images.

2. The Proposed Method

2.1. Watermark Embedding

Let be the host color image of size , and its red, green, and blue components are denoted as , , and , respectively. Figure 1 delineates the watermark embedding process, which consists of the following steps.

(1) The original image is divided into nonoverlapping blocks of size pixels, which produces blocks in total. A block in image is denoted as , , , and the red channel of is denoted as . The green and blue channels of block are denoted as and , respectively.

(2) Because the human visual systems are more sensitive to luminance than chromatic changes, transforming the RGB color model into another in which luminance and chromatic channels are separated would result in better image analysis. One of such models is the YCbCr model, in which presents the luminance and Cb and Cr represent the chromatic information, respectively. Denoting the transformed version of as , the three channels of block can thus be denoted as , , and , respectively.

(3) Next, we analyze the intensity and texture (i.e., contrast and edge) features of the image. The four pixels in block are denoted by , , , and , as illustrated in Figure 2. The top-left pixel, , of each block is recorded and the collection of all ’s forms a set of vectors, with the length of each vector being 8 bits. The vector set is further processed using the LBG vector quantization technique [10, 11] to generate a codebook consisting of 64 codewords, . An example of is illustrated in Figure 3.

The intensity feature of block is obtained by computing the similarities between pixel and the codewords in the codebook: Function min( ) returns two parameters, and , where is the codeword most similar to and is its index. The index, whose size is 6 bits , will be recorded, and the block is modified by replacing with .

(4) Next, the remaining pixels, , will be encoded. Because adjacent pixels in a natural image are usually highly correlated, recording their relationship is an effective way of reducing the encoding size. Such an idea is realized by the differential coding technique, in which the differences between pairs of the pixels are calculated first: And then the maximum of the absolute differences, , is used to classify the block into four types according to Table 1. The blocks of different types will be encoded with different codebooks consisting of different number of codewords. A uniform block will be encoded with a small codebook, whereas a block with large differences will be encoded with a large codebook in order to reduce the error.

The average difference, , is obtained for each block. Those ’s of type-3 blocks are collected to form a dataset, and the K-means clustering technique [12, 13] is used to separate the dataset into five clusters (subtypes). That is, type-3 blocks are further divided into five subtypes, that is, types 3.1~3.5, and for each subtype, we collect , , and to form a set which have elements. The vector quantization technique is then used to generate a codebook consisting of 16 representative codewords (i.e., differences) of each type (for different average magnitudes). An example of the block types and codebooks, ’s, of different coding is shown in Table 2, in which the codebooks of types 1 and 2 are predefined without training and the codebooks of types 3.1 to 3.5 are generated using the vector quantization technique. For each block, we encode the three differences with its corresponding codebook, and three bits are used to encode the block type. Furthermore, two, three, and four bits are used to encode the differences of the blocks of types one, two, and three, respectively. To better explain the encoding process, Figure 4 illustrates two examples of . In the first example, , , and . According to Table 1, this block is of type 0. Because the neighbors are almost the same, their differences are not recorded, and the only thing to encode is the block type (3 bits), . In the second example, , , , and . Hence, this block is of type 2, and the most similar code to () is also 3, under index . Likewise, the most similar codeword to () is 1 and −1. The smaller codeword, −1, under index is chosen because it is closer to the codeword of (). To compensate the error, is further updated by . The appropriate code for is then −1 under index . Therefore, the data to be recorded is the block type and the indices of the codewords of , , and . The size of each index of types 1, 2, and 3 are 2, 3, and 4 bits, respectively. Therefore, the total sizes of the data to be recorded of types 0, 1, 2, and 3 is 3, 9 (), 12 (), and 15 (), respectively. Codebooks of type 3, denoted as , will be recorded. We have conducted several experiments to understand the distributions of each block type in a few images as shown in Figure 5. About to blocks belong to types 0 to 2, which indicates that neighboring pixels are very similar as expected, and they can be encoded with smaller sizes.

(5) For each Cb block, the mean of four Cb pixels is calculated. And then, the vector quantization technique is used to generate a codebook consisting of 16 representative codewords for the means. The index of the best codeword for a Cb mean, , will be recorded with size of bits. The processing of Cr blocks is identical to that of Cb blocks. The codebooks of Cb and Cr are denoted as and , respectively, and Figure 6 gives examples of them.

(6) For an RGB block, three LSBs of channel R, two LSBs of channel G, and three LSBs of channel B are cleared before hashing, and a random value is generated as the seed of the hash function. The LSB-cleared block ( bytes) is hashed using the SHA-1 algorithm [14]. The XOR of the even bits of the 180-bit hash values are obtained to generate the first bit of the authentication data, and the XOR of the odd bits of the 180-bit hash values are obtained to generate the second bit of the authentication data. The 2-bit authentication data is denoted as .

(7) Now, the global data consists of four codebooks: , , , and . To protect the data, two techniques are used: error correcting coding (ECC) and duplicating (majority voting). As the Reed-Solomon coding [15] has () correcting capability, it is used to encode the global data and the encoded data are further duplicated into copies. The duplication dramatically improves the robustness, and even if some copies (less than ) are corrupted, the bit can still be restored correctly. Let denote the length of ECC-encoded data; is determined by . The duplicated string is zero padded to make the total size be . The final data is separated into bits and each bit for block is denoted as . Thus, the payload of each block is (if any).

(8) The corresponding block of block () is determined by using the Torus Automorphism [16]. A prime number is chosen by the user and treated as a private key. The LSBs of the corresponding block are replaced with the variable-sized payload, with four different sizes: 20, 26, 29, and 32 bits. The bit allocation is shown in Figure 7, and to maintain imperceptibly, the embedding order is B, R, and then G channel.

2.2. Authentication and Recovery

During authentication, the same process is applied on the received image , as shown in Figure 8. is divided into blocks, and for each block, the position (index) of its corresponding block is computed using the Torus Automorphism with the same key. Because there are four different sizes of the payload, we first extract the 20-bit data from LSBs of the corresponding block. According to the last three bits of the extracted data, , we will know the size of the rest unextracted data. We then extract the following 0, 6, 9, or 12 bits. After obtaining the payload of each block, we have the complete data: , , , , , , and (if any).

The encoded string of the global data is obtained by combining which are collected from each block. The sting is separated into pieces with each piece decoded by ECC decoding. Then, the majority voting is applied on each bit of the decoded results. If the count for a bit exceeds , the bit is 1; otherwise, the bit is 0. The results of the processing are thus the four codebooks: , , , and .

A binary authentication map with size of is set as zero. For each block, the hash data, , is produced with the correct seed and then compared with (the original one). If a mismatch occurs, the corresponding pixel on the map is set as one for an unauthentic block. If there are no mismatches after checking all blocks, the image is authentic and the authentication process terminates. Otherwise, the binary map will be further processed to generate the final map which shows the position of the tampered blocks. Because an attacker usually tries to alter the semantics of the image, tampered pixels tend to cluster together (may be in several locations). Randomly altering the pixels is meaningless and hence not likely to happen. Morphological operations can be used to remove the (noise like) false positives and concentrate the shapes of meaningful tampered regions. Based on this analysis, dilation and erosion operations are applied on the map, and according to the processed map, the tampered regions (which may consist of many blocks) are localized.

After localizing the tampered blocks, their three channels, , , and , will be recovered by the following process.(1)By using the table-lookup process, the top-left pixel of block , , is obtained by finding from . The codebooks of types 0, 1, and 2 are predefined and the codebooks of type 3 are . According to , we select the corresponding codebook and the three differences, , , and , are obtained from indices , , and , respectively. The other three pixels of can be decoded by using the following equations: , , and . Thus block has been recovered.(2)Through table-lookup, a corresponding codeword for Cb is obtained by finding from . A block, , is constructed by duplicating the codeword four times. The Cr block is obtained exactly the same as Cb blocks. Thus blocks and are recovered.(3), , and are obtained by transforming , , and from YCC to RGB color model. The tampered block is recovered by replacing , , and with , , and , respectively.The chromatic channels of the block are recovered with the approximation and the luminance channel is recovered with fine details. Such mechanism preserves the image quality very well.

3. Experimental Results

Four color images were tested in our experiments: airplane, fruits, school, and lighthouse. There are thus blocks in total, and the number of duplications, , is 18. The original and watermarked images produced by the proposed method are shown in Figure 9. As can be seen, these images all have high visual quality. The performance of our method is compared against that of Liu’s method [9]. Table 3 lists the comparisons of the PSNR values and it shows that our PSNR levels are quite acceptable and our method outperforms Liu’s.

To demonstrate the effectiveness of our technique, we performed various attacks on the watermarked image including removing the words from airplane, placing a strawberry on the bottom, removing the clock from the building, and placing a boat on water. The tampered images and their detection results are shown in Figure 10. As is obvious in the figure, the tampered regions (blocks) are all correctly detected and localized. The tampered regions (blocks) are further recovered. Figure 11 shows the local areas of the results recovered by the proposed and Liu’s methods. In addition, Table 4 lists the PSNR values of the recovered images produced by the proposed and Liu’s methods. Compared with the original images, we can see that the images recovered by the proposed method have high fidelity both for colors and textures. The PSNR values also verify such results. Comparing the PSNR values between the watermarked and recovered images, the proposed method only produces tiny degradations. However, we can see apparent chromatic degradation in the images produced by Liu’s method. From both Figure 11 and Table 4, it is obvious that our method outperforms Liu’s.

4. Conclusion

An authentication and recovery scheme for color images is proposed in this paper. The approximations of the luminance and chromatic channels are properly calculated, and to reduce the data size, differential coding is used to encode the details of the luminance channel with variable size according to the characteristic of the block. The recovery data which represents the approximation and the details of the image is embedded in the image for data protection. The important data is well protected by using the ECC technique and duplication. The experimental results demonstrate that our technique is able to identify and localize image tampering, while preserving high quality for both watermarked and recovered images.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.