#### Abstract

One of the main objectives of watermarking is to achieve a better tradeoff between robustness and high visual quality of a host image. In recent
years, there has been a significant development in gray-level image watermarking using fractal-based method. This paper presents a human visual
system (HVS) based fractal watermarking method for color images. In the proposed method, a color pixel is considered as a 3-D vector in *RGB* space. And a general form of 3 × 3 matrix is utilized as the scaling operator. Meanwhile, the luminance offset vector is substituted by the range block mean vector. Then an orthogonalization fractal color coding method is achieved to obtain very high image quality. We also show that the orthogonalization fractal color decoding is a mean vector-invariant iteration. So, the range block mean vector is a good place for hiding watermark. Furthermore, for consistency with the characteristics of the HVS, we carry out the embedding process
in the *CIE* space and incorporate a just noticeable difference (JND) profile to ensure the watermark invisibility. Experimental results show that the proposed method has good robustness against various typical attacks, at the same time, with an imperceptible change in image quality.

#### 1. Introduction

With the rapid development of internet, digital multimedia data such as image, audio, and video are readily reproduced and distributed with ease. Many watermarking methods have been exploited for copyright protection due to the properties of digital media and the popularity of the internet. Typically, a watermark can be a random signal, a meaningful message, or a logo. It is viewed as an effective way to prevent user’s content from illegal distribution [1]. Figure 1 shows a general scheme for digital watermarking. Firstly, an encrypted watermark is inserted into an original image by using an embedding algorithm. Then, the watermarked image will pass through the transmission channel, which may include some attacks, such as lossy compression, low-pass filtering, noising, and geometric distortions. When the watermark is to be extracted, the corresponding extraction algorithm will be implemented, usually with aid of a secret key and the original image. Only the owner of the image knows the key and no other person can identify the watermark without the knowledge of the key. A good watermarking scheme should possess many important properties, for example, invisibility, robustness to various types of attacks, and accurate detection.

In general, digital image watermarking algorithms addressing a wide variety of applications can be classified into two categories depending on the domain in which the watermark is embedded. The first group is mainly implemented in spatial domain [3, 4] while the second group of algorithms is achieved in transform domain, such as discrete cosine transform (DCT) [5, 6], discrete Fourier transform (DFT) [7] and discrete wavelet transform (DWT) [8, 9]. The watermark in spatial domain methods is inserted into the image by directly modifying a preselected set of pixels in the image. On the other hand, in transform domain watermarking methods, the watermark is embedded into the image by modifying the transformed frequency coefficients. The advantage of watermarking in the frequency domain is that the characteristics of the human visual system (HVS) are better captured by the spectral coefficients, since the HVS is more sensitive to low-frequency coefficients and less sensitive to high-frequency coefficients [10, 11]. When focusing on color image watermarking, many methods are realized by embedding watermarks in the image luminance or by processing each color channel separately. By observing the fact that human eyes are insensitive to changes in the blue component, Kutter et al. [12] proposed a method for embedding a watermark in the blue component. Barni et al. [13] introduced a color image watermarking scheme based on the cross-correlation of *RGB* channels. In Barni’s scheme, a watermark is embedded within the host image by modifying a subset of full frame DCT coefficients of each color channel. And the existence of the watermark is verified by calculating the correlation of *RGB* channels. Tsai et al. [14] provided a scheme of embedding a watermark on a quantized color image. Tsui et al. [15] introduced a solution of embedding a watermark in the frequency domain of the chromatic channels by using the spatiochromatic discrete fourier transform (SCDFT). In Tsui’s method, the chromatic content of a color image is encoded into *CIE* chromaticity coordinates while the achromatic content is encoded as *CIE* tristimulus value. Kuo and Cheng [16] presented a color watermarking method that combines color edge detection and color quantization using principal axes analysis in 3-D color space. Chou and Liu [17] introduced a color image watermarking scheme that hides watermark signals in most distortion-tolerable signals within three color channels of the host image without resulting in perceivable distortion in wavelet domain. Also, Vahedi et al. [18] proposed a watermarking scheme of color images based on wavelet transformation. In Vahedi’s method, the strength of the embedded watermark is controlled locally with the aid of the visual characteristics of the host image and so forth.

The idea of fractal image coding (FIC) was originally introduced by Barnsley and Demko [19]. And in 1992, Jacquin [20] achieved the first practical fractal image coding method. In the past decades, fractal image coding has mainly been developed for the purpose of image compression [21–33] as well as some for image denoising [34–37] and image indexing [38]. Recently, it is also well studied for gray-level image watermarking [39–43]. Davern and Scott [39] described a steganographic method for inserting secret information into images using fractal image coding. The method allowed a user to specify a visual key when hiding the secret information. The visual key must then be used when retrieving the hidden data. Čandik et al. [40] investigated two possible approaches of embedding digital watermarks into fractal codes of images, which are embedding digital watermarks into parameters for position of similar blocks and coefficients of block similarity. Kiani and Moghaddam [41] presented a watermarking method to embed a binary watermark into an image using a special type of fractal coding that its parameters are contrast scaling and the mean of range block. And it utilized the fuzzy -mean clustering to address the watermark bits. Pi et al. [42] studied a novel watermarking method utilizing a special type of orthogonalization fractal coding. In the method, a permutated pseudorandom binary sequence used as watermark was embedded into the range block means. And the detecting procedure was carried out by computing the correlation coefficient between the original and the extracted watermark. Shahraeini and Yaghoobi [43] proposed a blind watermarking algorithm based on fractal model in discrete wavelet domain and so forth.

However, few known papers considered fractal coding for watermarking of color images. In this paper, motivated by [42], we advance a HVS-based fractal color image watermarking algorithm. Meanwhile, an analogous *RGB* version of orthogonalization fractal coding is introduced. Just like other watermarking methods, good robustness and high image quality are our goals. In our proposed method, based on the observation that the , , and components for a natural image are correlated, we consider a pixel as a 3-D vector in *RGB* space. Then, instead of using three independent linear predictions for the pixel intensity of the three planes (namely, , , and planes) in the classic fractal color coding [2], we adopt a general form by using a 3 × 3 matrix as the scaling factor. In addition, the range block mean vector substitutes the luminance offset vector. In terms of the quality of the fractal representation measured by PSNR, the proposed fractal coding significantly performs better than the traditional fractal color image coding. We also show that orthogonalization fractal color decoding is a mean-invariant iteration. So, we use the range block mean vectors to hide the watermark. On the other hand, in order to accord with the characteristics of the HVS, we convert the range mean vectors from *RGB* space to *CIE* space before the inserting process (with the aid of the JND model [44]) is carried out. The experimental results show that the proposed method is very robust against different attack, as well as with retaining very high image quality.

#### 2. Foundations of Fractal Image Coding

The theory of fractal-based image coding using iterated function system (IFS) and collage theorem was proposed by Barnsley and Demko [19]. Let be a complete metric space. An iterated function system (IFS) is a finite set of contraction mappings , where with contraction factor on . In general, the system is denoted by and the contraction factor is . Furthermore, we define as , . Then, the transformation is a contraction mapping on the complete metric space , where is the Hausdorff metric. So, according to the contractive mapping fixed-point theorem, there exists a unique fixed point satisfying which is known as an attractor of the IFS. The fixed point can be obtained by following iteration where and . Let be a small given real number. With the above hypothesis, collage theorem says that, suppose satisfying ; then, The collage theorem implies that if a set can approximate to contracted copies of itself as well as possible by a set of contractive mappings, then the corresponding attractor is close enough to the set. Thus, the IFS code for the set can be stored since the approximator can be obtained by iterating the IFS on any initial set.

However, in practice it is unlikely to find “collages” for most natural images by reasonably simple transforms. The well-known practical method for fractal image coding, namely partitioned iterated function system (PIFS), was proposed by Jacquin [20]. Let be a given gray-level image, and hence is a subset in three-dimensional Euclidian space that is, . The image is first segmented into nonoverlapping range blocks of size, say , and denoted by . Note that and for any . Let be the pool of all domain blocks which are of size and all are extracted from that is, for . Let , where is usually an affine contractive transformation. For each range block , let be distance measure. Our goal is to find a domain block as well as an appropriate transformation such that where and satisfies When all the range blocks are processed, the set of maps thus is obtained and called the PIFS code of the image .

In classic fractal coding, the distance always uses -norm, , as the so-called collage distance measure. The affine contractive map consists of three parts defined as Here, is a uniform scaling from domain block to range block dimension; is one of the eight isometries that map a block into itself (identity, 90°, 180°, and 270° rotations around the center and the four reflections over the symmetry axes); and is a linear transform in the gray level, defined as where and are known as the contrast and luminance offset parameters, respectively, and is usually restricted to satisfy for contractivity. (Some authors, e.g., [21], have restricted with no noticeable effects on the contractivity.)

#### 3. Fractal Color Image Coding

The traditional fractal color image coding method in *RGB* space discussed in [2] for the purpose of compression was performed by using three gray-level linear predictions to encode the , , and image components independently. It was defined as
where represents values of a pixel and denotes the corresponding offset values. However, for most natural images, the components , and are always strongly correlate. For instance, when testing Lena image, the correlations between and , and , and are 0.8786, 0.6764 and 0.9106, respectively. The correlation of , and is 0.9881, which is estimated by [45]
where is the covariance matrix of vectors , and . Likewise for Barbara image, the correlations between and , and , and are 0.8860, 0.8061 and 0.9582, respectively; the correlation of , and is 0.9921. Based on this observation, in this work, with above notations, we consider the , , components as a whole and adopt a more general functional form:
where the linear operator is defined by a 3 × 3 matrix:
In general, the distance between the pixels and is defined by using -norm (Euclidean distance), that is,
Let and , , where denoting the pixel number of , be the (, , and ) values on and transformed , respectively. Then the distance of and can be determined by

The purpose of using squared *norm* in function (13) is to facilitate the estimation of the matrix . According to the matrix theory, formula (13) can be represented using the well known (squared) *Frobenius norm*; that is,
where and being a vector. , , and denote the respective , , values of the pixels in the domain block . Likewise, , , and denote the respective , , and values of the pixels of .

In order to minimize the residual , we rewrite (13) as and differentiate it with respect to , and set them to zero; that is, and yielding that is, where and , . Inserting (16) into (18), we have

For a test image, let the , , values and , be random samples of the random variables and representing the pixel value distribution of the parent subblock and the child subblock, respectively. Then, the formula (19) can be comprehensively written as follows: where is a covariance matrix As a result, if is an invertible matrix, which can be easily satisfied by randomly adding extremely small positive real numbers to each element of , then

Moreover, summarizing (16) gives where and .

According to the *Banach’s fixed point theorem*, in order to guarantee the contractivity of the fractal transformation with respect to *norm*, should satisfy the restriction , which implies that the maximum eigenvalue of must be smaller than one. There is no simple relationship between the coefficients of and . However, in the norm, contractivity is guaranteed if all satisfy the condition for each . The resulting fractal transforms satisfying this condition are always contractive in , hence in , due to the equivalence of the norms in finite pixel space. For this reason, we will “clamp” the coefficients of by checking each row of if it satisfies . If for some , we re-encoded the -component of the child subblock by “discarding” one or two unimportant coefficient(s) in . Firstly, the correlation coefficients between and , , of the corresponding parent subblock are calculated and denoted by , , , respectively. Secondly, the magnitudes of the three correlations are sorted in ascending order, for example, , which means that has maximum correlativity. Finally, we “clamp” the coefficients by following way:(i)If has the same sign as , then set and
If , goto (iv).(ii)If and have different sign whereas and take the same sign, then set and , are computed with the similar equation as (24) by replacing with . If , goto (iv).(iii)If both and have different sign with , then goto (iv).(iv)Set and . If , then is “clamped” to .

When the “clamped” is determined, can be calculated naturally by (23). Accordingly, the (nonorthogonalization) fractal decoding algorithm with and is given by

If and are taken as fractal parameters, the orthogonalization decoding algorithm is where denotes the mean of , and components of the range block ; that is, ; is the matched domain block from the th iteration and . From (26), it can be readily verified that is a mean vector-invariant iteration; that is,

In Figure 2, we test the proposed method on the 512 × 512 color Lena (Figure 2(a)), Barbara (Figure 2(d)), and Peppers (Figure 2(g)) images using full search scheme. We set the size of range blocks to be 4 × 4 and the search step size to be 2. We can see that, in terms of PSNR qualitative measures, the image quality in Figures 2(b), 2(e), and 2(h) with PSNRs 34.71 dB, 30.37 dB, and 32.01 dB encoded by the method proposed here is significantly better than the quality of Figures 2(c), 2(f), and 2(i) encoded by the traditional fractal image coding in [2], around 2.0 dB.

**(a) Lena**

**(b) PSNR 34.71 dB**

**(c) PSNR 32.54 dB**

**(d) Barbara**

**(e) PSNR 30.37 dB**

**(f) PSNR 28.57 dB**

**(g) Peppers**

**(h) PSNR 32.01 dB**

**(i) PSNR 30.35 dB**

#### 4. Fractal Color Watermarking

In general, there are two types of techniques for embedding a predefined watermark into an image. One is pseudorandom sequence used for objective detection; the other is binary image or gray image used for subjective detection [1]. In this paper, in order to subjectively verify the ownership of an image with the aid of extracting a watermark, a binary logo image is used. And the watermark bit is either 1 or 0. One of the main challenges of the watermarking is to achieve a tradeoff between robustness and perceptivity. In general, increasing the strength of the embedded watermark can achieve robustness, but it would lead to an increase in the visible distortion as well, and vice versa. Since the orthogonalized fractal decoding is a mean-invariant iteration, the range block mean is a good robust place to hide a watermark, as discussed in [42]. After fractal decoding, the embedded watermark diffuses throughout the reconstructed image.

In order to gain high robustness as well as low sensitivity in color image watermarking, the knowledge of human visual perception of color stimuli must be well utilized in designing embedding algorithms. It is well known that the human visual system (HVS) is not perfect sensor of perceiving color visual information. Although the HVS is particularly sensitive to changes in image hue, it is less sensitive to the yellow-blue component. *CIE* is the most complete color model conventionally used to describe all of the colors that are visible to the human eye [10]. The parameter represents the lightness of the color, whereas , are chromatic information. And is always named magenta-green axis and is named yellow-blue axis. Based on the observations, in the method here, we will convert the , , and values of the range block means to space before they are inserted watermark . The transformation from space to the *CIE* space is through the space. The *RGB* space is first converted to the space through a linear transformation
and the space is converted to the *CIE*- space through a nonlinear transformation
where
and , , represent the reference white. Conversely, the reverse transformation is easily expressed by using the inverse of the function above:
where

The color difference at each pixel is defined as
Under a specific viewing condition, there exits a certain amount of information that is not perceivable by the human eyes, or the so-called perceptual redundancy. Experiments have shown that is not detectable by the HVS, and is not apparent [46]. is called the uniform just-noticeable color difference (UJNCD). Since the sensitivity of the HVS to the yellow-blue component is approximately 1/5 compared to the luminance component [10], we can set . So, if we set , then and . Assume a randomly scrambled watermark to be the watermark insertion procedure is illustrated in Figure 3 and includes the following steps.(1)Fractally encode the original image (using full search scheme) to produce the fractal codes in *RGB* space, where is the position of the optimal domain block.(2)Convert the range block means from *RGB* space to space and denote them by .(3)Embed the permutated watermark into .(i)If the message to be embedded is , replace with , where .(ii)Otherwise, if , replace with .(4)Convert back to *RGB* space and denote them by .(5)Hide the watermark by performing fractal decoding using .

To extract the embedded watermark, the original image and watermarked (possibly attacked) image are needed. and are calculated from the original image in space, followed by a distance comparison to the from the watermarked image. If the extracted is closer to , we regard “1” as embedded; otherwise, “0” was inserted. The flow diagram is illustrated in Figure 4. The watermark extraction is very quick since it does not need fractal encoding and decoding.

#### 5. Experimental Results

In order to verify that the proposed algorithm described before indeed increases the robustness of the watermarked images against attacks, a series of experiments has been conducted by using the attacks to the 512 × 512 color Lena, Peppers, and Barbara images, as shown in Figures 2(a), 2(d), and 2(g). The size of the color range block is 4 × 4, so the original image can be partitioned into 128 × 128 (i.e., 16384) color range blocks. For convenience, the size of binary logo image to be embedded is 128 × 128, and the logo image as shown in Figure 5(a), a lovely panda, is encrypted by using the block shifting method [47]. That is, we first divide the logo image into 32 × 32 small blocks, each with 4 × 4 pixels. Successively, the positions of the small blocks are randomly shifted by a random matrix, such as which is generated by randomly scrambling the integers . As an example used here, the resulting image is shown in Figure 5(b). If people want to recover the original information, the same random matrix should be used,; otherwise people could not obtain the correct information.

**(a)**

**(b)**

The attacked image quality is measured by *peak signal-to-noise ratio* (PSNR) defined as
where
Here, and are the height and width of the image, respectively. and are the values located at coordinates of the original image and the attacked image, respectively.

When extracting the watermark, the *normalized correlation coefficient* (NC) is calculated using the original watermark and the extracted watermark to judge the existence of the watermark and to measure the correctness of the extracted watermark. It is defined as
where and are the height and width of the watermark, respectively. and are the watermark bits located at of the original watermark and the extracted watermark. And is set to 1 if it is watermark bit 1; otherwise, it is set to −1. Likewise for . The PSNRs of the three watermarked images are 33.30 dB, 29.05 dB, and 30.59 dB, respectively. For brevity, only the Lena image is shown. Figure 6(a) shows that in the unattacked watermarked image, no visual difference exists at all compared with Figure 2(b). The extracted logo image shown in Figure 6(b) with implies that the extracted watermark is exactly the same as the original watermark.

**(a)**

**(b)**

Now, various classic image-processing operations, including JPEG compression, median filtering, mean filtering, sharping, blurring, noising, cropping, and scaling, are simulated to investigate the robustness of the proposed watermarking method. JPEG is one of the most frequently used image compression formats. The JPEG quality factor is a number between 0 and 100. When decreasing the quality factor, the image compression ratio increases, whereas the quality of the resulting image is significantly reduced. Table 1 lists the results of applying various JPEG quality factors to the test images. The proposed method can detect the existence of a watermark even for quality factor equal to 15. And the values of NC still exceed 0.49 for the three images. For other nongeometric attacks, such as median filters, mean filters, sharpening, and blurring, the resulting images are blurred or sharpened at the edge and so forth. The extracted watermarks are illustrated in Table 2. Here, both the median and mean filters are applied with masks of size 3 × 3, 5 × 5 and 7 × 7, respectively. As we can see, the proposed method is very robust against these non-geometric attacks and most of the extracted logo images are only slightly contaminated.

The results of noising and some common geometric attacks, for example, rotation, scaling and, cropping, are illustrated in Table 3. The Gaussian noise variation is varied from 5 to 15 and the step size is 5. As shown, the proposed method can effectively resist the attacks and all of the extracted watermarks could be clearly recognized. The attack of scaling is performed by downsampling a watermarked color image by factor 2 in both row and column directions. Since the image is shrunk from size 512 × 512 to 256 × 256, all range blocks are then contracted from 4 × 4 to 2 × 2 and should be calculated in a 2 × 2 block. For the clopping attack in our test, three types, namely, Type I, Type II, and Type III, are considered. In Type I, the left-top of an image of quarter size is cut away (i.e., 75% is remained), as shown in Figure 7(a); in Type II, the positions of the left-top and the right-bottom of the remained subimage are (31,471) and (81,461), respectively. In other words, the remaining region is about 64%, as shown in Figure 7(b); in Type III, two rectangular regions are removed and the remaining part is about 79%, as shown in Figure 7(c). In practical use, the position of the clipped block is usually needed to be located, especially for the case of Type II. Similar to the method in [42], it can be easily achieved by moving the clipped block point by point from the top-left of watermarked image and calculating the correlation coefficient. Clearly, the position of the clipped block should be located at the position where the correlation coefficient is maximum (theoretically, it equals 1 if without any other attacks). Then, in order to facilitate obtaining the subwatermark of the clipped block correctly, the vacant region is filled up with 0 (black). Hence, the size of the patched color image is still 512 × 512. Finally, the watermark extraction procedure is performed and the NC value of the watermark is calculated, as described in Section 4. The results are also listed in Table 3. It is worth noticing that the NC value is closely related with the size of clipped block: smaller size of the clipped block generally means smaller NC value. So, a more reasonable method for detecting the existence of a watermark in the clipped block is only computing the NC value of the sub-watermark of the clipped region (it is almost 1 in theory).

**(a) Type I**

**(b) Type II**

**(c) Type III**

We next compare the proposed method to three traditional methods: Tsui et al. [15], Kutter et al. [12] and Cox et al.’s [5] methods using the Lena image. The results are shown in Table 4. In the comparisons, all the methods still use the panda as the test binary logo image, but the size becomes 64 × 64, which can be obtained by downsampling the original 128 × 128 logo image. In our proposed method, the logo is needed to be expanded to 128 × 128 size by filling up with 0 on its three neighboring regions: left, bottom, and left-bottom, before it is embedded to the Lena image during the embedding procedure. Correspondingly, at the end of the extracting process, the logo is obtained by clipping the extracted logo image on the left-top with the size of 64 × 64. As shown in Table 4, the proposed method generally outperforms the other three methods for various attacks.

#### 6. Conclusion

In this work, we have developed a fractal color image watermarking method and assessed its performance. As we know, the main objectives of watermarking are to achieve a good robustness against attacks and retain high image quality as well. In the proposed method, we consider a pixel as a 3-D vector and an *RGB* version of orthogonalization fractal coding is performed in *RGB* space. In order to obtain higher image quality, instead of using three independent linear functions in classic fractal color coding, we utilize a general form of fractal affine transform by using the range block mean vector as the luminance offset and a 3 × 3 matrix as contrast scaling. In terms of the image quality measured by PSNR, the proposed fractal color coding significantly outperforms the traditional one. It is worth noting that, since the compression ratio is not the main point for watermarking, we do not discuss the quantization of the fractal parameters. On the other hand, we also show that the orthogonalization fractal color decoding is a mean vector-invariant iteration. Hence, the range block mean vector is a good place for embedding the watermark. For the sake of consistency with the human visual system and ensuring the watermark invisibility as well as possible, we further implement the hiding procedure in the *CIE* space and incorporate a JND scheme. Our experiments show that the proposed watermarking has good robustness against various distortions such as JPEG compression, median filtering, and geometric distortion such as scaling and cropping, as well as with an imperceptible change in image quality.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

The authors would like to thank the anonymous referees for their helpful comments and suggestions that led to a significant improvement of the paper. This work was supported by the National Natural Science Foundation of China (nos. 61003178, 61070087, 61373087, 11201312, 61272252, and 11071150), and by the Municipal Science and Technology Plan of Shenzhen in China (JC201105170615A, JC201005280508A).