Abstract

Currently, the research for reversible watermarking focuses on the decreasing of image distortion. Aiming at this issue, this paper presents an improvement method to lower the embedding distortion based on the prediction-error expansion (PE) technique. Firstly, the extreme learning machine (ELM) with good generalization ability is utilized to enhance the prediction accuracy for image pixel value during the watermarking embedding, and the lower prediction error results in the reduction of image distortion. Moreover, an optimization operation for strengthening the performance of ELM is taken to further lessen the embedding distortion. With two popular predictors, that is, median edge detector (MED) predictor and gradient-adjusted predictor (GAP), the experimental results for the classical images and Kodak image set indicate that the proposed scheme achieves improvement for the lowering of image distortion compared with the classical PE scheme proposed by Thodi et al. and outperforms the improvement method presented by Coltuc and other existing approaches.

1. Introduction

Digital watermarking has been extensively applied to the fields of digital library, fingerprinting, and secret communication. The conventional watermarking algorithms [13] can introduce irreversible distortion of digital works, which do not apply to military and medical domains. However, the reversible watermarking, known as lossless technology, can restore the original signal and has become a hot area of research since the last ten years.

Currently, the reversible watermarking schemes mainly focus on the spatial domain and are divided into three categories including difference expansion-based method [47], histogram shifting-based method [812], and prediction error-based method [1317]. The difference expansion-based method was firstly proposed by Tian [4], which used the difference and average values of neighbor pixels to embed watermarking bits. Alattar [5] embedded the watermarking information by calculating the difference expansion of the integer transformation. Chen and Tsai [6] presented an adaptive block sized reversible image watermarking scheme with difference expansion, which had higher capacity than conventional fixed block sized method. Gu and Gao [7] used chaotic logistic map to randomly select the position for watermarking embedding and also to search the threshold space of reversibility. The proposed method achieved balance between the reversibility and the robustness with the help of chaotic system.

In [8], a breakthrough idea for histogram shifting was proposed by Ni et al. The watermarking bits were embedded by the shifting of zero-peak pairs of the image histogram. Ni’s method is nonblind, requiring the encoder to transmit the extra side information to the decoder. To solve this issue, some blind watermarking schemes based on histogram shifting are presented. Wang et al. [9] presented a multilevel embedding method using histogram shifting without the side information, in which the synchronization mechanism is adopted to ensure the selection of optimal zero-peak pairs in each level. Coatrieux et al. [10] contributed a modulation method of dynamic histogram shifting, adaptively taking care of the local specificities of the image content and inserting data in textured areas. Moreover, some reversible watermarking algorithms, combining histogram shifting with prediction technique, are presented to satisfy both high embedding capacity and good visual quality [11, 12].

The prediction-error expansion (PE) algorithm was developed by Thodi and Rodríguez [13, 14], which is essentially a particular form of difference expansion. Thodi and Rodríguez employed a pixel’s three-neighbor context to predict the pixel value and used the expansion of prediction-error between the original pixel value and the estimated one to embed message. The PE algorithm achieved a maximal embedding rate of 1 bit per pixel (bpp). Aiming at reducing the embedding distortion in the PE algorithm, Coltuc [15] proposed an improvement scheme. Instead of embedding the entire expanded difference into the current pixel, the expanded difference is split between the current pixel and its prediction context with global optimization, and Coltuc’s scheme achieved the improvement with popular predictors. Sachnev et al. firstly proposed to utilize the rhombus-context predictor (RCP) to predict the centered pixel [16], and the rhombus-context is composed of the four horizontal/vertical close neighbors. Later, some improvement schemes based on RCP are proposed by Ou et al. [17], Dragoi and Coltuc [18], and Li et al. [19].

Recently, some reversible watermarking schemes on frequency domain are presented [2022]. Lei et al. [20] applied two-level wavelet transform to each subblock of an image and then performed singular value decomposition (SVD) on the low frequency wavelet coefficients of each block to generate the singular values. The watermark bits were embedded by quantizing the first singular values using the recursive dither modulation (RDM) approach. In [21], an intelligent reversible watermarking approach GA-RevWM for medical images is proposed. GA-RevWM adopted block-based embedding strategy using integer wavelet transform (IWT), and an intelligent method for threshold selection with genetic algorithm (GA) was applied to increase the imperceptibility of the marked image.

In order to improve the performance of reversible watermarking, this paper proposes a scheme to lower the embedding distortion based on PE. The main idea of the presented method is to enhance the accuracy of prediction value of image pixel by the extreme learning machine (ELM) [23, 24] with good generalization ability. Moreover, an optimized method of ELM is utilized to further diminish the prediction error. In this paper, the improved PE scheme is tested using two popular predictors, that is, median edge detector (MED) predictor [25] and gradient-adjusted predictor (GAP) [26]. The experimental results demonstrate that the proposed scheme achieves improvement for the image distortion compared with the classical PE scheme proposed by Thodi and Rodríguez [13, 14]. In addition, through the experimental contrast and theoretical analysis between the proposed approach and the noted improvement embedding scheme proposed by Coltuc [15], it is observed that the proposed approach outperforms Coltuc’s one.

The outline of the paper is organized as follows. The basic principle of PE scheme is presented in Section 2. Section 3 describes the improvement method of PE-based reversible watermarking using the optimized ELM. The improvement schemes with MED and GAP predictors are given in Section 4, respectively. Experimental results and analyses are shown in Section 5. Finally, Section 6 draws the conclusion.

2. Basic Principle of PE

In the PE algorithm [13], the prediction-error between the original image pixel value and the estimated value is utilized to embed the watermarking. The concrete procedure of embedding watermarking of the PE algorithm is shown as follows.

Step 1. Scan the image according to certain sequence; then starting with the first pixel of the image, the prediction value of pixel is computed with , a neighborhood of , by a mathematical equation (e.g., (14) of Section 4).

Step 2. With the prediction error (), the prediction-error expansion is defined as follows:where is watermarking bit, , and then the watermarked pixel is given byIf , then the pixel is considered as extensible one.

Step 3. Select a threshold ; if the prediction error satisfies and the pixel is extensible, then we mark the pixel with “1”; otherwise, we mark it with “0.” Thus a matrix called the location map (LM) is composed of a set of “0” and “1,” which has same size with the original image. Then LM is compressed by arithmetic encoding (AE) or run-length encoding (RLE), generating a bit stream with a length of .

Step 4. The least significant bits (LSBs) of first pixels of the image form a sequence noted as , which is utilized for the lossless image restoration, and then in terms of (2), we embed the watermarking information and into the extensible image pixels except for first ones.

Step 5. Embed into the LSBs of first pixels, and generate the final watermarked image.

The watermarking extracting procedure of PE algorithm is shown as follows.

Step 1. Scan the image according to same sequence as the embedding procedure; then extract the LSBs of first pixels, decompressed by AE or RLE to obtain LM.

Step 2. Start with final pixel of LM; if the pixel value is “1,” then the prediction error is calculated by , and the watermarking information and are extracted by

Step 3. The LSBs of first pixels of the image are replaced with .

Step 4. Compute using , and generate the restored pixel value by

3. Proposed Scheme

In this section, the basic principle of ELM is firstly introduced. Then the optimized ELM is provided with better prediction performance than ELM. Finally, a PE-based improvement scheme using optimized ELM is presented.

3.1. Extreme Learning Machine

The extreme learning machine (ELM) is proposed by Huang et al. [23, 24] in terms of the generalized inverse theory, by which the output weight of the learning network can be achieved only with a step calculation. Compared with neural network (NN) [27, 28] and support vector machine (SVM) [2931], ELM greatly improves the generalization ability and learning speed of the network [23]. The network training model of ELM uses the structure of a single layer feed-forward neural network, shown in Figure 1, where , , and are the node number of network input layer, hidden layer, and output layer, respectively. is the activation function, and is the hidden node threshold. is the training sample, where and , .

The math expression of ELM network model is given by where denotes the input weight vector connecting input layer nodes to the th hidden layer node, is the output weight vector connecting the th hidden layer node to output layer nodes, and represents the network output vector.

The cost function of ELM is defined by where . Huang et al. [23] indicate that seeking for optimal is the training object of ELM, and with the optimal , the smallest error marked as between the network output value and the corresponding real value is achieved. can be written further as follows:where , , and are the output matrix of hidden layer, the output weight matrix, and the objective matrix for training sample set, respectively.

When the activation function of network hidden layer is infinitely differentiable, the network input weight and the threshold of hidden layer nodes can be randomly assigned, and the matrix is a constant one. The learning process of ELM is equivalent to calculating the least squares solution of minimum norm of the linear system , and the calculation formula for is shown bywhere is the Moore-Penrose generalized inverse of . After is solved, the network training process of ELM ends [23].

3.2. Optimized ELM

Due to the random selection of input weight and hidden node threshold, it easily results in the fact that the generalization ability and the stability of the ELM regression model are not ideal. Aiming at solving this issue, the optimized ELM (OELM) is developed through the search and adjusting of mutative scale chaos [32]. Then the optimal input weight and hidden node threshold are achieved with OELM.

In OELM, (7) can be simply written as where , , correspond to the input weight and hidden node threshold of OELM, respectively, and , . For the convenience of description, is simply written as , where . The optimization object of ELM is to seek the optimal input weights and hidden node thresholds, by which can be achieved. However, for the training effect of neural network, it is not appropriate to only use fitting as unique measurement standard. A training network with relatively lesser fitting error does not always have lesser error for the test dataset [33]. Bartlett [34] pointed out that when two networks have close fitting, one with lesser output weight has a better generalization performance. Therefore, we add the output weight paradigm, , as the auxiliary standard of network selection. When the fitting errors of adjacent two trainings are very close during the ELM optimization, we select with lesser .

The concrete steps of OELM scheme are described as follows.

Step 1. Some variables are initialized. Let , , , , , and , . The optimal objective function value and the optimal output weight are initialized to large positives. Here, is the initial value of a chaotic system .

Step 2. are mapped into the definition domains of and denoted by ; that is,

Step 3. For a given training set, can be solved by (8), and is obtained using (5) and (9). and are assigned according to (11), where is a predefined threshold, that is, a small positive:

Step 4. Consider, .

Step 5. Repeat Steps 2–4 until remains invariable for loops, and then go to Step 6.

Step 6. The searching regions of are reduced; that is,where , , and are the current optimal solution.

Step 7. and are assigned again bywhere is a small positive, for example, 0.1.

Step 8. Consider; if or , then end the optimization process and the network with is generated; else, , and go to Step 2 to continue. Here, is the appointed maximum value of , and is the acceptable value of .

3.3. Improvement Method Using OELM

For the PE-based reversible watermarking algorithm, the reducing of the embedding distortion can be realized through decreasing prediction error. In our improvement method, OELM is adopted to generate the more precise prediction value for image pixel; thereby, the prediction error is maintained within a small scope. The detailed procedure of the improvement scheme is shown as follows.

Step 1. According to the scanning sequence from top to bottom and left to right, the preassigned neighbor pixels of all image pixels are collected as the input part of training set of OELM, and the corresponding image pixels are considered as the output part of training set of OELM.

Step 2. The data of training set are normalized; that is, they are transformed into the domain of . After the training and learning processes of OELM end, the final OELM regression model is generated.

Step 3. The input part of training set is imported into the OELM model, and then the corresponding prediction value of image pixel is achieved by the output of OELM model. The final prediction value is generated by transforming into the domain of and rounding operation.

Step 4. The rest of the improvement scheme is referred to Steps 2–5 of watermarking embedding procedure of PE algorithm listed at Section 2.

Step 5. At the detection side, the watermarking extracting and image restoration are achieved by Steps 1–4 of watermarking extracting procedure of PE algorithm shown at Section 2, and it should be noted in Step 2 that the prediction value of pixel is generated by OELM model obtained through a secure channel.

4.1. Improvement Scheme with MED Predictor

Any predictors can be applied to the PE algorithm of Section 2. As a kind of high-performance predictor, MED has been utilized in some reversible watermarking schemes [1315] and also in JPEG-LS standard [25]. Through MED, the prediction value of pixel can be achieved using (14) in terms of the template shown in Figure 2: where , , and are the neighbors of pixel , noted as . Actually, the MED predictor is a combination of the two interpretations [15, 25].

The proposed improvement scheme with OELM using MED predictor (noted as OELM-MED) is basically similar to that shown in Section 3.3, and it is noted that the neighbor pixels , , and of all image pixels are collected as the input of training set of OELM in Step 1 of Section 3.3.

4.2. Improvement Scheme with GAP Predictor

As a major part of context-based, adaptive, lossless image coding (CALIC) algorithm [26], the GAP predictor has a more complex feature than the MED one, and the prediction context of GAP is enlarged to 7 pixels (Figure 3). The prediction result is based on the vertical and horizontal gradients given by

Let beThen the prediction value of pixel is computed bywhere is defined by

It is noted that in (17) should be rounded to integer value so as to be applied to the PE-based reversible watermarking algorithm. Since GAP has a better estimation performance than MED, the reversible watermarking methods with GAP outperform the ones using MED [35, 36].

The proposed improvement scheme with OELM using GAP predictor (noted as OELM-GAP) is basically similar to that shown in Section 3.3, and it is noted that the preassigned seven neighbor pixels of all image pixels are collected as the input of training set of OELM in Step 1 of Section 3.3.

4.3. An Example for Prediction Error Improvement

As an example, the randomly selected 8 × 8 block data of Lena image is used to explain the prediction error improvement produced by OELM. Figure 4(a) shows original block data, and Figures 4(b) and 4(c) give the corresponding versions predicted by MED and OELM-MED, respectively. It is observed that OELM-MED provides lower prediction error compared with MED. For whole Lena image, the average absolute prediction error produced by MED is 3.105 per pixel, the corresponding one by OELM-MED is 2.908 per pixel, and the prediction improvement rate is 6.34%. We also give the statistical data of GAP for Lena image, the average absolute prediction error by GAP is 2.957 per pixel, the corresponding one by OELM-GAP is 2.619 per pixel, and the prediction improvement rate is 11.43%. Compared with OELM-MED, a higher improvement rate produced by OELM-GAP is due to the fact that the prediction context of GAP is more than that of MED, which is conducive to learning and prediction of OELM.

5. Experimental Results and Analyses

In our experiments, the indicator of measuring the embedding capacity adopts the pure hiding rate, that is, the rate between the embedded watermarking bits (not including the overhead) and the amount of image pixels. The location map for the overflow/underflow pixels is compressed by arithmetic encoding (AE).

5.1. Test Images

Our scheme and Coltuc’s one are all the improvement methods based on the PE proposed by Thodi and Rodríguez. For achieving comparisons conveniently with Coltuc’s improvement scheme, the experiments use the same two image sets as those adopted in Coltuc’s scheme [15]. Three classical gray-level images, Lena, Jet, and Mandrill (Figure 5), with distinct statistic features, are considered as the first set. Specifically, Mandrill includes large texture, in Jet, a big consistent region exists, and Lena bonds consistent region with texture.

The second set consists of 24 true-color images with the size of 512 × 768, called Kodak test set. These images have a better visual quality than the classical ones used usually in some reversible watermarking algorithms. At http://www.r0k.us/graphics/kodak/, the portable network graphics (PNG) format of Kodak test set can be downloaded. The gray-level versions of the color images are given by where , , and are the red, green, and blue components of the pixel located at coordinates , respectively. Figure 6 shows the gray-level versions of Kodak test set.

5.2. Experiments for the Embedding Threshold and Reversibility

The embedding threshold is an important factor influencing embedding capacity and marked image’s visual quality. The larger the embedding threshold is, the more the embedded bits are and the smaller the PSNR is, which are observed easily through Figures 7 and 8. Moreover, with the increasing of embedding threshold, the rising trend of pure hiding rate slows down. When the embedding threshold increases to a certain value, the pure hiding rate does not rise. The maximum pure hiding rates of different images are not the same, such that those of Lena, Jet, and Mandrill are 0.953, 0.954, and 0.935, respectively, and the corresponding thresholds are 70, 80, and 93, respectively. If more payloads need to be embedded, we can achieve embedding more payloads through multiple runs of watermarking embedding process.

Figure 9 describes the reversible capability of proposed scheme. The original and watermarked images are listed at the columns (a) and (b) in Figure 9, respectively. Column (c) shows the difference between the original and marked images, which is not perceptive for human eyes; thus an enhancement measure is adopted to make the difference visible through the Matlab function imadjust. The original image content, shown in column (d), is restored after the embedded bits are extracted at the receiving side. Column (e) shows the differences between original and restored images, which are black regions and demonstrate that the restored images are totally the same as original ones. Then we compute the SSIM (structure similarity measure index) between original and restored images and get SSIM = 1, demonstrating that the proposed scheme is fully reversible.

5.3. Comparison with Congeneric Algorithms

First, the experiments are made on three classical images. The average absolute prediction errors with original MED proposed by Thodi and Rodríguez are 3.105 per pixel for Lena, 3.647 per pixel for Jet, and 12.251 per pixel for Mandrill, respectively. The corresponding ones with proposed improvement version (OELM-MED) are 2.908 per pixel for Lena, 3.360 per pixel for Jet, and 11.416 per pixel for Mandrill, respectively. Also, the average absolute prediction errors with original GAP are 2.957 per pixel for Lena, 3.497 per pixel for Jet, and 11.661 per pixel for Mandrill, respectively. The corresponding ones by OELM-GAP are 2.619 per pixel for Lena, 3.247 per pixel for Jet, and 11.142 per pixel for Mandrill, respectively. Figures 10 and 11 show the comparison results for prediction errors, from which it is observed that the average absolute prediction errors with proposed scheme are lower than ones with original scheme and the prediction performance of GAP outperforms that of MED.

In terms of PSNR with respect to the pure hiding rate, the comparisons between original MED and OELM-MED are presented in Figure 12, from which it is observed that the proposed scheme achieves a performance boost over the one proposed by Thodi and Rodríguez. The average improvements, that is, the average values of all improvements under all kinds of pure hiding rates, are 0.47 dB for Lena, 0.41 dB for Jet, and 0.61 dB for Mandrill, respectively. Also, for OELM-GAP, the average improvements are 0.45 dB for Lena, 0.28 dB for Jet, and 0.30 dB for Mandrill, respectively. The comparisons between original GAP and OELM-GAP are shown in Figure 13.

In addition, from Figures 12 and 13, it is further observed that the improvements in the cases of small hiding rates are less than those in the cases of large hiding rates, which is due to the fact that small hiding rate means that small embedding threshold is adopted, introducing small prediction error and small image distortion; thus the improvement achieved is relatively small compared with one with large hiding rate. Also, we find that the improvement on Jet is less than those on Lena and Mandrill; this is due to the fact that Jet has large consistent regions, which can give rise to small prediction error in original PE scheme with MED/GAP, and, in other words, the proposed scheme can achieve better improvement effect for the images with large texture.

Next, the comparison experiments between Coltuc’s improvement scheme and proposed one are implemented. Coltuc’s scheme is designed to decrease the embedding distortion, of which the basic idea is to share the expanded difference between the current pixel and its prediction context. The marked images have better visual quality than those in which the expanded difference is simply embedded into the current pixel. Meanwhile, the modification on the current pixel’s context may enlarge the distortion induced by the prediction of its successors. To solve this problem, Coltuc’s scheme also realizes the optimization of the global embedding error by varying a parameter and global search. Through Table 1, it is inferred that, for three classical images, Coltuc’s scheme and proposed one with MED or GAP predictor all make improvements compared with Thodi’s scheme, and the proposed scheme overall outperforms Coltuc’s one. Table 2 lists the experimental results of Coltuc’s and proposed schemes on Kodak test images. With MED, Coltuc’s scheme achieves the average improvement of 0.54 dB for 24 test images and the best average improvement of 1.10 dB for image 5; the proposed scheme achieves the average improvement of 0.67 dB for 24 test images and the maximum average result of 1.15 dB for image 1. With GAP, the corresponding data of Coltuc’s scheme are 0.09 and 0.43, respectively, and ones of proposed scheme are 0.21 and 0.68, respectively. Therefore, for the high quality images, the proposed improvement scheme also outperforms Coltuc’s one.

Moreover, for Coltuc’s scheme, at the same time of achieving improvement for embedding distortion, the modification on the current pixel’s context may lead to incorrect location map of overflow/underflow pixels, which influences the stability of algorithm. On the contrary, the proposed method decreases the embedding distortion by only reducing the prediction error of current pixel using the optimized ELM without modifying the context; thus the generated location map is stable and exact. For the computational complexity, Coltuc’s scheme is similar to proposed one, and both make use of the global search method. In real-world applications, it is better to employ parallel process to lower the running time. Also, through further analysis on Figures 12 and 13 and Tables 1 and 2, it is observed that the improvements with MED from two schemes are larger than ones with GAP, indicating that GAP has a better estimation performance than MED on the other hand.

5.4. Expanded Experiments with Rhombus-Context

Actually, any context can be used together with OELM in the PE-based schemes. Here, some expanded experiments are provided using the well-known rhombus-context composed of the four horizontal/vertical close neighbors. The rhombus-context predictor (RCP) proposed by Sachnev et al. is designed to predict the centered pixel by the average value of four horizontal/vertical close neighbors [16]. The prediction performance of RCP is superior to those of MED and GAP, which can be embodied by the prediction error improvement. The average absolute prediction errors achieved by RCP are 2.303 per pixel for Lena, 2.938 per pixel for Jet, and 11.640 per pixel for Mandrill, respectively. Then, we use OELM with rhombus-context (noted as OELM-RCP), substituting RCP, to predict the centered pixel. The average absolute prediction errors achieved by OELM-RCP are 1.939 per pixel for Lena, 2.460 per pixel for Jet, and 10.231 per pixel for Mandrill, respectively.

Figure 14 shows intuitively the average absolute prediction errors achieved by RCP and OELM-RCP, from which it is observed that OELM-RCP has lower prediction errors than RCP. Also, from Figures 10, 11, and 14, it is observed that the prediction performance of OELM-RCP is better than those of OELM-MED and OELM-GAP.

For PSNR with respect to the pure hiding rate, the experimental comparisons between original embedding scheme with RCP proposed by Sachnev et al. and the same embedding scheme with OELM-RCP are presented in Figure 15, from which it is observed that, compared with the PSNR achieved by RCP under the same pure hiding rate, the PSNR achieved by OELM-RCP is enhanced. In addition, the improvement schemes proposed by Ou et al. [17], Dragoi and Coltuc [18], and Li et al. [19] are all based on the same RCP; thus it is easily known that similar performance enhancements can be achieved when OELM-RCP is applied to these schemes instead of RCP.

5.5. Comparison with Other Non-PE Based Schemes

Figure 16 shows the performance comparisons between proposed scheme and other non-PE based schemes [21, 22, 37] in terms of the pure hiding rate and PSNR for Lena image. It is observed that the proposed method is better than these non-PE based schemes; particularly, as the pure hiding rate is larger than 0.3, proposed method is obviously prior to other schemes.

6. Conclusion

The reversible watermarking has become an active research area due to its ability for recovering the watermarked digital works to original status, and the focus of reversible watermarking mainly is the reduction of embedding distortion under keeping certain embedding capacity. The noted improvement embedding approach developed by Coltuc splits the expanded difference between the current pixel and its prediction context, and the scheme obtains a low distortion than the simple embedding of the expanded difference into the current pixel. But the changing for prediction context may cause inexact location map. In this paper, the proposed improvement scheme generates the prediction value of current pixel using an optimized ELM, meanwhile makes no changes on its context, and obtains the better performance for distortion reduction than Coltuc’s approach which is profited from the generalization ability of optimized ELM.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported in part by the Humanity and Social Science Youth foundation of Ministry of Education, China (Grant no. 13YJC870007), the National Natural Science Foundation of China (Grant nos. 61362032 and 61462048), and the Natural Science Foundation of Jiangxi Province, China (Grant nos. 20151BAB207003 and 20151BAB217015). The authors would like to thank Dr. Liya Xu for helping them with English writing.