Abstract

Recent reversible data hiding (RDH) work tends to realize adaptive embedding by discriminately modifying pixels according to image content. However, further optimization and computational complexity remain great challenges. By presenting a better incorporation of pixel value ordering (PVO) prediction and pairwise prediction-error expansion (PEE) technologies, this paper proposes a new RDH scheme. The largest/smallest three pixels of each block are utilized to generate error-pairs. To achieve optimization of the distribution of error pairs, two-layer embedding is introduced such that full-enclosed pixels of each block can be used to determine how to optimally define the spatial location of pixels within block. Then, to modify error pairs with less distortion introduced, the shifted pairing error is involved in the separable utilization of the other one; i.e., it serves as the context for recalculating the other one. Since the recalculation is equivalent to expansion bins selection, various extensions of original pairwise PEE are designed, parameterized, and combined into the so-called multiple pairwise PEE, with which the 2D histogram can be divided into a set of sub-ones for more accurate modification. The experimental results verify the superiority of the proposed scheme over several PVO-based schemes. On the Kodak image database, the average PSNR gains over original PVO-based pairwise PEE are 0.83 and 0.99 dB for capacities of 10,000 and 20,000 bits, respectively.

1. Introduction

REVERSIBLE data hiding (RDH) is a kind of data hiding technology in which some secret message is embedded into the cover media for various purposes such as secret communication and integrity authentication. Moreover, the cover media can be exactly recovered after extracting the hidden data. Compared with traditional data hiding technologies such as digital watermarking whose main concern lies in the robustness against various attacks, RDH is categorized as fragile watermarking technology, and its specific property is to exactly restore the cover media. RDH for uncompressed gray-scale image [1] is the most investigated subject of RDH nowadays. Furthermore, RDH in encrypted domain [2] has also gained increasing attention.

In terms of evaluating a RDH scheme, two aspects including fidelity and capacity are usually considered. To achieve desirable trade-off between them, many novel technologies, such as lossless compression [3, 4], histogram shifting [57], difference expansion [8], prediction-error expansion [912], and integer-to-integer transform [1316], have been devised. Among them, prediction-error expansion (PEE) has attracted much attention since it can better exploit the spatial redundancy in image. For PEE-based RDH, the target pixel is first predicted to generate prediction error. Then, the obtained errors are expanded or shifted to fulfill data embedding. So far, many advanced prediction methods have been devised to achieve optimization of the distribution of errors or a prediction-error histogram (PEH) [1722]. For prediction-error modification or PEH manipulation, advanced modification methods such as [2329] have also been proposed to embed the payload by modifying prediction errors as slightly as possible.

Adaptive embedding and high-dimensional PEE are two remarkable achievements in recent years. For example, pixels are adaptively selected for data embedding in [17]. In [29], the obtained errors are classified and embedded with different amounts of data. In [26], the expansion bins are adaptively selected so as to achieve optimal embedding performance. In [23], a histogram sequence is first derived from decomposing the PEH. Then, for each subhistogram the expansion bins are adaptively selected. High-dimensional PEE additionally exploits the correlation of errors by jointly modifying them [25, 27]. For example, pairwise PEE [27] outperforms conventional PEE by introducing more flexible modification of error pairs; i.e., a pair of expandable errors can only be embedded with one of the combinations of bits “00”, “01”, and “10”. Pairwise PEE was firstly incorporated with rhombus prediction [27]. Afterwards, there was also the incorporation of pairwise PEE and pixel value ordering (PVO) prediction [30]. To enhance pairwise PEE by combining expandable errors into a pair, the strategy of adaptive pairing is proposed later [28]. Based on the observation that adaptive pairing actually aims to achieve optimization of the distribution of error pairs, similar optimization is achieved in [31, 32] where pairwise PEE is better incorporated with PVO prediction. Existing pairwise PEE-based schemes have demonstrated the advantage of pairwise PEE in distortion reduction. On the other hand, PVO prediction has been verified to have high accuracy. Therefore, PVO-based pairwise PEE is of great significance to high-fidelity RDH.

In this paper, a new RDH scheme is proposed by extending PVO-based pairwise PEE into adaptive embedding from two aspects including adaptive error-pair generation and adaptive error-pair modification. The cover image is first divided into shadow and blank blocks to enable two-layer embedding. In this way, the local complexity of each block can be more precisely computed using full-enclosed pixels, and thus a better selection of pixel blocks for data embedding is presented. For error-pair generation, the largest/smallest three pixels of each block are utilized by improved PVO (IPVO) [33] to generate error pairs. For two correlated pixels, the principle of IPVO can be interpreted as predicting the one having relatively smaller location with the other one. Since errors valued 0 or 1 are expandable in IPVO, it is quite necessary to ensure that the to-be-predicted pixel with relatively smaller location has relatively larger gray value. Based on the above analysis, full-enclosed pixels are additionally used to estimate the distribution of pixels within block and to accordingly fulfill spatial location definition. As for error-pair modification, a key observation lies in that most secret data is embedded into error pairs consisting of one or more shiftable errors. For these error pairs, we propose to use the shifted error to adaptively recalculate the other pairing error. Such recalculation can be equivalent to expansion bins selection. In this way, multiple pairwise PEE is designed and parameterized by the selection of expansion bins. Finally, an efficient mechanism to determine the expansion bins aiming for minimized distortion is also proposed.

The rest of the paper is organized as follows. Section 2 briefly introduces some related work including PVO, IPVO, and PVO-based pairwise PEE. The proposed scheme is presented in detail in Section 3. The experimental results are given and discussed in Section 4. This paper is concluded in Section 5.

2.1. PVO-Based PEE

PVO-based PEE [34] exploits image redundancy in a blockwise manner. After dividing the cover image into nonoverlapped blocks consisting of pixels, pixels of each block are collected to generate a pixel sequence . Next, they are sorted to obtain where is the unique one-to-one mapping such that , if and . In this way, the prediction error is computed as

To fulfill data embedding, is modified aswhere . Accordingly, is modified to

2.2. IPVO-Based PEE

In PVO-based PEE, smooth blocks are preferentially selected and a large number of errors valued 0 are thus generated. However, these errors are ignored according to (2), which implies insufficient utilization of smooth blocks. To solve this problem, pixel location is introduced into IPVO-based PEE [33] and the prediction error is renewed aswhere . If , there is ; otherwise, there is . Obviously, the relative order of and remains unchanged after enlarging . In this way, the marked prediction error is computed asand accordingly is modified to

2.3. PVO-Based Pairwise PEE

Despite the superiority over PVO-based PEE, the performance of IPVO-based PEE is still limited by considering only the largest two pixels. By involving the 3rd largest pixel, is reinterpreted as in PVO-based pairwise PEE [30] where and are computed as

Obviously, expandable errors in IPVO-based PEE would turn to error pairs . In this way, the modification rule of IPVO-based PEE is described as a 2D mapping in Figure 1(a), for which only one pairing error is modifiable.

To better utilize the largest three pixels, another 2D mapping is proposed as shown in Figure 1(b). Correspondingly, and will be modified after embedding. For this new mapping, six types of error pairs or error-pair transforms are defined as shown in Figure 2. Among them, Type-D error pair is the most valuable one since it introduces the least distortion per embedding one bit, followed by Type-A, Type-B, Type-C, Type-E, and Type-F error pairs, respectively. Based on such consideration, it is quite necessary to optimize the distribution of various types of error pairs. For example, Type-D error pair consists of two expandable errors. As we can learn from IPVO, more expandable errors can be generated with pixel location involved.

3. Proposed Scheme

The proposed method is presented in this section. Firstly, the framework of two-layer embedding is introduced and a new calculation of local complexity is presented. Then, error-pair generation and adaptive definition of spatial location are introduced. Finally, multiple pairwise PEE and implementation details are presented.

3.1. Two-Layer Embedding

Existing PVO-based schemes are generally implemented by processing blocks in raster-scan order, i.e., from top to bottom and left to right. For each block, its right and bottom neighbors are always recovered before it at decoder such that they can serve as its context. To obtain a better context so as to achieve better pixel block selection, two-layer embedding is introduced into the proposed scheme. As shown in Figure 3(a), all pixels except boundary ones are divided into nonoverlapped blocks denoted by “shadow” and “blank,” respectively. During data embedding, shadow blocks are first embedded with half of the secret data and blank blocks are embedded with the remaining half later. At decoder, blank blocks are first recovered after data extraction. Then, shadow blocks are similarly processed later.

Referring to Figure 3(a), with the nearest four blank blocks serving as the context of a shadow block, a full-enclosed context is thus constructed. The first advantage of the new framework is to more precisely compute the local complexity of each block. In this paper, a complexity measurement is computed considering the absolute difference between two consecutive context pixels. As shown in Figure 3(b), is computed as the sum of all these differences.

3.2. Adaptive Error-Pair Generation

For each selected shadow or blank block, the largest three pixels are utilized to generate an error pair. It has been verified by IPVO that more expandable errors can be derived from two correlated pixels by predicting the one having relatively smaller location. Therefore, with , two pairing errors and are computed aswhere . According to IPVO we have expandable errors and here. By combining them into a pair for pairwise PEE, the initial 2D mapping is obtained and shown in Figure 4.

To make a comparison between the new 2D mapping in Figure 4 (denoted by ) and the original one in Figure 1(b) (denoted by ), the image Lena is divided into blocks with the size of . Then, the numbers of all types of error pairs are computed and presented in Table 1. As it is shown, the new 2D mapping not only achieves capacity improvement, but also introduces less distortion per embedding one bit.

According to , all of (0, 0), (1, 0), and (1, 1) are Type-D error pairs. Specifically, the correlation between and is exploited to generate while the one between and is exploited to generate . To generate more expandable or such that more Type-D error pairs can be obtained, the probability of should be promoted. To accomplish this, we propose to adaptively define the spatial location of pixels within block. Suppose the cover image (denoted by ) is divided into a set of sized blocks, for each of which we propose to use horizontal and vertical gradients to estimate the distribution of pixels within block and thus determine how to collect them. With the upper-left pixel located at , horizontal gradient and vertical gradient are computed as

For example, every pixel is quite likely to have a right neighbor having smaller gray value when . In this case, we should collect pixels from left to right so as to ensure pixel having relatively smaller location has relatively larger gray value. Similarly, every pixel is quite likely to have a bottom neighbor having smaller gray value when . In that case, we should collect pixels from top to bottom. By combining and , eight modes of spatial location definition are summarized and presented in Figure 5, taking block with the size of 3 × 4 as an example. Moreover, how to determine the optimal mode is also given in Figure 5.

Table 2 presents the performance comparison after applying adaptive mode. According to Tables 1 and 2, obtains more Type-A error pairs but fewer Type-E error pairs or specifically more error pairs (1, 0) but fewer error pairs (0, 1). This can be explained by the higher probability of brought by adaptive mode. As a result, the capacity increases whereas the distortion reduces. It is also seen that adaptive mode is even more beneficial for . Specifically, we have more Type-B, Type-C, and Type-D error pairs but fewer Type-F error pairs. So far, with the improvement in error-pair generation, the capacity, which used to be 9198 bits, now turns to be 11 219 bits. Although there is more serious distortion, the proposed scheme introduces less distortion per embedding one bit; i.e., whereas the original performance is measured by 2.338, the new one is measured by 2.213.

3.3. Multiple Pairwise PEE

Notice that the correlation between and is neglected so far. For , , and , their order can be one of the following three: (1) , (2) , and (3) . Then, in Case (3) we have and . By additionally considering the correlation between and , one can see that is shifted in case of . In this situation how is utilized for data embedding should be related to considering the similarity between them; i.e., the larger is, the more likely is shifted. In this paper, we propose to adaptively determine the predicted value of and recalculate aswith a parameter . To fulfill data embedding, is modified asand accordingly is modified to

Given that (i.e., ), we have . That is to say, if the correlation between and is likely to produce expandable error. Otherwise, we turn to utilize the correlation between and with serving as the predicted value of . By varying to cover every similarity between pairing errors, pairwise PEE will become more comprehensive. Similarly, is shifted in case of . In that situation serves as another candidate predicted value of and we recalculate as

Then, is obtained according to (11) and is similarly modified according to (12). Given that (i.e., ), we have expandable error  = 1. That is to say, one data bit will be embedded into error pairs . This can be equivalent to expansion bins selection. Table 3 presents the evolution of all types of error pairs from Case (3). The same evolution can be performed on error pairs from Cases (1) and (2) as well. In this way, the original 2D mapping is extended into for . One can verify that includes the original 2D mapping as a special case for taking .

3.4. Multiple Pairwise PEE Embedding

For each block, the minimum pixels are also utilized for data embedding. Let ; another two pairing errors and are calculated aswhere , . is similarly utilized like , and only and are reduced. By counting the frequency of error pairs and derived from blocks with a given complexity, a 2D histogram sequence for is defined as

Input: capacity requirement CR, activated mappings , performance measurement and ;
Output: parameters ;
(1) Function Optimization ()
(2)  
(3)  Update ()
(4)  While CR is not met do
(5)   
(6)   
(7)   fordo
(8)    ifandthen
(9)     
(10)     Update ()
(11)    End if
(12)   End for
(13)   Ifthen
(14)    
(15)    Update ()
(16)   End if
(17)  End while
(18)  Return
(19) End function
(20)
(21) FunctionUpdate ()
(22)  
(23)  
(24)  Ifthen
(25)   
(26)   
(27)  End if
(28) End function

So far we have histogram sequence and mapping sequence . The final step is the selection of histograms and mappings and the determination of parameters which indicates that the histogram is to be processed by the mapping . For histograms selection, the one with low complexity is preferentially selected. For mappings selection, any combination of various mappings can be attempted. Suppose that mappings are activated; takes values from to or 0 indicating that histogram is excluded. The determination of optimal is described as follows. Firstly, the capacity-distortion performance of in manipulating histogram is considered. Let denote the types of error pair defined by . The capacity and distortion, denoted by and , can be formulated as

Then, the complete determination procedure is presented in Algorithm 1.

An example is given here for better illustration. As shown in Figure 6, for such a sized block with , and are calculated according to (9) and (10). As a result, mode-6 is applied to obtain . After sorting, there is . Then, prediction errors are calculated as , , , and . Suppose that is determined as 4; error pairs and are to be processed by . Apparently, both error pairs are categorized as Type-C by . Finally, is enlarged to 189 while remains unchanged for embedding one bit valued 0, and is reduced to 176 while is reduced to 177 for embedding one bit valued 1.

3.5. Implementation of the Proposed Scheme

With specific block size and parameter , all pixels except boundary ones are divided into shadow blocks and blank blocks. Then, the embedding and extraction procedures for the shadow layer are given below. Notice that the shadow and blank layers are embedded equally; i.e., each layer is embedded with half of the secret message.

Firstly, to solve the problem of overflow/underflow, pixels are processed in raster-scan order by changing those valued 0 (255) into 1 (254). Here we use a location map record whether a pixel has been changed or not. Then, the location map is compressed using arithmetic coding and we use to denote its length.

Next, optimal is determined by capacity requirement which refers to the embedding of secret message and auxiliary information. Suppose that histograms with are selected; the transmission of parameters requires a capacity of bits. Then, the least significant bits (LSBs) of pixels from the first few blocks are recorded. These LSBs and the location map will be embedded along with half of the secret message as the payload.

With the first few ones skipped, shadow blocks are successively processed to embed the payload. For each block, local complexity and two error pairs , are first calculated. If , these two error pairs are processed by the 2D mapping ; otherwise, this block is skipped. After embedding the whole payload, the auxiliary information will be embedded by replacing the LSBS of the first pixels. The auxiliary information can be decomposed as follows:(i)Block size (4 bits)(ii)Parameters (4 bits) and (10 bits)(iii)Index of the last processed block (16 bits)(iv)Length of the location map (16 bits)(v)Parameters ( bits)

At decoder, the auxiliary information is first retrieved from the LSBs of pixels of the first few shadow blocks. Next, with the first few ones skipped, shadow blocks are successively processed. For each block, local complexity and two marked error pairs , are calculated. If , these two error pairs are processed by the inverse 2D mapping of to extract the embedded data and recover the original pixel values; otherwise, this block is skipped. Finally, the LSB sequence, the compressed location map, and the hidden message are retrieved from the extracted data bits. With the LSB sequence, the LSBs of pixels of the first few shadow blocks are recovered. The location map is decompressed and used to recover a pixel valued 1 (254).

4. Experimental Results

To demonstrate the performance of the proposed scheme, several experiments are conducted in this section. The test images are several gray-scale images with the size of . Except Barbara, they are all downloaded from the USC-SIPI1 image database. To achieve the best performance, the embedding procedure is conducted for various block sizes taking . In terms of activated mappings, we consider only the combination of mappings with where is set as 2, 4, and 8.

To verify the performance of the proposed scheme, some embedding parameters are presented in Table 4. As it is shown, for a given EC of 10,000 bits, the image Lena is divided into sized blocks. At error generation stage, mode-8 becomes the most popular one and it is chosen by about of blocks in shadow layer and about of blocks in blank layer. During shadow-layered embedding histograms with are selected, of which 34 are processed by and 19 are processed by . During blank-layered embedding histograms with are selected, of which 45 are processed by and 11 are processed by . As a result, the highest PSNR reaches up to 61.15 dB.

Then, the proposed scheme is compared with five state-of-the-art schemes [17, 27, 30, 31, 35], as shown in Figure 7. Notice that rhombus prediction is adopted by [17, 27] while PVO prediction is adopted by [30, 31, 35] and the proposed scheme. Figure 7 shows that larger maximum EC is always guaranteed by rhombus prediction. However, PVO-based schemes are generally better in fidelity when capacity is moderate. Rhombus prediction is firstly utilized in 1D PEE [17]. Later, it is incorporated with pairwise PEE to realize 2D PEE [27]. As shown in Figure 7, the comparison of [17, 27] indicates that significant performance improvement is brought by the incorporation with pairwise PEE.

PVO-based pairwise PEE is firstly proposed in [30]. Figure 7 shows that, for some smooth images with large capacity (e.g., Lena with an EC larger than 35,000 bits and Barbara with an EC larger than 27,000 bits), [27] outperforms [30]. In other cases, [30] always outperforms [27]. The proposed scheme can also be regarded as an improved PVO-based pairwise PEE. As shown in Figure 7, the proposed scheme always outperforms [30]. According to Tables 5 and 6, the average gains of the proposed scheme over [30] are 0.52 and 0.63 dB for capacities of 10,000 and 20,000 bits, respectively. The proposed scheme also outperforms [27] in almost all cases. Rhombus prediction guarantees very large capacity on smooth images Lena and Barbara. As a result, on these images the proposed scheme only outperforms [27] with EC not larger than 44,000 and 35,000 bits, respectively.

An improved incorporation of pairwise PEE and PVO prediction is proposed in [31] and it motivates the strategy of multiple pairwise PEE. In [31], the original 2D mapping of [30] is extended to two new ones according to local complexity. However, the gain is limited and the reason mainly lies in that only error pairs consisting of shiftable error are involved in the 2D mapping evolution. The proposed scheme, by contrast, achieves rather significant performance improvement over [30]. Referring to Tables 5 and 6, the proposed scheme outperforms [31] by 0.33 dB on average for an EC of 10,000 bits and 0.43 dB on average for an EC of 20,000 bits.

The spatial correlation of pixels in a block is similarly exploited in [35]. Although the obtained errors are modified individually, the advantage of [35] is to fully exploit block redundancy by predicting as many pixels as possible. As shown in Figure 7, the proposed scheme outperforms [35] in almost all cases. Referring to Tables 5 and 6, the proposed scheme outperforms [35] by 0.24 dB on average for an EC of 10,000 bits and 0.4 dB on average for an EC of 20,000 bits. Furthermore, there is also significant superiority of the proposed scheme over [35] in capacity.

In summary, the proposed scheme mainly aims to present a better incorporation of pairwise PEE and PVO prediction technologies. Specifically, original PVO-based pairwise PEE [30] is extended into adaptive embedding from two aspects including adaptive error-pair generation and adaptive error-pair modification. To better verify the advantage of adaptive embedding, the comparison of the proposed scheme with [30] on the Kodak2 image database which contains several or sized images is shown in Figure 8. It shows that the proposed scheme achieves significant superiority over [30] on all kinds of images.

Finally, Table 7 gives the main notations used in this paper and it is worth mentioning that although achievement is achieved through adaptive error-pair generation and adaptive error-pair modification, the proposed scheme still uses only the largest/smallest three pixels in a block for data embedding. In the future work, we will focus on how to predict more pixels in a block and obtain more error pairs.

5. Conclusion

In this paper, a novel RDH scheme is proposed by presenting a better incorporation of pairwise PEE and PVO prediction. For error-pair generation, two-layer embedding is applied to obtain full-enclosed pixels. Then, full-enclosed pixels are used to estimate the distribution of pixels within a block and to realize adaptive spatial location definition. In this way, the error-pair generation becomes adaptive, and the distribution of error pairs is thus optimized. For error-pair modification, pairwise PEE is extended to multiple pairwise PEE with which the 2D histogram can be divided into a set of sub-ones for more accurate modification. The experimental results have shown that the proposed scheme outperforms previous state-of-the-art schemes by improving the marked image quality.

Data Availability

Data are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by National Natural Science Foundation of China under Grant 61802074 and in part by Guangdong Basic and Applied Basic Research Foundation (2020A1515010760).