#### Abstract

In this paper, an efficient reversible data hiding method for encrypted image based on neighborhood prediction is proposed, which includes image encryption, reversible data hiding in encrypted domain, and hidden data extraction. The cover image is first partitioned into non-overlapping blocks, and then the pixel value in each block is encrypted by modulo operation. Therefore, the linear prediction difference in the block that satisfies the specific condition is consistent before and after encryption, ensuring that data extraction is completely separable from image decryption. In addition, by using the linear weighting of three adjacent pixels in the block to predict the current pixel, the prediction accuracy can be improved. The data-hider, who does not know the original image content, may embed additional data based on prediction difference histogram modification. Data extraction and image recovery are free of any error. Experimental results demonstrate the feasibility and efficiency of the proposed scheme.

#### 1. Introduction

Cloud computing is revolutionizing the way digital media is stored and processed. However, the privacy and security of the digital media that resides on the cloud server may be questionable, since cloud data center is managed by a third party cloud server. One of the best ways to ensure the security and confidentiality is to encrypt the media. The user first converts the sensitive content into unintelligible form before uploading it to the cloud such that no information is revealed at all. All the processing and calculation in the cloud are performed in the cipher-text domain, and the processing result is provided to the user [1]. The authorized terminal user who has the decryption key can obtain the plaintext data after decrypting. Under this specific circumstance, the cloud service provider is not authorized to access the plaintext content. However, the effective management and reliability protection of massive cipher-text data in the cloud has become an urgent problem to be solved. Data hiding in encrypted domain is a new research field, which can directly embed some additional messages such as owner identity or authentication data, directly into an encrypted data for effective management or tamper detection purposes.

In the past few years, a considerable number of data hiding schemes for encrypted images or videos have been reported in the literature [2–10]. However, in these schemes, the original cover cannot be recovered completely without distortion due to data embedding. Strictly speaking, cloud service providers are not entitled to introduce permanent distortion, especially medical and military images. Consequently, many researchers show their interests in developing reversible data hiding in encrypted images (RDH-EI). Due to reversibility, the original image can be fully recovered after extracting the secret information [11]. In general, an RDH-EI framework has three end users, i.e., the content-owner, data-hider, and receiver. To preserve privacy, the content owner encrypts the original image before sending it to the data-hider. The data-hider embeds some additional information into the encrypted image and has no privilege to access the original content. At the receiver end, the authorized user can extract the hidden information and losslessly recover the original image. This can be used in many privacy-preserving applications such as medical cloud storage and image management.

Generally, existing RDH-EI methods can be classified into three categories, namely, methods by vacating room after encryption (VRAE) [12–22], methods by reserving room before encryption (RRBE) [23–25], and methods based on homomorphic encryption [26–33]. The early VRAE framework is proposed by Zhang [12, 13] and Hong et al. [14]. The entire data of an uncompressed image are encrypted directly by a stream cipher. Then the data-hider embeds the additional data by modifying a small portion of encrypted data. The advantage is that the operation of end user is simple and efficient. But the embedding capacity is relatively small. More importantly, the accuracy of data extraction and the lossless recovery of original image are not satisfactory. Later, sparse coding is applied in RDH-EI to achieve high image quality [15]. Qian and Zhang [16] proposed a RDH-EI scheme using distributed source coding (DSC). Huang et al. [17] designed a new framework for RDH in encrypted domain, which integrates previous difference histogram shifting based RDH approaches via a new encryption strategy. Zhou* et al*. [18] used a public key modulation mechanism to embed additional data, without access to the encryption key. Recently, a high capacity reversible data hiding approach based on MSB (most significant bit) prediction is presented in [19]. In addition, several other VRAE schemes are reported in [20–22].

In RRBE framework, the embedding room is vacated in the plaintext domain. Ma* et al*. [23] proposed a RRBE method to reserve room from the original image before encryption. Secret information then can be embedded into the reserved space directly. Later, some RRBE methods have been proposed by reserving the space using different techniques [24, 25]. The advantages of this framework are mainly reflected in two aspects: relatively large embedding capacity and pure reversibility. But the data-hider should know the vacated room created by the content owner before encryption; otherwise he/she cannot perform information embedding with RRBE. This will undoubtedly lead to information leakage. In addition to VRAE and RRBE, another type of method is based on homomorphic encryption. Chen et al. [26] first proposed a Paillier cryptosystem based RDH-EI approach. Later, Shiu et al. [27] improved Chen et al.’s method [26] by transplanting difference expansion into homomorphic encryption. In addition, more RDH methods in homomorphic encrypted domain have been investigated in [28–33]. However, the most important problem of homomorphic encryption, such as Paillier cryptosystem, is that it will cause data expansion after encryption.

In this paper, we develop an effective and reliable framework for RDH-EI. In fact, the proposed method belongs to the first category. Its main contribution is the combination of modular addition and prediction error histogram modification. Its advantages are mainly reflected in the following aspects. First of all, room for data hiding does not need to be vacated before encryption. Secondly, it can be fully separable and fully reversible. Thirdly, modular addition operation with additive homomorphism is used for image encryption. Unlike the public key cryptosystem in [26–29], it does not result in data expansion. More importantly, the linear prediction difference in the image block that satisfies certain conditions remains the same before and after encryption, which ensures that data extraction is completely separated from image decryption. Finally, unlike the prediction technology in [31, 32], the proposed loop prediction technique can obtain the prediction difference of each pixel in the block, which greatly increases the carrier for information embedding. In addition, the prediction accuracy can be improved by using the linear weighting of the three adjacent pixel values. Thus it can effectively improve the embedding capacity. The rest of the paper is organized as follows. In Section 2, we present the proposed scheme, which includes image encryption, data embedding in encrypted image, data extraction, and original image recovery. Experimental results and analysis are given in Section 3. Finally in Section 4, conclusions and future work are drawn.

#### 2. Proposed Scheme

The framework of the proposed scheme has been outlined in Figure 1, which shows the various steps that are performed in RDH-EI. It is composed of three phases, i.e., image encryption, data embedding in encrypted image, and data extraction and image recovery. First, the content owner encrypts the original image using an encryption key and sends the encrypted version to the data-hider. On the server side, the data-hider embeds some additional data into the encrypted image using an embedding key. Here, the data-hider is not authorized to access the original content (i.e., plaintext). At the receiving end, an authorized user can extract the hidden data and losslessly recover the original image.

##### 2.1. Image Encryption

Let the size of an original image be , and each pixel value lies in the range , , . The original image is first divided into non-overlapping blocks. If both* M* and* N* are multiples of 2, the cover image is divided into W/2×H/2 image blocks with a size of 2 × 2 as shown in Figure 2. If* M* or* N* is not a multiple of 2, the cover image is divided into blocks, including blocks of size 2 × 2. Here, denotes the smallest integer greater than or equal to , and is a floor function to obtain the greatest integer less than or equal to .

To ensure that pixels in the same block are encrypted with the same random value, the encryption matrix is obtained using the following equation:where is a pseudo-random matrix generated using pseudo-random number generator (PRNG) with the encryption key . After the encryption matrix is obtained, image encryption can be performed by using modulo-256 addition as follows:where represents an encrypted image. The corresponding decryption can be done in the following manner:

According to equation (2), each pixel within an image block is added by the same random integer for modulation. Thus, spatial correlations will be kept within small image blocks, and they can be exploited to embed secret data. Moreover, the main advantages of this algorithm are that they are simple and operate at a high speed.

In order to achieve error-free data extraction and complete reversibility, we also need to determine whether the current block is embeddable during the encryption process. If the current block is 2 × 2, the value of the current random element in is added to the pixel values of the four points in the current block, respectively. We denote the 4 pixels in* m*-th original image block as , , , and . After using the modulo operation with a random value , the 4 pixels in the new image block are calculated by (for ). If the block is identified as an embeddable block, the following conditions need to be satisfied:

andwhere and . Otherwise, the block is identified as a non-embeddable block.

A binary location map is used to record the locations. Specifically, if the current block is an embeddable block, the corresponding element is marked as “0” in . Otherwise, the element is marked as “1”. Since is mainly composed of zero, it can be compressed with a lossless compression algorithm. Subsequently, it can be embedded in the marginal area by using LSB replacement. Alternatively, it can be saved as a part of side information and transmitted to the receiver side [34].

##### 2.2. Data Embedding in Encrypted Image

At this phase, the data-hider can embed some secret data into the encrypted image without knowing the image content. The whole process includes difference histogram generation and difference histogram modification.

*(**1) Difference Histogram Generation. *To generate the histogram of prediction errors in encrypted domain, an efficient loop prediction technique is adopted. The detailed procedure can be described as follows.

*Step 1. *After obtaining the encrypted image, the data-hider divides it into non-overlapping 2x2 blocks by the same way in image encryption. If the width or height of the image is not a multiple of 2, the right edge blocks or the bottom edge blocks whose size are not 2x2 will be ignored during data embedding.

*Step 2. *According to the location map , it can be determined whether the current block is an embeddable block. If it is an embeddable block, go to Step 3. If it is a non-embeddable block, the next block is taken as the current block and proceed to Step 2.

*Step 3. *For the encrypted pixel value in the upper left corner of the* m*-th image block, the prediction difference is calculated according to the following equation:where , , and are the three remaining encrypted pixels located in the same column, row, and diagonal directions with . In addition, is the predicted difference value. Weight coefficients , , and are satisfied with , .

Although , , , and are encrypted values, the following equation is easily proved under the condition that the embeddable block is satisfied:where is the original pixel value in the upper left corner of the* m*-th image block, is the horizontal adjacent pixel value, is the vertical adjacent pixel value, and is the diagonal pixel value.

*Proof. *According to Section 2.1, if the current block is embeddable, then , , , and are greater than 255 or less than 256 at the same time. Consequently, the above equation can be simplified to the form of equation (7). The setting of the weighting coefficient has a certain influence on the prediction accuracy. For the sake of simplicity, we set , , and .

It can be seen from the above proof that the prediction difference remains unchanged before and after encryption. All other 2x2 blocks can be processed in the same manner. There is a high degree of correlation between adjacent pixels in a local region of an image. That is, they have similar gray values, or even the same gray value. Thus, the resulting difference histogram has a higher peak than the histogram of the original image. The coefficient histogram is usually defined as where # denotes the cardinal number of a set and* m* represents the block number. The difference histograms of some gray images in Figure 3 illustrate the distribution of the prediction errors.

**(a) Lena**

**(b) Baboon**

**(c) Barbara**

**(d) Peppers**

**(e) Truck**

**(f) Boat**

**(g) Man**

**(h) Airplane**

*(**2) Difference Histogram Modification. *In data embedding procedure, for each block, we use equation (6) to calculate the prediction error, which is utilized for secret data embedding. After the first round of embedding, the pixel value in each block is changed to as shown in Figure 4(b). In the second round of embedding, the modified value will be used together with and to prediction and it will be changed to as given in Figure 4(c). The remaining two rounds of modification are similar. After four rounds of data embedding, all pixels in each block are modified, as shown in Figure 4(e).

**(a) The first round of embedding**

**(b) The second round of embedding**

**(c) The third round of embedding**

**(d) The fourth round of embedding**

**(e) The final results**

Without loss of generality, we use the first round of embedding to describe the data embedding procedure. First, find the highest bins in the left and right side of the difference histogram, denoted by and , respectively. That is, andThe embedding zone which determines where the messages will be embedded is defined as where is a scale factor. In this case, the capacity can be calculated as follows:

The difference histogram modification in the encrypted domain can be described as follows: where is one bit of the secret data and is the modified prediction error. In general, to achieve a higher security, a stream cipher with key is used to encrypt the secret data before embedded into the encrypted image. Unauthorized users will have difficulty recovering the original message because they do not have the key. Finally, the modified pixel values will be obtained as follows:

According to the statistical distribution of the difference histogram in Figure 3, it can be seen that the probability of occurrence is larger when the prediction difference is closer to 0 or 255. For simplicity, we set and . For better illustration, the graphical representation of histogram shifting is shown in Figure 5 intuitively. Although the middle part of the difference histogram is usually empty, ambiguities arise when the bins from two sides overlapped in the middle after expansion. To avoid it, the differences in and will not be shifted. However, ambiguities still arise when difference is changed from to or from to during the embedding process. The overlapping problem can be resolved by using a location map . It is a binary array with its every element corresponding to and , 0 for genuine and 1 for pseudo. The location map and the additional information will be embedded together into the encrypted domain.

**(a)**

**(b)**

**(c)**

##### 2.3. Data Extraction and Original Image Recovery

At the receiver side, data extraction and image decryption are completely separable. In other words, the hidden data can be extracted before or after decryption. Next, we will introduce these two schemes of data extraction in detail.

*(**1) Scheme I: Data Extraction in the Encrypted Domain. *When holding the encrypted image containing secret information and the data hiding key , the receiver can extract the secret information directly. It can be operated in the reverse process of data embedding.

*Step 1. *Divide the marked and encrypted image into non-overlapping 2x2 blocks as in data embedding phase.

*Step 2. *According to the location map , if the current block is an embeddable block, go to Step 3. If it is a non-embeddable block, the next block is taken as the current block and proceed to Step 2.

*Step 3. *Note that data extraction is in the reverse order of embedding procedures, i.e., from to . For each block, the remaining three pixels , , and are utilized to predict , and the obtained prediction error values are used for data extraction. After the first round of data extraction, is changed as . Then, together with and are utilized to predict . After data extraction, is changed as . Similarly, and can be obtained. For simplicity, the pixel in the upper left corner is still taken as an example.

For the encrypted pixel value in the upper left corner of the* m*-th image block, the prediction difference is calculated according to the following equation:According to the proof in Section 2.2, it is known that is equal to .

*Step 4. *According to the previous embedding process, it can be seen that the secret information can be extracted in and .

If , thenIf , then

*Step 5. *The extracted bits can be further decrypted by using the data-hiding key . Thus the original secret data are obtained.

Since the whole process is entirely operated in encrypted domain, it effectively avoids the leakage of original content. After obtaining the secret data from the marked encrypted image, the prediction difference value can be further restored as follows:It should be noted that the boundary difference can be restored according to the location map . The encrypted pixel values can be obtained as follows:

Thus, the encrypted image without the hidden data, i.e., , is obtained. With the encryption key , the original cover image can be accurately restored by performing the decryption operation as in equation (3). As each recovery step is reversible, the final decrypted image is exactly the same as the original one.

*(**2) Scheme II: Data Extraction in the Decrypted Domain. *In Scheme I, both data embedding and extraction are performed in the encrypted domain. However, in some scenarios, users want to decrypt the image first and then extract the hidden data from the decrypted image when it is needed. For example, after the image being decrypted, the recipient also hopes to track the source of the image. Thus, Scheme II is introduced to perform data extraction after image decryption. The detailed process of decryption and data extraction is comprised from the following steps.

*Step 1. *With the encrypted image containing secret information and the encryption key , image decryption can be accomplished according to the following equation:No visible distortions can be observed in the marked decrypted images, as will be demonstrated in later experimental results.

*Step 2. *Divide the marked and decrypted image into non-overlapping 2x2 blocks, which is exactly the same as in Section 2.1.

*Step 3. *According to the location map , if the current block is an embeddable block, go to Step 4. If it is a non-embeddable block, the next block is taken as the current block and proceed to Step 3.

*Step 4. *Calculate the prediction difference between the basic pixel and the remaining pixels in each 2x2 block. For simplicity, the pixel in the upper left corner is still taken as an example.

For the decrypted pixel value in the upper left corner of the* m*-th image block, the prediction difference is calculated according to the following equation:According to the proof in Section 2.2, it is known that is equal to .

*Step 5. *The hidden data can be extracted in a manner similar to equation (17) and equation (18). That is, it is only necessary to replace in equation (17) and equation (18) with .

*Step 6. *The image difference can also be restored in the same way as in equation (19). The original pixel values can be obtained as follows:

Therefore, the original image, i.e., , is successfully recovered. Due to the reversibility, the original image and secret data can be completely restored without any error.

#### 3. Experimental Results and Analysis

In this section, the experimental results obtained by applying our method will be present. Eight standard images of size 512 × 512×8, i.e.,* Aerial, Barbara*,* Lena*,* Lighthouse*,* Tank, Truck*,* Zelda*,* and boats *[35], are used to compare our proposed method with the related state-of-the-art works. In addition, 80 images selected from a popular gray-scale image database [36] are also used to further demonstrate the effectiveness of our method. The secret data is a binary sequence generated by pseudo-random number generator. For data hiding in encrypted images, we have to measure different performances which are the scrambling effect, the payload (i.e., embedding rate), and the reconstructed image quality.

##### 3.1. Scrambling Effect and Security Analysis

In RDH-EI, unauthorized user is not allowed to access the original image and secret data. Thus both the original image and secret data should be protected. To protect the secret data, a stream cipher is applied to change the bits. Without the data hiding key , it is extremely difficult to reveal the secret data. And a pseudo-random matrix generated by PRNG is used to encrypt image. A statistical analysis of histogram is employed to verify the security level. Generally, histogram of an image demonstrates the distribution of the pixels based on the intensity values. Figure 6 illustrates the histograms of the original image. After encryption, the corresponding histograms are shown in Figure 7. It can be observed that the histograms of the encrypted image obtained with our approach are uniformly distributed in comparison with the original image. It is impossible to exploit them to obtain information about the original content of the image. In addition, the histogram statistics of the original images and the corresponding encrypted images are also given in Table 1.

**(a) Lena**

**(b) Baboon**

**(c) Barbara**

**(d) Peppers**

**(e) Truck**

**(f) Boat**

**(g) Man**

**(h) Airplane**

**(a) Lena**

**(b) Baboon**

**(c) Barbara**

**(d) Peppers**

**(e) Truck**

**(f) Boat**

**(g) Man**

**(h) Airplane**

Perceptual security refers to the encrypted image being unintelligible. The original images are given in Figure 8, and their corresponding encrypted results are shown in Figure 9. As can be observed, the visual content of the plaintext images has been completely blurred by the proposed encryption scheme. In addition, for standard gray images, i.e.,* Lena*,* Baboon*,* Barbara*,* Peppers*,* Truck*,* Boat*,* Man*,* and Airplane*, PSNR (Peak Signal to Noise Ratio) values are 9.53dB, 9.53dB, 7.85dB, 8.45dB, 9.61dB, 8.96dB, 7.56dB, and 8.05dB, respectively. The PSNR values of the encrypted images are relatively low. Obviously, scrambling performance of the described encryption system is more than adequate and the visual security is guaranteed. To more comprehensively validate the proposed method, another 80 images are selected from [36] for testing. The* PSNR *values of 80 encrypted images are shown in Figure 11. In [37], bit plane disordering, block, and pixel scrambling are performed, which may provide a reference to further enhance security for our future work.

**(a) Lena**

**(b) Baboon**

**(c) Barbara**

**(d) Peppers**

**(e) Truck**

**(f) Boat**

**(g) Man**

**(h) Airplane**

**(a) Lena (9.53dB)**

**(b) Baboon (9.53dB)**

**(c) Barbara (7.85dB)**

**(d) Peppers (8.45dB)**

**(e) Truck (9.61dB)**

**(f) Boat (8.96dB)**

**(g) Man (7.56dB)**

**(h) Airplane (8.05dB)**

##### 3.2. Visual Quality of Marked and Decrypted Image

When both of the keys and are obtained, the original image can be losslessly recovered by decrypting and extracting the hidden information, as indicated by a* PSNR* which tends to. Equality between the original images and restored images has proved the reversibility of the proposed scheme. Note that the hidden message is always extracted without any error. In some scenarios, the authorized user can decrypt the marked encrypted image to get an approximated original image. Therefore, the visual quality of the decrypted image containing the hidden data is also expected to be equivalent or very close to that of the original image. Since each pixel is altered at most by , the introduced distortion will not be perceptible when is small. To verify this, the original images and their corresponding decrypted versions containing the hidden data are shown in Figures 8 and 10, respectively. Just as shown, the recovery version is very identical to the original image visually.

**(a) Lena (48.84dB)**

**(b) Baboon (48.90dB)**

**(c) Barbara (49.03dB)**

**(d) Peppers (48.74dB)**

**(e) Truck (48.69dB)**

**(f) Boat (48.83dB)**

**(g) Man (49.12dB)**

**(h) Airplane (49.00dB)**

To quantitatively evaluate the reconstructed image quality in comparison to the original one, the PSNR values with different embedding rate are given in Table 2. When , the* PSNR* values are all above 48 dB. Even when , the* PSNR* values are all around 40 dB. In addition, the* PSNR* values for the other 80 images [36] are also given in Figure 12, which have similar results. These test results indicate that it is almost impossible to detect the degradation in image quality caused by data hiding.

##### 3.3. Embedding Capacity

In our experiments, the embedding capacity is measured in bit per pixel (bpp), which is expected to be as large as possible in order to embed the maximum amount of information. Obviously, the embedding capacity in each image is determined by the number of prediction errors within . As the side information is extremely small with respect to the capacity, the side bits are not counted in the following experimental results. For eight standard gray images, the embedding rates are shown in Table 2. It can be observed that the embedding capacity of the proposed scheme depends strongly on the characteristics of the original cover image, as each image has a different number of prediction errors associated with the embedding process. As expected, images with less texture in the original version (e.g.,* Lena *and* Airplane*) have higher prediction accuracy and thus can contribute higher number of differences associated with the peak point. Therefore, they can be embedded with more data, achieving a larger embedding rate. On the other hand, images with higher spatial activity (e.g.,* Baboon *and* Truck*) achieve lower embedding rate.

As can be seen from equation (14), the embedding capacity is related to the parameter *β* which can be used for adjusting the embedding capacity flexibly. When a low capacity is needed, e.g., for content authentication purpose, we can narrow the embedding range, e.g., . In this way, a higher quality of stego-image can be achieved, as depicted in Figure 12. As the parameter *β* increases, it means that the more prediction errors can be exploited for embedding. As a result, the embedding capacity will be higher. But on the other hand, more shifting and embedding operations will make more changes to the pixel values, which will lead to lower PSNR.

Table 2 illustrates the embedding capacity for three cases, i.e., , , and , which demonstrates the effectiveness of on improving the embedding capacity. It is obvious that there is no perfect solution to achieve high payload and low distortion simultaneously. The more flexible capacity control is achieved in our framework, which is helpful to make a tradeoff between the capacity and the visual quality according to the different practical requirements. Besides eight standard images, the embedding rates of other 80 test images [36] are plotted in Figure 13.

##### 3.4. Comparison and Discussion

As mentioned in Section 1, all methods in [12–14] may introduce some errors on data extraction and/or image recovery, while complete reversibility and error-free extraction can be achieved in the proposed method. For methods in [23–25], error-free data extraction and image lossless recovery can be obtained. But histogram shifting should be done prior to encrypting the image. On the contrary, in our method, the image is directly encrypted, which is more reasonable. In addition, several comparisons are made between our proposed method and several previous methods [16, 25, 31] in terms of embedding rate. To do this, eight standard images are taken as examples, and the comparison of the embedding capacity is shown in Figure 14. First of all, it can be seen that our method has a larger payload than the others. In fact, the maximal payload for these methods, obtained by Xu* et al*. [25] is 0.3565bpp. Specifically, the proposed method reaches a maximal payload of 0.5935bpp, which is far higher than what the other compared algorithms can achieve. When we examine the reconstructed image quality, our method can perfectly reconstruct the original image (PSNR) using both the encryption key and the data hiding key.

#### 4. Conclusions and Future Work

In this paper, an efficient framework for RDH-EI is presented. A specific modulo operation is utilized to encrypt the image, which can preserve some correlation between the neighboring pixels. With the preserved correlation, the data-hider can embed the additional data into the encrypted image by using difference histogram modification. Since the embedding process is done on encrypted data, our scheme preserves the confidentiality of content. Data extraction is separable from image decryption; i.e., the additional data can be extracted either in the encrypted domain or in the decrypted domain. Experimental results show that the visual quality of marked decrypted image is very high and that the achieved payload is enough to embed some additional data. On the other hand, real reversibility can be achieved, which means that the secret data and original image can be restored without any error. Future works will focus on determining the optimal modification on the histogram to achieve the best rate-distortion performance.

#### Data Availability

The [.xlsx] data and MATLAB source code used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

This work is supported by the National Natural Science Foundation of China (61771270), Zhejiang Provincial Natural Science Foundation of China (LY17F020013), and Ningbo Natural Science Foundation (2018A610054).