Abstract

An efficient method of completely separable reversible data hiding in encrypted images is proposed. The cover image is first partitioned into nonoverlapping blocks and specific encryption is applied to obtain the encrypted image. Then, image difference in the encrypted domain can be calculated based on the homomorphic property of the cryptosystem. The data hider, who does not know the original image content, may reversibly embed secret data into image difference based on two-dimensional difference histogram modification. Data extraction is completely separable from image decryption; that is, data extraction can be done either in the encrypted domain or in the decrypted domain, so that it can be applied to different application scenarios. In addition, data extraction and image recovery are free of any error. Experimental results demonstrate the feasibility and efficiency of the proposed scheme.

1. Introduction

With the rapid developments occurring in mobile internet and cloud storage, privacy and security of personal data have gained significant attention nowadays. There are no guarantees that stored data will not be accessed by unauthorized entities, such as the cloud provider itself or malicious attackers. Under these specific circumstances, sensitive images, such as medical and personal images, need to be encrypted before outsourcing for privacy-preserving purposes [1, 2]. In other words, the consumers would like to give the untrusted cloud server only an encrypted version of the data instead of the original content. The cloud service provider (who stores the data) is not authorized to access the original content (i.e., plaintext). However, in some application scenarios, the cloud servers or database managers need to embed some additional messages, such as authentication or notation data, directly into an encrypted data for tamper detection or ownership declaration purposes. For example, patient’s information can be embedded into his/her encrypted medical image to avoid unwanted exposure of confidential information.

To address this problem, researchers have been studying the possibility of hiding data directly in the encrypted domain. Over the past few years, a considerable amount of schemes about data hiding in encrypted images or videos has been reported in the literature [310]. However, within these schemes, the host image/video is permanently distorted caused by data embedding. In general, the cloud service provider has no right to introduce permanent distortion. This implies that, for a legal receiver, the original plaintext content should be recovered without any error after image decryption and data extraction. To solve this problem, reversible data hiding (RDH) in the encrypted domain is preferred.

RDH is a technique that slightly alters digital media (e.g., images or videos) to embed secret data while the original digital media can be recovered without any error after the hidden messages have been extracted [11]. This specific data hiding technique has been found to be useful in some important and sensitive areas, that is, military communication, medical science, law-enforcement, and error concealment [12, 13], where the original media is required to be reconstructed without any distortion. So far, three major approaches, that is, lossless compression [14], histogram modification [11, 15], and difference expansion [16], have already been developed for RDH. For more details of these methods and other RDH methods, refer to the latest review of recent research [17]. Although RDH techniques have been studied extensively, these techniques are suitable for plaintext instead of ciphertext.

RDH in the encrypted domain has emerged as a new and challenging research field. In recent years, some RDH methods for encrypted images have been proposed. In general, these methods can be divided into three categories, that is, methods by vacating room after encryption (VRAE) [1824], methods by reserving room before encryption (RRBE) [2528], and methods based on homomorphic encryption [2934]. In VRAE framework, the original signal is encrypted directly by the content owner, and the data hider embeds the additional bits by modifying some bits of the encrypted data. The advantage of this framework is that the operation of the end user is simple and efficient. However, as the entropy of an encrypted image has been maximized, the embedding capacity is limited. Moreover, the accuracy of data extraction and the quality of restored image are not satisfactory. In RRBE framework, the embedding room is created in the plaintext domain, that is, vacating room before encryption. The advantages of this framework are mainly reflected in two aspects; namely, embedding capacity is relatively large and pure reversibility is achieved. But this framework might be impractical because it requires the content owner to perform an extra preprocessing before content encryption [17]. In general, the content owner expects to send only an encrypted image to the manager without extra information. In addition to VRAE and RRBE, another type of method has recently been proposed by using homomorphic encryption. With the additive homomorphic property of Paillier cryptosystem, Chen et al. [29] firstly proposed a homomorphic encryption based RDH approach. Shiu et al. [31] improved Chen et al.’s method [29] by adopting the concept of difference expansion into homomorphic encryption. Moreover, RDH in the homomorphic encrypted domain has also been investigated in [32, 33]. However, the used public-key cryptosystems lead to data expansion after image encryption. In [30, 34], the additive homomorphic property of modulo operation is utilized to realize the RDH in the encrypted domain. The advantage is that encryption does not cause data expansion.

In this paper, we develop an effective and reliable framework for RDH in the encrypted domain. In fact, the proposed method belongs to the third category. Its main contribution is the combination of the modular addition and two-dimensional (2D) histogram modification. Its advantages are mainly manifested in four aspects. First of all, room for data hiding does not need to be vacated before encryption, which is more reasonable compared with the methods in [2528]. Secondly, completely separable and completely reversible can be achieved, which is more reliable than the methods in [1821]. Thirdly, the modular arithmetic addition operation, which has additive homomorphism, is utilized for image encryption. It does not cause data expansion, unlike the public-key cryptosystems in [29, 3133]. Finally, since data embedding in encrypted domain is accomplished by using pairwise coefficient modification, embedded capacity has been greatly improved compared with the methods in [30, 34]. The rest of the paper is organized as follows. In Section 2, we describe the proposed scheme, which includes image encryption, data embedding in encrypted image, data extraction, and original image recovery. Experimental results and analysis are presented in Section 3. Finally, in Section 4, conclusions and future work are drawn.

2. Proposed Scheme

In this section, a RDH method in encrypted images is illustrated. It is composed of three parts, that is, generation of the encrypted image, generation of the marked encrypted image, data extraction, and image recovery. First, the content owner encrypts the original image with encryption key to produce an encrypted image. Then, the data hider without knowing the actual content of the original image can embed some additional data into the encrypted image. Here, the data hider can be a third party, for example, a database manager or a cloud provider, who is not authorized to access the original content of the signal (i.e., plaintext). At the receiving end, maybe the content owner himself or an authorized third party can extract the hidden data either in encrypted or decrypted image. For illustrative purposes, the framework of the proposed scheme is given in Figure 1.

2.1. Image Encryption

Assume the original image is an 8-bit gray-scale image with size and pixels , , . As we know, in the plaintext image, the correlation will gradually decrease with the increase of the distance between two pixels. In order to make good use of the correlation among pixels for RDH, the cover image is divided into a number of nonoverlapping blocks of size as shown in Figure 2. If both and can be divisible by 3, the number of nonoverlapping blocks is . If or cannot be divisible by 3, the image is divided into blocks, including blocks of size . Here, denotes the smallest integer greater than or equal to , and denotes the greatest integer less than or equal to .

To ensure that pixels in the same block are encrypted with the same random value, the encryption matrix is obtained using the following equation:where is a pseudo-random matrix generated with the encryption key . After getting the encryption matrix , image encryption is done as follows.where represents an encrypted image. The corresponding decryption can be done in the following manner:

2.2. Data Embedding in Encrypted Image

After receiving the encrypted image, the data hider can embed some additional information into it for the purpose of media notation or integrity authentication. In order to achieve reversibility, the idea of histogram shifting is introduced in ciphertext based on homomorphic encryption. The whole process consists of two parts, namely, difference histogram generation and difference histogram modification.

(1) Difference Histogram Generation. Before performing the data embedding operation, a two-dimensional difference histogram of the encrypted image needs to be generated. The detailed procedure can be described as follows.

Step 1. Divide the encrypted image into nonoverlapping blocks, which is the same as Figure 2. If the width or height of the image is not a multiple of 3, then the edge block will be ignored during the data embedding process.

Step 2. Calculate the difference between the basic pixel and the remaining pixels in each block. Here, the pixel located in the center coordinate is taken as the basic pixel for prediction. Then the difference can be calculated by using the following equation:where . Note that the values of and cannot be zero at the same time. Obviously, eight differences can be obtained in each block.
Although is the encrypted value, it is easy to prove the following equation:

Proof. One has

According to the above proof, the correlation between the neighboring pixels in the local area of the plaintext image is preserved; that is, the difference remains unchanged even after encryption. All other blocks can be processed in the same manner.

Step 3. Generate the difference histogram using differences in each block. There is a high degree of correlation between adjacent pixels in a local region of an image. That is, they have similar gray values, or even the same gray value. Thus, the resulting difference histogram has a higher peak than the histogram of the original image. To demonstrate the distribution of the image difference, the histograms of some residual images are shown in Figure 3. It is clearly seen that the distribution is approximately symmetrical. The methods in [30, 34] mainly focus on exploiting one-dimensional (1D) coefficient histogram for RDH. The 1D coefficient histogram is usually defined aswhere # denotes the cardinal number of a set, is an integer, and represents the block number. By considering every two differences together, the associated two-dimensional (2D) histogram can be defined aswhere denotes the th difference in the th block. More specifically, , , , and . The distribution of the two-dimensional histogram is presented in Figure 4.

(2) Difference Histogram Modification. When the difference histogram is generated, reversible data hiding can be accomplished by using histogram shifting method. In [30], the conventional 1D histogram shifting technique is adopted. If the highest bin is located in the left side of the difference histogram, for example, , the graphical representation of data embedding is shown in Figure 5(a). Otherwise, if the highest bin is located in the right side of the difference histogram, for example, , then its graphical representation is shown in Figure 5(b). Specifically, the conventional 1D RDH [30] can also be implemented in an equivalent way by modifying the 2D histogram [35].

For example, histogram modification in Figure 5(a) is in fact equivalent to the one shown in Figure 6. To further illustrate this case, some examples are provided below.(i)For the coefficient pair , in the method of 1D RDH shown in Figure 5(a), is expanded to 0 or 1 for embedding a data bit , and is expanded similarly. Consequently, in the method of 2D RDH shown in Figure 6, the coefficient pair will be expanded to , , , and when the to-be-embedded bits are , , , and , respectively.(ii)For , in the method of 1D RDH, is expanded to 0 or 1 for embedding a data bit , and is shifted to 2. Correspondingly, in the method of 2D RDH, the pair is expanded to if , and if .(iii)For , in the method of 1D RDH, and are shifted to 3 and 2, respectively. Accordingly, in the method of 2D RDH, the pair is shifted to .

In particular, various histogram modification strategies can be designed based on 2D histogram. A reasonable histogram modification strategy directly contributes to the superior performance. The purpose of our design is to provide high embedding capacity while maintaining good visual quality. According to the statistical distribution of the difference histogram in Figure 4, we find that the probability of occurrence is larger when the difference is closer to 0 or 255. Based on this, a novel RDH technology is presented as shown in Figure 7.

Suppose the message to be embedded is a binary sequence denoted as . In order to enhance the security, a stream cipher is used to encrypt the message according to the data-hiding key . Thus, the to-be-embedded binary information, that is, , is an encrypted version of . It is difficult for anyone who does not retain the data hiding key to recover the message. The 2D histogram modification in the encrypted domain can be described as follows. According to the symmetry in Figure 4, only the modification in the lower-left quadrant is described for simplicity.

If , it has eight candidate directions for modification. In this case, three bits can be embedded. Specifically, the marked coefficient pair is determined as follows:

If , 1 bit can be embedded in each coefficient pair. Then, the marked coefficient pair is determined as followswhere . Although the middle part of the difference histogram is usually empty, ambiguities arise when the bins from two sides overlapped in the middle after expansion. To avoid it, the differences of 126 and 127 will not be expanded. However, ambiguities still arise when difference is changed from 125 to 127 or from 124 to 126 during the embedding process. The overlapping problem can be resolved by using a location map. It is a binary array with its every element corresponding to 126 and 127, 0 for genuine, and 1 for pseudo. The location map and the secret information will be embedded together in the encrypted domain.

If , the marked coefficient pair is determined as follows:where .

If and , the coefficient pair is shifted to as follows:

According to the characteristic of modulus function, the following equation can be established.

According to (13), the operation of can be accomplished by replacing with . Thus, in (10)~(12), the modification of the difference is equivalent to the modification of the pixel value. Then the marked and encrypted image of the proposed scheme is obtained. The embedding capacity in the lower-left quadrant denoted as can be computed by

2.3. Data Extraction and Original Image Recovery

In this scheme, data extraction and image decryption are completely separable. In other words, the hidden data can be extracted either in encrypted or in decrypted domain. Furthermore, our method is also reversible, where the hidden data could be removed to obtain the original image. We will first discuss the extraction in the encrypted domain followed by the decrypted domain.

(1) Scheme I: Data Extraction in the Encrypted Domain. In order to protect the users’ privacy, the database manager (e.g., a cloud server) does not have sufficient permissions to access original video content due to the absence of encryption key. But the manager sometimes need to note and mark the personal information in corresponding encrypted images as well as verify their integrity. In this case, both data embedding and extraction should be manipulated in the encrypted domain. In the encrypted domain, the hidden data extraction can be accomplished by the following steps. According to the symmetry in Figure 4, only the extraction in the lower-left quadrant is described for simplicity.

Step 1. Divide the encrypted image into nonoverlapping blocks, which is the same as Figure 2. The center pixel in each block is selected as the basic pixel for prediction.

Step 2. Calculate the difference between the basic pixel and the remaining pixels in each block by using the following equation:

Step 3. The associated 2D histogram can be generated in the same way as in Section 2.2

Step 4. According to the previous embedding rules, the hidden data can be extracted aswhere denotes the extracted message bits. Since the whole process is entirely operated in encrypted domain, it effectively avoids the leakage of original content.

Step 5. With the data-hiding key, , the extracted hidden bits could be further decrypted to obtain the original message. In addition, the image difference value can be further restored as follows:It should be noted that the boundary difference can be restored according to the location map. Similarly, according to (13), the operation of can be accomplished by replacing with . Thus the encrypted image without the hidden data, that is, , is obtained.

Step 6. With the encryption key, , the original cover image can be accurately restored by performing the decryption operation as in (3).

(2) Scheme II: Data Extraction in the Decrypted Domain. In scheme I, both data embedding and extraction are performed in the encrypted domain. However, in some cases, users want to decrypt the image first and then extract the hidden data from the decrypted image when it is needed. For example, with the encryption key, an authorized user wants to achieve the decrypted image containing the hidden data, which can be used to trace the source of the data. In this case, data extraction after image decryption is suitable. The whole process of decryption and data extraction comprised the following steps.

Step 1. Image decryption can be accomplished according to the following equation:No visible distortions can be observed in the marked and decrypted images, as will be shown in later experimental results.

Step 2. Divide the marked and decrypted image into nonoverlapping blocks, which is the same as Figure 2. The center pixel in each block is selected as the basic pixel for prediction.

Step 3. Calculate the difference between the basic pixel and the remaining pixels to form the prediction errorAccording to (5), the following equation is established:

Step 4. The hidden data can be extracted in a manner similar to (17). That is, it is only necessary to replace in (17) with .

Step 5. The image difference can also be restored in the same manner as in (18). The only thing that needs to be adjusted is to replace and with and , respectively. Similarly, the operation of can be accomplished by replacing with . Therefore, the original image, that is, , is successfully restored.

3. Experimental Results and Analysis

Eight well-known standard gray images, that is, Aerial, Barbara, Lena, Lighthouse, Tank, Truck, Zelda, and boats [36], are considered for experimental purposes. The size of first 7 images is , and the size of “Boats” is . The secret data is a binary sequence created by pseudo-random number generator.

3.1. Scrambling Effect and Security Analysis

For an image encryption scheme, the security depends on cryptographic security and perceptual security. Cryptographic security denotes the security against cryptographic attacks, which relies on the underlying cipher. In the proposed scheme, pseudo-random sequence is used to encrypt image. Figure 8 illustrates the histogram of the original image. After encryption, the corresponding histogram is shown in Figure 9. By comparing Figures 8 and 9, it can be observed that the modified distribution appears to be uniform, which suggests that a statistical analysis would not be effective for evaluating the original content.

Perceptual security refers to the encrypted image being unintelligible. The original images are given in Figure 10, and their corresponding encrypted results are shown in Figure 11. As can be observed, the marked and encrypted image is a noise-like image. The visual information of the original image is damaged, which means that the data hider has extreme difficulty to obtain any useful information from it. In addition, for standard gray images, that is, Aerial, Barbara, Lena, Lighthouse, Tank, Truck, Zelda, and boats, PSNR (Peak Signal to Noise Ratio) values are 8.17 dB, 7.87 dB, 9.53 dB, 8.82 dB, 10.17 dB, 9.95 dB, 8.90 dB, and 9.11 dB, respectively. Obviously, scrambling performance of the described encryption system is more than adequate.

3.2. Visual Quality of Marked and Decrypted Image

Since the embedding scheme is reversible, the original cover content can be perfectly recovered after extracting the hidden data. In some scenarios, the encrypted image containing the hidden data provided by the server needs to be decrypted by the authorized user. Therefore, the visual quality of the decrypted image containing the hidden data is also expected to be equivalent or very close to that of the original image. In other words, the degradation of the image quality should be maintained at an acceptable range, even if the hidden data has not been removed. In the proposed method, since the maximum change in pixel value is 2, the artifacts introduced will not be perceptible. To verify this, a series of tests have been conducted. The original images and their corresponding decrypted versions containing the hidden data are shown in Figures 10 and 12, respectively. From our subjective examination, it is concluded that the marked content cannot be visually distinguished from nonmarked content. In addition to subjective observation, PSNR values are also given in Figure 8. In addition to Zelda, PSNR values of the remaining images are all above 47 dB. Generally, it is almost impossible to detect the degradation in image quality caused by data hiding.

3.3. Embedding Capacity

According to the embedding process described in Section 2.2, the embedded capacity can be calculated as follows:

For standard gray images, that is, Aerial, Barbara, Lena, Lighthouse, Tank, Truck, Zelda, and boats, the maximal embedding capacities of one-layer embedding strategy are 0.1432 bpp (bit per pixel), 0.1417 bpp, 0.1942 bpp, 0.1164 bpp, 0.1160 bpp, 0.1428 bpp, 0.1570 bpp, and 0.2136 bpp, respectively. It can be observed that the embedding capacity of the proposed scheme depends strongly on the characteristics of the original cover image. As expected, for images with high spatial activity (e.g., Lighthouse, Tank), low embedding rate is achieved. On the other hand, images with lower spatial activity (e.g., Lena, Boats) achieve higher embedding rate. The main reason is that most adjacent pixels have similar values in a smooth region. Therefore, they can contribute higher number of differences associated with the peak point compared with those in a complex region.

In our experiments, the size of the encrypted block is set to . In general, with the increase of the block size, the embedding capacity will increase whereas the security performance of the encryption algorithm will decrease. According to our analysis in Section 2.2, in any block, the difference of those pixel pairs remains unchanged even after encryption. With the increase of the block size, more correlation between the neighboring pixels may be preserved, and thus the embedding capacity will increase. On the other hand, the difference value between any pixel pair in each block of the plain image can be recovered in encrypted domain. In addition, higher capacities can be also achieved by applying multiple-layer embedding strategy. However, its cost is the decrease in perceptual quality.

3.4. Comparison and Discussion

As mentioned in Section 1, the methods in [1821] may introduce some errors on data extraction and/or image recovery, while the complete reversibility can be achieved in the proposed method. More importantly, these methods are designed to carry only small payloads. Taking Zhang’s method [18], for instance, the embedding rate is 0.0156 bpp associated with block size . If error correction mechanism is introduced, the actual embedding rate will be further decreased. It can be observed that our method achieves significantly higher embedding rate. For methods in [2528], completely error-free data extraction and image recovery can be obtained. But it requires the content owner to perform an extra preprocessing before content encryption, which might be impractical. Instead, the proposed method overcomes these two problems.

Furthermore, Figure 13 shows the comparison of the embedding capacity between the proposed method and the methods in [30, 34]. Here, the maximum embedding capacity in one-layer embedding strategy is provided. As can be seen, in one-layer embedding strategy, the embedding capacity has been greatly improved. In fact, it can also be seen from Figures 6 and 7 that the embedding capacity of the proposed method can certainly be larger in one-layer embedding strategy. For example, when the coefficient pair is (0, 0), two bits can be embedded in the method of [30]. However, three bits can be embedded in the proposed method. The direct benefit is that a larger capacity can be achieved by one-layer histogram shifting. If two-layer embedding is used, the visual quality reduction is relatively large. Taking Lena and Boats as an example, the performance comparison of different embedding rates is given in Figure 14. Obviously, the proposed method can provide better performance when the embedding capacity exceeds the maximum capacity of one-layer embedding strategy.

4. Conclusions and Future Work

In this paper, an algorithm to reversibly embed secret data in encrypted images is presented. A specific modulo operation is utilized to encrypt the image, which can preserve some correlation between the neighboring pixels. With the preserved correlation, the data hider can embed the secret data into the encrypted image by using 2D histogram modification, even though he does not know the original image content. Since the embedding process is done on encrypted data, our scheme preserves the confidentiality of content. Data extraction is separable from image decryption; that is, the additional data can be extracted either in the encrypted domain or in the decrypted domain. Furthermore, this algorithm can achieve real reversibility and high quality of marked and decrypted images. One of the possible applications of this method is image annotation in cloud computing where high image quality and reversibility are greatly desired.

Although RDH technology and cryptography have been studied extensively, RDH in the encrypted domain is a highly interdisciplinary area of research. Technical research in this field has only just begun, and there is still an open space for research in this interdisciplinary research area. In future, more considerable effort is needed to determine the optimal modification on the histogram for achieving the best rate-distortion performance. Moreover, future work also aims at designing more efficient scheme for RDH in encrypted videos [37].

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (61771270, 61672302), Zhejiang Provincial Natural Science Foundation of China (LY17F020013, LZ15F020002), and Public Welfare Technology Application Research project of Zhejiang Province (2015C33237).