Abstract

Because of the advent of the information age, digital multimedia data are mainly transmitted through the Internet. Image is one of the most popular digital multimedia data. Therefore, this paper proposes a block-based key embedding method and a color image encryption scheme based on deep learning with this method. By using a neural network model to predict the initial chaotic sequence, the key data generated by prediction are encrypted to the color image in layers and blocks. This paper proposed the color image encryption scheme time complexity , which is simple and effective. Simulation results show that the proposed scheme exhibits good statistical properties with information entropy mean close to eight and correlation coefficient close to zero. Good robustness to common image attacks like noise addition and cropping. Excellent encryption performance includes enormous key space and low PSNR.

1. Introduction

With the birth and development of fifth-generation mobile communication technology, people can obtain all kinds of data they need through the Internet anytime and anywhere and realize resource sharing and information interaction. Therefore, information security has consistently commanded much attention. Image encryption represents the process of hiding images with keys to prevent unauthorized access. Considering the characteristics of image data, many existing image encryption algorithms are proposed based on different technologies, including vector quantization, fractional wavelet transformation, and chaos [1].

In recent years, deep learning models have been widely used in the field of information security. Especially in image processing, deep learning shows vast advantages (see Figure 1). For instance, Li et al. used convolution neural networks (CNN) to optically encrypt iris images [2]. Chen et al. proposed a method to improve the robustness of 2D/3D optical image encryption by using extended deep CNN [3]. Ni et al. tried to apply the compressed sensing (CS) reconstruction algorithm based on deep learning to image encryption [4]. Zhang and Li encrypted images using optics combined with deep learning models [5] and so on.

Combining the contents of the above (see Table 1), it is unproblematic to find most of the current encryption schemes based on deep learning which still stay at the stage of encrypting gray image. Based on the research done by Zhao et al. [6], this paper proposes a method of embedding keys in blocks and a color image encryption scheme, depending on a deep learning model to expand the research content in this direction. The deep learning model is principally used to predict the initial sequences generated by chaotic systems by embedding the predicted data into the upper left, lower right, and middle regions of the three components of the color image. The final encrypted image is obtaines by using scrambling and diffusion algorithms.

The main contributions of this paper are as follows:(1)This is a recent attempt to combine deep learning models with color image encryption(2)The implementation is simple and effective and improves the practical use efficiency of encryption algorithms(3)It has excellent encryption performance, good statistical characteristics, and strong robustness to common image attacks

2. Preparatory Knowledge

2.1. Chen’s Chaotic System

A chaotic system means there are seemingly erratic irregular movements in a certain way which are nonrepeatable, unpredictable, and uncertain. Common three-dimensional chaotic systems include Lorenz and Chen.

In 1999, Professor Chen and Ueta of Houston University put forward the chaotic system [7]. Its equations are defined is given in formula (1).

When the parameters are given , Chen’s chaotic model presents a chaotic state. It is similar to the Lorenz chaotic model but has a more complex topological structure than the Lorenz chaotic model. Therefore, Chen’s chaotic model is widely used in the encryption field. This paper chooses Chen’s chaos to generate the initial key sequence:

2.2. BiLSTM Neural Network

The encryption scheme proposed in this paper can predict the initial sequence generated by a chaotic system based on an arbitrary recurrent neural network. At this place, we choose bidirectional long-short term memory (BiLSTM). BiLSTM is a deep learning model developed based on long-short term memory (LSTM), which realizes more training by traversing the input data twice [8]. It is principally composed of forwarding and backward LSTM. Each time point contains an LSTM unit for selective memory, and forgetting and output information. The LSTM cell formula can be represented as follows:

BiLSTM splices the output of forwarding LSTM and backward LSTM at times t−1, t, t + 1, etc. Because it can use past time and subsequent time to forecast at the same time, it can provide a better prediction effect than LSTM.

2.3. Arnold Transformation

Arnold transformation, also known as cat mapping, was first introduced by Russian mathematician Vladimir Arnold in the study of ergodic theory. This transformation involves recutting and splicing the matrix of digital images [9].

The two-dimensional Arnold transformation of a digital image with M = N (equal length and width) is defined as

In formula (3), represents the position of pixels in the digital image before the transformation. indicates the position of the transformed pixels. are parameters. represents the number of current transformations. is the length or width of the image. represents modular operations.

2.4. The Proposed Block Embedding Algorithm
2.4.1. Theoretical Derivation

A good image encryption algorithm should have statistical characteristics like low correlation and high entropy. Hosny et al. obtained encrypted images with negative correlation and high entropy by chunking and scrambling the three-color subimages of color images and mapping them with one-dimensional chaotic logic maps [10]. Wang et al. employed the improved zigzag method to scramble the subimages after the color image was chunked, and the combination with the chaotic system made the encrypted image more difficult to crack [11]. Younes and Aman experimentally demonstrated that this method can effectively reduce the correlation between the pixels of the encrypted image by dividing the original image into an arbitrary number of blocks [12].

After summarizing the above research results, we can infer that according to the characteristics of the three-color channel of the color image, the channel split of the color image and the chaos of the split subgraph can theoretically and effectively reduce the correlation of adjacent pixels of the image and weaken the structure of the initial image, thereby improving the security of the encryption scheme. As a result, we designed a key embedding method, which mainly embeds the generated key matrix into various regions of different color channels in blocks. The specific embedding idea is shown in Figure 2.

To simplify the explanation, we used the 7  7 matrix as an example in Figure 1. After splitting the color image with uniform length and width into R, G, and B color matrices, the key sequences with a length is predicted by the deep learning model. The key sequences and the R matrix, G matrix, and B matrix of MN start from the elements in the upper left corner, lower right corner, and intermediate position, respectively and carry out XOR operation one by one according to the direction sequences marked in the figure. At the last moment, the matrix data are obtained after embedding the key (Algorithm 1).

2.4.2. Pseudocode for the Proposed Block Embedding Algorithm
Input: R, G, B channels of image F and keys
Output: matrix e_Im.
Step 1: the R channel is embedded using .
  
   
   
   
Step 2: the G channel is embedded using .
  
   
   
  
Step 3: the B channel is embedded using .
  
   
   
  
Step 4: the matrix e_Im is obtained by concatenating the three-color channels
Step 5: encapsulate it as an embedding method and name it
2.5. The proposed encryption schemes
2.5.1. Theory and Steps

Deep learning refers to artificial neural networks (ANNs) with complex multilayers [13]. LSTM is a specific recurrent neural network (RNN) that solves the long-term dependency problem on RNN [14] so that it can achieve good predictive performance at large time intervals. There have been studies to apply LSTM to the prediction of chaotic time series. Sangiorgio and Dercole have experimentally checked that LSTM networks are superior to feed-forward competitors in predicting chaotic time series and have good robustness [15]. BiLSTM is formed by a combination of LSTMs in the forward and backward directions, which can be seen as a two-layer neural network.

From the above research results, it can be perceived that is theoretically effective to generate robust chaotic sequences using BiLSTM. In the actual application scenario, even if the chaotic system that generates the initial key is cracked by the attacker. The attacker cannot get the final encryption key by reverse inference of the deep learning model. It improves the security of the encryption scheme to a certain extent and can resist some common image attacks.

In the process of using deep learning for color image encryption research, we found there is a certain contradiction between the two aspects of reducing the difficulty of use in actual scenes while being both secure and at the same time. To this end, we have introduced our design of the block embedding method to reduce complex calculations while ensuring the safety and robustness of the algorithm.

The encryption scheme proposed in this paper represents a process of splitting the color image into R, G, and B color matrices, predicting and generating three new chaotic key sequences by BiLSTM, embedding the key sequences into the three-color matrices in blocks, and generating the final encrypted image by scrambling and diffusion. The main encryption process (see Figure 3) and the specific encryption steps are as follows (Algorithm 2):(1)CauseThe original image and its size be . Given the initial value of the Chen chaotic system, the initial chaotic sequences are obtained.(2)The initial chaotic sequences with length s are intercepted as the dataset of BiLSTM, and the parameters of the deep learning model are set to generate the predicted sequences .(3)The image is divided into three-color matrices R, G, and B. The sequences are embedded into the three-color matrices by using the block embedding algorithm (see the section Block Embedding). The embedded matrix is obtained.(4)Given two pseudo-random integers with the range of values , the is transformed by Arnold transformation to obtain the color matrices after the first scrambling.(5)After splicing the initial chaotic sequences and the predicted sequences , the new sequences are gained awarding toThe sequences are used for forwarding diffusion and reverse diffusion of the three-color matrices according to formula (5). On that occasion, the diffused matrices are obtained as(6) Arnold transformations are carried out on the matrices , and the final encrypted image E is obtained by splicing the transformed three matrices.

2.5.2. Pseudocode for the Proposed Image Encryption Algorithm
Input: original image F
Output: encrypted image E and keys
Step 1:
Step 2:
Step 3: get using the custom embedding method
Step 4:
Step 5: the matrix after diffusion is obtained by using to diffuse in the forward and backward direction as
  
 End
  
Step 6:
2.6. Decryption Scheme

The decryption scheme in this paper represents the reverse process of an encryption scheme. The detailed steps of the decryption scheme remain as follows (Algorithm 3):(1)Performing inverse Arnold transforms an encrypted image E to obtain a diffusion image(2)After antidiffusion of the diffusion image, the inverse Arnold transform is carried out to obtain the embedded image (3)After the inverse operation of the embedded image, the restored image is obtained

2.6.1. Pseudocode for the Image Decryption Algorithm
Input: encrypted image E and keys
Output: restored image
Step 1:
Step 2: perform diffusion recovery.
  
  
Step 3:
Step 4:

3. Simulation Experiment Results and Analysis

A good encryption scheme should consider good statistical characteristics of digital images, good robustness against common image attacks, and an enormous key space. In this paper, MATLAB is used to simulate the proposed encryption scheme. Through the analysis and discussion of the experimental results, it has been proved that the scheme can effectively encrypt and decrypt color images.

The main pictures used in this section come from the open-source website (https://sipi.usc.edu/database/). The image sizes are 256  256, 512  512, and 1024  1024, respectively. The encryption and decryption effects of the three groups of experimental color images (see Figure 4).

3.1. Statistical Characteristic Analysis
3.1.1. Histogram

An image histogram obtains a graphical expression to reflect the intensity distribution of pixels in the image. Putting differently, it is to count the number of each pixel block in the image, which reflects the most essential statistical characteristics of the image [16]. The flatter the histogram drawn by encrypting the image, the more uniform the pixel value distribution in the image, the smaller the analysis space left for attackers and the better the encryption performance of the image.

In this paper, three groups of experimental objects are encrypted, and the encrypted images are tested by histogram. The results are shown in Figure 5. From the histogram test results, we can observe the encryption scheme proposed in this paper, make the pixel distribution in the encrypted image become uniform, and effectively reduce the analysis space of attackers.

3.1.2. Adjacent Pixel Correlation

Adjacent pixel correlation refers to the relationship diagram between adjacent pixels drawn by randomly selecting N pixels in an image and using the pixel values of two adjacent pixels as horizontal and vertical coordinates, respectively. The more the points in the graph are that remain concentrated near the diagonal of the coordinate axis, the stronger the correlation between the adjacent pixels of the image is. On the contrary, the more the points in the graph are dispersed in the whole graph, the weaker the correlation between adjacent pixels of the image. A good encryption scheme should be able to effectively reduce the correlation between the adjacent pixels of the image [17]. In the field of image encryption, the correlation analysis of adjacent pixels of encryption schemes is usually carried out from three directions horizontal, vertical, and diagonal.

The correlation analysis of adjacent pixels of this encryption scheme is shown in Figure 6. Observing the analysis chart, we can see that the correlation between adjacent pixels of the image encrypted by the encryption scheme proposed in this paper is surely reduced after measuring the three directions, i.e., horizontal, vertical and diagonal. We can find that the drawing points are evenly distributed in the whole analysis map, which can effectively resist statistical attacks.

3.1.3. Correlation Coefficients

To better measure the correlation of adjacent pixels of an encrypted image, the correlation is frequently used to quantitatively describe its size. The specific definitions are as follows:In Formula (6),where and represent two vectors of adjacent pixels, and show the relational value. The lower the correlation of adjacent pixels, the closer their correlation is to 0.

Through simulation experiments (see Table 2), the correlation in all three directions is close to zero, indicating that the cryptographic algorithm proposed in this paper has almost no correlation.

3.1.4. Information Entropy

Information entropy is mainly used in the field of image processing to describe the information contained in an image, that is, the distribution of grayscale values in an image. Its formula is defined aswhere represents the frequency with which grayscale occurs.

The closer the value of the information entropy is to eight, the more uniform the distribution is. By comparing with other studies (see Table 3), it can be discovered that after employing the scheme proposed here, the encrypted image possesses a marked degree of randomness.

3.2. Robustness Analysis

As we all know, there may be various situations like information loss and attack in the process of image transmission [21]. A robust encryption scheme should be able to resist various common image attacks to a certain extent and should ensure the information missing or contaminated encrypted images can still obtain the main information of the original image through decryption.

3.2.1. Cropping Attack

Cropping attack refers to dividing the image, which causes the image yield some information. In image encryption, part of the encrypted image is cut and then decrypted, and the decrypted reconstructed image is compared with the original plaintext image [22].

We tested 25% and 50% cut attacks on three groups of experimental pictures with different resolutions and the test results (see Figure 7). From the experimental results, we can see that the encryption scheme can still display the main image information after the clipping attack which proves that the encryption scheme possesses good resistance to the clipping attack.

3.2.2. Gaussian Noise Attack

Images are always degraded by some senseless error, which is called noise. The ideal noise is considered as white noise, which appears at the same intensity at all frequencies. As a particular case of white noise, Gaussian noise can be used to approximate the noise in many real scenes.

In this paper, three groups of experimental images are decrypted by adding 0.08 and 0.2 Gaussian white noise, respectively. The experimental results are shown in Figure 8. It can be noted from the diagram that the resulting diagram encrypted by this encryption scheme can obtain the main information of the original image through decryption even if it is polluted by noise, which has good robustness to noise attack.

3.3. Encryption Performance Analysis
3.3.1. Peak Signal-to-Noise Ratio

The peak signal-to-noise ratio (PSNR) measures the maximum signal-to-noise ratio on a signal and is typically used for image and video signals. PSNR is calculated using the following formula: where is the total number of samples in the signal, is the maximum possible value of the sample, corresponds to the nth sample of the original signal , and represents the nth sample of the encrypted signal .

PSNR is a measure of the difference in peak error between two images. For ideally similar images, PSNR is infinite, and for completely different images, its value is zero. The PSNR of the images was encrypted implementing the algorithm proposed in this paper and is calculated together with the original images (see Table 4), and it is found that their values are lower than the PSNR values calculated by other research schemes, which proves that the images encrypted by the algorithm in this paper are quite different from the original images, and the algorithm has good encryption performance.

3.3.2. Key Space Analysis

The space size of the encryption key affects the encryption performance of the encryption scheme. A robust image encryption scheme has a requirement of a sufficiently large key space [26]. It can be seen from the above that in the encryption scheme of this paper, we got six encryption keys. When M = N, we obtain the length of , respectively:

Meanwhile, the key length of can be expressed as

From what has been mentioned above, we can conclude that the key space size is

According to formula (12), we calculate the key space of three groups of experimental graphs, and the results are that are all greater than . Experimental results show that this encryption scheme has large key space and high security.

3.3.3. Time Complexity

Time complexity, as a function, is consumed to qualitatively describe the running time of the algorithm and is an important indicator to measure the quality of the algorithm. The common letter O is employed to denote and does not include the low-order term and the first-term coefficient of the function.

The algorithm proposed in this paper mainly encrypts images with equal lengths and widths, assuming the length and width of the images represent . The chaotic system is operated to generate the initial key sequence as needed . At the same moment, we estimate the time complexity of the deep learning model as based on its training theory. The proposed embedding key method in this paper requires. Scrambling and diffusion algorithms are required in the schemes and , respectively.

Composing the previous estimates, we can conclude that the time complexity function formula for the cryptographic algorithm proposed in this article is

Formula (13) is simplified by order of magnitude to give the time complexity of the cryptographic algorithm presented herein.

4. Discussion

Through the observation of the above simulation experimental results, it can be perceived that the cryptographic algorithm proposed in this paper performs well in terms of histogram, correlation, information entropy, and other statistical performance, and it shows good robustness in common shearing and noise attacks. As a group, it obtains an enormous key space and lower time complexity and better encryption performance such as the PSNR value compared with other research schemes.

Because the encryption algorithm proposed in this paper is primarily to operate a chaotic system combined with a deep learning model to generate an encryption key, and then embed the key through an embedding method designed by ourselves. In theory, deep learning models may generate errors when generating keys. But this article focuses on improving the algorithmic security of color image encryption, so the possible errors are unconsidered in this encryption scheme. Naturally, any algorithm includes certain limitations. In the effective experimental process, we tested the pixel number change rate (NPCR) and the uniform means change intensity (UACI). It is found that the resistance to differential attack was weaker than that of other research results and temporarily only supports the encryption of color images of equal length and width.

5. Conclusion

In this paper, we proposed a color image encryption scheme based on deep learning and block embedding. Because of its multilevel complex structure, the deep learning model is combined with a chaotic system to ensure the complexity of the generated key. As a group, we design a block embedding method, which combines the encrypted key with the color image organically. Through simulation experiments, it is proved that the proposed encryption scheme obtains good performance and robustness to general attacks.

MATLAB uses dual-precision data types by default. To overcome the difference between the simulation accuracy and the factual situation in the encryption algorithm design, we reduce the frequency of the conversion of different data types in the experiment. Meantime, the complex calculation that generates the encryption key is separated from the operation of the key embedding to reduce the complex operation of the color image itself. As one would expect, this paper temporarily only realizes the color image encryption of equal length and width. In the future, we will study the color image encryption with unusual length and width in this direction and improve the robustness of the algorithm to other image attack methods.

Data Availability

The data that support the findings of this study are openly available at [https://sipi.usc.edu/database/].

Conflicts of Interest

The authors declare that they do not have any commercial or associative interest that represents conflicts of interest in connection with the work submitted.