Security Measure for Image Steganography Based on High Dimensional KL Divergence
Steganographic security is the research focus of steganography. Current steganography research emphasizes on the design of steganography algorithms, but the theoretical research about steganographic security measure is relatively lagging. This paper proposes a feasible image steganographic security measure based on high dimensional KL divergence. It is proved that steganographic security measure of higher dimensional KL divergence is more accurate. The correlation between neighborhood pixels is analyzed from the principle in imaging process and content characteristics, and it is concluded that 9-dimensional probability statistics are effective enough to be used as steganographic security measure. Then in order to reduce the computational complexity of high dimensional probability statistics and improve the feasibility of the security measure method, a security measure dimension reduction scheme is proposed by applying gradient to describe image textures. Experiments show that the proposed steganographic security measure method is feasible and effective and more accurate than measure method based on 4-dimensional probability statistics.
Steganography provides an effective way for covert communication through hiding secret messages in public multimedia covers. Steganography based on image covers gets more attention because images widely existed and are easy to acquire and complex enough. During three key points (robustness, security, and capacity) of steganography, security seems more valued by researchers. So adaptive steganography  has become the mainstream because of its high security, and a lot of research results have been achieved. However, research on corresponding steganographic security theory is relatively lagging, and there is no effective steganographic security measure method now. Accurate steganographic security measure method could help to design more secure algorithms, help to enrich steganographic security theory, and promote further development of steganography.
At present, there are mainly two types of steganographic security measure methods. The first type is from the attacker's perspective through blind steganalysis [2, 3], which relies on a large number of training sets, many high-quality features, and good classifier. These security measure methods perform a certain degree of uncertainty and pool availability. The other type is based on information theory, which is the main research direction of steganographic security measure currently.
Zollner  first introduced Shannon's information theory to study steganographic security, providing ideas for research on steganographic security theory. The steganographic security theory widely used at present was constructed by Cachin  based on statistical security theory and KL divergence. By assuming that covers were independent and identically distributed, Cachin’s theory described the changes of 1-dimensional probability statistical distribution between the cover object and its stego object. However, Cachin’s theory ignored the correlation between cover elements, resulting in poor accuracy and overestimation of security.
In order to improve the accuracy of the steganographic security measure, Sullivan  proposed a method based on the Markov chain model. Then, Zhang  proposed a high-order Markov chain model. The “chain” scanning method made these two methods less accurate when applied to images which are 2-dimensional. Image covers (unless specifically noted, “image” refers to “spatial image” in the paper) are special because there is correlation between neighborhood pixels. And the correlation in 2-dimensional space cannot be described by Markov chain. On the basis of Cachin’s theory, Sun proposed a steganographic security measure method of n-dimensional KL divergence, but failed to propose an effective solution for its high computational complexity.
Therefore, the accuracy of image steganographic security measure relies on the accuracy of pixels statistical distribution. Based on the above research, this paper proposes an image steganographic security measure method based on high dimensional KL divergence. It is analyzed and proved that using higher dimensional discrete probability statistics can effectively improve the accuracy of steganographic security measures; we can draw the conclusion that 9-dimensional probability statistics can accurately measure the steganographic security by analyzing the correlation of neighborhood pixels, and we analyze the complexity problem of 9-dimensional probability statistics; to solve the complexity problem of 9-dimensional probability statistics, dimension reduction scheme is proposed to keep the accuracy of measure method, and then the complete steganographic security measure method is depicted; finally, experiment results show the feasibility, validity, and accuracy of the proposed steganographic security measure.
2. Related Work
The steganographic security measure is used to make a reasonable quantification of the steganographic security by means of calculation. In the study of steganographic security, the classical method was proposed by Cachin, which defines steganographic security based on statistical security and KL divergence. Assuming that and are two probability distributions of values of variable , KL divergence (also known as relative entropy) is an asymmetry measure of differences between the two probability distributions and . In general, is the true distribution, and represents the theoretical or approximate distribution. The basic form of KL divergence is
where the logarithmic function defaults to base 2 and is the value space of variable . For grayscale images, usually .
We can use to describe the probability distribution of the original cover image and use to describe the probability distribution of its corresponding stego image. If the KL divergence of the two probability distributions is at most ,
Then the steganography is secure. If , the steganography is absolutely secure. In practice, the probability distributions and are difficult to obtain accurately, and accurate steganographic security measure cannot be achieved. In actual application, it is assumed that the pixels are independently and identically distributed, and the probability distribution of the cover can be described by probability statistics.
3. Theoretical Analysis
It is not accurate enough using KL divergence calculated from 1-dimensional probability statistics of pixels to evaluate steganographic security. In order to improve measure accuracy, we naturally get the idea that using high dimensional probability statistics can compensate for the lack of accuracy caused by low dimensional probability statistics. -dimensional probability statistics used for steganographic security measure is shown as follows:
is treated as the final result of the steganographic security measure, and the smaller the value, the higher the steganographic security.
3.1. Theoretical Proof
In this paper, n-pixel group is defined as a square region composed of a center pixel and n-1 pixels closely adjacent to it in the spatial image. In the statistical analysis about image, as shown in Figure 1, 2-pixel group, 4-pixel group, and 9-pixel group are often used. When the image is statistically analyzed using n-pixel groups, accordingly, it is assumed that the n-pixel groups are independently and identically distributed.
Theorem 1. Steganography is more secure if embedding modifications occur on pixel groups with smaller statistical probability.
Proof. When making probability statistical analysis of n-pixel groups, we must use n-dimensional KL divergence as a criterion for measuring steganographic security. In the proof process, the representation is simplified.The above formula “greater than or equal to” is derived according to the nature of the convex function. if and only if .We can get from the above formula that when , , and when . if and only if , which means that there is no embedding modification.
So, , and the equal sign is established if and only if .
Therefore, is a monotonically increasing function of . If modifying pixel groups with higher statistical probability, KL divergence would be higher and steganography would be less secure; if modifying pixel groups with lower statistical probability, KL divergence would be lower and steganography would be more secure.
Theorem 1 is consistent with the empirical conclusions of current adaptive steganography algorithms. The statistical probability of pixel groups in the image texture areas is relatively low, and modifications of adaptive steganography algorithms with high security occur mostly in the texture pixels. This theorem provides theoretical basis for adaptive steganography algorithms.
Theorem 2. Steganographic security measure would be more accurate when using higher dimensional probability statistics.
Proof. Assuming that pixel groups are independently and identically distributed (different from the assumption that pixels are independently and identically distributed, for that pixel groups with independent and identical distribution conform to the fact that there is a strong and actual correlation between neighborhood pixels), for the same cover image and stego image , the N-dimensional KL divergence and n-dimensional KL divergence () are analyzed, respectively, as steganographic security measures, and we haveAs obtained in the proof process of Theorem 1, whenWhen we continue to modify the existing stego image to obtain another stego image , and use m-dimensional KL divergence to describe differences between and , and differences between and , the change of KL divergence isWhen , .
Therefore, steganographic security measure adopting higher-dimensional probability statistics can be more sensitive to embedding modification and be more accurate.
3.2. Determination of Dimensions Used in Steganographic Security Measure
From Theorem 2, it can be concluded that the higher the dimension of probability statistics, the more accurate the corresponding steganographic security measure. But for overall consideration especially for security, computational complexity, and practical feasibility, the conclusion is not absolutely correct. The dimension used in steganographic security measure must be determined obeying the following principles.
Principle 1. Probability statistics should adopt high dimensions to ensure the accuracy of steganographic security measures.
Principle 2. Computational complexity cannot be too high and steganographic security measures are feasible in practice.
Principle 3. The dimension should be determined to conform to actual characteristics of the cover.
The characteristics of the image are mainly reflected in correlation between the central pixel and its neighborhood pixels. We can easily know that Principle 1 is contrary to Principle 2. The key to taking a compromise between the two principles is the consideration of actual characteristics of image which are depicted in Principle 3.
The image library Bossbase 1.01  is often used in image steganography research. It contains 10,000 512512 uncompressed spatial images. These images are taken from 8 different cameras, and captured natural images of the raw format are processed (without any compressing operations) to obtain the grayscale image library. Most natural images are interpolated during imaging process  which makes the central pixel and its neighborhood pixels have strong correlation. In addition, the inherent content characteristics of natural images also make neighborhood pixels get correlation. It can be found that the pixel closer to the central pixel would have a stronger correlation with the central pixel. Due to interpolation operation and inherent content characteristics, the central pixel has a strong correlation with its 8-neighbor pixels. The correlation between the central pixel and its neighborhood pixels is shown in Figure 2.
After the above analysis, it is not difficult to get the conclusion that it is accurate and reasonable to determine the dimension based on the strong correlation between the central pixel and its 8-neighbor pixels. And it means that steganographic security measure should use 9-dimensional probability statistics, which conforms to Principles 1 and 3. However, for Principle 3, the computation complexity is
Therefore, security measure using 9-dimensional probability statistics is not feasible in practical applications. In order to solve this problem, this paper proposes a dimension reduction scheme based on image texture features, which can convert the 9-dimensional probability statistics to 4-dimensional probability statistics.
4. Steganographic Security Measure
Probability statistics usually ignore image texture features and do not distinguish pixels within a pixel group (9-dimensional probability statistics using 9-pixel group). Empirical conclusion of adaptive steganography and deterministic conclusion of Theorem 1 indicate that local texture features are the focus of most adaptive steganography algorithm, and the processing method to textures is related to steganographic security. Image texture features make dimension reduction in steganographic security measure become possible.
Gradient is a vector with amplitude and direction. The image gradient indicates that the pixel value changes fastest along the gradient direction at a certain pixel. Therefore image gradient can describe image texture features clearly. And it is widely used in edge detection and image enhancement. In this paper, we use gradient to describe texture features in 9-pixel groups and design a “dimension reduction scheme” so as to not only maintain correlation between 9-pixel groups, but also ensure the accuracy, reduce computational complexity, and increase availability of the steganographic security measure.
The overall structure of steganographic security measure method proposed in this paper is shown in Figure 3. The dotted line shows the key content of the proposed method.
4.1. Gradient and Texture Pixels
The preferred choice for obtaining gradient of a 9-pixel group is to use the Sobel operator [10, 11]. As shown in (11), convolving Sobel operator with a 9-pixel group to calculate the horizontal approximate gradient and vertical approximate gradient of the central pixel,
and are the horizontal gradient and vertical gradient of the central pixel at the position , respectively. And then the amplitude and direction of the gradient can be calculated easily. Gradient amplitude can describe the relative change of the gray value of image pixels. Using gradient amplitude to judge whether the central pixel is a texture pixel is simple, intuitive, and effective. General way to determine whether it is a texture pixel is setting a threshold (if gradient amplitude is greater than or equal to , then the pixel studied is a texture pixel). Gradient amplitude is calculated as follows:
Because image area where modification occurs and the number of pixels modified differ with different embedding payload, a reasonable threshold should be closely related to embedding payload. needs to meet the constraints shown in the following equation:
where is a set of texture pixels obtained according to threshold , denotes cover image, and denotes embedding payload.
4.2. Dimension Reduction Scheme
Dimension Reduction Scheme is different for texture pixels and nontexture pixels.
(1) Dimension Reduction Scheme for Texture Pixels. If , then the central pixel is a texture pixel. In this case, in order to reduce 9-dimensional probability statistics to 4-dimensional probability statistics, we must pick out pixels that are greatly deviated from the 9-pixel group to minimize the overall impact.
It is necessary to determine which pixel is the most deviated from the 9-pixel group. The easiest way is to compare which means difference between each pixel and the mean of pixels in the group. Pixel number within a 9-pixel group is shown in Figure 4.
The mean can be obtained in the following way:
Based on , look for the target pixel with the largest deviation,
By comparison to get the maximum value of and its corresponding target pixel , if there is more than one pixel satisfying the requirement, then select the pixel having weak correlation with the central pixel (correlation of D-neighbor pixel is weaker than 4-neighbor pixel) as the target pixel (if there are still more than one, the target pixel is determined by a random selection method). According to the position of target pixel , we can get the dimension deduction scheme for texture pixels, which is shown in Figure 5.
If the target pixel is the upper left pixel of the 9-pixel group, the lower right 4 pixels are counted, as shown in Figure 5(a).
If the target pixel is the upper pixel of the 9-pixel group, the lower left 4 pixels are counted, as shown in Figure 5(b).
If the target pixel is the upper right pixel of the 9-pixel group, the lower left 4 pixels are counted, as shown in Figure 5(c).
If the target pixel is the left pixel of the 9-pixel group, the lower right 4 pixels are counted, as shown in Figure 5(d).
If the target pixel is the central pixel of the 9-pixel group, no statistics are performed.
If the target pixel is the right pixel of the 9-pixel group, the upper left 4 pixels are counted, as shown in Figure 5(e).
If the target pixel is the lower left pixel of the 9-pixel group, the upper right 4 pixels are counted, as shown in Figure 5(f).
If the target pixel is the lower pixel of the 9-pixel group, the upper right 4 pixels are counted, as shown in Figure 5(g).
If the target pixel is the lower right pixel of the 9-pixel group, the upper left 4 pixels are counted, as shown in Figure 5(h).
The purpose of the above processing is to weaken the influence of texture pixels to steganographic and reduce the loss of measure accuracy caused by 9-dimensional probability statistics being reduced to 4-dimensional probability statistics.
(2) Dimension Reduction Scheme for Nontexture Pixels. If , then the central pixel is a nontexture pixel. In this case, it is necessary to determine whether the central pixel is a singular pixel, that is, whether the central pixel is far away from its neighborhood pixels:
If (setting a threshold , it is equivalent to ) meaning the central pixel is a singular pixel, the lower right 4 pixels are counted, as shown in Figure 6. The probability of this situation is extremely low.
If , then according to the method shown in Figure 7, four groups of 4 pixels are counted. Through counting multiple times, it can compensate for the disadvantages of low dimensional statistics and emphasize that embedding modification occurring on nontexture pixels has a great influence on steganographic security.
Associate threshold with threshold defined in Section 4.1, . Generally, .
4.3. Algorithm Description for Proposed Measure Method
See Algorithm 1.
|Input: Payload , Cover image , Stego image|
|Output: Value of steganographic security measure|
|1: Initialize target-image ;|
|2: [m,n]= size of ()|
|3: calculate gradient using Equ.(11)|
|5: while ()|
|9: for i=1:511|
|10: for j=1:511|
|11: if //texture pixels|
|12: calculate using Equ.(14)|
|14: look for make established|
|15: if end|
|16: if end|
|17: if end|
|18: if end|
|19: else if //non-texture pixels|
|31: let repeat step 2-30 to get|
|32: return using Equ.(3).|
The experiments use BOSSBase 1.01  as the image library, and MATLAB R2016a as the experimental platform. The processor is Intel(R) Pentium(R) CPU G2130 @ 3.20 GHz. In the experiment, three spatial image steganography algorithms including two adaptive steganography algorithms (HOGO  and HILL ) and LSBM (Least Significant Bit Matching) are selected for experimental analysis. Besides, we also select a JPEG image steganography algorithm J-UNIWARD  for experimental analysis. Sections 5.1–5.3 are experiments for spatial image steganography; and Section 5.4 is for J-UNIWARD.
5.1. Effectiveness of the Proposed Measure Method
Two different images (Image 1 and Image 2, as shown in Figure 8) are selected from the image library. We make three comparative experiments to demonstrate the effectiveness of the proposed measure method.
Comparing the differences of the steganographic security measures of the two images with the same embedding rate () and the same spatial image steganography algorithm, the results are shown in Figures 8(a) and 8(b).
Comparing the differences of the steganographic security measures of the same image with the same embedding rate () but with different steganography algorithms (HUGO, HILL, LSBM), the results are shown in Figure 8(a) or Figure 8(b).
Comparing the differences of the steganographic security measures of the same image with the same steganography algorithm (HILL) but with different embedding payloads ( and ), the result is shown in Figure 8(c).
Figures 8(a) and 8(b) show that when the payload is 0.4, the value of steganographic security measure of Image 2 is lower than that of Image 1 for all three steganography algorithms, and the steganographic security of Image 2 is higher. Comparing Image 1 with Image 2, they all have a high texture complexity and texture pixels in Image 1 distribute widely, which causes embedding modification to Image 1 more scattered. Therefore, more pixel groups would be modified and final measure values would be higher. It can be seen from the results of Figure 8 that the proposed steganographic security measure method can effectively quantify steganographic security and can accurately quantify the relationship of steganographic security with algorithm, cover image features, and embedding payload.
5.2. Accuracy of the Proposed Measure Method
In order to show that the steganographic security measure method proposed is more accurate, compare it with the measure method of calculating KL divergence by directly using 4-dimensional probability statistics. 50 images were randomly selected from the image library, and adopt these two measure methods to get their KL divergence values between each cover image and its stego image processed by HILL or HUGO. Since the total number of 4 pixel groups used in probability statistics of two measure method is different, the measure values cannot directly reflect each method’s accuracy. Change ratio which can avoid this problem can be used to evaluate measure accuracy. In the experiment, we adopt the change ratio of payload changed from 0.2 to 0.4 as direct data for evaluating the accuracy of each measure method. Change ratio can be calculated in the following way:
where denotes change ratio and indicates measure value at a certain payload.
In the experiment, we compared the difference of measure value and the difference of change ration between the two measure methods under algorithms HILL, HUGO, and LSBM. Some experimental results of HILL algorithm are shown in Table 1. Some experimental results of HUGO algorithm are shown in Table 2. And Some experimental results of LSBM algorithm are shown in Table 3 (only 20 of them are listed due to paper space limitations).
The effectiveness of the steganographic security measure method proposed in the paper can also be demonstrated from the data in Tables 1, 2, and 3. From the comparison between the proposed method and the method based on 4-dimensional probability statistics in change ratio when embedding payload is changed from 0.2 to 0.4, it can be clearly found that change ratio of the proposed method is larger. And it indicates that the proposed method is more sensitive to embedding modifications and is more accurate. At the same time, the two measure methods sometimes show different conclusions due to the data shown in Tables 1 and 2. For example, as for the data and data in Table 2, the proposed method shows that steganography of image is more secure while the method based on 4-dimensional probability statistics shows that steganography of image is more secure.
5.3. Changes of Steganographic Security Measure with Embedding Payload
This experiment is designed to compare the values of the proposed steganographic security measure method and the method based on 4-dimensional probability statistics at different embedding payload. 100 images in Bossbase 1.01 image library were randomly selected as experimental database, and average measure values of HILL, HUGO, and LSBM were calculated when the embedding payloads were 0.01, 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, and 0.5. Average measure values at selected payloads are shown in Table 4 and Figure 9.
Both the proposed method and the method based on 4-dimensional probability statistics adopt the frequency statistics to approximate the probability. Total number of frequency counts (total number of pixel groups) of the two methods are different. So, in view of the same cover and the same measure method, the total number of frequency statistics is the same, so we can take a “relative value” to show the accuracy of the security measure method. “Relative value” can be calculated as follows:
where represents relative value at payload using steganography algorithm and measure method ; represents measure value at payload using steganography algorithm and measure method .
Changes of relative values with embedding payload are shown in Figure 10.
Figure 9 shows that both two methods can describe the change of steganographic security with the embedding payload. Figure 10 shows that relative values of proposed method for the same steganographic schemes are all larger than that of the method based on 4-dimensional probability statistics, which indicates that measure method based on low dimensional (4-dimensional) probability statistics overestimates the steganographic security and is less accurate. At the same time, the results indicate that the proposed measure method is more sensitive to the change of the embedding payloads and the proposed measure method is more accurate.
The above results show that the proposed method is more accurate and performs much better.
5.4. The Performance of Steganographic Security Measure on JPEG Image Steganography
The experiment in this section still uses the Bossbase 1.01 as the image library. Images are JPEG-compressed to obtain the corresponding JPEG images (Quality=75, bitDepth=8). Adopting a typical JPEG image steganography algorithm J-UNIWARD , the compressed JPEG images are modified to obtain stego images in JPEG format. Since the proposed security measure is carried out through statistical analysis of pixel values, it is necessary to read the cover images and stego images in spatial discrete pixel values when evaluating the steganographic security of the JPEG image steganography. In short, JPEG images need to be inversely transformed into spatial images before steganographic security measure. We perform experiments similar to previous sections, and the results are shown in Table 5 and Figure 11.
From the data in Table 5, it can be seen that measure value of J-UNIWARD is higher than HILL or HUGO. This is because JPEG image steganography modifies the DCT coefficients of the image block, and each embedding modification affects more pixels. It can be seen from Table 5 and Figure 11 that the proposed security measure can effectively measure the changes of steganographic security of JPEG images with image features and payload, and it is slightly more accurate than 4-dimensional probability statistics.
In this paper, a feasible steganographic security measure method based on high dimensional KL divergence is proposed. The proposed steganographic security measure is considered from the perspective of stegan (the sender) to conduct a reference steganographic security measure. It is proved that embedding modification to the pixel groups with small statistical probability could get higher steganographic security, and higher dimensional probability statistics is more accurate for steganographic security measure. It is reasonable to use 9-dimensional probability statistics in security measure by analyzing the imaging principle. And dimension reduction scheme is proposed to obtain a feasible steganographic security measure method. Experiments on spatial image steganography and JPEG image steganography show the effectiveness and accuracy of the proposed measure method. However, the threshold determination of the proposed method needs to be further improved in terms of complexity and rationality. We will make further study on the relationship between image texture complexity and steganographic security.
The image data supporting this Systematic Review are from previously reported studies and datasets, which have been cited. The processed data are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that they have no conflicts of interest regarding the publication of this paper.
This work was supported by the National Natural Science Foundation under grant 61601517.
J. Fridrich, “Content-adaptive pentary steganography using the multivariate generalized Gaussian cover model,” Proceedings of SPIE - The International Society for Optical Engineering, 2015.View at: Google Scholar
Z. Zhang, F. Qu, G.-J. Liu, J.-W. Wang, Y.-W. Dai, and Z.-Q. Wang, “A novel security evaluation method for digital image steganography based on higher-order markov chain model,” Information and Control, vol. 39, no. 4, pp. 455–461, 2010.View at: Google Scholar
X. Ma and Y. Nie, “Optimized approach of Sobel operator of image edge detection using model-based design,” RISTI: Revista Iberica de Sistemas e Tecnologias de Informacao, vol. e6, pp. 401–412, 2016.View at: Google Scholar
N. Mathur, S. Mathur, and D. Mathur, “A novel approach to improve sobel edge detector,” Procedia Computer Science, pp. 431–438, 2016.View at: Google Scholar