Abstract
Least significant bit (LSB) substitution is a method of information hiding. The secret message is embedded into the last bits of a coverimage in order to evade the notice of hackers. The security and stegoimage quality are two main limitations of the LSB substitution method. Therefore, some researchers have proposed an LSB substitution matrix to address these two issues. Finding the optimal LSB substitution matrix can be conceptualized as a problem of combinatorial optimization. In this paper, we adopt a different heuristic method based on other researchers’ method, called enhanced differential evolution (EDE), to construct an optimal LSB substitution matrix. Differing from other researchers, we adopt an HVSbased measurement as a fitness function and embed the secret by modifying the pixel to a closest value rather than simply substituting the LSBs. Our scheme extracts the secret by modular operations as simple LSB substitution does. The experimental results show that the proposed embedding algorithm indeed improves imperceptibility of stegoimages substantially.
1. Introduction
The internet provides an easy way to exchange information with others. However, information is also prone to eavesdropping from hackers. Several methods can be employed to protect secret information, such as cryptography, steganography, and secret sharing schemes. The spirit of these methods is basically varied. Cryptography scrambles the content with a private key. Without the appropriate key, unauthorized authors cannot decode the secret within limited time and resources. Unlike cryptography, steganography, also called information hiding, conceals the secret rather than scrambling it. That is, the secret is covered by innocent information that does not attract the attention of hackers, who thereby pass over it. Due to its simplicity and efficiency, steganography is still a popular method until now [1–4].
Among plenty of steganographic methods, simple least significant bit (LSB) substitution is the most general one [5]. The secret message is decomposed and embedded into the least significant bits of each pixel of the coverimage. The modified coverimage is called a stegoimage. The secret message can be extracted by performing modular operation to each pixel of the stegoimage. This method is very simple and easy to implement; however, sometimes the stegoimage is not imperceptible enough when more least significant bits are substituted. Recently, Wang et al. proposed a novel idea about substitution matrix to improve the quality and security of the stegoimage [6]. The substitution matrix can be seen as a mapping function, which maps each secret value into another value. Different substitution matrices represent different mappings and result in different stegoimages. Among these different stegoimages, some are closer to the original coverimage than others are. Obviously, the optimal substitution matrix is the one that produces the stegoimage closest to the coverimage. Due to the huge number of possible substituion matrices, Wang et al. utilized genetic algorithms (GA) [7] to find the optimal substitution matrix. According to the patterns of chromosomes, GA can be classified into two types: one is binary GA, and the other is realparameter GA. In the course of evolution, binary GA has to encode the original problem into binary chromosomes, and the encoding method may influence the efficiency of problemsolving. However, some problems, such as combinatorial optimization, are not easy to be encoded into binary chromosomes. Furthermore, the length of chromosomes of such kind of problems may be too long to solve problems efficiently. Although realparameter GA can encode a problem with shorter chromosomes, its efficiency and the quality of the final solution are not as good as those of binary GA. Therefore, each kind of GA has its limitation.
Later, other researchers adopt different heuristic methods, such as tabu search [8], ant colony algorithm [9], and cat swarm optimization [10], to construct optimal substitution matrix. All of these researchers adopt simple LSB substitution to embed the secret. However, even though an optimal substitution matrix is adopted, the modification to the pixels of the coverimage may be still large due to the intrinsic features of simple LSB substitution. Consequently, the improvement of the imperceptibility may be limited. Besides, these researchers adopt peak signaltonoise ratio (PSNR) as a fitness function to measure how the stegoimage is near the coverimage. The PSNR number represents the average differences between pixels; however, sometimes the difference between pixels cannot respond to human perception. Generally speaking, human eyes can tolerate modifications to texture areas more than those to smooth areas [11]. Images are viewed by human eyes after all; hence it is more suitable to use the measurement based on the human visual system (HVS) to evaluate the imperceptibility of the stegoimage.
This paper proposed a method to construct optimal substitution matrix as well. Nevertheless, the proposed scheme has three aspects different from those in other similar researches. At first, the way to embed secrets is to change the pixel values rather than to substitute the least significant bits directly. However, the way of extracting is easy and the same as that of simple LSB substitution. Second, another heuristic method, called enhanced differential evolution (EDE) [12], is adopted to search for the optimal substitution matrix. Differential Evolution (DE) was first introduced by Storn and Price [13] and copes with problems whose feasible solutions are continuous values. Later, Onwubolu and Babu extend the capability of DE to handle problems with discrete solutions. Third, an HVSbased fitness function, called structural similarity SSIM [14], is employed to measure the difference between the stegoimage and the coverimage. Therefore, our stegoimage is not only physically near the coverimage but is also perceived similar to the coverimage by human eyes. The rest of this paper is organized as follows. In Section 2, some preliminary knowledge for our work is provided. In addition, some related literatures are reviewed as well. In Section 3, the way of constructing an optimal substitution matrix and the embedding and extraction methods are explained in detail. Then, the experimental results and comparisons with other researchers methods are presented in Section 4. Finally, we will give some conclusions in Section 5.
2. Literature Review
2.1. Simple LSB Substitution
The simple LSB substitution is the earliest steganographic technique. The socalled least significant bit is the less important part of a pixel. Therefore, modifying LSBs of pixels cannot change an image too much. Embedding and extracting secrets are very simple and easy to implement. Suppose that denotes a pixel of the coverimage and is expressed as That is, and are the quotient and the remainder when is divided by . Suppose that is a bit secret. The secret can be embedded into by means of the following: Performing a modulooperation on the stegopixel , as shown in (3), can extract the secret :
Simply speaking, the secret is embedded by directly substituting the last bits of and is retrieved from the last bits of . Take a pixel (00100000)_{2} and a 3bit secret message (111)_{2} as an example. Since the length of the secret message is three, we can substitute the last three bits of the pixel with the secret. Using (1) and (2), we can get the stegopixel (00100111)_{2}. If the secret is very large, it is divided into segments of fixed length and is evenly distributed into each pixel of the coverimage.
There are two problems of this method. First, the more secret messages there are, the more bits of the coverimage have to be modified. Hence the stegoimage may become too different from the coverimage to give cover to the secret message inside. Second, the simplicity is a twoedged sword. The receiver can recover the secret easily, so do the hackers. Therefore, the security of this method has to be enhanced.
2.2. Substitution Matrix
In 2001, Wang et al. introduced the substitution matrix to improve the quality and security of the stegoimage [6]. Briefly speaking, a substitution matrix is used to replace the secret value with another value. Wang et al.’s method can be summarized as follows. At first, the secret is divided into segments of bit length, and then the order of each segment is randomly permuted. Suppose that denotes the set of reordered segments of and that where is the number of total segments of . Let denote a substitution matrix, and , where , , and . According to , every element of is changed into another value as shown in Note that there is only a “1” in each row and each column. In short, the substitution matrix can be seen as a onetoone mapping function from to , where is the set of all possible integer values of . Then, the mapping result is embedded into the coverimage by means of simple LSB. The following serves as an example: The elements , , , and of indicate that the possible values 0, 1, 2, and 3 of are substituted with 0, 2, 1, and 3, respectively. Therefore, is changed into another set as .
Obviously, there are various possible substitution matrices. Different substitution matrix produces different and further produces different stegoimage. Wang et al. defined an optimal substitution matrix as the one that produces a stegoimage with maximal peak signaltonoise ratio (PSNR), where
In (8), and denote the width and height of the coverimage, respectively. And and denote the pixels of the coverimage and the stegoimage, respectively. Essentially, finding an optimal substitution matrix is a kind of combinatorial optimization problem, and there are totally possible solutions. Moreover, the solution space rapidly grows up with the number of . If , for example, the number of possible solutions becomes 20, 922, 789, 888, 00. When solving optimization problems with large solution space, heuristic algorithms perform better than deterministic algorithms. Therefore, Wang et al. utilized genetic algorithm (GA) to find nearoptimal substitution matrix.
2.3. Optimal LSBBased Steganography
Some other researchers apply Wang et al.’s substitution matrix to improve the quality of the stegoimages. The main difference is that they adopt different optimization algorithms, especially bioinspired algorithms. In 1992, Dorigo proposed an ant colony optimization (ACO) algorithm [15] in his Ph.D. thesis, which is very suitable for solving combinatorial optimization problems. Since finding an optimal substitution matrix is a kind of combinatorial optimization problem, Hsu and Tu [9] adopted ACO to find optimal substitution matrix. In 2007, Chu and Tsai introduced a cat swarm optimization (CSO) algorithm [16], which is derived from the behavior of cats. Valuing the performance on finding the global best solutions, Wang et al. [10] gave some revisions to CSO to generate optimal substitution matrix. When using bioinspired algorithms, one needs to provide fitness function to evaluate a solution so that the algorithm can guide those virtual creatures, such as cats or ants, toward the optimal solution. Hsu and Tu and Wang et al. utilized the pixel difference between the coverimage and the stegoimage as the fitness of a solution.
Some researchers adopt different embedding strategies to make the distortion to the coverimage as little as possible. Xu et al. [17] adopted Mielikaines’ pairwise LSB matching method [18] and changed the matching order between the secret bits and cover pixels to decrease the distortion to the coverimage. They designed a threetiered score system to evaluate the performance of a matching order and utilized an immune programming to find the best matching order. Considering that the difference measured in pixels is not necessary the same as that measured by human eyes, a few of researchers take human visual system into consideration. Lee and Tsai [19] determined the number of bits used to carry secret in a pixel according to the principle of just noticeable difference (JND). Further, they utilized dynamic programming to divide the secret data into segments to minimize the modification to the coverimage when embedding secret data. Instead of using every pixel of a block, Bedi et al. [20] chose a part of the pixels in a block to carry secret data. In view of the image quality, the choice is not made at random. For each block of pixels, they utilized particel swarm optimization (PSO) algorithm [21] to determine the best pixels to embed secret data sequentially. The distortion error between the coverimage and the stegoimage is measured with a quality index based on human visual system. Since the pixels used to embed secret data vary from block to block, the pixel positions have to be recorded as the key to extract secret data successfully. If the size of the coverimage is pixels, the minimal required space for the key is bits. Obviously, the required space grows up with the size of the coverimage. Another worry about Bedi et al.’s scheme is about hiding capacity. Not all of the pixels in a block will be used to embed secret data; or else, Bedi et al.’s scheme becomes meaningless. In fact, in their experiments, only eight pixels of a block are used to embed secret data. The highest possible payload is only 0.5 bits per pixel if the last four bits of a pixel are used to embed data.
2.4. HVSBased Measurement
Human eyes are complex biological organs. The way human eyes perceive the difference between two images is not the same as that of PSNR. Sometimes, a sensible difference for human eyes does not necessarily mean a large difference between pixels. After some observations, Barni and Bartolini [11] listed the following three rules of thumb.(1)Disturbs are much less visible on highly textured regions than on smooth areas.(2)Contours are more sensible to noise addition than highly textured regions but less than flat areas.(3)Disturbs are less visible over dark and bright regions.Based on the characteristics of human visual system (HVS), some researchers proposed different methods to evaluate image quality or to estimate the acceptable change to an image [22–26].
Combining the three components of luminance, contrast, and structure, Wang et al. proposed a structural similarity (SSIM) index to measure the similarity between two images in light of HVS. Suppose that and denote two graylevel images, respectively. The luminance comparison function is where and denote the average pixel values of images and , respectively, and , where is the dynamic range of pixel values and . The contrast comparison function is where and denote the standard deviations of pixel values of images and , respectively, and where . The structure comparison function is where is the covariant of pixel values of images and , respectively. Combining (9), (10), and (11), we can get the following SSIM index: For simplicity, Wang et al. set and . Consequently, SSIM can be transformed into a specific form as follows: The SSIM metric is calculated on various windows of the image, and hence we can use the following mean SSIM (MSSIM) index to evaluate the overall image quality:
2.5. Enhanced Differential Evolution
Differential evolution (DE) was first introduced by Storn and Price [13]. DE is a populationbased optimization method, and candidate solutions are represented as vectors. For each individual (called a target vector) in the current population, offspring (called a trial vector) is generated by adding a scaled, random vector difference to a randomly selected population vector. The trial vector competes with its corresponding target vector on their fitness. The winner can live to the next generation. Although simple, DE performs well on a wide variety of test problems [12, 13, 27, 28]. Figure 1(a) is the flowchart of DE. Initially, DE is invented for solving continuous space optimization problems. Later, some researchers modified DE to attack permutativebased combinatorial optimization problem. The socalled permutativebased combinatorial optimization problem is that its candidate solution is a permutation of a sequence of integers. Among these modifications, Onwubolu and Babu’s approach, called enhanced differential evolution (EDE), is intuitive and easy to implement [12]. The main idea of this approach is to transform the permutative population into continuous population. The forward transformation formula is as follows: where is a discrete parameter of some vector and is a scaling factor. After being transformed into continuous form, the population can be handled by canonical DE strategy to generate the child population. However, the individuals of the child population are continuous values and cannot be evaluated by the fitness function. Therefore, they have to be backward transformed into discrete solutions by the following equation: where denotes a function that rounds a real value to the nearest integer. Backward transformation may produce infeasible solutions, so the offspring population has to be repaired. To generate better offspring, Onwubolu and Babu proposed two improvement strategies for the repaired offspring: one is swap mutation and the other is insertion mutation. The final offspring will compete with the parents. Figure 1(b) shows the flowchart of EDE.
(a)
(b)
3. The Proposed Method
With the help of a substitution matrix, the imperceptibility can be improved, but the embedding way of simple LSB substitution may limit the improvement. Several researches have been devoted to the study of constructing an optimal substitution matrix. Some utilized deterministic algorithms [29, 30], while some utilized heuristic algorithms [6, 8–10] to search for the optimal matrix. No matter what algorithms they employed, it is PSNR that they adopted as the objective function (also called fitness function in heuristic algorithms) to guide the search direction. As we have mentioned above, PSNR measures the absolute difference of pixel values, but not the difference perceived by human eyes. Therefore, we adopt MSSIM (14) as the fitness function. The heuristic algorithm we utilized to search for the optimal substitution matrix is EDE. In addition, to break through the intrinsic limitation of simple LSB substitution, we design a ModEmbedding algorithm, which can make the coverimage and stegoimage as close as possible.
Before moving on to the main task, it is helpful to give an overview of our scheme. Figure 2 is the flowchart of the proposed scheme. In the embedding process, a secret is split into segments, each of which is of bit length. Let denote the set of segments. Then, all elements of are randomly permuted using a pseudorandom number generator. According to and the coverimage , EDE constructs nearoptimal substitution matrix . With , is transformed into and embedded into . And finally, we can get the stegoimage with the secret inside. In the extraction process, is extracted from the stegoimage and is reversely transformed into with the same substitution matrix . Using a pseudorandom number generator seeded by the same key, we can rearrange the elements of in the original order of the elements of . Finally, combining the elements of , we can recover the secret .
(a) The embedding process
(b) The extraction process
With the overview in mind, we can now look deeper into the details of the proposed scheme.
3.1. EDE Subroutine
In this paper, EDE is employed to construct a nearoptimal substitution matrix. Since initialization and selection are problemdependent parts of EDE, we will concentrate on these two parts.
Initialization. EDE is a populationbased evolutaionary algorithm. Therefore, a population size has to be predefined at first. Initially, users have to randomly generate a set of distinct candidate solutions. Starting from the initial population, EDE will generate offspring and evolve continuously to find the optimal solution until the terminated condition is satisfied. In order to use EDE to solve problems, we first need to represent solutions in form of vectors. As regards the problem of the proposed scheme, a solution is in the form of a matrix. Precisely speaking, as we have mentioned before, it is a onetoone mapping from the set to the set , where . Consequently, we can simply represent a substitution matrix as a permutation of the set . Therefore, it is obvious that a substitution matrix can be represented as a permutation of 0 to . We will now explain more definitely how a substitution matrix is encoded into a vector in EDE. Suppose that the substitution matrix , where , , and . Then, the corresponding vector , where if . Take the following substitution matrix as an example. The corresponding vector is . Consider
Selection. In the selection phase, each individual (i.e., vector) of the current population has to compete with its offspring. The competition is based on their quality; hence users have to provide a fitness function to score vectors. Here we adopt (14) as our fitness function. Figure 3 illustrates the detailed process of computing a fitness of a vector . At first, is transformed into the substitution matrix corresponding to the vector . Next, the transformed result is embedded into the coverimage to get the stegoimage . Therefore, the fitness of a vector is MSSIM .
3.2. Embedding
Before explaining the way of the proposed embedding, let us consider the following example. Suppose that a coverpixel is 60 and a 2bit secret is 3. According to (1), 60 is expressed as follows: In light of simple LSB substitution, the secret substitutes the remainder directly and hence results in the following stegopixel: Performing 63 mod 2^{2}, we can extract the secret. Let us consider another situation. If we change the coverpixel to 59, instead of 63, we still can extract the secret by performing the same modulooperation (i.e., 59 mod 2^{2}) because . It is clear that 59 is closer to the coverpixel 60 than 63 is. This example makes it clear that we can test (2) on three quotients , , and to see which one can result in a stegopixel closest to the coverpixel. The complex embedding algorithm is as Algorithm 1.

3.3. Extraction
Extracting secret is very simple. Algorithm 2 illustrates the extracting algorithm. Each bit secret is extracted concatenated to the whole secret . However, is not the original secret. We have to convert each bit value of to the original value according to the substitution matrix .

4. Experimental Results and Discussions
This section demonstrates some experimental results of the proposed method. In addition, the proposed scheme was compared with some simulated experiments. The experiments in this section are carried out on a PC with Intel Core 2 Duo CPU at 2.8 GHz, 4 GB RAM, Windows 7 Professional Operating System, NetBeans IDE, and JDK 6. Before turning to a closer examination of the experimental results, we will outline our assumptions here.(1)The coverimage is a graylevel and uncompressed image.(2)The stegoimage cannot be modified by any form of signal processing.(3)The key is preserved secretly.(4)The size of a secret image is of pixels, where and are the width and the height of the coverimage, respectively.
Having clarified the assumptions, we may now go into details about our experiments. Here we have three distinct types of simulationsas follows.(a)Experiment I. The secret was embedded and extracted by means of simple LSB substitution.(b)Experiment II. The secret was transformed with an optimal substitution matrix and then embedded and extracted by means of simple LSB substitution. The optimal substitution matrix was constructed by EDE with PSNR as the fitness function.(c)Experiment III. The secret was transformed with an optimal substitution matrix and then embedded and extracted by means of simple LSB substitution. The optimal substitution matrix was constructed by EDE with MSSIM as the fitness function.As regards the proposed scheme, we transformed the secret with an optimal substitution matrix and then embedded it by the proposed ModEmbedding algorithm. The way to extract the secret is the same as that of simple LSB substitution. The optimal substitution matrix was constructed by EDE with MSSIM as the fitness function. For clarity, we use Table 1 to summarize the similarities and dissimilarities between the proposed scheme and the above simulations.
Figure 4(a) is our secret image of 256 512 pixels, and Figures 4(b) to 4(f) are our coverimages of 512 × 512 pixels. The secret image is embedded into the last four significant bits of pixels of the coverimage (i.e., ). The window size of MSSIM is 11 × 11. Table 2 lists parameters of EDE, and Table 3 lists the PSNR and MSSIM values of the stegoimages of the three simulated experiments and our method. To summarize, we sketch the bar chart of these values in Figure 5.
(a) Secret image
(b) Sailboat
(c) Boat
(d) Pepper
(e) Bridge
(f) Gold
(a) PSNR
(b) MSSIM
Several observations from these experimental results are discussed as follows.(1)The proposed scheme outperforms simple LSB substitution (i.e., Experiment I) in both PSNR and MSSIM.(2)As we have mentioned before, there are some researchers using heuristic algorithms to construct the nearoptimal substitution matrix. Because we are not able to acquire their experimental results, we use Experiment II to simulate and compare. The MSSIM values indicate that our stegoimages are more visually imperceptible than those of other researchers adopting PSNR as a fitness function. The PSNR values indicate that the improvement of the absolute difference between the coverimage and the stegoimage is limited if the way of embedding is simple LSB substitution. Therefore, the proposed embedding algorithm indeed breaks through the limitation.(3)We wonder what the result is if other researchers change their fitness function from PSNR to MSSIM. Therefore, we use Experiment III to simulate that situation. The MSSIM values indicate that our method performs better. We may, therefore, reasonably conclude that simple LSB substitution also limits the improvement of visual imperceptibility.
As a whole, the merits of our work are summarized as follows.(1)The extra space for the substitution matrix is small. The extra space for the substitution matrix is related to the number of bits, that is, the parameter in our scheme, used to carry the secret. If , the required space is bits. In Bedi et al.’s scheme [20], the extra space is related to the size of the coverimage. If the size of the coverimage is 512 × 512 pixels, the required space is 512 × 512 bits, which is 1024 times that of our scheme.(2)The payload is high, but the image quality is not destroyed too much. Generally speaking, the last four bits of a pixel can be modified at most; or else, the image quality is not acceptable. Hence the highest possible payload of a steganographic scheme is four bits per pixel. The experimental results show that our scheme achieves the highest payload, which is eight times that of Bedi et al.’s scheme. In addition, the average MSSIM of our scheme as shown in the experiments is 0.9183, while that of Bedi et al.’ is 0.9124 according to their experimental results.(3)We give consideration to image quality at pixel level and at visual level simultaneously. The pervious researches related to optimal substitution matrix only consider the image quality at pixel level [6, 9, 10]. Our scheme takes the human visual system into account and adopts the measurement MSSIM as our fitness function. Besides, we elaborate the embedding algorithm so that the difference between the cover and stegoimages at pixel level is as small as possible. Though Bedi et al. also adopt MSSIM as their fitness function, the required space for the key is too large.(4)Our extracting method is as simple as the simple LSB. One of the merits of the simple LSB is its simple way of extracting the secret, that is, the modular operation. Like simple LSB, we extract the secret only through the modular operation.
5. Conclusions
As we have mentioned in Section 2.2, the number of possible solutions becomes 20, 922, 789, 888, 000 when . In this paper, we adopt EDE to construct a nearoptimal substitution matrix. It follows from the experiment results that EDE can construct a good substitution matrix within a few iterations. Considering the features of human eyes, we adopt an HVSbased measurement MSSIM, instead of PSNR, as the fitness function. We can see from the experimental results that adopting MSSIM as the fitness function indeed improves imperceptibility visually. Besides, the proposed embedding algorithm improves the stegoimage quality largely; at the same time, the extraction is as simple as by the traditional LSB substitution method. Many researchers utilize different methods to solve the problem of constructing an optimal substitution matrix, so we believe that this is an interesting problem. So far as we know, no one has attempted to apply discrete DE to solve this problem until now. Therefore, this paper provides an efficient method to construct a substitution matrix and extends the applications of the DE algorithm successfully.
In future work, we intend to address the issue of steganalysis [31]. We will design a sophisticated embedding strategy against statistical steganalysis. In addition, we may compare the results obtained from different bioinspired algorithms.
Acknowledgment
This work was supported in part by a Grant from the National Science Council of the Republic of China under Project NSC 1022221E034011.