Abstract

Recently, deep learning-based models are being extensively utilized for steganalysis. However, deep learning models suffer from overfitting and hyperparameter tuning issues. Therefore, in this paper, an efficient -nondominated sorting genetic algorithm- ( NSGA-) III based densely connected convolutional neural network (DCNN) model is proposed for image steganalysis. NSGA-III is utilized to tune the initial parameters of DCNN model. It can control the accuracy and f-measure of the DCNN model by utilizing them as the multiobjective fitness function. Extensive experiments are drawn on STEGRT1 dataset. Comparison of the proposed model is also drawn with the competitive steganalysis model. Performance analyses reveal that the proposed model outperforms the existing steganalysis models in terms of various performance metrics.

1. Introduction

With the advancement in Internet technology and communication, a substantial amount of images are transferred over public networks. Recently, it has been found that many criminal groups utilize images to transfer their dangerous data. These groups hide their dangerous data in the images. Generally, they utilize steganography approaches to hide their harmful contents in the images [1]. Therefore, researchers have started utilizing steganalysis models to recognize the images which contain embedded data. Thus, image steganalysis is an approach for recognizing data embedded in images. Consequently, steganalysis classifies the given image as a stego-embedded image or normal image [2].

Zhou et al. [3] designed an ensemble learning model- (ELM-) based image steganalysis. SRNet and RESDET were utilized as base models. Fusion of the base models was then achieved to classify the embedded images. Zhang et al. [4] designed a CNN model by using kernels, and the optimization of convolution kernels was achieved during the preprocessing layer. The minimal convolution kernels were utilized to minimize the initial parameters. Spatial pyramid pooling was also used to integrate the local features. Gowda et al. [5] designed an ensemble color space model (ECSM) to evaluate a weighted activation map. It can extract various features explicit to each color space. Levy-flight grey wolf optimization was utilized to minimize the number of features selected in the map.

Boroumand et al. [6] proposed a deep residual model (DRM) to reduce the heuristics and externally enforced elements. This model computes the noise residuals by disabling the pooling to overcome the suppression of the stego signal. Yedroudj et al. [7] designed a truncation activation-based ensemble model (TREM) trained with Rich features. It utilizes a truncation activation function and batch normalization on a scale layer. Ye et al. [8] utilized high-pass filter-based CNN (HCNN) to achieve steganalysis. The weights of the initial layer were computed using a high-pass filter for evaluation of residual maps in a spatial rich model. It was utilized as a regularizer to suppress the image content efficiently. A truncated linear unit was also utilized. Wu et al. [9] utilized CNN and deep residual network for steganalysis. It contains a substantial number of network layers, which are significant for evaluating the complex statistics of images.

Yang et al. [10] designed thirty-two-layer CNNs to enhance the performance of features by integrating all features to enhance the gradient. The bottleneck layers enhance the feature propagation and minimize CNN parameters dramatically. Li et al. [11] designed a novel CNN model to evaluate embedded artifacts in an efficient manner. Information diversely was also achieved. A parallel subnet module was also designed utilizing numerous filters. Subnets were trained independently to improve computational speed. Zhang et al. [12] designed a novel CNN model to enhance the classification accuracy of spatial-domain steganography. A spatial pyramid pooling was utilized to integrate the local features. Sharma et al. [13] designed an aggregated residual transformation-based CNN model to obtain significant features for steganalysis. This model has limited initial parameters for enhancing the classification rate. The residual skip connections were also utilized.

Liu et al. [14] have shown the similarity and dissimilarity between SRM-EC and CNN models. An ensemble model was designed to integrate SRM-EC with CNN by averaging their resultant probabilities. Zeng et al. [15] utilized CNN for a Rich model feature set. The bottom to up strategy was utilized for training the output of each subnetwork to the actual output. Yang et al. [16] designed a max CNN for steganalysis. It allocates significant weights to features learned from the complex texture regions. Yang et al. [17] proposed image steganalysis using a transfer learning model with structure preservation. The discriminant projection matrix was utilized for building the model. Frobenius-norm-based regularization was also utilized to achieve better results. Ren et al. [18] designed an efficient selection channel network and steganalysis model. The steganalysis model combined with the trained selection channels estimates the final steganalysis outcomes.

From the extensive review, it has been observed that deep learning-based models can be utilized for steganalysis [19]. However, deep learning models suffer from overfitting and hyperparameter tuning issues. Therefore, in this paper, an efficient NSGA-III-based densely connected convolutional neural network (DCNN) model is proposed for image steganalysis. This is the principle difference from the existing model available in the literature.

The main contributions of this paper are as follows:(1)An efficient NSGA-III-based DCNN model is proposed for image steganalysis.(2) NSGA-III is utilized to tune the initial parameters of the DCNN model.(3)Accuracy and f-measure performance metrics are used as a multiobjective fitness function.(4)Extensive experiments are drawn on STEGRT1 dataset. Comparison of the proposed model is also drawn with the competitive steganalysis model.

The remaining paper is organized as follows: Section 2 presents the proposed NSGA-III-based DCNN model for steganalysis. Experimental results and comparative analysis are presented in Section 3. Section 4 concludes the paper.

2. Proposed Model

In this paper, an efficient NSGA-III-based DCNN model is proposed for image steganalysis. The following section discusses the working of DCNN and NSGA-III.

2.1. Densely Connected Convolutional Neural Network

The diagrammatic flow of the DCNN is shown in Figure 1.

Assume a stego/normal image , which is assigned to CNN. The model has layers which utilize nonlinear transformation such that shows the layer’s indexes [20]. shows a set of operators like pooling, rectified linear units (ReLU), convolution (Conv), and batch normalization (BN). shows the outcome of the layer. However, the existing CNN joins the outcome of the layer as an input of layer. It achieves the layer transition as . ResNets utilize a skip join which avoids the nonlinear transformations utilizing an identity operator such as

ResNets achieve better gradient flow compared to CNN. However, the summation of the identity operator with an output of may hinder the data flow in the model.

Therefore, to enhance the data flow, a DenseNet was designed. It contains direct links from a given layer to every other layer. The layer takes the feature maps of all previous layers, , as input:

Here, shows the integration of feature maps obtained from layer .

is defined as a group operator. It contains BN, ReLU, and a Conv.

The integration operator utilized in equation (2) is not sustainable if there are some variations in the size of the feature maps. The downsampling layers of CNN vary with the size of the feature maps. To achieve downsampling, the model is divided into various densely connected dense blocks. Layers among the blocks are represented as transition layers. In this paper, the transition layer utilizes BN and 1  1 Conv followed by a 2  2 average pooling layer. There are no links across dense blocks except the transition layer.

If every generates feature maps, it considers layer with input feature maps. defines the channels of the input layer. The main significance of DenseNet over CNN is that it has confined layers, e.g., . represents the growth rate of the DenseNet. Every layer merges with the feature maps. The growth rate regulates the details of every layer’s contribution to the global state. The global state is globally defined; therefore, it is not required to redefine in every layer.

Every layer will compute feature maps, but it may have more inputs. 1  1 Conv is utilized as the bottleneck layer prior to every 3  3 Conv to minimize the size of feature maps and enhance the computational speed. This model is efficient for DenseNet, and DenseNet with bottleneck layer can be defined as BN-ReLU-Conv (1  1)-BN-ReLU-Conv (3  3) version of , as DenseNet-B. In this paper, 1  1 Conv provides feature maps.

To enhance the model density, the feature maps are minimized at the transition layers. If a dense block has feature maps, then the transition layer computes output feature maps. is represented as a compression factor. If , then the size of feature maps through the transition layer stays constant.

DenseNet contains four dense blocks. Each dense block contains an equal number of layers. Initially, Conv with 16 output channels is implemented on the input images. For Conv layers having kernel size as 3  3, every side of the inputs is zero-padded to maintain the fixed-size feature map. 1  1 Conv is followed by 2  2 average pooling between two connecting dense blocks. Finally, a global average pooling is implemented, and a softmax activation function is used. The sizes of feature map sizes in dense blocks are 32  32, 16  16, and 8  8, respectively. The DenseNet with configurations , , and are computed. The size of the input image is 256  256. Conv layer has convolution having a size 5  5 and stride as 2.

The exact network configurations and other hyperparameters of the DenseNet are tuned using .

2.2. -Nondominated Sorting Genetic Algorithm-III

NSGA-III [21] has been extensively utilized to optimize many engineering applications. It has achieved good convergence speed, and it does not suffer from the premature convergence issue [2224].

Table 1 represents the nomenclature of NSGA-III. Algorithm 1 illustrates the generation of an initial population of NSGA-III-based DCNN. Initially, a random population is computed by utilizing the normal distribution. The computed solutions are then mapped to the group of initial parameters of DCNN.

Optimal number of layers.
Implement an optimal number of layers based on the DCNN model.
while do
Consider DCNN with maximum performance
if then
else
end if
end while
select a random group of solutions from using a normal distribution
compute a set of random solutions
return

Algorithm 2 demonstrates the proposed NSGA-III-based DCNN model. Initially, we will test the DCNN by using the random population to train and test the model on the chunk of steganography dataset. The fitness of each solution is then obtained. Dominated and nondominated groups are then evaluated. Thereafter, mutation and crossover operations are used to compute the child solutions. Nondominated sorting is used to sort the obtained nondominated solutions. If the number of fitness evaluations exceeds the max allowed, then we return the tuned parameters of DCNN. Finally, NSGA-III-based DCNN is trained on the steganalysis dataset.

elect randomly solutions from the elite
for all do
decode as initial parameters of DCNN
for to do
compute DCNN with random initial parameters in
if then
end if
end for
if then
end if
for to do
select randomly an
if then
else
end if
if then
end if
end for
end for
if then
select solutions computed using NSGA-III
end if

3. Performance Analysis

3.1. Dataset

Rezaei et al. [25] designed a reference dataset for image steganalysis. It is the so-called Real version 1 (STEGRT1), and it contains both JPEG and BITMAP images. It has 8000 cover and stego images with different sizes and characteristics. These images were obtained using various steganographic approaches such as payload and quality factors.

3.2. Experimental Set-Up

The experiments of the proposed and the existing models are drawn on MATLAB online server with the help of a deep learning toolbox. Additionally, to increase the size of the dataset, the BitMix data augmentation [26] is also implemented. The performance of the proposed model is compared with the HCNN [8], TREM [7], CNN [4], ELM [3], ECSM [5], and DRM [6].

3.3. Comparative Analysis

In this section, the comparison between the proposed and the existing CNN-based steganalysis models are presented.

Figure 2 shows the performance analysis of the proposed model. It is found that the best performance is found at epoch 8 and iteration. Therefore, the proposed model converges efficiently with good convergence speed.

Figures 3 and 4 represent the confusion matrices obtained by using the proposed model with and without NSGA-III. It has been found that the majority of the obtained results lie in the true classes (i.e., in diagonal matrices). Therefore, it will lead to good performance results such as accuracy, f-measure, precision, recall, and area under the curve (AUC). In Figure 4, every diagonal value shows whether the corresponding class is true or false. It helps in evaluating the various performance metrics. Assume that stego-embedded image is our true class; it means the normal image belongs to the negative class. Overall, the analysis indicates that the proposed model with NSGA-III achieves better performance than without the use of NSGA-III.

Figures 5 to 9 show the comparative analysis between the existing and the proposed models. In these figures, the notched boxplots are shown. The box shows the interquartile range (IQR). Red line shows the median of the computed performance. Notch indicates a confidence interval around the median which is dependent upon the median interquartile range/sqrt of a number of experiments (). Here, we have considered . If the size of a notch is smaller, then the steganalysis model achieves better results. To evaluate the significant improvement or reduction, we have selected the average computed values of the proposed model and one from the existing steganalysis models (i.e., showing a better average value among existing models). Thereafter, we evaluate their absolute difference. It computes the average mean improvement or reduction; to make it in percentage form, we divide the absolute difference by the maximum possible value and multiply the computed value by 100.

Figure 5 represents the comparison between the existing and proposed steganalysis models in terms of accuracy. It reveals that the proposed model achieves better accuracy than the existing steganalysis models. The proposed model outperforms the existing steganalysis models in terms of accuracy by .

Figure 6 represents the precision analysis among the proposed model and the existing steganalysis models. It is evaluated that the proposed model achieves consistent values of precision than the existing models. The proposed model outperforms the existing models by .

Figure 7 demonstrates the recall analysis of the proposed steganalysis model. It is observed that the proposed model outperforms the competitive models in terms of recall values compared to the existing models. The proposed model has shown an average enhancement in recall values by 1.2832%.

Figure 8 represents the f-measure analysis among the proposed model and the existing steganalysis models. It is evaluated that the proposed model achieves consistent values of f-measure than the existing models. The proposed model outperforms the existing models by .

Figure 9 demonstrates the AUC analysis of the proposed steganalysis model. It is observed that the proposed model outperforms the competitive models in terms of AUC values compared to the existing models. The proposed model has shown an average enhancement in AUC values by 1.2913%.

4. Conclusion

From the extensive review, it has been found that deep learning-based models have been extensively utilized for steganalysis. However, these models suffer from overfitting and hyperparameter tuning issues. Therefore, NSGA-III based DCNN model was proposed for image steganalysis. NSGA-III was utilized to optimize the initial parameters of DCNN model. The accuracy and f-measure were utilized to design a multiobjective fitness function. Extensive experiments were drawn on STEGRT1 dataset. Comparison of the proposed model was also drawn with the competitive steganalysis model. Performance analyses have shown that the proposed model outperforms the existing steganalysis models in terms of accuracy, f-measure, precision, recall, and AUC by 1.2643%, 1.0245%, 1.1438%, 1.2832%, and 1.2913%, respectively. The results show that the proposed model can record even little changes in image features.

In the near future, one may extend the proposed work by designing a novel deep learning model to enhance the results further. Additionally, one may test the proposed model on other steganography datasets.

Data Availability

No data were used to support this study

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research work was funded by Institutional Fund Projects under grant no (IFPRC-027-135–2020). Therefore, authors gratefully acknowledge technical and financial support from the Ministry of Education and King Abdulaziz University, Jeddah, Saudi Arabia.