This study proposes a novel framework to improve intrusion detection system (IDS) performance based on the data collected from the Internet of things (IoT) environments. The developed framework relies on deep learning and metaheuristic (MH) optimization algorithms to perform feature extraction and selection. A simple yet effective convolutional neural network (CNN) is implemented as the core feature extractor of the framework to learn better and more relevant representations of the input data in a lower-dimensional space. A new feature selection mechanism is proposed based on a recently developed MH method, called Reptile Search Algorithm (RSA), which is inspired by the hunting behaviors of the crocodiles. The RSA boosts the IDS system performance by selecting only the most important features (an optimal subset of features) from the extracted features using the CNN model. Several datasets, including KDDCup-99, NSL-KDD, CICIDS-2017, and BoT-IoT, were used to assess the IDS system performance. The proposed framework achieved competitive performance in classification metrics compared to other well-known optimization methods applied for feature selection problems.

1. Introduction

The emerging technology of the Internet of Things (IoT) is constantly evolving and being exploited in the last couple of years, enabling communications and interactions among several devices via a network; thus, it is propelling new technology of business process [1]. Subsequently, several challenges in many aspects, such as financially, in proving credibility, in the enforcement, and in business operations, have come to the fore resulting from the exponential growth of cybersecurity attacks [2]. Cloud computing is normally used as an IoT data storage, which is formulated as a model that supplies various resources and services to the customer on-demand. Typically, cloud computing minimizes the human intervention between users and providers [3]. Due to its impressive features, it has received serious attention from organizations and users. However, to transit from the current platform to the cloud computing platform, several struggling issues can be faced related to the operation mechanism and security. The vulnerability of cloud computing is related to the valuable data stored remotely on servers. This security threat makes it a target for many cybercriminals and intruders; therefore, it hinders many people from favoring or transiting to the cloud computing platform. There are several reasons why the recent cyberattacks are substantially growing. One of the main reasons is related to the existence and accessible hacking tools that can be easy to use, which allow the naive hackers to quickly attack the cloud storage without brilliant skills or specific knowledge [48].

In the last decades, a considerable inattention from a wide range of research communities has been paid to address different issues in cyberattacks domain such as intrusion detection systems (IDSs) [9]. Furthermore, various machine learning (ML) algorithms were utilized to address the cyberattack issues such as the implementation of the decision tree algorithm (DT) in [10, 11], support vector machine (SVM) models in [12, 13], k-means [14, 15], k-nearest neighbor (kNN) [16, 17], and many other machine learning algorithms [1820]. Quite recently, many deep neural network solutions have been applied to the IDS in fog, clouds, and other IoT-based systems. Notably, the convolutional neural network (CNN) model [21] and the deep recurrent neural network (RNN) model in [22], as well as the restricted Boltzmann machines (RBMs) in [23], multilayered perceptron neural network [24], and many others [25].

The IDS is modeled as a feature selection problem and has been successfully addressed by various traditional classifiers. As the revolution of metaheuristic (MH) optimization algorithms, they used to tackle a wide range of complex optimization problems. MH is essentially utilized for IDSs such as the particle swarm optimization (PSO) algorithm [26], crow search algorithm (CSA) [27], genetic algorithm [2830], random harmony search algorithm [31], and grey wolf optimizer (GWO) algorithm [32, 33].

In this article, we propose a novel powerful IDS model utilizing advanced versions of deep learning (DL) and metaheuristic optimization algorithms. The features initially were extracted efficiently and simply by implementing a convolutional neural network (CNN) model. There are many consecutive convolution blocks designed to extract the informative features. The CNN was only used in the feature extraction phase, which allows the extraction of meaningful features that can represent the raw data in a lower-dimensional space. In addition, CNNs are well known for their ability to learn complex features with less complex architectures and fast training processes. Following the blocks in CNN, the fully connected layer is built to extract relative features and detect malicious or intruder activities. Thereafter, a new and efficient version of the reptile search algorithm (RSA) [34] is proposed as a feature selection tool to improve the classification results of IDS. The RSA is used since it is a very recent but efficient algorithm due to several impressive features, such as it has few parameters to be initiated. In the initial search, the derivative information is not mandatory. It is simple and easy to use. It is scalable and admissible. Finally, it is sound and complete. Therefore, it has been tested against several benchmark functions and engineering problems [34]. The RSA also helps in improving the neuro-fuzzy inference system for predicting the swelling potentiality for fine-grained soils[35]. Although the RSA has several advantages, as with other MH algorithms, its performance can be affected by the problem size and complexity. Accordingly, the RSA suffers from premature convergence due to the lack of balance between the exploration and exploitation capabilities during the search. Therefore, the problem-specific knowledge embedded with the search space shall be considered, and a suitable adjustment to the RSA optimization structure shall be adopted.

The designed model proposed in this study was initiated by preparing the IoT dataset for feature extraction. The feature extractor model is a CNN model that is trained over the preprocessed dataset. The outputs of the CNN model, which are the extracted features, are filtered, and the most relevant features are selected by the RSA. To evaluate the proposed model, four public datasets, KDDCup-99, NSL-KDD, Industrial IoT (IIoT) traffic data (BoT-IoT), and CICIDS-2017 were used. Furthermore, the results of the proposed RSA-based model are evaluated against the other seven well-established algorithms. The comparative results demonstrate the viability of our developed model, which shows significant performance for all datasets.

Our main objective of this study is to propose a novel and efficient IDS model that utilizes the impressive features of efficient deep learning and MH algorithms. To achieve these objectives, several contributions are presented in this article as follows.

Design a CNN model as a feature extractor with the goal of extracting the feature from the mentioned IoT datasets. Propose an adapted version of the RSA as a feature selection technique for selecting the most relevant and informative features. Assess the model by comparing its yielded results against seven state-of-art models over five well-known public datasets.

The remaining parts of this article are organized as follows: Section 2 reviews the related research works on IDS models. Then in Section 3, we elaborate on the basics and fundamentals for RSA. The proposed IoT security model is presented in Section 4. The results and discussion is given in Section 5. Finally, in Section 6, the conclusion of the article is stated, and the possible future works are recommended.

The related works of some previous IDS utilizing metaheuristic algorithms are summarized. The deep learning model and swarm intelligence approaches are combined by Saljoughi et al. [36] to address the IDS scheme for cloud computing. The authors used multilayer perceptron (MLP) neural networks as a feature extractor and the particle swarm optimization (PSO) as a feature selection method. Two datasets are used for evaluation purposes: KDD-CUP and NSL-KDD. Their proposed method yielded a significant performance in detecting intruders and cyberattacks through experimental validation.

Also, in [37], the denial-of-service (DOS) attack detection in cloud computing is tackled using an enhanced version of the artificial bee colony metaheuristic, which is utilized for boosting the classifier’s performance [37]. Their developed system can achieve prediction results with a 72.4% average detection rate when compared to QPSO. In [38], Dash suggests two IDS methods based on the artificial neural networks algorithm for intrusion detection and metaheuristic algorithms. The first method suggests utilizing the gravitational search (GS) algorithm during the second combined GS with PSO. The two methods (GS and GS-PSO) are used as a trainer for the ANN. Their performance is validated using comparative evaluation against several well-established algorithms such as gradient descent, PSO, and GA.

The literature indicates the significant use of various metaheuristics in line with machine learning classifiers for security protection applications, where the metaheuristic algorithms will be utilized as feature selection optimizers and the classifiers as improper action detectors. For instance, the authors of [39] reported significant outcomes for KDD-CUP 99 datasets, where an intrusion detection system is composed of a genetic algorithm and fuzzy support vector machine (SVM). Similarly, Nazir and Khan built a Tabu Search Random Forest (TS-RF), which is a strong intrusion detection system (IDS) in [40], such that the TS algorithm was integrated with the RF classifier. The performance of the system was tested using the UNSW-NB15 dataset, where the results revealed an improvement in the classification accuracy compared to several other methods.

In addition, an improved intrusion detection system was proposed by Mayuranathan et al. in [31], where the feature selection mechanism was optimized by applying the random harmony search algorithm (RHS) and the distributed DoS (DDoS) detection was performed by implementing the restricted Boltzmann machine classifier. The system was tested utilizing the KDD’99 datasets, and the results denoted a considerable detection performance.

On the other hand, other authors utilize neural network classifiers in their systems. In particular, an intrusion detection system for the Internet of medical things applications [33] was built by integrating the hybrid of principal component analysis (PCA), grey wolf optimizer (GWO) algorithm, and deep neural network (DNN). The PCA-GWO was used to optimize the performance of the classier (DNN). The feature selection optimization was reflected in the results and indicated a respective classification accuracy.

Furthermore, a new denial-of-service (DoS) detection system was proposed in [27] by SaiSindhuTheja and Shyam. The system optimizes the feature selection mechanism with the use of the modified crow search algorithm (CSA), such that for optimization performance enhancement, integration between the crow search algorithm (CSA) and the opposition-based learning (OBL) is implemented. Consequently, the second component of the system is the classifier, where the recurrent neural network (RNN) will be utilized for this task. The strength of the system gave it the ability to compete and outperform other detection systems.

3. Background

This section provides the two main aspects of the RSA as follows: the inspirations of the RSA are illustrated in Section 3.2, while the detailed descriptions of the procedural steps of RSA are shown in Section 3.3. In addition, this section presents a brief introduction to CNN-based models and their applications in the following section.

3.1. Convolutional Neural Network

Nowadays, AI-based algorithms such as CNNs have been widely exploited in fields such as computer vision. For instance, CNNs were extensively used to identify the COVID-19 and quickly diagnose image data. This section will briefly cover the recent advances and existing literature on using CNNs in different applications. Depending on the CNN architecture and building blocks, the CNN models can be applied to various data, including time-series data, textual data, images, and videos [41]. Thus, the main crucial component of such a model is the convolution operation applied to the input data. The convolution operation extracts features from the input data using several convolutional filters with the same or different filter sizes. In addition, the convolution operation relies on the local correlation of the information, which can help extract more complex features and learn more meaningful feature representations. The CNNs can suffer from variations in the data, such as image data (translation, rotation, and scaling). Thus, the CNNs use a pooling operation to sample the feature map extracted from the previous layer. Depending on the task, fully connected (FC) layers can be placed after a convolution block (convolution and pooling) or at the end of the network to classify or detect the input data.

Several CNN architectures have been proposed based on several criterions such as the network depth or width, the type of the convolution operation, the number of convolutional filters and their corresponding size, the pooling operation and its size, the number of fully connected layers, and the deployment environment of the model. Many CNN-based models have been proposed including MobileNet, ResNet, NASNet, EfficientNet, MnasNet, and AlexNet [4246]. For instance, MobileNet has three versions where MobileNetV3 implements the inverted residual block inherited from EfficientNet and ResNet [47]. The MobileNetV3 uses different types of convolution layers named the depthwise separable convolution, which was proposed to replace the standard convolution operation and lower the computation cost, facilitating the model deployment in embedded and edge systems. In addition, the proposed MobileNetV3 consists of a novel building block named Squeeze-And-Excite block [44]. The depthwise separable convolution uses the inverted residual connection to reduce the number of training parameters and improve the learned representations from the input data. The architectures mentioned above have been employed in a variety of tasks related to computer vision, such as image recognition, classification, image segmentation, face detection, and video classification [48]. The CNNs have shown a great ability in extracting features automatically, even when using simple networks. Thus, in our study, we propose a simple yet effective CNN architecture and adapt it to the network intrusion detection task.

3.2. Inspiration of RSA

The RSA is a recently developed metaheuristic algorithm by Abualigah et al. in 2021 [34]. The RSA mimics the hunting behavior of crocodiles in their natural habitat. In general, the crocodiles are belonged to the family of “Crocodylinae,” while they prefer to live in an environment where water and food are available. They are from the amphibians capable of hunting in the water, as well as out of the water. The living behaviors of crocodiles are illustrated as follows:(i)Vision: Crocodiles have a penetrating night vision that many other animals lack. They use the disadvantages of most other animals of poor night vision for hunting at night.(ii)Eating: Crocodiles are predators residing at the top of the food chain, as they are fed from the environment surrounding their habitat such as fishes and deer, cows, zebras, baby elephants, and small crocodiles. In addition, large crocodiles are not afraid to add other predators to their food sources, such as sharks and cats. It also has the ability to live for long periods without food if the surrounding environment lacks any food source. It was reported from the sources that some of them can feed on fruits.(iii)Locomotion: Crocodiles have the ability to swim, walk, and run. In swimming, they use the tail for steering, and the legs are ignored. In walking, they use their legs to carry their bodies and facilitate their movement, and the tail is used for balancing and steering. Finally, crocodiles can run short distances out of the water to attack prey, and thus, the energy is transmitted from the tail to the body to move forward at high speed.(iv)Cognition: Crocodiles have the ability to recognize the patterns of prey; for example, they have the ability to know which animals come to water in order to drink frequently.(v)Hunting: Crocodiles are set ambushes inside the water to catch animals that come to drink from the water’s edge or that dive in the water. At the right moment, crocodiles stealthily attack their prey from the water. Once the crocodile catches its prey, it drags it into the water and drowns it. Finally, the crocodile cuts its prey into large pieces and devours it completely. Frequently, crocodiles fight each other in order to share prey.(vi)Cooperation: Crocodiles are animals that prefer to live in groups. This pattern helps crocodiles cooperate in order to prepare for ambushes of predation. Everyone in the group has a role in helping accomplish the task of predation. For example, a crocodile attacks the animal that drinks from the riverbank in order to push him towards the water and then the crocodiles hiding in the water attack the prey.

3.3. Procedural Steps of RSA

Figure 1 illustrates the procedure steps of the RSA, while a detailed description of these steps is shown.

3.3.1. Phase 1: Initialization of RSA’s Parameters

The control parameters and the algorithmic parameters should be initialized before executing the RSA. The list of control parameters includes , which represents the number of crocodiles, and as the maximum number of iterations. Furthermore, two algorithmic parameters are used in RSA, such as and . These two algorithmic parameters are used to control exploitation and exploration abilities, respectively, in order to reach the right balance between the two abilities during the search process.

3.3.2. Phase 2: RSA Population Initialization

During this phase, we randomly generate a set of initial solutions using the following equation [34]:where represents the decision variable of the th solution at the th position. The upper and the lower bounds of the decision variable at the th position are and . is a randomly generated value between 0 and 1, while indicates the total number of decision variables at each solution. The set of solutions, as many as , are generated and stored in X as follows [34]:

where each row indicates the solution of th position.

3.3.3. Phase 3: Fitness Evaluation

The fitness value (i.e., quality) of each solution in the population should be calculated as .

3.3.4. Phase 4: Encircling Phase

This is the exploration behavior of crocodiles in the RSA. This phase is introduced to find a better solution by exploring new regions in the search space of the problem following two strategies, namely, the high walking and belly walking, as shown in (3). The high walking strategy is controlled by , while the belly walking strategy is controlled by [34]:where represents the decision variable of the th solution at the th position. is the th position in the best solution obtained at iteration. is the new iteration, and while the previous iteration is . The hunting operator of the th position in the th solution is denoted as , which is calculated using (4). The parameter controls the exploration capability of the high walking strategy. The value of is set to 0.1 according to [34]. is a randomly generated value ranging between zero and one. is the decision variable at the th position in the th solution, where . , , and are calculated, respectively, as follows:where is the percentage difference between the decision variable at the th position of the best solution and the decision variable at same position of the current solution . is set to 0.1 according to [34], which is also used to control the exploration ability of the RSA during the hunting cooperation. is a random value between 0 and 2. is the average value of all decision variables of the current solution . is a factor used to reduce the search area of the th position in the th solution and is the evolutionary sense probability and assigns a randomly decreasing value from 2 to -2 [34], which are calculated, respectively, as follows:where in the equation, is a randomly generated value ranging between 1 and , which refers to the index of one solution in the population that is randomly chosen. is a random integer value between 0, or 1, or -1.

3.3.5. Phase 5: Hunting Phase

This is the exploitation behavior of crocodiles in the RSA. This phase is designed in the RSA to exploit the current research regions in order to find the optimal solutions according to two strategies: hunting coordination and hunting cooperation, as shown in (9). The hunting coordination strategy is controlled by , while the hunting cooperation is controlled by [34].

3.3.6. Phase 6: Stop Criterion

Repeat from Step 3 to Step 5 until we reach the maximum number of iterations .

4. Proposed Model

With this part, the phases of the proposed IoT security are based on extracting the feature from the data using CNN and then selecting the relevant feature using a modified RSA. In general, the IoT security model consists of four stages, as given in Figure 2 and the description of each phase is given as follows.

4.1. First Phase: Prepare IoT Dataset

In this stage, the IoT dataset is prepared to make it suitable for the feature extraction stage (next one). This is achieved by normalizing the dataset using min max approach. For clarity, by considering the collected traffic samples, of IoT is represented as [34]

Using the min max approach to normalized , DNij is formulated as [34]where indicates the value of feature at the sample . So, the normalization of is represented aswhere stands for the features of th traffic, and they are represented as of . is the number of samples, and stands for the number of features.

4.2. Second Phase: CNN for Feature Extraction

The CNN is a widely used automatic feature extractor in various applications [49, 50] such as image classification, text classification, speech recognition, and others. In our study, we implemented a CNN model using the following architecture: . The core building blocks are convolution layer (Conv), ReLU activation function, fully connected layer (FC), and pooling layer (Pool). The CNN learns complex representations as features from the network traffic samples and classifies them based on their intrusion type. Using a convolution operation, the CNN extracts local and position-invariant patterns while sharing the weights across the layers and channels [51]. In our case, the design of the CNN network was based on the error, and trial method, where the objective is to build a simple yet powerful model that maximizes the classification accuracy on the tackled task. In addition, the best-trained model based on its performance on the test data is used to extract the learned features for the feature selection stage. The proposed CNN is illustrated in Figure 3.

In the implemented CNN architecture, the Conv block is followed by a rectified linear unit (ReLU) [52] defined in (13) to prevent the negative/small values from being propagated, while the pooling operator is used for reducing the dimensionality of the activation map of the inputted data :

To reduce the model complexity and prevent overfitting, dropout layers are used with a regularization rate equal to 0.5 to drop some neurons during training randomly. Furthermore, the Conv1 layer [53] consists of a kernel size with 64 filters and a stride. The 1D convolution operation applied on the input data of the previous layer is defined in the following equation:

The output is defined as where and represent the weight matrix and the bias corresponding to the -th layer, respectively. Meanwhile, two types of pooling were used, which are max-pooling and adaptive average pooling [54] with size .

As Figure 3 shows, the extract feature maps after the last pooling operation are inputted to a sequence of FC layers. The layers FC1, FC2, and FC3 were employed for feature extraction, whereas FC4 was used for the classification task. The FC4 used the Softmax function to output the probability of classifying a traffic sample to a specific type. As a regularization method, the CNN model uses batch normalization (BN) to normalize the input features fed to the FC4. The extracted feature vector from FC3 of each sample is of size . The extracted features are fed into the FS algorithm, which only selects the most relevant features to boost the overall performance of the intrusion detection task.

4.3. Third Phase: Feature Selection

During this phase, the proposed model selects the relevant features based on their quality. Thus, this process has a significant impact on IDS detection in IoT environments.

The proposed RSA as FS approach (see Figure 4) begins by initializing population, with a number of agents represented by . After that, it converts each agent into its binary version. More so, it reduces the number of features excluding those related to zeros from the binary version. Thereafter, the proposed RSA approach assesses the quality of the chosen features by computing the error classification according to the KNN classifier. Then, the best solutions (agents) are updated till reaching the optimal solutions.

4.3.1. Create Population

The proposed RSA begins by dividing the given datasets into training and testing subsets, with 80% and 20%, respectively. After that, the following equation is applied to construct the initial values of population with agents:

where represents the dimension of each agent, which means the number of features. More so, refers to a random vector, and and indicate the limits of the search space.

4.3.2. Updating Population

In the updating phase, each agent is converted into its Boolean version, as in the following equation:

Accordingly, feature numbers in the training set can be decreased by eliminating the features that belong to zeros. After that, the fitness value for each agent is computed, as follows:where refers to the classification error, which is computed utilizing the KNN depending on the training sets. More so, represents random weights that are applied for balancing between classification error and the ratio of relevant features . To simplify this process, suppose . By applying (16), then . Accordingly, the first, fourth, sixth, and seven features can be set as relevant features, where the training set can be decreased using them. Then, (17) is used to evaluate the quality of this section process.

The next stage is to obtain , which got the best fitness value . Thereafter, the is used for updating the current agents using the operators of the RSA.

4.3.3. Stop Learning Phase

During this phase, if the terminal criteria are not met, they will be checked. In this case, the updating process will be implemented again. Otherwise, is considered as output, and it is applied to reduce the testing set that is used in the next phase.

4.4. Fourth Phase: Evaluation Performance

To evaluate the performance of the developed RSA, the best agent is employed for ignoring, from the testing set, those features that correspond to zeros and are considered irrelevant. Then, compute the accuracy of the classification using several evaluation measures. Algorithm 1 presents the full steps of the proposed RSA.

Input: : number of generations, and : number of agents.
Using equation (11) to normalize the IoT data.
Apply the CNN-based feature extraction (see Section 4.2).
Dividing the dataset into training and testing according to the extracted features.
Generate initial population by applying (15).
Set .
While do
Apply (16) to find boolean form for each solution.
Use (17) to calculate the fitness value for each .
Allocating the best solution.
Updating using (3)–(9).
Use the relevant features (corresponding to ones) inside for reducing the testing set.
Outputs: Return by the and the values of evaluation indicators.

The complexity of the developed method RSA is .

5. Experimental Series and Results

This section presents the evaluation experiments of the developed IoT security approach and the evaluation process based on different evaluation metrics and real-world datasets and extensive comparisons to different methods in terms of features selection techniques.

5.1. Evaluation Measures

Several evaluation indicators are used to assess the quality of the proposed approach and all comparative methods.

We define those indicators according to the concept of the confusion matrix (see Table 1).

5.1.1. Average Accuracy

It refers to the rate of correct detection of intrusion. It can be calculated aswhere , which refers to the iteration number(number of runs).

5.1.2. Average Recall

This is also known as a true-positive rate (TPR), and it refers to the percentage of intrusion predicted positively. It is calculated as

5.1.3. Average Precision

It represents the percentage of TP cases of all positive cases. It can be computed as

5.1.4. Performance Improvement Rate (PIR)

It is used to compute the rate of the improvement got by the developed method, and it is defined aswhere and indicate the value of measure (i.e., precision, accuracy, recall, and F1-measure) of RSA and other algorithms, respectively.

5.2. Experiments Setup

The proposed CNN model in this study was trained for 100 epochs with early stopping using 2024 samples in each training batch. We save the best model during the training, resulting in a good performance on each dataset. The Adam [55] optimizer was used, where the learning rate is set to 0.005. The CNN model has been trained on a GPU of type Nvidia GTX 1080 and implemented using Pytorch framework1. The complexity of the CNN can be measured using the total updated parameters during the training, which is equal to 63,432. The proposed RSA was evaluated and compared to the following optimization algorithms: multiverse optimization algorithm (MVO) [56], whale optimization algorithm (WOA) [57], moth flame optimization (MFO) [58], grey wolf optimizer (GWO) [59], transient search optimization (TSO) [60], Bat (BAT) algorithm [61], and firefly algorithm (FFA) [62]. The parameters of each of these algorithms are set according to its implementation. However, the common parameters such as the number of iterations and agents are 50 and 20, respectively.

5.3. Dataset Description

To validate the proposed framework, we used KDDCup-99, NSL-KDD, CICIDS-2017, and BoT-IoT datasets. These datasets are the well-known datasets used to assess the IDS techniques, whereas the KDDCup-99 and NSL-KDD datasets share the exact source of data and the same intrusion type labels. Both KDDCup-99 and NSL-KDD were used to compare the proposed framework with other methods. Tables 24 list the datasets and the corresponding labels and samples distribution in training and testing sets.

The NSL-KDD dataset was built based on KDDCup-99, representing the refined version without duplicated network traffic samples. During the challenge on intrusion detection held by DARPA (defense advanced research projects agency) in 1998, the KDDCup-99 was created. The KDDCup-99 data were gathered from MIT Lincon laboratory experiments, where network traffic data were recorded during a period of 10 weeks. The setup used to experiment was around 1000 UNIX machines and 100 users. The collected network traffic data were around 5 million records stored in a raw transmission control protocol/Internet protocol (TCP/IP) dump format. Due to the enormous size of the dataset, the data collectors released a minor version representing only 10% of the total connection records consisting of 41 features for each record and the following types of attack: denial-of-service (DoS), probing, remote-to-user (R2L), and user-to-root (U2R). Meanwhile, the Bot-IoT dataset [63] consists of more than 72 million connection records gathered from many IoT devices. The dataset was collected by the Cyber Range Lab (at the UNSW Institute for Cyber Security) in Australia. We only used 5% of the entire dataset in our experiments, consisting of around 3.5 million records with ten features. The CICIDS-2017 [64] consists of 79 network flow features from gathered network traffic using the CICFlowMeter tool. The CICIDS-2017 datasets were collected by the CIC (Canadian Institute for Cybersecurity) to emulate real-world data (PCAPs). In addition, the collected connection records cover a variety of network protocols, including SSH, e-mail, HTTP, and FTP protocols generated by 25 users on machines with varying operating systems.

5.4. Results and Discussion

The results of the IoT security model based on the integration of the CNN and RSA compared with other models are given in this section. Tables 5 and 6 illustrate the average of each performance measure among the 25 independent number of runs for both binary and multiclass cases.

The analysis of the results in the multiclassification case can be noticed in the following points. The first point is that the efficiency of the developed RSA is better than the competition algorithms’ overall performance measures during the learning stage among KDD99, NSL-KDD, and CIC2017. However, the performance of the RSA at BIoT achieved the second rank, following the MFO, which has better results. The second point that can be noticed is that the ability of RSA to detect the attack type using testing samples is higher than other methods when using the four dataset.

Furthermore, we can notice from the results of the algorithms in the case of the binary classification of the four datasets the high performance of the RSA either in the learning stage or evaluation stage. However, it can be noticed that high quality is achieved in the case of KDD99 and NSL-KDD. However, the result outcomes of the competitive methods are nearly the same in the other two datasets (i.e., BIoT and CIC2017), with little better performance for the developed method.

Moreover, Figure 5 depicts the average of each method among all the tested datasets in terms of each performance measure. It can be observed from this figure that the RSA has a high average overall performance metrics in the training and testing stages of the binary and multiclassification, followed by MVO in the multiclassification case, which provides better accuracy results than other algorithms. The BAT has a better recall value in the training and testing stages, and provides a better F1-measure value in the testing stage. Each of MFO and GWO, in the case of training, has higher precision and F1-measure value than other algorithms, whereas, in the case of the testing stage, FFA has a higher precision value than other methods. The same observation for MVO can be noticed in the case of binary classification. Each of MFO and GWO has better performance in terms of F1-measure and precision, respectively, in the training and testing stages. BAT provides better Recall value among the tested datasets in either the training or testing stages.

For further analysis of the obtained results, we used the Friedman test [65] to check whether the difference between the competition methods is significant or not. The Friedman test provides us with a mean rank for each method as given in Table 7. From these mean ranks, we can conclude that the mean rank of RSA is the highest in terms of performance measures in both classification scenarios (binary and multiclass), followed by MVO, FFA, MFO, and BAT, which has a high mean rank according to accuracy, precision, F1-measure, and recall, respectively.

From the previous results, it can be noticed the high ability of the developed method to improve the process of predicting the attack in the IoT environment. However, the developed method has some limitations, such as being time-consuming resulting from learning the model. However, this can be fixed by using transfer learning techniques.

6. Conclusion

This article presented a new method for intrusion detection systems (IDSs) of the Internet of things (IoT) and cloud environments. The main idea is to utilize the proliferation of deep learning and metaheuristic optimization algorithms to build robust feature extraction and selection techniques. First, a one-dimensional convolutional neural network (CNN) method is suggested to extract the relevant features. Second, the reptile search algorithm (RSA) is employed to select an optimal feature subset to reduce data dimensionality and boost classification accuracy. Several well-known and public datasets were used to assess the performance of the suggested techniques. More so, extensive experimental comparisons were carried out to confirm the quality of the RSA as a feature selection technique. The outcomes revealed that the RSA obtained better performance compared to several optimization approaches, such as PSO, FA, GWO, WOA, TSO, BAT, and MVO. It recorded over 99% for all training scenarios of all datasets. Also, it recorded high results in a testing scenario; for example, for multiclassification, the RSA obtained 92.040%, 89.684%, 89.985%, and 92.040%, of accuracy, precision, F1, and recall, respectively, for KDD99 datasets. Also, in the binary classification, the proposed method recorded high results; for example, it recorded 92.344%, 94.335%, 92.763%, and 92.344%, of accuracy and precision, F1, and recall, respectively, for KDD99 datasets in the testing scenario. For other datasets, the proposed RSA also recorded superior results in all evaluation tests using several classification indicators. We concluded that the applications of CNN with RSA have significant impacts on the IDS classification process. For future work, other issues could be addressed; for example, the convergence speed of the RSA needs to be improved. Thus, other artificial search mechanisms could be integrated with the RSA to tackle this problem. Also, in future work, we may consider applying the RSA for training deep learning models to boost the classification process for different applications, including IDS.

Data Availability

The data used to support the findings of this study are available from the authors upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.


The authors would like to thank the support of the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University. This research was funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R239), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.