Abstract

The existing wireless network intrusion detection algorithms based on supervised learning confront many challenges, such as high false detection rate, difficulty in finding unknown attack behaviors, and high cost in obtaining labeled training data sets. This paper presents an improved -means clustering algorithm for detecting intrusions on wireless networks based on Federated Learning. The proposed algorithm allows multiple participants to train a global model without sharing their private data and can expand the amount of data in the training model and protect the local data of each participant. Furthermore, the cosine distance of multiple perspectives is introduced in the algorithm to measure the similarity between network data objects in the improved -means clustering process, making the clustering results more reasonable and the judgment of network data behavior more accurate. The AWID, an open wireless network attack data set, is selected as the experimental data set. Its dimensionality reduces by the method of principal component analysis (PCA). Experimental results show that the improved -means clustering intrusion detection algorithm based on Federated Learning has better performance in detection rate, false detection rate, and detection of unknown attack types.

1. Introduction

With the rapid development of wireless LAN and mobile communication technologies, the WiFi network has become an indispensable part of people’s daily work and life, bringing great convenience. Meanwhile, it is always threatening people’s property and information security because of the security threats. Therefore, it is of great significance to study network information security and related technologies. At present, as an important direction of network security research, the network intrusion detection method has attracted the interest of many researchers [1].

Network intrusion detection, as an important dynamic security technology, is the most widely used and effective active network security defence method at present, which makes up for the deficiency of static security technology. Intrusion detection technology is mainly divided into two categories: misuse intrusion detection and anomaly intrusion detection [2]. With the aid of the established database of known intrusion behavior characteristics, misuse intrusion detection technology can use the database to real-time monitor the network data flow in pattern matching and thus determine whether the network behavior and its variant behavior are abnormal. When the data traffic characteristics and features in the database intersect any of the detection rules, it can be concluded that an invasion has occurred. Misuse intrusion detection technology relies on the characteristic library of known intrusion behaviors, which can detect the known intrusion behavior quickly and accurately, thus, determine which type it belongs to. Unfortunately, it cannot detect the network intrusion behavior of an unknown attack type. By establishing the normal behavior characteristic database, the abnormal intrusion detection technology can solve the above problem. When the behavior characteristics of network data do not conform to the rules of the normal behavior characteristic database, the behavior is determined as network intrusion behavior. This technique can detect the intrusion behavior of an unknown attack type, but its false detection rate and missed detection are high [3]. With the increasing diversification and complexity of network intrusion behaviors, the network intrusion detection system based on anomaly detection technology can better adapt to the changeable network environment, which makes it more popular at present.

In the network intrusion detection system with supervised anomaly detection, a large amount of normal behavior data needs to be marked in the practical application process to establish the normal behavior characteristic library. However, it is very difficult and costly to obtain pure and accurate data sets of normal behavior in the real network environment [4]. To solve this problem, the unsupervised anomaly detection method was proposed, which does not rely on labeled data or requires manual or other methods to mark and classify the training data set [5]. Accordingly, the lack of training data set for the detection model is alleviated to a certain extent. However, the major challenges faced by it are still lacking a large number of effective training data sets. How to train the detection models with the network traffic data from different data sources and protect their privacy is an urgent problem to be solved.

Federated Learning [6], first proposed by Bernd, is an effective way to solve the multisource data cotraining model, whose purpose is to carry out collaborative training without sharing private data. The model is not trained intensively with the aggregated multisource data but trained cooperatively with them by only transmitting their relevant encrypted parameters. With the wide application of Federated Learning in intrusion detection systems, researchers have proposed a series of effective detection algorithms based on it. For example, Wang et al. [7] proposed a method of using a DCNN network to extract features under the federated learning mechanism and finally using the Softmax classifier model to carry out intrusion detection. Zhao et al. [8] proposed a network intrusion detection classification model (CNN-FL) that integrates Federated Learning and convolutional neural network (CNN), and it used multisource data to cotrain the same model to improve the classification accuracy of the classifier. Wei et al. [9] proposed a cross-platform malicious user detection method for social networks based on vertical federated learning, which combined multiparty data for modeling and analysis and finally realized more accurate detection of malicious users.

In order to further improve the detection rate of wireless network intrusion detection system, reduce the false detection rate, flexibly discover the attack behavior of unknown attack types, and efficiently reduce the training time of the model, this paper proposes an improved -means clustering intrusion detection algorithm for wireless network based on Federated Learning. This algorithm no longer takes Euclidean distance as the measurement method between data objects of wireless network but uses cosine distance [10] which is more suitable for high-dimensional network data to describe the similarity between objects and then measures the similarity between any two data objects from multiple perspectives, making the measurement results more reasonable and accurate. At the same time, this algorithm improves -means clustering by combining three-way decision ideas, realizes the dynamic adjustment of values by setting the threshold , and also realizes the delayed decision of uncertain network data by using the neighborhood of data objects, thus, further improving the detection rate of the intrusion detection system and reducing the false detection rate.

In this paper, we choose the open data set AWID [11] of the wireless network to do the experiments and the method of principal component analysis (PCA) [12] to reduce the dimensionality of experimental data, which significantly decreases the data feature scale and improves the performance of the algorithm. Experimental results show that, compared with the traditional wireless network intrusion detection algorithm, the proposed algorithm in this paper not only realizes the purpose of expanding the amount of training data but also improves the performance of the detection rate, the false detection rate, and the discovery of unknown attack types under the condition of ensuring the data privacy security.

2. Intrusion Detection Model Based on Federated Learning

As an artificial intelligence algorithm, Federated Learning is designed to ensure the security of private data while sharing and using local private data. It solves the problem that multiple computing nodes train the global model together without exchanging the original data. In Federated Learning, the global model is trained among a large number of participants in a distributed means. To avoid the server accessing local data, the participants only train the model locally and update the global model only by passing model parameters to the server. In this paper, it is applied to the wireless network intrusion detection system under the condition of lacking a large amount of effective training data. Instead, it makes full use of the local network data to assist in training the detection model. Accordingly, the overall performance of the detection model can be improved, and the privacy data leakage problem can be avoided in the process of data transmission [13]. The intrusion detection model based on Federated Learning is shown in Figure 1, where it is assumed that there are participants with the same goal to jointly train the global classifier. In each iteration, the server passes the global model to each participant, and the participant trains it separately through local wireless network data. After the local training is completed, each participant passes the model parameters back to the server, and the server updates the global parameters by averaging the model parameters of each participant and sends the new parameters to each participant. The core function adopted by the classifier model in this paper is the improved -means clustering function, and the model parameter passed in the iteration process is the -means clustering threshold . The updating process of the global model is shown in where is the model parameter at the moment of iteration, and is the model parameters uploaded by the -th participant after iterations.

3. Improved -Means Clustering Algorithm

3.1. Traditional -Means Clustering Algorithm

The traditional -means clustering algorithm takes the Euclidean distance between data objects as the basis to measure whether the connection between data objects is close and considers that the closer the feature attributes are, the smaller the distance between data objects [14, 15]. It follows two assumptions of clustering: (1) the number of normal data objects in the whole test data set is far greater than that of abnormal data objects; (2) there are obvious differences between normal data objects and abnormal data objects. The traditional -means clustering algorithm is as follows.

1. Input: data sample set and the number of clustering k.
2. Initialize: Randomly select k data objects to form the first cluster center point set from the data set .
3.   Repeat:
3.   Calculate the distance between all xi in X and k cluster center points uj.
4.   Update the cluster .
5.   Recalculate the new mean in each cluster.
6.   Update the cluster center point set .
7.   Calculate the objective function .
8. Until: J convergence.
9. Output: k clusters.

It is a typical two-way clustering method based on the idea of two-way decision, that is, to determine whether a data object belongs to a certain class cluster or not. With the increasing diversity and complexity of wireless network intrusion, the two-way clustering algorithm has some shortcomings when dealing with network data sets. On the one hand, clustering results cannot fully reflect the properties of all data objects themselves. For example, in Figure 2, according to the traditional two-way clustering method, data objects and are grouped into two clusters and , respectively. There are significant differences between data points , , and data objects in the corresponding clusters. Therefore, the adoption of two-way clustering method will inevitably reduce the intrusion detection rate while increase the false detection rate. On the other hand, the traditional -means clustering algorithm adopts a fixed value of , the number of categories of all data behavior, which is difficult to predict exactly in advance. Therefore, the fixed will affect the accuracy of classification and judgment of wireless network data.

3.2. The Improved -Means Clustering Algorithm by Combining Three-Way Decisions
3.2.1. Idea of the Three-Way Clustering

To solve the problems existing in the application of traditional clustering algorithms in the intrusion detection systems, many scholars have improved the two-way clustering algorithm by introducing the three-way decision idea into the clustering algorithm and then proposed the three-way clustering method. The core idea is to extend the decision items into positive domain decision, negative domain decision, and boundary domain decision [16, 17]. If you have a full grasp of and a comprehensive understanding of things, you can directly make a judgment of acceptance or rejection; otherwise, further investigation is manifested as a delay in decision making [18, 19]. Taking Figure 3 as an example, using the traditional two-way clustering method, data objects and can only be classified and , respectively. But there are significant differences between data objects and and those in classes and . The clustering results are shown in Figure 3. The three-way clustering method is used for clustering, and the results are shown in Figure 4. and are divided into the boundary region of and , respectively, which can be used as uncertain data objects for further processing. Compared with the traditional two-way clustering method, it has obvious advantages in structure and can further cluster and judge outlier data points according to their particularity.

The three-way clustering method is an extension of the traditional two-way clustering method, which is a solution to the reasonable classification of uncertain data objects. If it is difficult to determine the category of the data object immediately, it can assign to the boundary area. The data objects whose category can be determined accurately assign to the core area.

For data set , assume that is the result of using the two-way clustering method to cluster . Each category is improved according to the three-way decision idea, and it is represented as by two sets, where they are where are called the core region and the boundary region of the class, respectively. The data objects in the core region are determined to belong to a class , while the data objects in the boundary region may belong to a class .

In the process of intrusion detection, the core region data are directly classified as intrusion data or normal data, and the boundary region data are deferred to decide to reduce misjudgment.

3.2.2. Three-Way Dynamic Threshold -Means Clustering Algorithm

In general, the same kind of behavior data in the network intrusion detection system has a high similarity [20], so the vast majority of boundary region in the result set based on the three-way clustering algorithm can be basically determined as behavior data that is not of the same kind with the core region . On the basis of the traditional three-way clustering algorithm based on neighborhood, the following improved -means clustering algorithm of three-way dynamic threshold is proposed to eliminate the influence of human intervening value on the clustering effect of the -means algorithm.

In the process of -means clustering, a distance threshold is introduced, which is predicted by the hard clustering algorithm and dynamically optimized during the algorithm execution. It adopts distance as the similarity evaluation index, and the introduction of can effectively cluster those outlier data objects separately, which can be used as a new clustering center to participate in data training. The dynamic adjustment of the clustering center can eliminate the influence of human intervening value on the clustering effect of the -means algorithm to a certain extent.

The -means clustering algorithm based on three-way dynamic thresholds is as follows.

1. Input: data sample set and the number of clustering k.
2. Initialize: Randomly select data objects to form the first cluster center point set from the data set .
3. Loop:
4.   Repeat:
5.     Calculate the distance between all remaining xi in X and k cluster center points uj.
6.     Update the cluster .
7.     Calculate .
8.     Traverse all in X = {x1,x2,…,xn},
      IF , xi belongs to ;
        ELSE.
        Let uk+1 = xi, update the center point set to , where is the stand-alone cluster in the X and the cluster number is updated to .
9.     Recalculate the new mean in each cluster .
10.     Update the cluster center point set
11.     Calculate the objective function
12.   Until: J convergence.
13. Obtain the two-way clustering result: .
14. Calculate .
15. Traverse all the class in :
     , consider the q neighborhood of xi, i.e., , which is the set of q data points closest to xi. IF , we have ;
     , IF , ; ELSE .
16. Obtain and .
17. Let , then randomly select k data objects to form the first cluster center point set from X.
18. Do “Loop” on the set .
19. Obtain the two-way clustering result:.
20. Output: the final clustering result .

The final clustering result consists of two data sets and , where contains all the core region data objects through deterministic division and contains the deterministic data objects obtained from all the data in the uncertainty border area by a secondary deterministic division. The accuracy of the resulting clustering set is significantly higher than that of the traditional two-way clustering algorithm.

3.3. Similarity Measure of the Multiple Perspective Cosine Distance

Euclidean distance is a common measure of the distance between samples used in clustering algorithms. As shown in equations (4) and (5), the traditional -means clustering method achieves the purpose of clustering by minimizing the sum of the distance between each sample and the center of the class.

In the methods of similarity measurement between samples, Euclidean distance focuses on measuring the numerical differences of attribute values between samples, while cosine distance, mainly measuring the differences between dimensions without paying attention to the numerical differences, focuses on the consistency of value directions between dimensions. For wireless network data with higher dimensions, these two traditional measurement methods have their limitations. In this paper, the improved cosine distance measurement method is introduced to the -means clustering algorithm of wireless network data, and the similarity between data objects in a wireless network is measured from multiple perspectives to obtain a more reasonable and real similarity between two data objects, thus, making the clustering results more ideal. The distance based on cosine can be expressed as: where is the cosine value of the angle between and , measuring the similarity between the data objects [18]. According to equation (6), the cosine distance can be regarded as the angle between two objects observed from the perspective of the origin. Therefore, the cosine distance can also be expressed as:

Equation (7) only takes 0 as the reference point, and the angle between two objects is only the angle from the origin, as shown in Figure 5(a). However, if two data objects are approximately in line with the origin, the cosine distance measurement with the origin as the only reference point will lose its effect, as shown in Figure 5(b). Therefore, cosine distance measurement from multiple perspectives will be effective in solving this problem.

Introduce a third nonorigin point as the reference point, and the distance between the data can be expressed as:

When measuring the similarity between two data objects, we can observe the angle between two data objects from each point in the reference point set , that is, the angle between vectors and . The distance between data and can be expressed by the mean of the cosine distance observed from multiple reference points: where is the base of the set .

The main idea of selecting reference points for multiple perspectives is as follows.

Assume that is the point on the outer hypersphere of the unit hypercube in the -dimensional space, and is the center of the sphere. When point is selected on the unit hypersphere according to the equal angular step of the spherical coordinates, in the Cartesian coordinate system , the Cartesian coordinates of point can be calculated as follows:

In particular, suppose that is any a point on the unit sphere in three-dimensional space, is the angle between and the -axis, and is the angle between the projection of on the plane and the -axis, as shown in Figure 6(a). When choosing point on the unit hypersphere by equal angular step, let , where is the radian step length of whose value varies with . is the coordinate of on the -axis, and is the projection of on the plane . Let , and we can obtain the coordinates of the vector on the -axis and -axis: where is the spatial dimension, and the value of in three-dimensional space is 3. Thus, the coordinates of point in the spatial Cartesian coordinate system are . For example, in three-dimensional space, when is selected, the datum point coordinates obtained by the multiple perspective method are shown in Table 1 and Figure 6(b).

The datum point set contains data objects from all angles, so the cosine distance can more reasonably measure the similarity between two high-dimensional data objects in the case of multiple perspectives. In this paper, the multiple perspective cosine distance is used as the distance measurement method of the improved -means clustering and applied to the wireless network intrusion detection algorithm, and more accurate detection results are obtained. Compared with the traditional Euclidean distance, this algorithm adopts the cosine distance to calculate the distance between high-dimensional data objects, which ensures a higher detection rate and a lower false detection rate. Unfortunately, its time complexity increases significantly, thus, decreasing its detection efficiency. Therefore, the PCA method is used to reduce the dimensionality of the data set in the wireless network, reducing the impact of the time complexity on the detection efficiency of intrusion detection.

4. Improved -Means Clustering Intrusion Detection Algorithm for the Wireless Networks Based on Federated Learning

In the improved -means clustering intrusion detection algorithm for the wireless network based on Federated Learning, each participant with the local training data set needs not to share its private data set and preprocesses its local data by itself. All participants use the processed data set to train the classifier model and timely transmit the related parameters. Each participant trains the classifier model by downloading the latest global model parameters in the next iteration. The classifier has a good detection effect on all data sets by cyclic iteration until the overall model reaches the optimal, as shown in Figure 7.

During the actual training, participants and the server exchange relevant parameters at the proper time. In this paper, we assume that each participant is an independent and equal individual, and its training data size is equal or the difference is small. Therefore, the server carries out an arithmetic average operation on the model parameters uploaded by each participant. The algorithm is as follows:

1. Input: Data sample set , the step length of multiple perspective N, the initial cluster number k, the weight vector , and the largest number of iterations T.
2. Initialize: Reduce the dimensionality of all the data objects in the local data set by use of .
3. Repeat:
3.   Each participant calculate according to k-means algorithm.
4.   Each participant passes the k-means clustering threshold αt to the server.
5.   The server sends the new threshold to each participant.
6.   
7. Until: .
8. Output: Clustering result sets C.

5. Experiments and Result Analysis

The experimental equipment in this paper is 11 laptop computers with Windows 10 operating system, Intel i5 CPU, and 8 G memory, where one acts as the server. The experimental data set is the wireless network data set AWID. The development environment is Python 3.7. The comparative tests are as follows. (1)For detection rate and false detection rate, the proposed algorithm is compared with the intrusion detection algorithms based on traditional KNN classification and the density clustering DBSCAN(2)For detecting unknown attack types, the proposed algorithm is compared with the intrusion detection algorithms based on traditional KNN classification and the density clustering DBSCAN

5.1. Experimental Data Set

The AWID data set is derived from Kolias, which is the network attack data set collected under the real WiFi network environment with the largest and most comprehensive data volume [21, 22]. According to the attack type level, the dataset is divided into two data subsets: the CLS dataset with four large attack types and the ATK dataset with 16 seed attack types. The 16 seed attack types of the latter are included in the four major attack types of the former. For example, the attack types of Caffe-Latte, Hirte, Honeypot, and EvilTwin in the ATK data set belong to the camouflage attack types in the CLS data set. At the same time, the AWID data set includes two versions: complete data set and condensed data set. This paper uses the condensed version of the CLS dataset. The distribution of data types in the dataset is shown in Table 2, and a normal data record in the dataset is

(0,0,0,1393661303,0.055325,0.055325,0.081227,159,159,0,0,0,0,26,1,1,1,1,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0x00000000,0,0,0,2101680214,0,0,0,0,1,0,0,0,1,2437,0,1,0,1,0,0,0,0,0,0,0,0,-32,1,0,0x08,0,0,8,0x00,

0,0,0,0,0,0,0,ff:ff:ff:ff:ff:ff,ff:ff:ff:ff:ff:ff,00 : 13 : 33 : 87 : 62:6d,00 : 13 : 33 : 87 : 62:6d,00 : 13 : 33 : 87 : 62:6d,0,3684,0,0,0,0,0,0,1,1,0,0x0000,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0x00000044a18671bc,100,0,0,0,0,0,0,0,0,0,1,OTE29224e,6,0,1,0,0x00,0,1,2,2,1,2,0,0,0x0000,0x0000,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0).

The process of data set preprocessing includes data completion, data rationalization, numerization of character data, data normalization, and reduction of the data attribute dimensionality.

5.1.1. Data Clipping

In the AWID data set, some attributes of a few network data are missing. To ensure the effectiveness of its results, delete those attributes with the missing rate of 80% or more and fill the remaining attribute bits with 0 s.

5.1.2. Data Selection

The number of normal behavior records in the AWID is far greater than the number of attack behavior records. Such is the case in the real network environment. Therefore, we select the 10 : 1 normal behavior record and aggressive behavior record as the training data set to construct the classifier. To fully verify the detection performance of the proposed algorithm on the data behaviors of different attack types, we also select 10 : 1 normal behavior records and attack behavior records to build a test data set. Besides, the training data set and test data set allocated to each participant contain a different number of attack behaviors. To the greatest extent, it restored the situation that some attack behavior data in the local data set of some users in the real wireless network environment is less or does not exist, but it can also have the ability to detect such attack behavior through the training of the Federated Learning detection model.

5.1.3. Numerization of Character Data

The hexadecimal attribute value in the AWID is converted to the decimal attribute value, the MAC address attribute value is converted to the number of its occurrence in the whole data set, and the data attribute value in the form of characters is numerically processed by the one-hot encoding [23, 24] method. The character attribute variable processed by the encoding method can retain the influence degree of the original attribute on the clustering result more reasonably.

5.1.4. Dimensionality Reduction of Data Attributes

The wireless network data in the AWID has 154 attribute values. In this paper, we delete all the attributes with the same values in the data set before the experiment and use the PCA to extract the attributes with a greater contribution rate to reduce the dimensionality of the wireless network data and then reduce the time complexity of the detection algorithm.

5.1.5. Data Standardization

Different attributes in the data set have different ranges. To reduce the impact of such difference on the detection model, we can use z-score standardization [25, 26] on the data to make it a normal distribution. Using distance to measure similarity and the PCA to reduce dimensions, the -score normalization is better than the Min-max normalization in classification and clustering algorithms. where represents the normalized data of , represents the -th eigenvalue, represents the data mean value of the feature, and represents the standard data deviation of the feature.

5.2. PCA Method for Dimensionality Reduction of Wireless Network Data

In wireless network data, each piece of network data often involves dozens or even hundreds of attribute variables. Too many attribute variables will not only increase the time complexity of the detection algorithm but also bring difficulties to the reasonable analysis of detection results. Although each attribute variable of the network data provides a certain amount of information, its importance and contribution vary. Moreover, in most cases, there is a certain correlation between various attribute variables of network data, which makes the information provided by these attribute variables overlap to a certain extent and affects the accuracy of detection results. Therefore, we adopt the PCA method to deal with these attribute variables and replace the original attribute variables with a small number of variables, so as to achieve dimensionality reduction of wireless network data [27]. The dimensionality reduction process is as follows.

1.Input: The wireless network Data and principal component cumulative variance contribution threshold R.
2. Initialize: Construct the initial forward transformation weight vector , the initial backward transformation weight vector , and the backward transformation data set .
3. i =1.
4. Repeat
   IF reaches its maximum, get
   
5.Until
6. Determine m, the number of selected principal components, according to R and then obtain .
7. IF , the mean square error, reaches its minimum, get the best and .
8. Obtain the final according to the best .
9. Output: the final principal component data set .

In wireless network data set AWID (154 attributes) [28], the 77-dimensional attributes that have an influence on clustering results were extracted and dimensional reduction was conducted by the PCA method. The principal component variance contribution rate and cumulative variance contribution rate obtained were shown in Table 3. When the PCA is used to reduce the dimensionality of wireless network data sets, the appropriate number of principal components can be selected by adjusting the threshold value of the contribution rate of cumulative variance of principal components. The choice of principal component quantity directly affects the characterization ability of original network data. Choosing a small number of principal components to replace the original data may result in poor clustering results and greatly reduce the detection performance of intrusion detection algorithm. Selecting a large number of principal components to replace the original data cannot achieve the purpose of dimensionality reduction. Therefore, how to choose the appropriate number of principal components to replace the original network data needs to be decided according to the specific algorithm and algorithm function, so as to achieve the goal of data dimensionality reduction to the maximum extent on the basis of guaranteeing the high performance of the algorithm. After many experiments, this paper selects the first 16 attributes after dimensionality reduction to carry out intrusion detection experiments and obtains the most ideal detection results. When additional attributes were added for the experiment, the time complexity gradually increased, but the intrusion detection results did not change significantly, so the first 16 attributes were selected in this paper.

5.3. Analysis of Experimental Results

In this paper, detection rate ACC and false detection rate FAR are used as performance evaluation indexes of wireless network intrusion detection algorithm [29]. The details are as follows: (1)Detection rate ACC is the ratio between the network data of the correctly judged category and the sum of network data. The higher the detection rate, the better the performance of the intrusion detection algorithm(2)False detection rate FAR is the ratio between the amount of normal behavior data wrongly judged as aggressive behavior and the sum of normal behavior data. In intrusion detection algorithms, the lower the false detection rate, the better the detection performance of the algorithmwhere (true negative) represents the number of network data behaviors that correctly identified as normal; (true positive) refers to the amount of network data that correctly identifies the network attack behavior as the attack type; (false negative) represents the amount of data that misidentifies the network attack behavior as normal; (false positive) refers to the amount of network data that wrongly identifies normal data behavior as some attack behavior.

To avoid the contingency of experimental results caused by intrusion detection algorithm testing on a single experimental data set, experimental data sets and of different sizes with different attack behavior classes are randomly selected from the CLS data set for experiments. The structure of the data sets used in the experiments is shown in Tables 4 and 5, according to which the sample data are extracted from the CLS data sets. The attack behavior data of data set all contain several unknown attack behavior data of corresponding class number (disguised by known attack behavior), which is used for the comparative experiments of the performance of intrusion detection algorithm to detect unknown attack behavior.

5.3.1. Experiments of ACC and FAR

Intrusion detection algorithms based on traditional KNN classification and density clustering DBSCAN are compared with the proposed algorithm. When the number of participants is 2, 4, 6, 8, and 10, ten test data sets , , , , , , , , , and are randomly selected from the test data packets for comparative experiments. The experimental results are shown in Figures 812.

5.3.2. Experiments of Detecting Unknown Attack Types

For detecting the unknown attack types, the proposed algorithm based on Federated Learning is compared with intrusion detection algorithms based on traditional KNN classification and density clustering DBSCAN. When the number of participants is 2, 4, 6, 8, and 10, ten test data sets , , , , , , , , , and are randomly selected from the test data packets for comparative experiments. The experimental results are as shown in Figures 1317.

Through the above comparative experiments, the results show that with the increasing number of participants, the detection performance of the proposed algorithm based on Federated Learning in terms of detection rate, false detection rate, and other aspects maintains at a relatively stable level. It fully verifies the feasibility of this detection algorithm in the real network environment where local data is protected and training data is scarce. Compared with the intrusion detection algorithms based on traditional KNN classification and density clustering DBSCAN, our algorithm has significant improvement in detection rate, false detection rate, and detection of unknown attack behavior. The AWID data set can well represent the characteristics of the original attributes after dimensionality reduction by the PCA method, which can achieve dimensionality reduction, reduce the time complexity, improve the detection efficiency, and ensure a higher detection rate and a lower false detection rate.

6. Conclusion

To allow multiple participants to conduct cooperative training on a global model without sharing their private data, protect participants’ local data, and expand the amount of data in the training model, this paper proposes an improved -means clustering intrusion detection algorithm based on Federated Learning for the wireless network. This algorithm is combined with three-way decision ideas and introduced a multiple perspectives cosine distance as the similarity measure between data objects to improve and modify the -means clustering algorithm. Therefore, the clustering result is more reasonable and the network data behavior is determined more accurately. As a result, the detection rate of the algorithm is increased and the error detection rate is decreased. This algorithm, however, is assumed under the ideal condition of unobtrusive parameter transmitted between participants and server, which may be different from that in the real network environment. Moreover, the scale of reference point set selected by the multiple perspective method in this paper is so large that it will affect its overall performance. In the future, we will continue to improve the algorithm structure and use more appropriate data sets to train the classifier, thus, ensuring the security of the interaction between the participants and the server. We will also find a more reasonable and effective way to select the reference point and to reduce the dimensionality of experimental data to further reduce its time complexity and improve its overall performance.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to thank the anonymous reviewers for their valuable suggestions given to improve the quality of the manuscript significantly. This work was supported by the National Natural Science Foundation of China under Grants no. 62076088, Natural Science Foundation of Hebei Province of China under Grant no. F2019205163 and no. F2021205004, Science Foundation of Returned Overseas of Hebei Province of China under Grant no. C2020342, and the Technological Innovation Foundation of Hebei Normal University under Grant no. L2020K09.