Abstract
Federated learning is a machine learning framework proposed in recent years. In horizontal federated learning, multiple participants cooperate to train and obtain a common final model. Participants only need to transmit the local updated model instead of local datasets. Some participants do not use effective local data sets, but provide disguised model parameters to participate in federal training and obtain common training models. This attack is called Freerider attack. To the best of our knowledge, researches have proposed some Freerider attack strategies with theoretical support, but there are few researches on Freerider attack detection. However, the model disguised by some attackers using special attack strategies is similar to the real model in terms of convergence and weight, so it is difficult to detect the model provided by attacker as abnormal data. Based on DAGMM, a highdimensional abnormal data detection model, this paper optimizes the sample processing and compression model, and proposes an improved detection algorithm, called DeltaDAGMM. Two types of large datasets are used for experiments. The experimental results show that DeltaDAGMM has higher precision and F1 score than DAGMM. On average, the DeltaDAGMM algorithm achieves a precision of 98.42% and an F1 score of 98.36%.
1. Introduction
Data security and privacy protection have gradually become the focus of attention of major Internet companies and research institutions. With the continuous introduction of relevant laws and regulations in various countries, it has become a key issue for people to research that how to conduct deep learning without infringing privacy of others, a framework called federated learning [1] was proposed. In federated learning, participating entities do not need to share local data sets, but only transmit the model of their local training updates, so as to protect the data privacy of themselves.
According to the characteristics of the data distribution of different training participants, federated learning can be divided into three types: vertical federated learning, horizontal federated learning and federated transfer learning [2]. Horizontal federated learning, also known as featurebased federated learning. In the horizontal federated training, the characteristics of the participants' data sets basically overlap, but the sample sources of the data sets are different. For example, two tumor hospitals in different regions respectively use their patient tumor image information as samples for horizontal federated training.
In the horizontal federated learning, a parameter server coordinates all clients for iterative training. The parameter server first initializes a global model. In each round of training, the parameter server distributes the global model to each participant client. Each participant client uses local data for training based on global model to get the local update model and uploads it to the parameter server. The parameter server receives the local model of all clients, and uses federated averaging algorithm (FedAVG) [3] for model aggregation to obtain a new round of training global model. However, some malicious or dishonest clients upload fake local models to the parameter server without local training, as shown in Figure 1, this attack is known as Freerider attack [4].
There are few researches on Freerider attacks in federated learning. In the only few researches, Fraboni et al. [5] summarized several Freerider attack methods and provided theoretical support. Lin et al. [4] proposed that the DAGMM model can be used to detect abnormal data, but did not provide experimental data such as paradigm defense and detection methods and accuracy. DAGMM is a detection model used to detect highdimensional abnormal data [6]. Through experiments, we found that this model got a high precision on the detection of two Plain Freerider attack strategies. The reason was that the local model parameters generated by the attackers who using these attack strategies were filled with random or fixed numbers, these model parameters can be detected as highdimensional exception data compared to the local model parameters provided by fair participants. For attack strategies such as directly copying the global model or adding differential perturbation to the global model, the detection precision of DAGMM is not high. The reason is that the model parameters of the attackers differ little from other fair participants in terms of gradient and convergence. It is difficult for DAGMM to detect these model parameters as abnormal data.
To overcome the limitations of existing methods, we improved the DAGMM model and design an optimized Freerider attack detection method DeltaDAGMM. To effectively detected the attackers using disguised Freerider attack strategies, this detection method originally included sample processing in the input sample. The input samples were the model parameters transmitted by the participants. we calculated increment of the model parameters of the participants relative to the global model parameters of the current round, then input the samples into the compression network after linear processing. In the compressed network, sample features were extracted. Finally, we input the sample features into evaluation model to calculate the energy/likelihood, and a threshold was set to determine whether it is a Freerider attacker. In order to verify the universality of the method, we selected a large number of different types of samples for experiments. In addition, we also compared this method with DAGMM. Experimental results showed that DeltaDAGMM can achieve higher precision and F1 score.
Major contributions of this paper include:(i)An optimized detection algorithm of highdimensional abnormal model parameters, DeltaDAGMM, was proposed to detect the Freerider attackers with various attack strategies.(ii)For the disguised Freerider attack strategies, the sample processing link of the input detection model was optimized. Straightened the increments of the local update model relative to global model to obtain the input samples of the detection model.(iii)In order to accurately obtain the sample features, we optimized the feature extraction of the DAGMM model compression network, so that the output energy/likelihood of the Freerider attackers’ model parameters will be larger that could be more easily evaluated as abnormal data in the evaluation network.
The rest of this paper is organized as follows. In Section 2, the methods of attack, defense and detection in horizontal federated learning proposed in the past are reviewed. Section 3 explains some of the preliminary knowledge, including the knowledge of Freerider attack and the DAGMM model. Section 4 details the DeltaDAGMM detection method proposed by us. Section is the complexity and convergence analysis of the model. Section 6 is the experimental results and discussion. Section 7 concludes this paper.
2. Related Work
Many scholars have researched and analyzed the methods of attack and detection in federated learning, which is worthy for us and later scholars. This section will introduce the related work.
2.1. Attacks in federated learning
Since the framework of federated learning was proposed, the researches on the safety of federated learning have been very active. The known types of attacks in federated learning are as follows:(i)Attackers maliciously modify the dataset to decrease the model performance, such as inverting one label of the model to another wrong label, etc. This type of attack is called poisoning attacks [7]. There is also distributed poisoning attack in federated learning [8]. For example, Xie et al. [9] proposed DBA (Distributed Backdoor Attack), which has a higher success rate, better convergence and flexibility compared with centralized backdoor attack [10], and it can avoid two robust FL detection methods.(ii)In the process of participating in the federation training, attackers infer the model parameters of other participants based on the local updated model parameters and the received global model parameters, and then infers the dataset information of other participants. This type of attack is called inference attack or privacy attack [11]. For example, Wang et al. [12] proposed an attack method that combines a multitask discriminator to identify the sample classification, customer name, identity and other information. Nasr et al. [13] designed a white box inference attack method against the shortcomings of the stochastic gradient descent algorithm.(iii)The attacker pretends to train, but instead of using his own dataset to participate in training, he uploads disguised model parameters. This type of attack is the Freerider attack we researched. Fraboni et al. [5] proposed theoretical and experimental analysis of the Freerider attack, which provided a formal guarantee for the attacks to converge to the aggregation model of fair participants.
2.2. Attack detection in federated learning
The reputation method proposed by people can be used to detect these attacks. For example, Kang et al. [14] proposed a decentralized consortium blockchain approach for efficient reputation management of participants. Kang et al. [15] also proposed a reputationbased federal learning security scheme designed by using the multiweight model, which can significantly improve the learning accuracy. In addition, some game theory methods have been proposed to prevent attacks while forcing fair contributions. For example, Hu et al. [16] proposed a collective extortion strategy under incomplete information multiperson FEL game, which can effectively help the server to effectively stimulate the full contribution of all devices without worrying about any economic loss.
In the field of highdimensional and multidimensional abnormal data detection, traditional detection methods usually first extract features, and then input the reduceddimensional features into other available models, such as GMM [17]. Yang et al. [18] proposed an unsupervised dimensionality reduction method combining deep learning and GMM. Zong et al. [6] proposed a deep autoencoded Gaussian mixture model (DAGMM) for detecting highdimensional abnormal data.
DAGMM shows the best precision on the public benchmark dataset, and has outstanding performance in the unsupervised anomaly detection of multidimensional or highdimensional data. Lin et al. [4] conducted a preliminary study on the attack and detection of Freerider and proposed several strategies of Freerider attack and a detection method, but did not provide the theoretical basis for its attack types or a normative detection method of the paradigm. Fraboni et al. [5] theoretically analysed and standardized the form of Freerider attacks, and mentioned that highdimensional anomaly data detection models (such as DAGMM) can be used for attack detection, but they did not conduct depth research on attack detection or experiment.
On the basis of DAGMM, we propose a new detection algorithm called DeltaDAGMM.
3. Preliminaries
3.1. Freerider attack method
The research of Fraboni et al. [19] showed that there are two types of Freerider attacks in Horizontal Federated Learning. One is called Plain Freerider attack, whose strategy is directly returning the global model parameters obtained in each round, or replace them with random numbers. The other is the disguised Freerider attack, whose strategy is to add differential perturbation to the global model parameters obtained in each round.
3.2. Plain Freerider attack
There are three main attack strategies of the Plain Freerider atack:(i)The attackers first get the length of the output layer matrix of the global model, after that they define a new highdimensional matrix with a length of , and fill this new matrix with a fixed value . Finally, they return the matrix to the parameter server A as the local updated model.(ii)The attackers first get the length of the output layer matrix of the global model, after that they define a new highdimensional matrix with a length of , and generate random numbers in the range to fill this new matrix. Finally, they return the matrix to the parameter server A as the local updated model.(iii)The attackers directly return the global model parameters of the current round as the local updated model to the parameter server A, that is. .
3.3. Disguised Freerider attack
During the training, we assuming that a Freerider has prior knowledge of the training process, who knows the approximate standard deviation of each round of the local updated model and global model of the fair clients in advance. The attacker processes the obtained global model parameters by adding differential timevarying perturbations. Which satisfies the convergence similar to the fair clients.
In the horizontal federated learning training without attackers, the images of the output layer gradient and the convergence function of the global model parameter are curves that smoothly converge with the number of training rounds . Therefore, the local updated model of the disguised Freerider can be assumed as the following timevarying noise perturbation process:where, is the noise process, and the whole noise is expressed as the dependent unit variance Gaussian white noise modulated by the parameter .
The Disguised Freerider attack strategy is divided into the following two types:(i)Linear timevarying disturbance.
Suppose that the perturbation model , the attenuation coefficient , then the Freerider attacker’s local model are updated as:where, the variable is the coefficient of noise level .(ii)Exponential timevarying disturbance.
Suppose that the perturbation model , the attenuation coefficient , then the Freerider attacker’s local model are updated as:
Fraboni et al. [5] explained the rationality of this perturbationbased attack and proposed a method to optimize the attack effect in experiments.
3.4. DAGMM
Density estimation is one of the core methods in anomaly detection of high dimensional data. DAGMM is a Gaussian mixture model which combines dimensionality reduction and density estimation efficiently. It mainly consists of two parts: compression network and estimation network.
The process of DAGMM is as follows: The depth autoencoder is used to reduce the dimensionality of the input samples in the compression network, and then the dimensionality reduction samples are fed back to the subsequent estimation network. The estimation network obtains the lowdimensional sample data fed back by the compressed network, and then estimates their energy under the framework of the Gaussian Mixture Model (GMM). High energy represents the data may be anomaly data.
In the research of Freerider attack, Fraboni et al. [5] proposed that DAGMM could be used as a means to detect Freerider attackers.
DAGMM is an endtoend training unsupervised highdimensional anomaly data detection model. Combined with the joint optimization of compression network and evaluation network, it solves the problem of large reconstruction error of anomaly samples in DSEBM and other detection methods. After a large number of experiments, we found that DAGMM has a high precision in detecting Plain attack type (i) and type (ii). However, the detection precision is not high for the detection of Plain Freerider attack type (iii) and Disguised Freerider attack types. Such sample data bases on the real model parameters and the addition of timevarying perturbations conforming to the convergence rate are likely to be detected as the real samples obtained by training, and can be restored well through the estimation network in DAGMM, and the output energy may not high enough to be detected as anomaly data.
Therefore, we optimize this detection model and propose a new type of attack detection method called Delta Deep Autoencoding Gaussian Mixture Model (DeltaDAGMM).
4. Research Methodology Introduction
The purpose of this paper is to detect free rider attackers. Therefore, we propose Delta DAGMM, a Freerider attack detection method, which includes three steps. As shown in Figure 2., we first calculate the increment of the model parameters of each client relative to the global model in the current round of horizontal federated learning to obtain the input samples of the detection model. Then we extract the sample features in the compression network, and finally input the sample features into the estimation network to get the energy/likelihood. Finally, we use the energy as the basis of attack detection, and set a threshold to determine whether the participants are free rider attackers.
4.1. Sample treatment
Sample treatment is a very important part of DeltaDAGMM defense detection. We know that DAGMM can effectively detect the Plain Freerider attack type (i) and type (ii), but it is not effective for the other attack strategies. We can simplify the remaining three attack strategies into Plain freerider attack type (i) and type (ii) that can be easily detected by DAGMM through sample treatment (ST).
Specifically, sample treatment is divided into two steps: data collection and incremental processing.
4.1.1. Data collection
In the training of horizontal federated learning, we assume that there are clients participating in multiple rounds of iterative training, denoted by respectively. In the process of iterative training, we used to denote the global model transmitted by the parameter server to all participant clients in round t. The participant clients' local updated model denoted by respectively. After the transmission of all participant client local model updates in round t is completed, parameter server receives local updated models of all participant clients. The global model of round t+1 is generated by federated averaging algorithm (FVG), the method can be expressed as:
The parameter server sends the obtained global model to all participant clients as the beginning of the training of round t +1, this iteration continues until the end of the training.
It is assumed that there are n rounds of training, and in each round of training, we put the local updated modal into a set, received n totally : , , …, and a global model set . Before the end of each round of horizontal federated training, we collected the global model parameter and the local updated model set of the clients as the input samples of the Freerider attack detection in round t.
When horizontal federated learning uses different training models, the dimensions of the output layer parameters of the training model are also different. According to the different training models in horizontal federation learning, we divide the input samples of the deltaDAGMM detection model into the following two categories:(i)MLPFederate. In horizontal federated training, we use MLP as the training model, and the set of parameters of the local update model obtained locally by participants trained or disguised is MLPFederate. The parameters of each participant's local update model are the weight matrix in output layer of MLP and the tensor array with length of 64∗10.(ii)CNNFederate. In horizontal federated training, we use CNN as the training model, and the set of parameters of the local update model obtained locally by participants trained or disguised is CNN Federate. The parameters of each participant's local update model are the weight matrix in output layer of CNN and the tensor array with length of 50∗10.
4.1.2. Incremental treatment
In the disguised Freerider attack, the model parameters disguised by the attacker are based on the global model parameters and the current training round is used as the parameter to add differential perturbation. In order to make the model parameters show the convergence similar to that of the fair clients on the whole, the effect of the differential perturbation set by the attacker is decreasing by round, but there will be some discreteness. Parameter server calculates the increment of the local model parameters of the Freerider attacker compared with the global model parameters in the current round , and the difference value is actually equal to the value of the differential perturbation added in the attack. Since the input sample of the attacker is based on the random process and has certain fluctuation, it is very likely to be detected as abnormal data.
In order to avoid the evaluation error caused by the too small absolute value of the element in the input sample, we linearly process the increment of the model to obtain the final input sample :where denotes the preset constant and denotes the global model filled with the preset constant . For the Plain Freerider attack type (iii), since the local model parameter of the attacker is and the model parameter after incremental processing is , it is equivalent to converting this attack strategy to fill the global model with fixed values, that is, the Plain Freerider attack type (i). When the attacker uses the disguised Freerider attack strategies, for the attack with linear timevarying disturbance and for the attack with exponential timevarying disturbance, the model parameters after incremental processing actually use the timevarying disturbance values of and to fill the samples of the global model. This strategy is close to the Plain Free rider attack type (ii).
The final input samples can be divided into the following two categories according to the different models selected for horizontal federated training:(i)DeltaMLPFederate. In the horizontal federation training experiment, the participants and parameter server select the MLP model, and we obtain the sample through incremental processing on the local update model parameter set of each round of training. The length of each input sample array is 64∗10.(ii)DeltaCNNFederate. In the horizontal federation training experiment, the participants and parameter server select the CNN model, and we obtain the sample through incremental processing on the local update model parameter set of each round of training. The length of each input sample array is 50∗10.
4.2. Compression network
In the processing of the DeltaDAGMM, when the highdimensional sample is generated, the compression network uses a deep autoencoder to reduce the dimension of the input sample and extract three parts of features. Finally, we merge the features to obtain the compressed sample. The DeltaDAGMM we proposed is different from DAGMM in the compression network. DeltaDAGMM adds feature extraction of the mean value of all elements of the input sample, making it easier for abnormal data to eventually output high energy values and be detected.
4.2.1. Feature extraction
The autoencoder neural network used by the compression network is an unsupervised learning model. It uses a backpropagation algorithm to make the target value equal to the input value as much as possible. It is generally used for feature extraction of highdimensional data.
Feature extraction in compressed networks has three sources:(i)Simplified representation of sample learned by deep autoencoder.(ii)Feature extracted from reconstruction error.(iii)The mean of all the elements in the input sample, .
Given the input sample, the three features extracted by the compression network are as follows:where and denote the parameters of the depth autoencoder, denotes the reconstruction counterpart of sample , denotes the encoding function, denotes the decoding function, and denotes the function to calculate the reconstruction error characteristics.
4.2.2. Feature merging
We merge the three features of , and as the output of the compression network and input them into the estimation network. The lowdimensional representation finally provided by the compression network is as follows:
4.3. Estimation network
The estimation network estimated the density of the lowdimensional representation Z under the framework of GMM, which was achieved by using a multilayer neural network (MLN) to predict the mixed membership for each sample. Membership testing is as follows:where denotes the compressed sample, integer is the number of mixed components in GMM, is the Dimension vector used for soft mixed components membership prediction, and is the output of multilayer network parameterized by .
The current number of samples is . For any , we can further estimate the important parameters in GMM as follows: the mixing probability , mean , and covariance of GMM component . This step is the same as the parameter updating process of the conventional Gaussian mixture model [6]:
Calculate the sample energy through the above parameters, denoted by :where denotes the determinant of a matrix. In the round training, the input samples of participants were estimated by the estimated network and the sample energies were respectively, where , Calculate the average of the energy of these samples. We set the threshold to , and choose a better value for W according to the experimental results. We predict the highenergy samples that meet the conditions as the Freerider attacker in the training round t. After the Federal training is over, each participant is ultimately determined to be a FreeRider if he has been detected as a Freerider for more than times, where denotes the smallest integer not less than of the number of the training rounds.
If the energy of the automatically encoded sample feature extraction calculated through the estimation network is low, indicating that the reconstruction error of the estimation network is low, the original sample can be considered as normal highdimensional data that is easy to restore. However, for the samples obtained from the model parameters provided by the Freerider attacker, the deviation from the original data is large after the reconstruction of the compression network, and the calculated energy is high. The algorithm DeltaDAGMM is illustrated in Algorithm 1.

5. Method analysis
5.1. Complexity analysis
We analyze the time complexity of sample processing, compression network and evaluation network in DeltaDAGMM, and compare it with DAGMM. Here we ignore the communication cost because they can be enhanced from the federal learning and training framework, and for this detection model DeltaDAGMM, there is no additional communication cost.
For the horizontal federated training in this paper, we set the size of the transmission model as a twodimensional tensor of J ∗ K and the number of participants as M, so the time complexity of sample treatment(ST) is O(M∗J∗K). In the compressed network(CN), the time complexity of the simplified representation(SR) of the calculation sample is O(M∗J∗K), the time complexity of the feature extraction(FE) from the reconstruction error is O(M∗J), and the time complexity of the mean of all elements(ME) in each sample is O(M∗J∗K). Assuming that the number of GMM mixed components in the evaluation network(EN) is G, the time complexity of the evaluation network calculation(ENC) is O(M∗JK). Table 1 describes the time complexity of DeltaDAGMM and DAGMM.
According to Table 1, the total time complexity of DAGMM and DeltaDAGMM is basically the same, and the maximum time cost is concentrated in the evaluation of network computing energy. In fact, the maximum time spent in federal training is spent on communication between servers and participants.
5.2. Convergence analysis
We need to explain the convergence of the federal learning model containing the Freerider attack to prove the effectiveness and concealment of this attack.
Taking plain attack strategy III as an example, the differences between global models with and without plain Free rider attackers are calculated, as shown in expressions (11)(15).
Among them, is the minimum value of local model parameters. is related to the initial training set of hyperparameters, including the number of training rounds , the learning rate and the number of samples for each small batch. is deltarelated Gaussian white noise, while is a timevarying noise. and represent two different stochastic processes related to the federal global model.
In the absence of Free rider attackers, the second item of ' s value dependency (11) is the difference between two different stochastic processes associated with federal training global models. In the case of Freerider attackers, the convergence of the federal training global model depends on the ratio of the sum of Freerider attacker samples to the sum of all participants’ samples.
6. Experiments and Discussion
In order to verify the effectiveness of DeltaDAGMM for the detection of Freerider attacks in horizontal federated learning, we designed and implemented experiments and compared it with the existing attack detection method DAGMM.
The experiment simulates the parameter server and all participant nodes in horizontal federated learning on a computer device. The hardware used in the experiment is AMD R74800H 2.9GHz, the memory is 16GB, the graphics card used for local training is NVIDIA GeForce RTX 2060 6GB, and the operating system is Windows 10.
We set up 10 participants, including 1 Freerider attacker, and conduct attack detection experiments on two different types of input samples MLPFederate and CNNFederate for five attack strategies. We repeated the experiment 50 times for each strategy to eliminate chance.
6.1. Experimental Datasets
We use two different training models, CNN and MLP, in the horizontal federation learning training process. In each round of training, we take the participants ' local model parameter set as the input sample.
Due to the different training models used in the training process, we get two kinds of highdimensional samples of different lengths, MLP Federate and CNN Federate so as to better judge the precision of the detection algorithm. Table 2 summarizes the specific information of the FreeRider attack detection datasets. The total number of the two experimental samples is 50,000.
6.2. Experimental Metrics
In this experiment, we adopt precision and F1 score as the metrics. Precision (16) denotes the proportion of samples detected as attackers and actually being attackers to all samples detected as attackers, which reflects whether the detection algorithm can accurately find positive samples from all samples to avoid false positives. Recall (17) denotes the proportion of the samples detected as and actually being attackers to the samples of actual attackers. F1 score (18) is a metric used to measure the accuracy of the dichotomous model. It also takes into account the precision and recall of the classification model, so it can be regarded as the harmonic average of model precision and recall. In (16), (17) and (18), TP denotes the number of the samples that are predicted to be attackers and are actually attackers. FP denotes the number of the positive samples that are predicted to be attackers but are not actually attackers. FN denotes the number of the samples of attacker that are actually but have not been detected. P denotes the precision of the detection model, and R represents the recall of the detection denotes. F1 denotes F1 score:
For each experiment, we will count the precision and F1 score of the Freerider attack detection of each round of horizontal federated learning training samples, which are called single time precision and F1 score. After the end of all rounds of the experiment, we set a threshold to obtain a final attack detection result based on the number of times each participant was inferred as a Freerider in all rounds, and the overall detection accuracy and F1 score were counted.
6.3. Experimental Result
This article conducts experiments on the above five different attack strategies. We select two different data sets of MLPFederate and CNNFederate for experiments, and finally calculate the single time precision and F1 score, and the overall precision and F1 score under the five different attack strategies. According to Table 3, the single time attack detection precisions of the five attack strategies of the DeltaDAGMM algorithm proposed in this paper have exceeded 83%, and the overall precisions have exceeded 95%. The single time precisions of Plain Freerider attack type (i), type (ii) and type (iii) are above 90%, and the overall precision are above 97%. The single time and the overall precision of detection for Plain Freerider attack type (iii) are both 100%. According to Table 4, the single time attack detection F1 score of the five attack strategies of the DeltaDAGMM algorithm proposed in this paper have exceeded 85%, and the overall F1 score have exceeded 96%. The single time F1 score of Plain Freerider attack type (i), type (ii) and type (iii) are above 95%, and the overall F1 score are above 95%. The single time and the overall F1 score of detection for Plain Freerider attack type (iii) are both 100%.
6.4. Experimental discussion
Table 5 and Table 6 respectively show the precisions and F1 score comparison between DeltaDAGMM and other Freerider attack detection methods. DAGMM(ST) denotes DAGMM with sample treatment.
As shown in Figures 3 and 4, under the five attack strategies, the single time and overall precisions of DeltaDAGMM are slightly improved compared with that of DAGMM and DAGMM with sample treatment for detection of the Plain FreeRider attack type (i) and type (ii), where single time precisions increase by 1.6% and 0.2% respectively, and overall precisions increase by 0.4% and 0.1% respectively. For the detection of the Plain Freerider attack type (iii) and the disguised Freerider attack strategy (i) and (ii), the precisions of DeltaDAGMM are significantly high than that of DAGMM and DAGMM with sample treatment. The single time precisions increase by 20.3% and 7.7% respectively, and the overall precisions increase by 15.8% and 6.2% respectively.
As shown in Figures 5 and 6, under the five attack strategies, the single time and overall F1 score of DeltaDAGMM are slightly improved compared with that of DAGMM and DAGMM with sample treatment for detection of the Plain Freerider attack type (i) and type (ii), where single time F1 score increase by 1.5% and 0.3% respectively, and overall F1 score increase by 0.6% and 0.2% respectively. For the detection of the Plain Freerider attack type (iii) and the disguised Freerider attack strategies (i) and (ii), the F1 score of DeltaDAGMM are significantly high than that of DAGMM and DAGMM with sample treatment. The single time F1 score increase by 21.2% and 8.5% respectively, and the overall F1 score increase by 15.9% and 6.4% respectively.
Since our previous statistics are to combine the detection results of all training rounds, it is impossible to show the trend of detection precisions with the change of training rounds. Therefore, we also record the change of DeltaDAGMM with training rounds in the horizontal federation for the precisions of the detection of five freerider attack strategies. As shown in Figures 7 and 8, as the number of training rounds increases, the detection precisions and F1 score of DeltaDAGMM gradually increase.
Since we set 1 Freerider attacker among 10 participants in all previous experiment, we try to set more attackers among the participants. As shown in Figures 9 and 10, when 23 Freerider attackers are set, the precision of the all three attack detection algorithms decreases, but DeltaDAGMM still maintain a precision of more than 75%, which is better than DAGMM and DAGMM with sample treatment.
For larger trials, we used existing distributed computing techniques to simulate the involvement of larger users in training, setting 1000 participants and 10 attackers, as shown in table 7, the DeltaDAGMM detection accuracy remains high.
6.5. Experimental conclusion
Due to the DeltaDAGMM proposed in this paper adds sample processing compared with DAGMM, in fact, the camouflage Free rider attack is transformed into the simple Free rider attack. And DeltaDAGMM adds a feature representation to the compressed network, making it easier to reconstruct lowdimensional samples of fair participants and find Free rider attackers among participants. We conducted experiments under different conditions, and found that the accuracy rate and F1 score of DeltaDAGMM were significantly higher than those of DAGMM no matter for single detection or overall detection. For largescale simulation experiments of participants, the accuracy rate of DeltaDAGMM was also higher.
7. Conclusions
In horizontal federated learning, there is a Freerider who does not use the local data set to participate in training, but disguises the parameters the local updated model to participate in training and steal the global model. In order to detect Freerider attackers, we propose an improved attack detection algorithm based on the DAGMM model, DeltaDAGMM. Compared with DAGMM, this algorithm is optimized in sample treatment and feature extraction. An incremental processing method is used to optimize the sample, and the more critical features in the sample can be extracted. We also set an appropriate threshold to finally detect the attacker.
The experimental results show that compared with DAGMM, DeltaDAGMM can achieve higher precision and F1 score. The average precisions of a single time detection are 92.1%, 20.3% higher than DAGMM, and average precisions of the overall detection are 98.4%, 15.8% higher than DAGMM. The average F1 score in single time detections are 93.4%, 21.4% higher than DAGMM, and the average F1 score in the overall detection is 98.4%, 16.5% higher than DAGMM. The experimental results confirm that DeltaDAGMM is a more effective Freerider attack detection algorithm than DAGMM.
However, in our experiments, the model parameters that the parameter server and participants of horizontal federated learning transmit to each other are plain text. The challenge of DeltaDAGMM is that in future federated training will use methods such as homomorphic encryption [19–23] or differential privacy [24] to encrypt model parameters transmitted by users. The model parameters sent and transmitted by the client will no longer be plaintext. Next, we will consider how to detect Freerider attacks under ciphertext.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
Thank Mr. Yu Haining for his guidance of this paper and the National Natural Science Foundation of China (62172123) for its support.