Interface Detector Based on Vaccination Strategy for Anomaly Detection

Liu, Yinghui; Li, Dong; Wei, Yuan; Zhang, Hongli

doi:https://doi.org/10.1155/2020/2106576

Mathematical Problems in Engineering

On this page

Abstract Introduction Results Conclusions Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 2106576 | https://doi.org/10.1155/2020/2106576

Interface Detector Based on Vaccination Strategy for Anomaly Detection

Yinghui Liu,¹Dong Li,²Yuan Wei,³and Hongli Zhang³

Academic Editor: Ioannis Kostavelis

Received17 Sept 2019

Accepted08 Jul 2020

Published26 Jul 2020

Abstract

Interface detector is an enhanced negative selection algorithm with online adaptive learning under small training samples for anomaly detection. It has better detection performance when it has an appropriate self-radius. Otherwise, overfitting or underfitting would occur. In the present paper, an improved interface detector, which is based on vaccination strategy, is proposed. During the testing stage, negative vaccine can overcome overfitting to improve the detection rate and positive vaccine can overcome underfitting to reduce the false alarm rate. The experimental results show that under the same dataset, self-radius, and training samples condition, the detection rate of the interface detector with negative vaccine is much higher than that of interface detector, SVM, and BP neural network. Moreover, the false alarm rate of the interface detector with positive vaccine is much lower than that of the interface detector and PSA.

1. Introduction

Negative selection algorithm (NSA), which was proposed by Forrest et al. in 1994 [1], is a significant algorithm of artificial immune systems. It is inspired by the mechanism of T-cell maturation that happens in the thymus, attracting widespread interest in the field of anomaly detection and fault diagnosis [2–7].

The initial NSA defines self-samples and nonself-samples using binary strings [3], making it easy to understand the mechanism of NSA. Soon, a real-valued NSA was presented [8], since many application problems can be described in real-valued space. At first, it uses constant size hypersphere as detectors. Later, the other detectors were proposed, such as variable-sized detector [9], hypercube detector [10], hyperellipsoid detector [11], and multishaped detector [12].

To improve the detection rate and reduce the amount of detectors, some improved NSA algorithms were proposed. Boundary detectors [13] are allowed to cover a part of self-space, making themselves enable to eliminate the holes on the boundary and have an opportunity to detect the deceiving anomalies hidden in the self-space. Furthermore, training negative selection algorithm (FtNSA) [14] generates V-detectors in self-space and nonself-space, respectively, and it can classify the testing samples lying within the holes. Self-adaptive negative selection algorithm (ANSA) [15] can build an appropriate profile of the system by using a subset of self-samples and adaptively adjust the self-radius, the detection radius, and number of detectors to amend the built profile of the system. It can adapt the varieties of self-/nonself-space.

Although the methods mentioned above can improve the detection rate or reduce the quality of detectors, little attention has been paid to the detector with online adaptive learning. Interface detector [16–18] is based on the outer layer samples of self-space, which is one or more closed hyperspheres (shown in Figure 1). It can be built under small training samples, and sometimes, one sample is enough. It can adapt itself to real-time variety of self-space during the testing stage. It can completely surround the self-space with an appropriate self-radius, making self-samples inside of it and nonself-samples outside of it.

(a)

(b)

(c)

(d)

The learning ability of the interface detector depends on the self-radius r_s. Once r_s is relatively large, the interface detector would classify a nonself-sample as a boundary sample, and then, overfitting can occur, leading to the detection rate decrease. Once r_s is relatively small, the interface detector cannot surround all the self-spaces, and then, underfitting occurs, leading to the false alarm rate increase.

The purpose of the present work is to further improve the detection performance of the interface detector by introducing vaccination strategy. As in the immune system, vaccination can generate a strong immune response, providing long-term protection against infection [19–21]. So some samples whose classifications are known can be used as vaccines to improve the learning ability of the interface detector.

2. Overfitting of the Interface Detector and Negative Vaccine

When the minimum distance d_o between self-samples and nonself-samples is smaller than r_s, overfitting occurs.

Figure 2 shows the overfitting process of the interface detector on 2-dimensional synthetic dataset. There are 3 self-samples and 2 nonself-samples (t₁, t₂, t₃ ∈ S; t₄, t₅ ∈ N). Select t₂ as the training sample and others as testing samples; the testing sequence is t₃, t₄, t₁, and t₅ (shown in Figure 2(a)). The interface detector built by t₂ (shown in Figure 2(b)) recognizes t₃ ∈ S and t₃ ∈ B (right). The new interface detector built by t₂ and t₃ (shown in Figure 2(c)) recognizes t₄ ∈ S and t₄ ∈ B (wrong), for d₃₄ < r_s, where d₃₄ is the distance between t₃ and t₄.

(a)

(b)

(c)

(d)

(e)

(f)

The new interface detector built by t₂, t₃, and t₄ (shown in Figure 2(d)) recognizes t₁ ∈ S and t₁ ∈ B (right). When the new interface detector built by t₂, t₃, t₄, and t₁ (shown in Figure 2(e)) recognizes t₅ ∈ S and t₅ ∈ B (wrong), overfitting occurs. Because nonself-sample t₄ is wrongly recognized as a boundary sample, the new interface detector built by boundary samples which include t₄ can wrongly recognize other nonself-samples such as t₅. When t₅ is wrongly recognized as a boundary sample, the interface detector built by these boundary samples (shown in Figure 2(f)) can enhance overfitting, leading to the rapid decrease in the detection rate.

Taking a small r_s (d₃₄ > r_s) is a way to avoid overfitting of the interface detector, but once r_s is relatively small, other new problems will appear [16]. Negative vaccine can balance this problem without modifying r_s.

That a testing sample t is recognized as a boundary sample is determined by the position information of the nearest boundary sample to t, rather than the others [16]. Negative vaccines are nonself-samples, and they can revise the position information of the boundary samples, which are recognized as new ones during the training stage or testing stage. Figure 3 shows the progress of negative vaccine adjusting the interface detector on the 2-dimensional synthetic dataset.

(a)

(b)

(c)

(d)

That the interface detector built by t₂ and t₃ wrongly recognizes t₄ as a boundary sample is determined by the position information of t₃. If t₄ is considered as a negative vaccine (shown in Figure 3(a)), it revises the position information of t₃. The interface detector built by t₂ and t₃ adjusts itself to be what is shown in Figure 3(b). The new interface detector recognizes t₁ ∈ S and t₁ ∈ B (shown in Figure 3(c)). t₁ adjusts the interface detector as is shown in Figure 3(d). The new interface detector recognizes t₅ ∈ N, and overfitting does not occur.

3. Underfitting of the Interface Detector and Positive Vaccine

When r_s is relatively small, the interface detector cannot recognize other new boundary samples to adjust itself. As a result, the interface detector cannot surround all the self-spaces and underfitting occurs.

Figure 4 shows the underfitting process of the interface detector on the 2-dimensional synthetic dataset. There are 5 self-samples. Select t₂ as the training sample and others as testing samples; testing sequence is t₃, t₄, t₁, and t₅ (shown in Figure 4(a)). The interface detector built by t₂ (shown in Figure 4(b)) recognizes t₃ ∈ S and t₃ ∈ B (right). The new interface detector built by t₂ and t₃ (shown in Figure 4(c)) recognizes t₄ ∈ N (wrong), for d₃₄ > r_s, where d₃₄ is the distance between t₃ and t₄. It recognizes t₁ ∈ S and t₁ ∈ B (right) (shown in Figure 4(d)). The interface detector is adjusted by t₁ to be what is shown in Figure 4(e). It recognizes t₅ ∈ N (wrong), and underfitting occurs (shown in Figure 4(f)). Because r_s is relatively small, the interface detector cannot completely surround self-space.

(a)

(b)

(c)

(d)

(e)

(f)

Taking a large r_s (d₃₄ < r_s) is a way to avoid underfitting of the interface detector, but once r_s is relatively large, other new problems will appear [16]. Positive vaccine can balance this problem without modifying r_s.

Positive vaccines are new boundary samples, and they can adjust the interface detector to surround more self-space. Figure 5 shows the progress of positive vaccine adjusting the interface detector on the 2-dimensional synthetic dataset. Figure 4(f) shows that the interface detector built by t₁, t₂, and t₃ wrongly recognizes t₄ ∈ N and t₅ ∈ N. Assume t₅ as a positive vaccine (shown in Figure 5(a)), and interface detector is adjusted by t₅ to be what is shown in Figure 5(b). It recognizes t₄ ∈ S and t₄ ∈ B (right). At last, the interface detector completely surrounds the self-space (shown in Figure 5(c)).

(a)

(b)

(c)

4. Experiment and Results

Interface detector based on vaccination strategy is used to overcome these problems. Because the interface detector based on vaccination strategy can adapt itself to real-time variety of self-space by continual learning of the testing samples during the testing stage.

In order to determine the performance and possible advantages of our proposed approach, we performed the experiments with 2-dimensional synthetic datasets (shown in Figures 6(a) and 7(a)). The algorithm of the interface detector based on vaccination strategy is shown in Figure 8.

(a)

(b)

(c)

(a)

(b)

(c)

(d)

4.1. Interface Detector with Negative Vaccine

To determine the advantages of the interface detector with negative vaccine, the comparison of interface detector, support vector machine (SVM), and BP neural network is carried out on a 2-dimensional synthetic dataset (shown in Figure 6(a)), in which there are 81 self-samples and 81 nonself-samples.

4.1.1. Results of the Interface Detector

Take , where , and d_ij is the distance between s_i and s_j.

Randomly select one self-sample as the training sample and others as testing samples. The interface detector adapts itself during the testing stage to be what is shown in Figure 6(b) finally. The detection rate is 0%, and the false alarm rate is 0%.

Because the minimum distance between self-samples and nonself-samples is shorter than r_s, the interface detector wrongly recognizes a nonself-sample as a boundary sample, leading to overfitting. At last, the interface detector not only surrounds all the self-spaces but also surrounds all the nonself-space.

4.1.2. Results of the Interface Detector with Negative Vaccine

Negative vaccine can be used to overcome overfitting of the interface detector and improve the detection rate. For this problem, select the nonself-sample which is nearest to self-samples as negative vaccine (shown in Figure 6(c)).

Randomly select one self-sample as the training sample and others as testing samples, except negative vaccine. The interface detector with negative vaccine improves the detection rate up to 100%, but the false alarm rate is still 0%. Finally, the interface detector is shown in Figure 6(c).

Compared with the results of SVM and BP neural network shown in Table 1, the interface detector with negative vaccine has better detection performance than that of the others.

In SVM and BP neural network, randomly select one self-sample and the nonself-sample which is the negative vaccine as training samples and others as testing samples. The results are average of 81 repeated experiments, for every self-sample takes turns as training data.

4.2. Interface Detector with Positive Vaccine

To determine the advantages of the interface detector with positive vaccine, the comparison of the interface detector and positive selection algorithm (PSA) is carried out on a 2-dimensional synthetic dataset (shown in Figure 7(a)), in which there are 136 self-samples and 213 nonself-samples. , and there are 68 samples in S₁ and S₂, respectively.

4.2.1. Results of the Interface Detector

Take , where and d_ij is the distance between s_i and s_j.

Randomly select one self-sample as the training sample and others as testing samples. The interface detector can adapts itself during the testing stage to be what is shown in Figures 7(b) and 7(c) finally. The detection rate is 100%, and the false alarm rate is 50%.

Because the minimum distance between self-samples and nonself-samples is larger than r_s, the interface detector cannot surround any nonself-space. Because minimum distance between S₁ and S₂ is larger than r_s, the interface detector cannot recognize any other new boundary samples to adapt itself to completely surrounding all the self-spaces. At last, the interface detector only surrounds half of the self-space.

4.2.2. Results of the Interface Detector with Positive Vaccine

Positive vaccine can be used to overcome underfitting and reducing the false alarm rate of the interface detector.

Randomly select one sample in S₁ and one sample is S₂ as the training sample and positive vaccine and others as testing samples. The interface detector with positive vaccine reduces the false alarm rate down to 0%, but the detection rate is still 100%. At last, the interface detector is shown in Figure 7(d).

Compared with the results of the positive selection algorithm (PSA) shown in Table 2, the interface detector with positive vaccine has better detection performance than PSA.

In PSA, the radius of detectors is the same as r_s. Randomly select one sample in S₁ and S₂, respectively, as training samples and others as testing samples. The results are the average of 4624 repeated experiments, for every self-sample takes turns as training data.

The interface detector based on vaccination strategy can overcome the drawbacks of the interface detector during the testing stage.

Underfitting of the interface detector is overcome by positive vaccine, which reduces the false alarm rate. The positive vaccines are the self-samples and are easy to get. So this method has better anomaly detection performance, whether the experiment is conducted on synthetic datasets or standard datasets.

Overfitting of the interface detector is overcome by negative vaccine, which improve the detection rate. The negative vaccines are nonself-samples and are difficult to get. How to get the negative vaccines efficiently is the next work to do.

5. Conclusions

A modified interface detector is developed by introducing vaccination strategy in this work. Interface detector based on vaccination strategy can overcome the drawbacks of the interface detector during the testing stage. Overfitting of the interface detector is overcome by negative vaccine, and it can improve the detection rate. Underfitting of the interface detector is overcome by positive vaccine, which reduces the false alarm rate. Comprehensive experimental results demonstrate that the proposed method is effective in anomaly detection. Under the same dataset, self-radius, and training samples condition, the detection rate of the interface detector with negative vaccine is much higher than that of interface detector, SVM, and BP neural network. In addition, the false alarm rate of the interface detector with positive vaccine is much lower than that of the interface detector and PSA.

The interface detector based on vaccination strategy can adapt itself to real-time variety of self-space by continual learning of the testing samples during the testing stage. This paper does not consider the computational complexity. We are preparing to do experiment with actual fault data in the future, and the computational complexity will be considered.

Nomenclature

r_s:	Self-radius
t₁, t₂, and t₃:	Self-samples
t₄ and t₅:	Nonself-samples
d₃₄:	The distance between t₃ and t₄
t:	A testing sample
s_i:	A single self-sample
d_ij:	The distance between s_i and s_j
S:	The set of self-samples
B:	The set of boundary samples
N:	The set of nonself-samples
P:	The set of samples position.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (Grant nos. 11802168, 61603238, and 51575331) and the project funded by China Postdoctoral Science Foundation (No. 2019M661458).

References

S. Forrest, A. S. Perelson, and L. Allen, “Self-nonself discrimination in a computer,” in Proceeding of the IEEE Symposium on Research in Security and Privacy, pp. 202–212, Oakland, CA, USA, May 1994.
View at: Google Scholar
D. Dasgupta, S. Yu, and F. Nino, “Recent advances in artificial immune systems: models and applications,” Applied Soft Computing, vol. 11, no. 2, pp. 1574–1587, 2011.
View at: Publisher Site | Google Scholar
F. Esponda, S. Forrest, and P. Helman, “A formal framework for positive and negative detection schemes,” IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), vol. 34, no. 1, pp. 357–373, 2004.
View at: Publisher Site | Google Scholar
A. Abid, M. T. Khan, and C. W. de Silva, “Layered and real-valued negative selection algorithm for fault detection,” IEEE Systems Journal, vol. 12, no. 3, pp. 2960–2969, 2018.
View at: Publisher Site | Google Scholar
Y. Wei and S. Liu, “Numerical analysis of the dynamic behavior of a rotor-bearing-brush seal system with bristle interference,” Journal of Mechanical Science and Technology, vol. 33, no. 8, pp. 3895–3903, 2019.
View at: Publisher Site | Google Scholar
D. Li, S. Liu, F. Gao, and X. Sun, “Continual learning classification method with new labeled data based on the artificial immune system,” Applied Soft Computing, vol. 94, Article ID 106423, 2020.
View at: Publisher Site | Google Scholar
R. Chikh and S. Chikhi, “Clustered negative selection algorithm and fruit fly optimization for email spam detection,” Journal of Ambient Intelligence and Humanized Computing, vol. 10, no. 1, pp. 143–152, 2019.
View at: Publisher Site | Google Scholar
F. A. Gonzalez and D. Dasgupta, “An immunogenetic technique to detect anomalies in network traffic,” in Proceeding of the international Conference Genetic and Evolutionary Computation, pp. 1081–1088, New York, NY, USA, July 2002.
View at: Google Scholar
J. Zhou and D. Dasgupta, “V-detector: An efficient negative selection algorithm with “probably adequate” detector coverage,” Information Sciences, vol. 179, no. 10, pp. 1390–1406, 2009.
View at: Publisher Site | Google Scholar
D. Dasgupta and F. González, “An immunity-based technique to characterize intrusions in computer networks,” IEEE Transactions on Evolutionary Computation, vol. 6, no. 3, pp. 281–291, 2002.
View at: Publisher Site | Google Scholar
J. M. Shapiro, G. B. Lamont, and G. L. Peterson, “An evolutionary algorithm to generate ellipsoid network intrusion detectors,” in Proceeding of the 2005 Workshops on Genetic and Evolutionary Computation, pp. 178–180, Washington, DC, USA, June 2005.
View at: Publisher Site | Google Scholar
S. Balachandran, D. Dasgupta, and F. Nino, “A framework for evolving multi-shaped detectors in negative selection,” in Proceeding of the IEEE Symposium on Computational intelligence, pp. 401–408, Hawaii, HI, USA, April 2007.
View at: Publisher Site | Google Scholar
D. Wang, F. Zhang, and L. Xi, “Evolving boundary detector for anomaly detection,” Expert Systems with Applications, vol. 38, no. 3, pp. 2412–2420, 2011.
View at: Publisher Site | Google Scholar
M. Gong, J. Zhang, J. Ma, and L. Jiao, “An efficient negative selection algorithm with further training for anomaly detection,” Knowledge-Based Systems, vol. 30, pp. 185–191, 2012.
View at: Publisher Site | Google Scholar
J. Zeng, X. Liu, T. Li, C. Liu, L. Peng, and F. Sun, “A self-adaptive negative selection algorithm used for anomaly detection,” Progress in Natural Science, vol. 19, no. 2, pp. 261–266, 2009.
View at: Publisher Site | Google Scholar
D. Li, S. Liu, and H. Zhang, “A negative selection algorithm with online adaptive learning under small samples for anomaly detection,” Neurocomputing, vol. 149, pp. 515–525, 2015.
View at: Publisher Site | Google Scholar
D. Li, S. Liu, and H. Zhang, “Negative selection algorithm with constant detectors for anomaly detection,” Applied Soft Computing, vol. 36, pp. 618–632, 2015.
View at: Publisher Site | Google Scholar
D. Li, S. Liu, and H. Zhang, “A method of anomaly detection and fault diagnosis with online adaptive learning under small training samples,” Pattern Recognition, vol. 64, pp. 374–385, 2017.
View at: Google Scholar
J. C. Aguilar and E. G. Rodríguez, “Vaccine adjuvants revisited,” Vaccine, vol. 25, no. 19, pp. 3752–3762, 2007.
View at: Publisher Site | Google Scholar
A. D. Rocha, P. Lima-Monteiro, M. Parreira-Rocha, and J. Barata, “Artificial immune systems based multi-agent architecture to perform distributed diagnosis,” Journal of Intelligent Manufacturing, vol. 30, no. 4, pp. 2025–2037, 2019.
View at: Publisher Site | Google Scholar
A. Borghesi, A. Bartolini, M. Lombardi, M. Milano, and L. Benini, “A semisupervised autoencoder-based approach for anomaly detection in high performance computing systems,” Engineering Applications of Artificial Intelligence, vol. 85, pp. 634–644, 2019.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Yinghui Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

255

Downloads

521

Citations