Abstract
Emitter identification has been widely recognized as one crucial issue for communication, electronic reconnaissance, and radar intelligence analysis. However, the measurements of emitter signal parameters typically take the form of uncertain intervals rather than precise values. In addition, the measurements are generally accumulated dynamically and continuously. As a result, one imminent task has become how to carry out discriminant analysis of intervalvalued parameters incrementally for emitter identification. Existing machine learning approaches for intervalvalued data analysis are unfit for this purpose as they generally assume a uniform distribution and are usually restricted to static data analysis. To address the above problems, we bring forward an incremental discriminant analysis method on intervalvalued parameters (IDAIP) for emitter identification. Extensive experiments on both synthetic and reallife data sets have validated the efficiency and effectiveness of our method.
1. Introduction
It is widely recognized that emitter identification is indispensable for communication, electronic reconnaissance, and radar intelligence analysis. No doubt, class discriminant analysis of emitter signal parameters has played a big role in emitter identification. For instance, emitter types could be inferred upon the discriminating signal parameters. The emitter working modes, detection range, angle resolution, and Doppler measurement of the target could be estimated as well according to the collected measurements of discriminative signal parameters. As can be seen, discriminant analysis of emitter signal parameters is crucial for both civil and military applications.
However, the measurements of emitter signal parameters are typically characterised by uncertainty and continuous growth. These two problems pose great challenges for class discriminant analysis of emitter signal parameters.
Firstly, the parameter measurement typically takes the form of uncertain intervals. Such uncertainty can result from the unstable working status of transmitter circuit, environmental noises, or other unknown interference sources. Secondly, the intervalvalued signal parameter measurements are being accumulated dynamically and continuously. In practical applications, the amount of received measurements from various kinds of emitters could be explosive. According to the conservative estimation, when the channel width is 1 GHz, the sampling rate is 2.5 GHz and each sample allocated two bytes of storage, the amount of emitter signal parameter measurements received per hour could be up to 18 T, and the volume per day could approach 432 T.
Unfortunately, few machine learning methods are fit for incremental intervalvalued emitter signal parameter discriminant analysis as they either deal with precisevalued data only or process intervalvalued data under an assumption of uniform distribution. Actually, the uniform assumption does not hold for emitter signal parameters. The distribution of emitter signal parameters is usually assumed to be approximately normal instead. Therefore, the imminent challenge is how to perform the discriminant analysis using the intervalvalued signal parameters complying with the normal distribution incrementally.
An example of intervalvalued emitter data set is illustrated in Figure 1. In data set , each observation has one intervalvalued parameter measurement and each intervalvalued measurement complies with a certain normal distribution according to the emitter type it belongs. As can be seen, the four observations , , , and are collected at time point while observations and are collected at time point . New observations are being gathered continuously at successive time points.
(a) Interval normal distribution
(b) Continuous accumulation
Inspired by the above problems, we bring forward an incremental discriminant analysis method on intervalvalued parameters (IDAIP) for emitter identification. The emitter signal parameters include radio frequency (RF), pulse repetitive interval (PRI), pulse amplitude (PA), pulse width (PW), and so on. Our IDAIP method is not only robust to the uncertainty of intervalvalued parameter measurements but also able to carry out the emitter parameter analysis incrementally for emitter identification. To the best of our knowledge, little effort has been made in incremental intervalvalued emitter parameter analysis yet. Experimental results validate the efficiency and effectiveness of our IDAIP method for potential applications in communication, electronic reconnaissance, and radar intelligence analysis.
The rest of the paper is organized as follows. We briefly review related work in intervalvalued data analysis in Section 2. Our IDAIP method is formally proposed in Section 3. In Section 4, we present the experimental results. And we conclude in Section 5.
2. Related Work
Quite a large number of machine learning methods have been put forward to address the uncertainty of intervalvalued data. For example, symbolic data analysis [1, 2] has been proposed to extend the classical data models to take into account the intervalvalued information. The representatives intervalvalued data analysis approaches include point value replacement [3–5], pBox [6–11], and Hausdorff distance methods [12, 13].
The point value replacement approach replaces the interval values by precise values, such as taking the middle points or ranges of intervals [3–5]. In this way, they transfer the intervalvalued data into the classical pointvalued data. These approaches fit a linear regression model to the midpoints, lower, or upper bounds of the interval and then apply the model to independent symbolic intervals. The model optimization principles are generally the minimization of midpoint, lower bound, or upper bound or the combination errors.
Alternatively, the pBox approaches describe the uncertainty over an intervalvalued variable by a pair of lower and upper cumulative probability distributions [6]. It is recognized that pBoxes are one of the simplest and most popular models which directly extend the cumulative distributions in the precise case or simply derived from small samples [7] and expert opinions. Due to the simplicity, the pBox methods have been widely used in many applications, such as estimation of future climate change [8], engineering design [9], soil screening level estimation [10], and reliability analysis [11]. However, pBox methods require that some characteristic values are known in advance, such as the mode, mean, and other fractiles of the distributions, while these values are unknown in case of the normaldistributed intervalvalued emitter parameter measurements. As a result, the pBox approaches are unfit for class discriminant analysis.
Hausdorff distance approach is assumed to be a natural way to compare the dissimilarity between intervalvalued data [12, 13]. Other distance measures for intervalvalued data have been applied as well, such as Euclidean distance [14], taxi distance [15], Mahalanobis distance, and the Wasserstein distance [16]. However, these distance metrics are all unsuitable for uncovering the delicate nature of normal class distributions.
Existing intervalvalued data analysis approaches typically assume that observations are independent with each other and the variable values are uniformly distributed in the interval. However, this is not true for emitter signal parameters which comply approximately with a normal distribution. Existing approaches are generally restricted to the equidistribution hypothesis. In addition, existing methods are constrained for static data and have no incremental learning ability. Though a fuzzy set based incremental learning algorithms on interval variables [17] has been explored it is assumed that delicate prior knowledge about fuzzy set definition is available, which is actually not the case in the practical application of emitter identification.
3. Method
In this section, we formally present our IDAIP method.
Suppose the intervalvalued emitter data set is composed of a set of continuously accumulating observations. The current number of observations, signal parameters, and emitter types are denoted as , , and , respectively. Assume there are number of observations in each emitter type , . And we assume that each observation , , is consisted of number of intervalvalued parameter measurements, , an associated emitter type , and a time stamp indicating the collection time. We denote the set of observations within the same emitter type as .
Each intervalvalued measurement is consisted of a lower bound and an upper bound . The lower bound and upper bound correspond to the minimum and maximum measurement, respectively, among independent measurements of parameter from observation . Each of the independent measurements of signal parameter from observation , , is assumed to comply with the same interval normal distribution . We also assume that mean values from observations in the same class , , where and , comply with the same class normal distribution and that . Consider
Given the current intervalvalued emitter data set , our IDAIP method is consisted of four major steps:(1)Interval Distribution Estimation: estimate the underlying normal distribution for each individual signal parameter of each observation according to the intervalvalued measurement .(2)Class Distribution Inference: infer the underlying class normal distribution of each signal parameter for each emitter type , upon the estimated individual parameter distributions at step (1).(3)Class Discriminant Analysis: evaluate the class discriminating power of each signal parameter between each emitter type pair .(4)Incremental Learning: implement the incremental class discriminant parameter analysis.
3.1. Interval Distribution Estimation
As discussed above, traditional symbolic interval data analysis typically assumes that the measurements in an interval are uniformly distributed. However, the true measurement distributions of emitter signal parameters are assumed to comply with a normal distribution instead. The difference between the traditional uniform assumption and our normal distribution assumption is illustrated in Figure 2.
Given an intervalvalued measurement of signal parameter from observation , we assume the corresponding parameter measurements follow a certain normal distribution, , where and correspond to the minimum and maximum value within . Then we can estimate the interval distribution by Lemma 1.
Lemma 1. Assume an intervalvalued measurement of signal parameter from observation satisfies that , , and that and correspond to the minimum and maximum value within ; then we can infer that
Proof. (1) could be inferred according to the minimum square error (MSE) criteria.
(2) Under the above normal distribution assumption, the lower bound measurement in interval corresponds to the smallest order statistic while the upper bound measurement corresponds to the largest order statistic. The order statistics of standard normal random variables have been approximated [18]. One approximation for the th highest order statistic out of is given aswhere and and it is recommended that . Therefore, given the intervalvalued measurement of signal parameter from observation , we havewhere and are the mean value and standard deviation of the normal distribution for signal parameter from observation . As a result, and . Therefore, the conclusion holds.
3.2. Class Distribution Inference
We assume that, in the ideal case, the standard deviations of interval distributions and the associated class distribution are the same; under the condition that . We also assume that the interval means comply with the underlying normal class distribution, , and the intervalvalued measurements in each emitter type are independent from each other. Then, upon the estimated mean values and variances for signal parameters in emitter type , we can further infer the normal class distribution for each signal parameter in emitter type , , as shown in (5), as proved by Lemma 2.
Lemma 2. If the intervalvalued measurements of signal parameter from emitter type , and from observations and of emitter type satisfy that (1) , , and (2) , then the class normal distribution for signal parameter in emitter type , , could be inferred as below:
Proof. The conclusion could be inferred according to the minimum square error (MSE) criteria.
3.3. Class Discriminant Analysis
The discriminating power of signal parameter for each emitter type pair could be evaluated by the probability that a parameter measurement from one emitter type is misclassified into another emitter type according to the inferred class distribution and .
We define and . Based on that, we further define the function , , , , as indicated in
We denote the intersection set as the set of intersected points between the class distribution curves and for emitter type and , respectively, on signal parameter , or . Then, the mutual classification error , the probability that observations of emitter type are misclassified into emitter type according to signal parameter , can be modified from [19] and classified into the following three cases:(1): emitter types and could be discriminated perfectly where is the misclassification rate lower bound between classes for any signal parameter, indicating emitter classes and could be discriminated perfectly.(2): emitter types and are overlapping completely where is the misclassification rate upper bound between classes for any signal parameter, indicating emitter classes and are overlapping with each other.(3) or : emitter types and are overlapping partially
In addition, the maximum mutual classification error between emitter type and on signal parameter could be defined as
Following the definition given by Shannon and Hartley [20, 21], the representation of the maximum mutual classification error information can be expressed in terms of a logarithmic scale of base 10 asLikewise, the minimum information required for discriminating between two emitter classes and for signal parameter is given by
If , the two emitter classes and can be discriminated by signal parameter with the classification error smaller than a predefined threshold . In that case, signal parameter is assumed to be discriminating for type pair . The results of class discriminant analysis are stored in an upper triangular discriminating power matrix . The element () of matrix would be specified as the index of the most discriminating signal parameter for the pair of emitter types , as defined below:
3.4. Incremental Learning
For the intervalvalued signal parameters, such as radio frequency (RF), pulse repetitive interval (PRI), pulse width (PW), and pulse amplitude (PA), we define the data description model as a mean value matrix , a variation matrix , and a class distribution vector accordingly as follows.
Definition 3 (mean value matrix ). One defines each element of the twodimensional mean value matrix as the sum of estimated mean values of signal parameter from all the observation in type , where and . Mathematically speaking,
Definition 4 (variation matrix ). One defines each element of variation matrix as the sum of estimated variations of signal parameter on all the observation in emitter type , where and . Mathematically speaking,
Based on the above two definitions and Lemma 2, the mean value and standard deviation of the normal distribution for each emitter type and each signal parameter could be formalized as
Similarly, the intersection set , the mutual classification error , the maximum mutual classification error , the maximum mutual classification error information , and the discriminating power matrix could be updated in line.
In the incremental intervalvalued parameter analysis process, once some new observations at time are collected, class distribution vector , mean value matrix , and variation matrix would be updated sequentially. And upon the updated matrices and vector, the discriminating power matrix would be updated afterwards as well. The outline of our IDAIP method is illustrated in Algorithm 1.

4. Results
We evaluated our IDAIP method on a series of synthetic data sets and one reallife data set.
In the synthetic data, the mean values of measurements for the same signal parameter from observations in the same emitter type all comply with the same normal distribution while those from observations in different emitter types may not. The standard deviations of measurements for the same parameter from observations in the same emitter type are always kept as the same. Each intervalvalued parameter measurement was obtained by randomly generating five () samples from the corresponding parameter normal distribution and selecting the minimum and maximum measurements as the lower and upper bound, respectively, to form an interval.
During experiments, we fixed the number of observations in each emitter type as the same, denoted as . The original synthetic data set was composed of () observations from five () different emitters, denoted as and , () observations per type. The original number of signal parameters was initialized as ten (), denoted as and .
This reallife emitter data set consisted of observations from three different emitter classes, denoted as , , and , respectively. Each emitter class has around observations. Each observation was composed of eight intervalvalued measurements. In addition, an independent test emitter data of observations was provided to validate the emitter identification model constructed from the emitter training data.
All the experiments were conducted on a Dell PC running Microsoft Windows XP with a Pentium dualcore CPU of 2.6 GHz and a 2 G RAM.
4.1. Evaluation on Synthetic Data
We evaluated both the efficiency and effectiveness of our IDAIP method on the synthetic data sets. The effectiveness of our method was evaluated in terms of class distribution inference and class discriminant analysis.
4.1.1. Evaluation of Efficiency
During the experiments, we compared the runtime of our IDAIP method versus the batch one without incremental learning.
Firstly, we compared the efficiency of our method against the batch one when varying the number of observations of each emitter type, , between to . The number of emitter types was set as 5, , and the number of signal parameters was fixed at ten, . During experiment, our method incrementally updates the mean value matrix, the variation matrix, and the class distribution vector once new observations arrive and performs the class discriminant analysis based on the updated matrices while the batch method conducts the class discriminant analysis using the complete set of available observations. As can be seen from Figure 3(a), the runtime of incremental learning is approximately the same while that of the batch method increases linearly.
(a) Varying
(b) Varying
(c) Varying
Secondly, we varied the number of available emitter types, , from four to twelve while we fixed the number of observations per emitter type as and the number of signal parameters as ten. Again, our method incrementally updated the three matrices once observations from a new emitter type were collected. Our method is orders of magnitude faster than the batch method, as illustrated in Figure 3(b).
Finally, we varied the number of signal parameters, , from nine to 18, while fixing the number of observations per emitter type as and the number of emitter types as five. Our IDAIP method incrementally performed the class discriminant analysis when two new signal parameters were available. Again, our method is significantly more efficient than the batch method, as indicated in Figure 3(c).
4.1.2. Evaluation of Class Distribution Inference
Given an estimated value of mean or standard deviation of a class normal distribution for a certain signal parameter from a certain emitter type, we define the absolute error as the absolute difference between the estimated value and the underlying true value. During experiments, we varied the number of observations per emitter type, , and calculated the corresponding absolute errors for each parameter from each emitter type. Under each parameter setting, we simulated the experiments 1000 times and generated a boxplot for the absolute errors. It turned out that the absolute errors of signal parameters from each emitter type all tend to converge to zero. For instance, the absolute errors of parameter on emitter type converge to zero with the increase of the number of observations per emitter type from to , as shown in Figure 4(a). A similar trend could be observed in the boxplot of absolute errors of standard deviations of parameter on emitter type , as shown in Figure 4(b). Similar results could be obtained on other parameters. As can be seen, these results were rather consistent with Lemma 2.
(a) Estimated mean value
(b) Estimated standard deviation
4.1.3. Evaluation of Class Discriminant Analysis
We simulated a series of intervalvalued synthetic data sets with different number of signal parameters when varying number of observations per emitter type. The number of emitter types was fixed at ten during experiments. Under each signal parameter number setting, we simulated the synthetic data set one thousand times. We reported the average percentage of correctly identified discriminating signal parameters in the discriminating power matrix . Again, with the increase in the number of observations per emitter type, there is a rise in the average percentage of correctly identified discriminating signal parameters in the discriminating power matrix , as shown in Figure 5.
4.2. Evaluation on Reallife Data
We evaluated our IDAIP method against the benchmark peers on a reallife emitter data set of signal parameter measurements as well. We evaluated our method against the benchmark class interval discriminant analysis method [22] and the point value replacement methods [3–5] in term of class discriminant analysis, emitter identification scalability, and accuracy.
4.2.1. Evaluation of Class Discriminant Analysis
We report the class discriminant results of parameter only as the results of the other two signal parameters were similar.
During experiments, the class interval discriminant analysis method inferred the class intervals and estimated the maximum mutual classification error between each class pair for parameter , . Instead of adopting a weighted interval estimation and class inference strategy as our method do, the point value replacement method simply picked the lower bounds and upper bounds which were treated equally for normal class distribution inference. For fair comparison, the associated maximum mutual classification error between each class pair for parameter , , was calculated as well for the point value replacement methods and our method. We applied a common discrimination threshold for the three methods. The class pair would be considered discriminable if the corresponding is below threshold and undiscriminable otherwise.
As can be seen from Figure 6(a), the three different emitter classes were unable to be discriminated under the uniform assumption, whose measurements are heavily overlapping with each other. For this reason, the class interval method was unable to discriminate any of the three class pairs. The point value replacement method, on the other hand, was able to discriminate one class pair . Comparatively, our IDAIP method successfully discriminated between two class pairs, and , fairly well, as shown in Figure 6(b). The maximum mutual classification errors in class discriminant analysis of the three methods were illustrated in Figure 6(c). As can be observed, our method has outperformed the other two methods by always achieving the smallest maximum mutual classification error.
(a) Class distributions under uniform distribution assumption
(b) Class distributions under normal distribution assumption
(c) Comparison of class discriminant analysis
4.2.2. Evaluation of Emitter Identification
To evaluate the emitter identification scalability and accuracy of our IDAIP method, we first designed a naive emitter identification approach after the incremental discriminant analysis. Given a test instance consisting of individual parameter measurements, its emitter type could be inferred as shown inwhere indicates the weight of each signal parameter , which could be calculated as below:where denotes the occurrence count of parameter in the discriminating power matrix .
(i) Evaluation of Scalability. We compared our incremental emitter identification method based on incremental discriminant analysis against the benchmark point value replacement methods. Specifically, we transformed the original intervalvalued reallife data into the middle value and range format. The original size of the emitter data set was set as , each emitter type with observations. Then, the size of emitter data was incrementally expanded to , , and . We compared the performance of our interval method against that of the benchmark point value methods, logistic regression, Multilayer Perceptron, Naive Bayes, SVM, NN, AdaBoostM1, and decision tree.
As shown in Figure 7, our incremental emitter identification method was much more scalable than the benchmark ones. The majority of benchmark point value methods were either unable to finish running within five minutes or exited due to memory problems when the data size was beyond . Specifically, the runtime of NN and Multilayer Perceptron was beyond five minutes when the data size reached . And when the the data size reached , the decision tree method had a memory location problem and expired before it finished. So did the AdaBoostM1 method when the data size reached . Though the logistic regression, SVM, and Naive Bayes methods were linearly scalable with the increase in data size, they were not adaptable for incremental learning.
(ii) Evaluation of Accuracy. For the above reason, we only report the emitter identification accuracy when the training size was below . Figure 8(a) compares the performance of our emitter identification approach against the benchmark point value methods when the training data size was while Figure 8(b) shows the identification accuracy boxplots of our method and the benchmark point value methods when the training data size varied between and . As can be observed, our emitter identification approach has outperformed the benchmark point value methods significantly. This is because we have made a better use of the underlying normal parameter measurement distribution.
(a) Performance when data size =
(b) Boxplots of accuracy with varying data size
5. Conclusion
In this work, we have brought forward an incremental discriminant analysis method on intervalvalued parameters for emitter identification (IDAIP) to address the problems of uncertainty in emitter signal parameter measurement and rapid growth in emitter data volumes. Our method is not only robust to the uncertainty of intervalvalued parameter measurements but also able to carry out the emitter parameter analysis incrementally for emitter identification. Extensive experiments have indicated the efficiency and effectiveness of our method. The runtime of our method is approximately linear with respect to the number of newly arrived observations and the number of signal parameters. Our method has outperformed benchmark intervalvalued machine learning methods under uniform distributions. These merits enable our IDAIP method to be applied promisingly for emitter identification in both military and civil applications.
Notations
:  The number of emitter types 
:  The set of observations in emitter type 
:  The number of signal parameters 
:  The total number of observations 
:  The number of independent measurements for each parameter and each observation 
:  An intervalvalued measurement 
:  The estimated mean value of measurements for parameter from observation 
:  The estimated standard deviation of measurements for parameter from observation 
:  The estimated mean value of measurements for parameter from emitter type 
:  The estimated standard deviation of measurements for parameter from emitter type 
:  The mean value matrix 
:  The variation matrix 
:  The class distribution vector. 
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported by National Natural Science Foundation of China (nos. 61402426 and 61373129) and partially supported by Collaborative Innovation Center of Novel Software Technology and Industrialization.