Abstract

The autonomous decision-making of a UAV is based on rapid and accurate threat assessment of the target. Accordingly, modeling of threat assessment under the condition of a small data set is studied in this paper. First, the operational scenario of a manned/unmanned aerial vehicle is constructed, and feature selection and data preprocessing are performed. Second, to obtain the structure, a modeling method for threat assessment is proposed based on an improved BIC score. Finally, the obtained model is applied to compute the threat probability using the junction tree algorithm. The experimental results show that the method proposed in this paper is an available method for threat assessment under the condition of small data sets.

1. Introduction

Threat assessment is an important component of operational decision-making. The complex and varying battlefield environment creates an enormous burden on the decision-maker. To lighten this burden, selected methods for threat assessment were proposed based on machine learning, such as the analytic hierarchy process (AHP) [1], multiple attribute decision-making [2], fuzzy inference [3], rough set [4], neural network [5], support vector machine (SVM), and Bayesian network (BN) methods. These methods display advantages and disadvantages for different problems. Bayesian networks are used in threat assessment in this paper. The Bayesian network is suitable for addressing uncertain information depending on probability theory, is a “white-box” model compared with the neural network and support vector machine approaches, and has the ability to handle small data sets. The Bayesian network was used to construct an air combat threat assessment model in [6], and the junction tree algorithm was applied to obtain the results of inference. Based on the external environmental information, the threat assessment model was built to obtain the threat level of the target in [7]. A dynamic Bayesian network [5] was used to model the threat situation of a battlefield environment, and the model was applied to a path planning for a UAV. The threat assessment framework [8] relying on the Bayesian networks was proposed, which exploits the expressiveness of first-order logic for semantic relations and the strength of the Bayesian networks in handling uncertainty. The Bayesian network model to assess the public health risk associated with wet weather sewer overflows is presented in [9]. The Bayesian networks [10] are employed to model the worst-case accident scenario and to assess the risks of natural gas station. Reference [11] presents a method of how to integrate and quantify these uncertainties in the warning process by evaluating a tsunami warning level probability distribution with a Bayesian network. The approach [12] for evaluating land-use adaptations was proposed for biofuel production, using Bayesian networks and integrating research on the food, water, and energy sectors. The Bayesian networks mentioned above are built using expertise only and without considering the data. Therefore, the obtained model has strong subjectivity. With the wide application of statistics in machine learning and artificial intelligence, obtaining a Bayesian network from data has become a mainstream method. For structured learning of the Bayesian network, the main research areas are divided into three categories, namely, methods based on the conditional independence test, methods based on scoring, and a mixed approach using the previously mentioned methods. The typical algorithms of the methods based on the conditional independence test contain the Peter Clark (PC) [13] algorithm and the three-phase dependency analysis (TPDA) [14] algorithm. Certain representative algorithms, such as K2 [15], ant colony optimization (ACO) [16], particle swarm optimization (PSO) [17], ant colony optimization (ACO) [18, 19], and bacterial foraging optimization (BFO) [20], belong to the methods based on score-and-search. The max-min hill climbing (MMHC) [21] and max-min parents and children (MMPC) approaches are algorithms typical of the mixed approach based on the two types of methods described above. Analysis shows that the method based on score-and-search is the most popular method. Maximum likelihood estimation (MLE) can obtain high-quality parameters for parameter learning in the Bayesian network. Unfortunately, the previously mentioned research studies were performed under the conditions of sufficient data and are not applicable for a small data set. The reason for this observation is that both the score of the structure and the maximum likelihood function correlate closely with the statistical information of the data, and small data sets lead to inaccurate statistical information.

Selected work has been conducted on the modeling method using the Bayesian network with a small data set. The primary objective is to introduce the constraint of the structure and the constraint of the parameter into the learning process. For structure learning, the existing structural constraints contain the node order [22], causal link [23, 24], edge existence [2528], and ancestral constraints [29]. The problem of structure learning is translated into the problem of objective optimization, and the various types of constraints [22, 26, 27] are directly used to limit the search space. Although this method can improve the results of structure learning under the condition of small data sets, the small data set makes the score inaccurate, which causes the score to lose the ability to evaluate the structure of the Bayesian network, which is why the proposed algorithms could not produce satisfactory results. To improve the precision of the score, several research studies have used the structural constraints to obtain the prior distribution of the structure. G. Borboudakis [23, 24] used the edge existence to represent the expert knowledge, which is integrated into the score by means of a nonuniform prior of the structure. The edge existence [30] is used to represent the expert knowledge and the confidence level is used to represent the uncertainty of expert knowledge, and then, the prior distribution of the structure is obtained based on the edge existence and the confidence level. However, there are two fatal shortcomings. One is that the expert knowledge plays determinative roles and is always regarded as completely correct. The other one is that the number of nodes in the structural constraint is no more than 4. If the number is more than 4, the prior distribution is not computed precisely. Therefore, this paper proposes a new method for obtaining the prior distribution for a number of nodes greater than 4. Combined with the new method, the threat assessment model is constructed using the K2 algorithm, and the parameters of the Bayesian network are obtained by the method in [28]. Finally, the threat assessment is measured using the model learned by the proposed method and the junction tree algorithm.

2. Preliminary

2.1. Cooperative Combat with Manned/Unmanned Aerial Vehicles

Cooperative combat with manned/unmanned aerial vehicles is an important combat mode that combines the advantages of manned aerial vehicles and unmanned aerial vehicles. The division of responsibilities is vital in cooperative combat. In the course of combat, the manned aerial vehicle sends the decision and control information to the unmanned aerial vehicle and commands the unmanned aerial vehicle to finish the combat task. The unmanned aerial vehicle sends the battlefield information obtained by its sensors to the manned aerial vehicle to aid the manned aerial vehicle in making decisions. The unmanned aerial vehicle acts as the “eyes” of the manned aerial vehicle, and the manned aerial vehicle acts as the “brain” of the unmanned aerial vehicle. The collaborative task process is illustrated in Figure 1.

2.2. Bayesian Network

The Bayesian network contains the structure and the parameters. The structure is a directed acyclic graph consisting of nodes and edges. The parameters are a set of conditional probability distributions. For each variable , we denote the parents of as . The parameter is the conditional probability .

If all conditional distributions are specified, then the joint probability can be calculated as follows:

The decomposition of the joint probability makes possible a compact representation of the joint probability and also efficient implementation of probabilistic reasoning.

2.3. Model of Edge Existence in the Bayesian Network

In general, the edge existence is the probability of the edge or arc between two nodes. More concretely, there are two nodes denoted as variables and in Bayesian network, and there are three ways to connect and . If is a parent of , an arc exists from x to y. If y is a parent of , an arc exists from to . If x has nothing to do with , there is no arc. To express these relationships conveniently, if is parent of , is used. If is parent of , is used. If has nothing to do with , is used. These expressions are obtained by probability theory as formulated in If , the distribution of the remainder states is uniformed; formula (3) is obtained as follows:If there are three nodes in a Bayesian network, the diagram of the edge existence is represented as shown in Figure 2. In Figure 2, if and , then and using the formula (3); if there is no knowledge between and , the distribution of the states between and is regarded as uniform distribution.

3. Threat Assessment Method Based on Improved BIC Score

3.1. Improved BIC Score

The decomposition of the score is an important property of the scoring function. To understand the decomposition of the score, the BIC score is used as an example in where is the number of data, is the number of data when the node and (the parents of ) is , , is the number of all nodes, is the number of the parents’ state, and is the number of ’s states. The family BIC score is represented by The BIC score can be represented as shown in In addition to the BIC score, the CH, AIC, and MDL score all have the property of decomposition. The BIC score is a Laplace approximation based on a large sample. Therefore, if the BIC score is directly used to learn the structure with the small data size, the ideal results are hard to obtain. To obtain a satisfactory structure using the BIC score, an improved BIC score [31] is proposed based on the property of the decomposition and the model of edge existence. The improved BIC score is referred to as the I_BIC score.In formula (7), represents the probability of the edge existence given by the domain expert and is an adjustable parameter. The range of is . With the increase of the data, the term makes the significance of data in the score gradually stronger while that of expert knowledge weaker. The term was proposed based on the property and the decomposability of the BIC score.

3.2. Algorithm of Structure Learning Based on I_BIC Score

The I_BIC score is provided in Section 3.1. In this section, the I_BIC_K2 algorithm is provided by combining the I_BIC score with the K2 algorithm. The K2 algorithm is a greedy heuristic method. For each node , the algorithm starts by assuming an empty parent set. In each step, the variable preceding whose addition maximally increases the family score is added to the parent set. The search stops when the number of variables in the parent data size set exceeds the predetermined maximum number of the parents or when the addition of variables cannot further increase the score. The details of I_BIC_K2 algorithm are presented in Algorithm 1.

Algorithm  _BIC_K2()
Input:- -the order of nodes  - -an upper bound of the number of the parent
     - -data set       - -expert constraint for edge existence
Output: 
   Initialize graph with no edge from variable set ;
   for i = 1 to
  ;
  ;
    while is true
      ;
      ;
      if
       ;
       ;
       Add an edge to graph ;
      else
       break;
      end if
   end while
   end for
  ;
   return ;
3.3. Algorithm of Parameter Learning Based on Monotonic Constraint Estimation Algorithm

If the parameters of Bayesian network satisfy

the is considered to be consistent with the monotonic constraint, and then the parameter learning process is given as follows.

Step 1. Establish monotonic constraint based on the expert knowledge.

Step 2. Obtain interval of every parameter using (9); the scope of will be obtained first.

Step 3. the parameter whose scope is given by (9) can be presumed to conform to the uniform distribution . Then convert uniform distribution into Beta distribution using (10) and obtain virtual sample and .

Step 4. Merge the virtual sample information into Bayesian estimation and get the estimations of parameters:where is the number of samples when the value of node is and the value of its parent is .

Step 5. Take obtained parameters as lower bound of the next parameter, return to Step 2, and repeat Steps 2-5 until all the parameters are obtained.

4. Experiments

4.1. Results of Structure Learning Using the I_BIC_K2 Algorithm

The Asia model is used as the simulation model. The Asia network has 8 variables and 8 edges. To represent the Asia model easily, the nodes are indicated using numbers from one to eight. The model is given in Figure 2. For the evaluation criterion, in the score-and-search paradigm, the aim of structure learning is to determine the maximal scoring value. Therefore, we choose the standard BIC score as the criterion, and the higher the BIC score is, the better the network approximates the distributions of the data sets. At the same time, the Hamming distance is chosen as the other criterion to evaluate the structure, and the smaller the Hamming distance is, the more accurate the structure will be. Formula (4) expresses the standard BIC score, and formula (12) describes the Hamming distance. In formula (12), represents the number of redundant edges, represents the number of missing edges, and represents the number of wrong edges. The experimental data set is generated using Bayes Net Toolbox (BNT) supplied by K. Murphy. Using these data sets and the selected criterion, we perform a comprehensive analysis of the improvements proposed in Section 3. All algorithms are programmed using the Matlab 2010 software package.

The Asia model as in Figure 3 is selected as the experimental model. The experiments are divided into three groups: the expert constraint is correct, the expert constraint is incompletely correct, and the expert constraint is wrong. The proposed algorithm is compared with the traditional K2 algorithm under the three conditions. To eliminate error, the experiments are repeated 100 times. The experimental results are presented in Figures 412. It is concluded that the proposed method produces more accurate results when the expert constraint is correct or incompletely correct. Particularly, the advantage is quite obvious when the data size is less than 500. As the data size increases, the advantage of the proposed method decreases. The results indicate that the proposed method is slightly weaker than the traditional K2 algorithm. However, with increasing data size, the results from the proposed method approach the results from the traditional K2 algorithm. In other words, the proposed method has good adaptability to the wrong expert constraint.

4.2. The Influence of on the Results of Structure Learning Using the I_BIC_K2 Algorithm

In order to study the influence of the parameter on the results of the structure learning using the I_BIC_K2 algorithm, the Hamming distance is computed using the different . The experimental results are presented in Figures 13-14 and Tables 1 and 2. The parameter is set to two different discrete intervals such as and . Some conclusion can be concluded from the experimental results. First, all the Hamming distances obtained by the proposed method are less than the Hamming distance obtained by K2 algorithm when the parameter . However, when the parameter , the Hamming distance obtained by the proposed method is not always less than the Hamming distance obtained by K2 with increase of the data. Second, the results under the condition of are better than the results under the condition of in most cases. These above are the reason why is selected in the paper.

The Asia mode is selected in the experiments. However, the conclusion is of general significance. This is because the parameter is used to optimize the family BIC score. The selection of has little relation with the number of nodes in the whole network. Therefore, the experimental results in Figures 13-14 can be regarded as a guide for selecting the parameter . In the paper, is used to compute the BIC score because the results under the condition of are better in most cases.

4.3. Collaborative Threat Assessment by Manned/Unmanned Aerial Vehicles
4.3.1. Simulation Scenario

The battlefield is composed of red and blue groups. The red group contains one manned aircraft and three unmanned aerial vehicles. The blue group contains two missile positions, two flack battlefields, and one radar position. The military task states that the manned aircraft and unmanned aerial vehicle make a cooperative attack on the blue group. At a certain point, the communication between the manned aircraft and unmanned aerial vehicles is interrupted or the manned aircraft is destroyed, and the unmanned aerial vehicle must execute the tasks independently. The combat mission scenario is shown in Figure 15. The parameters of the red group are given in Table 3, and the parameters of the blue group are given in Table 4.

4.3.2. Pretreatment of Threat Factors

In the course of combat, the battlefield information contains information from both sensors and the decision information from the manned aerial vehicle. The battlefield information is too complicated to use these sources directly. For example, certain information is continuous and other information is discrete. Therefore, pretreatment of battlefield information must be applied. The detailed work is described as follows:(1)(Sharp, Sp):Sp = = .(2)(Ele-radiation, Er):Er = = .(3)(Size, Si):Si = = .(4)(Velocity, V): Velocity of target, where represents the velocity of UAV.(5)(Height, H): (6)(Angle, A): , where is the angle between the velocity of target and target line and is the angle between the velocity of the UAV and the target line.(7)(Value, Vl):Vl = = greater value, lesser value.(8)(Distance, Dis):(9)(Identification friend or foe, IFF):IFF = = .(10)(Target number, Tn):Tn = = .(11)(Attack scope, As):As = = .(12)(Anti-attack Capability, Aac): Aac = = .(13)(Target type, Ty):Ty = = .(14)(Attack intent, Ai):Ai = = .(15)(Attack capability, Ac):Ac = = strong, middle, weak .(16)(Threat, T):

4.3.3. Simulation Conditions and Process

The simulation environment is Windows 10, and the compilation environment is Matlab 2010.

Before the communication interruption occurs, the manned aerial vehicle makes decisions based on the information from the sensors and sends the decision information to the UAV. The UAV completes the mission based on the decision information. All information is handled by the method described in Section 4.3.2. A portion of the processed information is listed in Table 5.

After the communication interruption occurs, the UAV uses the processed information to construct the structure of the threat assessment model using the I_BIC_K2 algorithm and obtains the parameters using the method described in [28].

Under the same condition, the traditional K2 algorithm is used to construct the structure of the threat assessment model, and maximum likelihood estimation is used to learn the parameters of the model.

After obtaining the structure and parameters, the junction tree algorithm is used to compute the probability of the threat. The junction tree algorithm is achieved in the BNT.

4.3.4. Simulation Results

To construct the threat assessment model based on the Bayesian network with small data size, the expert knowledge for the edge existence probability is given first. To represent the model more easily, the threat node is represented as node 1, the attack capability node is represented as node 2, the target type node is represented as node 3, the attack intend node is represented as node 4, the target number node is represented as node 5, the attack scope node is represented as node 6, the anti-attack capability node is represented as node 7, the sharp node is represented as node 8, the ele-radiation node is represented as node 9, the size node is represented as node 10, the velocity node is represented as node 11, the height node is represented as node 12, the angle node is represented as node 13, the value node is represented as node 14, the distance node is represented as node 15, and the identification of friend or foe node is represented as node 16. The constraint of the edge existence probability is given in where the indicates that the probability of the edge from node 4 to node 10 is 0.05. To illustrate the superiority of the proposed method in this paper, for a data size of 3000, the BN structure learned by the traditional K2 algorithm is treated as the right structure. The proposed method is compared with the traditional K2 algorithm for data sizes of 30, 100, 500, and 800.

The experimental results show that the expert constraint is useful in enhancing the accuracy of the structure learning. Figure 16 is obtained using the traditional K2 algorithm with 3000 samples, and Figure 16 is treated as the true structure. The experimental results are given in Figures 23 and 24 when the data size is 800. For a data size of 800, the I_BIC_K2 algorithm obtains the same structure as the structure in Figure 16, and the traditional K2 algorithm cannot obtain the accurate structure. The experimental results are given in Figures 17 and 18 when the data size is 30. For a data size of 30, the structure obtained by the I_BIC_K2 algorithm has more edges that are correct than the traditional K2 algorithm. The structure obtained by the I_BIC_K2 algorithm has no edges that are wrong, and the structure obtained by the traditional K2 algorithm has certain edges that are wrong. The experimental results are given in Figures 1922 when the data size is 100 and 500. For data sizes of 100 and 500, although the structure obtained by the traditional K2 algorithm has no edges that are wrong, it still has fewer edges that are correct than the structure obtained by the I_BIC_K2 algorithm. In view of the overall situation, the structures obtained by the I_BIC_K2 algorithm and the traditional K2 algorithm both become more accurate with increases in the size of the data. However, the I_BIC_K2 algorithm always obtains a more accurate structure than the traditional K2 algorithm.

After obtaining the structure of the threat assessment model, the parameters of the model must be learned for its application. In this paper, the monotonic constraint estimation (MCE) algorithm in [28] and the maximum likelihood estimation (MLE) are used to learn the parameters. The partial parameters are given in Tables 611. The 3000-MLE case represents the results of MLE with 3000 samples. The 100-MCE case represents the results of MCE with 100 samples. The 100-MLE case represents the results of MLE with 100 samples. As mentioned in [28], the MCE algorithm is superior to the MLE under the condition of small data size. Therefore, the MCE algorithm is chosen as the algorithm for parameter learning in this paper.

After the structure and the parameters are obtained, the threat assessment model is achieved and is applied to assess the threat probability. When evidence is given in terms of IFF=1, Tn=1, Sp=2, Er=1, Si=2, Dis=3, Al=1, A=1, V=1, H=1, Dc=1, and As=1, the junction tree algorithm is used to compute the posterior probability of the threat, as shown in Figure 25. In Figure 25, the standard value is the reasoning result based on the threat assessment, which is constructed using the traditional method with 3000 samples. The other two results are both obtained with 500 samples. The reasoning result of the proposed method is closer to the standard value than the reasoning result of the traditional method. Therefore, the proposed method can construct a more accurate model with small data size.

In order to verify the feasibility of our method, the proposed method is compared with multiattribute decision-making algorithm for threat assessment. The multiattribute decision-making algorithm is one of the most widely used methods for threat assessment. The target attribute contains target type, velocity, distance, angle, and target value for the multiattribute decision-making algorithm in the simulation experiment. In the course of combat, the target parameters at some point of time are given in Table 12. At the moment, the multiattribute decision-making algorithm is used to compute the threat value of the targets. The results are shown in Figure 26. Meanwhile, the proposed method in the paper is used to get the threat value. The results are shown in Figure 27. As you can see in Figures 26 and 27, the results of the two methods are basically consistent. For example, when the threat value is greater than 0.5 in Figure 26, the corresponding target has the high threat probability. When the threat value is less than 0.5 in Figure 26, the corresponding target has the low threat probability.

5. Conclusions

Under the condition of small data size, a method for constructing a threat assessment model based on the Bayesian network is proposed. The threat assessment model is built by the expert knowledge and data. Compared with the traditional method, the proposed method can obtain a more accurate threat assessment model with help of the expert knowledge. And the method for learning Bayesian network in the paper has the adaptability to the error expert knowledge. The research in this paper offers a feasible method for constructing the threat assessment model with small data size. Additionally, this research provides a new approach for modeling the problem in other areas.

Some of the future research directions are (i) to combine the proposed score with other search algorithms, (ii) to analyze the relationship between the parameter and the data size and give the way to select the parameter , and (iii) to get the optimal node ordering using some algorithms.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors are grateful to Professor Gao and Dr. Guo for discussions. This study was supported by the National Natural Science Foundation of China (Grants nos. 61305133 and 61573285).