Abstract
The digital twin is becoming the most promising emerging technology in the field of unmanned combat and has the potential to innovate future combat styles. Online battlefield learning is one of the key technologies for supporting the successful application of digital twin in unmanned combat. Since there is an urgent need for effective algorithms for online learning the battlefield states in real time, a new random finite set- (RFS-) based algorithm is proposed in the presence of detection uncertainty including clutters, missed detection, and noises. The system architecture and operational mode for implementing the digital twin-enabled online battlefield learning are provided. The unmanned ground vehicle (UGV) is employed as the experimental subject for systematically describing the proposed algorithm. The system architecture for implementing the digital twin-enabled online battlefield learning is firstly given, and its operational mode is also described in detail. The RFS-based digital twin models including the battlefield state model, UGV motion model, and sensor model are designed. The Bayesian inference is adopted, and the probability hypothesis density (PHD) filter is modified to implement the online learning process. At last, a group of experiments are conducted to verify the performance and effectiveness of the proposed algorithm. The research work in this paper will provide a good demonstration of the application of digital twin in unmanned combat.
1. Introduction
The adoption of unmanned vehicles brings both great autonomy and new technical challenges to modern warfare. Unmanned vehicles such as unmanned ground vehicles (UGVs) hold great promise for future combat operations and have already been used in several recent military conflicts in Syria and Afghanistan [1–3]. UGVs are the vehicles that operate while in contact with the ground and without a human presence on board. How to feed back the effective information collected from the real battlefield to the simulation space and how to enable the benefits of future paradigms, such as the Cyber-Physical Systems (CPSs) and digital twin, are big challenges for unmanned combat [4–7]. In this paper, we employ the UGV as the experimental subject to specify our contributions in implementing digital twins in unmanned combat.
Due to the data separation between the real battlefield and its models, it is difficult to achieve the automatic flow of information in a closed loop. Digital twin provides a new and effective way to solve this problem. It can enable the real-time bidirectional interoperability between the real world and virtual simulation space and is also an effective way to enable efficient real-time data sharing throughout the entire operational process including intelligent monitoring, prediction, digital representation, evaluation, decision support, and battlefield learning [8–10].
Battlefield refers to the environment constituted by all the objective factors in the battlespace except the combatants and weapons. All kinds of combat operations are inseparable from the specific battlefield. Battlefield has an important influence on the course and outcomes of combat operations. Combat entities can receive inputs from and provide outputs to the battlefield. The combat intention of the combat entity is realized through its interaction with the battlefield.
Battlefield learning means sensing the entities on the battlefield rapidly, understanding the current situation comprehensively, and predicting future status accurately before decision-making [11]. Battlefield learning is important for predicting future situations and evaluating the operational effectiveness of different actions. Battlefield learning helps to improve the commander’s understanding of the situation as a whole and form a basis for decision-making. It is also very important for the commanders’ real-time monitoring and perception of the dynamic situation [12].
Based on the classical definition of battlefield learning, online battlefield learning is the process of perceiving an existing battlefield and anticipating how it may evolve in the future. It is useful for obtaining knowledge of the previously unknown battlefield while the real combat process is proceeding simultaneously [13]. Online battlefield learning is also extremely important for generating plans and online decision support for security patrol [14].
In the military simulation, computer-generated force (CGF) is the virtual combat force object which is created by a computer and can control or guide all or part of its action and behavior [15]. The core task of constructing CGF is to model the behavior of combat entities on the battlefield. Online battlefield learning is one of the key technologies of CGF and has a broad application prospect. CGF depends on the online learning battlefield to fuse the data generated by the sensors in the battlefield and generate the real-time battlefield states online.
In recent years, the digital twin has become a hot topic, as well as the representative intelligence in all fields from military to people’s livelihood [16–19]. Digital twin emphasizes that the virtual object evolves in real time by receiving data from the physical object, thereby keeping consistent with the physical object throughout its entire simulation cycle [20, 21]. In a broad sense, the digital twin is a system composed of physical objects, simulation models, and the real-time dynamic interaction between them. It requires building the simulation models for real entities and simulating their behaviors [22]. It is regarded as the core link between the real and virtual spaces. With the help of various high-performance sensors and high-speed communication technologies, the digital twin can present and predict the actual situation of physical entities in near real time by integrating the data of physical entities. It enhances the ability of analysis and simulation and controls the physical entities through the virtual-real interactive interfaces and data fusion algorithms [23]. Key to enable digital twin in unmanned combat is understanding the evolving situations in the battlefield accurately and timely.
In this paper, we focus on learning the battlefield states that consist of significant environmental cues and the UGV states. In order to explore how to implement the digital twin-enabled online battlefield learning, we propose a random finite set- (RFS-) based algorithm which can support real-time interaction, as well as the deep integration and mutually beneficial symbiosis between the virtual and real battlefield. It is the necessary foundation for the successful application of the digital twin in unmanned combat. Our main contribution is designing and implementing a new online battlefield learning algorithm by using the RFS-based Bayesian theory and modifying the probability hypothesis density (PHD) filter [24]. The most important value of the proposed algorithm is to break through the data boundary between the real and virtual battlefield and enable the application of digital twin in unmanned combat. This algorithm can eliminate information islands and realize the tight integration and equal interaction of real and virtual battlefield.
The rest of the paper is structured as follows. A literature review on the recent digital twin and random finite set (RFS) is given in Section 2. We present the system architecture and operational mode of the proposed online battlefield learning algorithm in Section 3. The RFS-based battlefield model, UGV motion model, and sensor model are introduced in Section 4. The design and implementation of the learning process are given in Section 5. Experimental results are detailed in Section 6, and conclusions are given in Section 7.
2. Related Works
The digital twin has important research and application value in every stage of online battlefield learning. In the design and demonstration stage, the digital twin can help to improve the evaluation capability of system performance by enabling the equal two-way interaction between the simulation system and the real system. Through the semiphysical simulation, digital twin enhances the ability to quickly locate the design defect, optimize system design, and test the practicability of an online battlefield learning algorithm in execution.
In order to apply the digital twin-enabled online battlefield learning in the operation stage, it is important to realize the bidirectional interaction between the simulation space and the real space. Tao gives the five-dimensional structure models of digital twin and presents six application principles [25, 26]. The digital twin is the best way to realize the interactive integration of real space and simulation space and is highly concerned by many academics and enterprises. Its most important breakthrough is that it is not only a mirror image of the physical world but also accepts real-time data from the physical world and in turn acts on the physical world in real time [22, 27]. Digital twin brings new development opportunities to the combat simulation area, because it can allow commanders to have a complete digital footprint of the battlefield from beginning to end [28, 29]. The real-time dynamic interaction between the virtual world and the physical world is the foundation of the digital twin, as well as the main challenge of modeling and simulation. Some researchers present a digital twin-driven manufacturing cyber-physical system for parallel controlling of the smart workshop. By using the decentralized digital twin models, they successfully connect cyberspace and physical space.
Online battlefield learning needs autonomy in the operation stage. A decentralized multiagent system is also a new approach for implementing online battlefield learning, such as blockchain and CGF. Some researchers have discussed how to use blockchain to overcome the cybersecurity barriers for achieving intelligence in Industry 4.0 and introduced eight cybersecurity issues in manufacturing systems. Some researchers have surveyed the ability of blockchain for overcoming the barriers and examined the literature on the manufacturing system perspective and the product lifecycle management perspective. Ali et al. provided a survey of all aspects of multiagent systems, starting from definitions, features, applications, challenges, and communications to evaluation. They also gave a classification on multiagent system applications and challenges along with references for further studies [30].
RFS provides a novel unified probabilistic way for fusing real-time battlefield data [31]. The conventional battlefield learning algorithms usually depend on the vector-based data representation and fail to support the digital twin in real time. The vector-based representation requires the dimension and elements’ order in each vector to be equal and fixed. It also needs necessary operations outside of the Bayesian recursion to ensure the consistency of the vectors. The determination of newly observed measurements and missed measurements is implementing through vector augmentation and truncation which are very computationally intensive and irreversible. In this paper, we propose employing the random set theory to overcome these disadvantages. The proposed RFS-based algorithm can overcome the limitations of conventional algorithms very well, because it takes into account a more realistic situation where the randomly varying number of targets and measurements, detection uncertainty, false alarms, and association uncertainty are all taken into consideration.
3. System Architecture
With the rapid development of emerging information technologies, such as artificial intelligence (AI), cloud computing, edge computing, digital twin, and Internet of Things (IoT), the combat style has also been undergoing profound changes. New information technologies have facilitated the birth, development, and application of unmanned combat. Just as it is shown in Figure 1, new information technologies provide more diverse data sources, more powerful computing power, and more efficient computing methods for the key activities of unmanned combat including description, diagnosis, prediction, and decision.

The operational mode of the digital twin-enhanced online battlefield learning consists of five elements, i.e., computing services, physical entities, simulation models, connected data, and the connection between them. As shown in Figure 2, digital twin enables the bidirectional real-time mapping and interaction between real battlefield and its simulation model. Simulation models of the real combat entities are employed to reflect and predict their behaviors in real space. On the other hand, through the RFS-based battlefield states generated by the online battlefield learning algorithm, the combat simulation systems could guide the military commanders to respond to situation changes and choose the optimal courses of action (COA). Digital twin realizes the closed-loop optimization in the entire process from observing, orienting, and deciding to act. The simulation aspect of digital twin means building digital models of weapons, soldiers, or battlefield and executing all the models in an integrated way. The RFS-based simulation models are executed in parallel with the real battlefield and provide useful knowledge to the commanders.

The battlefield considered in this paper consists of all the significant environmental cues and the states of UGVs. Since GPS and topographic map in actual combat are most likely be disabled, location and mapping for unmanned vehicles can only be obtained with the help of the equipped sensors. The RFS-based online battlefield learning algorithm plays a central role in the virtual space. It provides simulated battlefield information to the decision support system to train the deep learning network system. It can also generate real-time battlefield information to the unmanned combat simulation system and helps to evaluate the possible outputs of available COAs.
For combat simulation, the battlefield provides spatial-temporal constraints for all participating actors. The simulated combat objects are deployed and controlled in the virtual space. They learn the battlefield that consists of other combat objects and significant environmental cues by using the proposed algorithm. The combat simulation system in the virtual space is used as a decision-making aid tool that assists the commanders to evaluate all the available COAs. It is in charge of choosing the optimal COA. The proposed online battlefield learning algorithm aims at analyzing and understanding operational activity in the real space at a given time. It can help to make the right decision and predict the future situation. It is the key technology for enabling and implementing digital twin-enabled online battlefield learning in unmanned combat.
Corresponding to the operational mode, the system architecture of digital twin-enabled online battlefield learning in unmanned combat is shown in Figure 3. The runtime infrastructure (RTI) is adopted to provide the simulation services to support the interconnection and interoperation for the entities in the real space and the simulation models in the virtual space. This system architecture employs digital twin and RTI to support real-time interaction between the virtual and real battlefield. By this means, it can realize the deep integration and mutually beneficial symbiosis between the virtual and real battlefield. The proposed algorithm can synchronously learn the number and position of the significant environmental cues (or landmarks) in the battlefield that exist in the sensor’s field of view (FoV). It also has the advantages of precise mapping, virtual-real interaction, stereoperception, intelligent intervention, and other characteristics.

4. RFS-Based Simulation Models
The digital twin-enabled battlefield modeling consists of three aspects. The first one is modeling the battlefield states including cues (or landmarks). The second one is modeling the UGV movement. The third one is modeling the sensors equipped on the UGV. In order to overcome the data association uncertainty problem under high clutters and measurement noises, the RFS-based modeling method is employed to fully integrate data association uncertainty into battlefield learning. The key of the proposed algorithm is to represent the battlefield states by using RFS. The derivation of the simulation models depends on RFS. RFS is the theory proposed by Mahler for implementing RFS in engineering applications [32]. The RFS-based models are the twinning models that are executed in parallel with the real entities and provide new knowledge about the real battlefield [8, 27].
The vector-based representation of the battlefield has been demonstrated to have some mathematical consequences, such as the ordering of significant environmental cues, data association problems, and element management problems. In addition, for the dynamic random scene, how to quantify the errors of the learned results generated by the vector-based Bayesian inference is also a great challenge. The abovementioned problems are usually solved by augmenting or truncating vectors outside of the Bayesian inference process. This will lead to the problem that the Bayesian optimality can only be achieved on the subset of the battlefield that is defined in advance. In this section, we give the RFS-based models which can solve these problems systematically.
The difficulty of RFS-based Bayesian inference is its computational complexity. To solve this problem, Mahler proposed the PHD (probability hypothesis density) filter. The PHD of the posterior probability density is denoted by and is a density function defined on the single object state as follows:
Here, denotes the RFS of detection received at time k, and denotes the RFS of states at time k. We use the abbreviation . In point theory, is defined as the intensity density. It is not a probability density and represents the density of the expected number of points at . Given any subspace of single object state space , the integral is the expected number of objects in .
4.1. RFS-Based Battlefield Representation
We adopt the RFS-based battlefield representation; here, S denotes the RFS that represents the entire unknown battlefield. In addition, in order to assist in operational decision-making, we also relate the battlefield to the UGV state . RFS , which is based on the UGV state at time , is used to denote the battlefield that has been explored. is the RFS of the battlefield states which consists of significant environmental cues and is the intersection of the union of all FoVs and the entire battlefield state. Thus, can be represented as follows:
Here, . depends on the UGV states at time . The learned battlefield at time k can be obtained based on in the following way:where represents the unexplored battlefield namely, the set of significant environmental cues that are not in . RFS denotes the learned battlefield which has appeared in the for the first time. Therefore, the battlefield transition process can be modeled aswhere denotes the state transition density of battlefield from time to time , and denotes the density of the RFS .
4.2. RFS-Based UGV Motion Model
The location of UGV can be represented by the state vector . The UGV motion model characterizes the state transition between and after inputting the control command . In this paper, we adopt the following two-dimensional motion model with translational and rotational displacement:
In this paper, the specific mathematical expression of is employed as follows:
Here, is used to represent the uncertainty and noise, and is the control command that UGV received at time .
4.3. RFS-Based Sensor Model
Given the current UGV state RFS and the battlefield RFS , the detection RFS can be described as follows:
Here, denotes the detection RFS related to the significant environmental cue with state , and denotes the clutters RFS, which is related to the UGV state . Due to the uncertainty and randomness in the detection process, the number of elements in is random and may be different from the number of states in Sk.
The detection RFS generated by battlefield state is modeled by Bernoulli RFS. Therefore, there are two forms of . The first one is and the probability is . The other one is and the probability is . denotes the UGV state at time , and denotes the probability of generating detection from . models the likelihood that s generates detection . In this paper, if the significant environmental cue exists in the sensor’s , and , otherwise.
Depending on and , the sensor’s likelihood function for generating is represented as follows:
Here, is the likelihood function of generating detection RFS for RFS , and is the probability density of the clutter RFS .
In this paper, the range and bearing sensor is used. The detection generated by the two-dimensional environmental cues at location can be modeled as follows:
Here, is the range and bearing detection, is the cue’s position, and is the noise with covariance .
5. Learning Process and Its Implementation
In this section, we give the basic principles, design, and implementation of the proposed algorithm. The process of the proposed algorithm relies on sequentially propagating the joint posterior probability density of the RFS-based battlefield and the UGV state as detection arrives.
5.1. RFS-Based Learning Process
With the RFS-based battlefield modeling, the RFS-based Bayesian inference is used to jointly learn the environmental cues’ locations and UGV state at every time step. The battlefield RFS can be characterized as follows:
In this paper, we use to denote the predicted distribution of the battlefield state and to denote the a posteriori distribution of the battlefield state. The knowledge of the battlefield can be propagated by the following prediction and update process:(i)Predict the battlefield state by using the previous battlefield states and input parameters:(ii)Update the battlefield state depending on the received detection RFS :
Here, implies set integration.
In this paper, the PHD filter is employed to implement the RFS-based Bayesian recursion [24, 33–35]. We modify and extend the Gaussian mixture-based PHD filter with a particle filter. The Gaussian mixture PHD filter is applied to learn the number and locations of the environmental cues, and the particle filter is applied to learn the UGV state at the same time. The computing process of online battlefield learning by modifying PHD filter is shown in Figure 4. The Bayesian recursion encapsulates the inherent uncertainty of the number of significant environmental cues that may be caused by detection uncertainty, clutters, UGV maneuvers, and the uncertainty related to detection noises.

The main challenge of online battlefield learning is how to learn the number and location of environmental cues while estimating the UGV state at the same time. In this paper, we partition the battlefield state into two kinds: for environmental cues and for UGV movement. We can analytically integrate out provided that we know . This means that even though we only have the sample sets , we can also represent successfully. Thus, each particle represents a value for . The advantage of this approach is that we can reduce the dimensionality of state space in which we are sampling and reduce the error of the learned battlefield.
Here, the Gaussian mixture PHD filter is applied to propagate each PHD that depends on the UGV state. The location of environmental cues in the battlefield is characterized by the Gaussian components of the mixture, and the number of cues in the battlefield is characterized by masses of all the Gaussian components. In this paper, the PHD at time is characterized by the following particles:where is the th hypothesized UGV state set, denotes the weight, and is the related PHD. The posterior distribution is approximated by the following set of weighted particles:
In this paper, is the prior PHD of the battlefield states for the th particle related to the th UGV trajectory. can be represented by the following Gaussian mixture:which consists of Gaussian components. For th Gaussian component, is predicted weight, is mean, and is covariance. The PHD of the new environmental cue for the sampled state at time is represented by . is also a Gaussian mixture and can be represented as follows:where is the number of the Gaussian components of the new PHD at time , and, , and are the corresponding Gaussian parameters. The predicted PHD is therefore also a Gaussian mixture and can be represented as follows:
Here, is composed of Gaussian components that represent the union of the prior PHD and the PHD of new environmental cues. Since the detection function can also be represented by a Gaussian mixture, the posterior PHD can be represented by a Gaussian mixture as follows:
The components of equation (18) are given as follows:
Here, . The terms , , and can be got through standard Kalman filters; here, we adopt the unscented Kalman filter [36].
We assume that the number of clutters in complies with the Poisson distribution, and the elements comply with uniform distribution over the battlefield state space. Then, the clutter PHD can be represented by ; here, denotes the averaged number of clutters, and complies with a uniform distribution. In order to reduce the amount of calculation, we use pruning and merging methods to reduce the number of Gaussian components of the updated distribution [37].
The posterior UGV state is sampled by with
The weights should be normalized as . With the resampling step [24], we can get the resampled particles . By choosing the UGV transition density as the proposal density, we get the weight as follows:
By assuming that there is only one environmental cue in the battlefield, then, we can getwith
Here, and.
5.2. Implementation
According to the learning process given above, we give the concrete realization method of the proposed algorithm in this section. We use C++ to write the experimental program for this algorithm. The C++ library dependencies such as Eigen (version 3.0.0), Boost (version 1.5.3), and gtest are also used. In order to detail the implementation of the proposed algorithm, the flow diagram of the proposed algorithm is presented in Figure 5. The concrete steps are described in Algorithm 1.

The computational complexity of the proposed algorithm is and is linear in the number of landmarks (in the FoV), as well as the number of detections and number of particles for the UGV state.
|
6. Experiments
In this section, a group of experiments are conducted to quantitatively verify the effectiveness and analyze the performance of the proposed algorithm. The virtual machine we used to run our experiments has 4 G of RAM and 6 3.40 GHz Intel CPUs and runs on Unbuntu 14.04 OS. The experimental data used to support the findings of this study are included within the article. The parameters used in this experiment are given in Section 6.1, and the models used in this experiment are given in Section 4.
6.1. Experimental Setup
As shown in Figure 6, the UGV patrols in a simulated two-dimensional space. The known ground truth (including the UGV states and locations of landmarks) is generated by the simulation models. The black dots represent the real locations of landmarks, and the black dashed line represents the real UGV states. The number of clutters complies with the Poisson distribution, and the clutter PHD is uniformly distributed. Table 1 shows some important parameters for the simulation models to generate the ground truth.

The sensor used in this experiment is the range-bearing sensor that can detect landmarks with distances of 5 m to 30 m in any direction. The range measurement standard deviation (std) is 1 m and the bearing measurement std is 2 deg. The maximum FoV of the sensor used by the UGV is 10 m and 360 deg.
6.2. Results and Analysis
The experimental results are shown in Figure 6, the red dashed line represents the learned UGV states, and the red points represent the learned locations of landmarks. The collection of red dashed lines and points represents the battlefield states. The results successfully confirm that the proposed algorithm can learn the battlefield states by using sensor detection at runtime.
In order to quantitatively evaluate the performance of the proposed algorithm, we give the errors of the learned battlefield states in Figures 7 and 8. Figures 7(a) and 7(b) give the errors of the learned number and locations of landmarks. The errors of locations are represented by optimal subpattern assignment (OSPA) distance [38]. We can find out that the performance of the proposed algorithm can satisfy the requirements of simulation and evaluation in unmanned combat.

(a)

(b)

(a)

(b)
Consider two sets and , where . Vectors and are taking values from the battlefield state space. The OSPA metric is defined as a distance between sets and . The OSPA distance of order , with the cut-off parameter , is defined for as follows:where represents the permutations set of length with elements taken from .
The errors of learned UGV states are shown in Figures 8(a) and 8(b). We can find out that the proposed algorithm can generate the learned UGV states with acceptable accuracy. But the errors are increased as time advance. This is caused by the cumulative errors of the UGV states.
In order to analyze how detection parameters affect the proposed algorithm, the averaged errors of the UGV states and landmarks are generated with different probabilities of detection from 0.1 to 0.99 and clutter intensity from 0.0001 to 1. For each pair of parameters, 10 simulation runs were carried out. Here, cardinalized optimal linear assignment (COLA) is used to evaluate the errors for the learned landmarks. From Figures 9(a) and 9(b), we can find out that the errors of the learned UGV states increase as decreases. The errors of the learned landmarks only increase slightly as PD decreases. The increase of will increase the errors of the learned locations of landmarks, but the effect on the errors of learned UGV states is quite small.

(a)

(b)
In order to apply the proposed algorithm in real unmanned combat applications, the time cost should be fully evaluated. As shown in Figure 10, we record 10 simulation runs for each pair of detection probability and clutter intensity, and each simulation run consists of 1000 time steps. The averaged time costs of the proposed algorithm and CPU are shown in Figures 10(a) and 10(b). We can find that the increase of detection probability will increase the time cost, and the decrease of clutter intensity will also increase the time cost. The average time cost for each time step is about 500 ms, and it can satisfy many unmanned combat applications very well.

(a)

(b)
7. Conclusions
Digital twin technology enables real-time dynamic interaction between the real battlefield and the simulation system. Our main contribution is proposing a new online battlefield learning algorithm based on RFS to enable the application of the digital twin in unmanned combat. The digital twin has a broad application prospect in unmanned combat and greatly promotes the innovation of unmanned combat mode. Since the implementation of the digital twin in unmanned combat depends on battlefield understanding, an effective battlefield learning algorithm is quite important. By adopting the RFS-based representation of the battlefield, the proposed algorithm can overcome the limitations of the traditional vector-based representation. The performance of the proposed algorithm is verified by using two groups of experiments. This paper is the first attempt for applying the digital twin to the unmanned combat area and has practical significance for implementing the digital twin in many other areas.
Abbreviations
CGF: | Computer-generated force |
COA: | Courses of action |
COLA: | Cardinalized optimal linear assignment |
EKF: | Extended Kalman filter |
FoV: | Field of view |
IoT: | Internet of things |
OSPA: | Optimal subpattern assignment |
PHD: | Probability hypothesis density |
RFS: | Random finite set |
UGV: | Unmanned ground vehicle. |
Data Availability
The experimental data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare no conflicts of interest.
Authors’ Contributions
Peng Wang conceived, designed, and performed the simulations and wrote the manuscript. Jiancheng Zhu and Yong Peng provided the basic ideas and analyzed the experimental results. Mei Yang helped to perform the experiments. Ge Li and Yong Peng reviewed the manuscript.
Acknowledgments
The authors would like to acknowledge the support of the Young Elite Scientists Sponsorship Program of China Association of Science and Technology.