Abstract

The lack of sample data and the limited visual range of a single agent during light field reconstruction affect the recognition of maneuvering targets. In view of the above problems, this paper introduces generative adversarial nets (GAN) into the field of light field reconstruction and proposes a multiagent light field reconstruction and target recognition method based on GAN. The algorithm of this paper utilizes the characteristics of GAN to generate data and enhance data, which greatly improves the accuracy of light field reconstruction. The consistency mean of all observations is obtained by multiagent data fusion, which ensures the reliability of sample data and the continuity of maneuvering target recognition. The experimental results show that the accuracy of light field reconstruction reaches 94.552%. The accuracy of maneuvering target recognition is 84.267%, and the more the agents are used, the shorter the recognition time.

1. Introduction

The light field, introduced by Gershun in his classic 1936 paper [1], is defined as an all-light function of a point in a given direction. And the function value represented the brightness per unit area. In a bid to solve the problem of massive data and high cost, Levoy and Hanrahan [2] introduced the concept of light field into the computer science field. Simply by combining and resampling the available images, the proposed method successfully reconstructs light fields from arbitrary camera positions. A light field contains all images taken of the same object from different positions and angles. Therefore, the light field can be viewed as a feature library of the target. With the development of light field reconstruction technology, many scholars use the sparsity of light field to monitor and identify targets and the technique has been applied in military, industrial, and other fields.

The visual range of the complete light field is limited, and the target may be lost due to the interference of blurring and occlusion. Cai et al. [3] proposed a target recognition method based on multiperspective reconstruction, through which the light field data from multiple perspectives are fused and the consistency of the average of all observations is obtained. The experiment shows the robustness and reliability of solving large-scale complex problems with collaboration of multiagents.

Since many difficult probability calculations lead to poor performance of the depth generation model, generative adversarial nets [4] were brought up in 2014 to solve the issue. In GAN, two models will be trained simultaneously: model-G (used to capture distribution data) and model-D (used to estimate the probability of samples from training data). This framework is equivalent to a continuous gaming process between two models. The lack of sample data in light field reconstruction seriously affects the accuracy of reconstruction results. The results, however, can be significantly improved by data generation and enhancement using the GAN method.

In the practical application of light field, the lack of reconstructed sample data seriously affects the accuracy of reconstructed results. Moreover, the limited visual range of a single agent leads to the loss of maneuvering targets. In order to solve the above problems, this paper proposes a multiagent light field reconstruction and maneuvering target recognition based on the GAN method, as shown in Figure 1. In this paper, our main contributions are as follows:(1)In this paper, the collaboration mechanism of multiple agents is introduced into the light field reconstruction. Through data fusion of multiple agents, the consistency of the average of all observations is obtained. This method can effectively remove redundant and high-noise sample data, ensuring the continuity of maneuvering target monitoring.(2)We use GAN to generate data and enhance data to solve the problem of reconstruction due to insufficient samples. This method can increase the amount of sample data and improve the accuracy of light field reconstruction.(3)We conduct a large number of light field reconstruction and maneuvering target recognition experiments to evaluate the performance of this method. The experimental result shows that our proposed method is superior to the existing methods.

The remainder of this paper is organized as follows: Section 2 provides an overview of the related literature. Section 3 describes the multiagent representation of light field, multiagent collaborative mechanism, multiagent light field reconstruction model, and GAN-based multiagent light field reconstruction method. The simulation experiments are discussed in Section 4. Finally, Section 5 concludes the paper.

As a deep learning model, generative ADM is one of the most prospective methods for unsupervised learning in complex distribution in recent years. Therefore, GAN has been widely used by many scholars. Laloy et al. [5] proposed spatial GAN (SGAN) which can quickly generate 2D and 3D unconditional realizations. The effectiveness of the proposed method is analyzed by taking 2D steady-state flow and 3D transient hydraulic tomography as examples. The radiation dose to the patient associated with CT in medical practice has raised public interest and concern. Yet, decrease in the radiation dose may lead to increased noise, thus affecting the radiologists’ judgment and confidence. To address this issue, Yang et al. [6] introduced a new CT image denoising method based on the generative adversarial network (GAN) with Wasserstein distance and perceptual similarity. The method successfully reduced the image noise level without missing the critical information. Interestingly, to cope with the same issue, You et al. [7] raised a novel 3D noise reduction method, called structurally sensitive multiscale generative adversarial net (SMGAN), which can effectively preserve structural and textural information in reference to normal-dose CT (NDCT) images and significantly suppress noise and artifacts. Xuan et al. [8] focused on automatic pearl classification by adopting the MV-GAN method which can not only reduce classification error, but also resist the brightness disturbance. Wang et al. [9] developed a visual analytics system (GANViz) to help experts understand the adversarial process of GANs in-depth. Classifying hyperspectral images (HSIs) with few training samples is challenging. Zhan et al. [10] designed a novel semisupervised algorithm framework for HSI data based on a 1D GAN (HSGAN), under which the automatic extraction of spectral features for HSI classification is accessible. To classify hyperspectral images (HSIs) Zhu et al. [11] proposed two schemes: (1) a well-designed 1D GAN as a spectral classifier and (2) a robust 3D GAN as a spectral-spatial classifier. The result demonstrated that proposed models are more competitive compared to previous methods. Xu et al. [12] proposed GAN obfuscator, which can achieve differential privacy under GANs by adding specific noise to gradients during the learning procedure. Stacked generative adversarial networks (StackGANs), brought up by Zhang et al. [13], significantly outperform other state-of-the-art methods in generating photorealistic images. Creswell et al. [14] provided an overview of several methods to train and construct GANs, and they also pointed out challenges waiting to be settled.

To solve the problem of distributed collaboration among multiple agents, Jumadinova et al. [15] proposed a learning technique based on human learning theory, enabling an agent to select appropriate capabilities to learn from other agents. The result shows that the overall utility of agents is significantly improved. Li et al. [16] addressed the formation control problem of multiagent systems, and they used a distributed compensation control strategy to deal with the issue. Wang et al. [17] investigated the consensus of continuous time multiagent systems over the undirected network. By researching the joint effects of agent dynamics, network topology, and time delay, conditions guaranteeing consensus are obtained. Chen et al. [18] proposed an adaptive neural network consensus control method for a class of nonlinear multiagent systems with state time-delay. An L-K functional is used to compensate the uncertainties of unknown time delays. The simulation results show the effectiveness of the algorithm. Petrillo et al. [19] designed a novel distributed adaptive collaborative control strategy exploiting information from connected vehicles to achieve leader synchronization. The effectiveness of this strategy was proved by the L-K approach. In a bid to improve the performance of the multiagent system, Cai and Shen [20] proposed the formation error which characterizes the minimum squared distance between two formations for arbitrary translation and rotation. They designed an integrated localization and control scheme to minimize the mean formation error (MFE). The experimental result showed that this method outperforms the existing algorithm.

Vagharshakyan et al. [21] introduced an image-based rendering technique based on light field reconstruction. The essence of this method lies in sparse representation of epipolar-plane-images (EPI) in the shearlet transform domain. The proposed method is superior in 3D scene. Wu et al. [22] developed a novel convolutional neural network- (CNN-) based framework for light field reconstruction. The method successfully suppresses the ghosting effects resulting from information asymmetry. The high performance and robustness are demonstrated in several data sets. A separable formulation of sheared reconstruction filters, introduced by Vaidyanathan et al. [23], is more efficient than previous techniques. Narrow baselines and constrained spatial resolution of the current light field cameras obstruct applications of light field cameras. Thus, Wang et al. [24] designed a hybrid imaging system containing a light field camera and a high-resolution digital single lens reflex camera. Cai et al. [25] presented a novel active method involving ray calibration and phase mapping, to achieve SLF 3D reconstruction. Zhou et al. [26] proposed an image recognition algorithm, in which the light field is applied to the image recognition as the feature extraction library for the first time. In addition, this algorithm can recognize an image with different camera angles.

In the field of target recognition, literature [27] proposed an optimized CNN-based image recognition model. This method effectively solves the problem that the traditional CNN model does not consider learning weights. Nam and Han [28] proposed a new visual tracking algorithm based on convolution neural network representation with differentiated training. The method performs online tracking by evaluating candidate windows that are randomly sampled around the target state. The experimental results show that the method shows excellent performance in the existing tracking benchmark. Hu and Zhai [29] proposed a multitarget detection algorithm based on 3D DSF R-CNN in order to upgrade the color image detection method with depth information. Experimental results show that the method is robust to complex illumination. In order to improve the quality of the tracking model, Danelljan et al. [30] proposed a spatial regularization discriminant correlation filter for tracking. The experimental results show that the method has shown excellent results in four data sets. The tracker introduced by Tao et al. [31] regards the initial state of the target in the first frame as a patch for target matching in a new frame and returns it to the closest patch by learning the matching function. The experimental results verify the effectiveness of the proposed method.

3. Proposed Methods

3.1. Multiagent Representation Model of Light Field

We take the Stanford bunny as an example. For a system containing agents, define as the interaction model. Define as the node set, and is the set of sides between ordered pairs of nodes. Side in the side set of a directed graph represents the information that agent j can obtain from agent i. The visible threshold of light field for each agent C is defined as , so the minimum generalization function E (C) iswhere is the standard that the threshold value is in the gray gradient area, is the random parameter, and is the weighting function. Add a specific parameter to the threshold value, that is, minimize the piecewise smoothing approximation of the gradient function calculated by the following function:

The first item adds the approximate value to the input light field image and evaluates the threshold edge by weighting. is the input gradient value figure on the domain, then

The region in the above formula is represented by the labeling function . The binary functions are equivalent to the multilabel function:

Then, the marker function is retrieved from these functions successively by the following formula:

Therefore, the light field model with agents is obtained, as shown in Figure 2.

3.2. Collaborative Mechanism of Multiple Agents

It is difficult to apply a single agent in real light field reconstruction. Interference, such as occlusion, blurring, and high noise, leads to lack of reconstruction samples and limited recognizable target type. In this paper, the collaborative mechanism of multiple agents is introduced into light field reconstruction and each agent performs data fusion according to its own state, environment information, and target information. Finally, each agent completes the task through cooperation. Suppose a diagonal matrix exists between agents , and the state error between agents is . Thus, the closed-loop system of the state error between agents is

The dynamic energy of each agent in a directed graph network composed of multiple agents would be

Let the state equation of an agent be , and the state equation of an agent can be obtained according to the above formula. In order to prove the overall collaboration between agents, the following formula is given:

When holds under the directed graph topology, the consistency of collaboration is reached. Set target Ζ distributes in union , and the union utility function is . To perform target Z, the union utility function formed by multiple agents would be .

If , at least two agents and cooperate to complete the task. The merged delta is

3.3. Multiagent Light Field Reconstruction Model

According to the establishment of the multiagent model, we adopt an algorithm combining wavelet transform and sparse Fourier to reconstruct the multiagent light field. The original light field imaging is an imaging grid, where each image represents that the beams reaching a microlens on the imaging surface come from different positions of the lens, and it is shown in Figure 3.

The original image is composed of a series of pixels, each of which is imaged by a microlens. Because the aperture is finite, each microlens has a certain field of view and there is a certain disparity between different microlenses. The direction difference produced by looking at the same target from two points at a certain distance is like a ray of a surface coming from the weight integral of all the rays on the lens.where is the light field parameter where F is the distance to the target surface and is the attenuation factor due to the optical halo effect.

Let , and the point imaging function on any surface can be obtained.

Based on the algorithm, the frequency domain information of images can be obtained by 4D Fourier transform of the multiagent light field. In this algorithm, the images are reconstructed by central slice and wavelet inverse transform. The multiagent light field is obtained, as shown in Figure 4.

3.4. Multiagent Light Field Reconstruction Method Based on GAN

Generator G should learn and fit the mapping of data from noise space to data space when antagonistic networks were proposed initially. Denote . is a noise that follows a certain distribution and is expressed as . The probability distribution of the input image data that discriminator D needs to learn is , where is the image data of the input discriminator. In this paper, we represent the multiagent in the GAN model as a multidiscriminator, as the global discriminant network, and as the discriminant network containing P agents. Therefore, the mapping of from noise space to data space should be expressed as . The probability distribution of input image data is , where is the image data captured by multiple agents of the input discriminator and . In the probability distribution, 0 means Fake and 1 means Real.

The set of light field images that wait to be reconstructed is and . The result of is denoted as , and then . Map to the clear image domain through the global generator where The real image is input to , and the corresponding antiloss function is

Finally, the optimal model of multiagent light field reconstruction based on GAN is obtained, as shown in Figure 5.

4. Experimental Results and Analyses

4.1. Experimental Settings
4.1.1. Controller Hardware

In this paper, we use MATLAB and PyTorch under the Window10 system for experimental simulation. The simulation calculation runs on a small server with a CPU of E5-2630 v4, the main frequency of 2.2 GHz, and a memory of 32 GB.

4.1.2. Experimental Parameters

In this section, the Stanford University Light Field Library is used as a light field data set for simulation experiments. The data set contains 17  17 angle trials. In terms of the light field reconstruction sampling scheme, this paper adopts the line model. The sampling rate is the number of sampled lines and dimension of the angle domain.

4.2. Light Field Reconstruction Experimental Performance Test

The data for this experiment are from the Stanford University Square database and are compared with the existing algorithms in Literature [3], Literature [23], Literature [24], and Literature [26] reconstruction results.

In Figure 6 and Table 1, we can clearly see the comparison between the reconstruction time and the reconstruction accuracy of the proposed algorithm and various comparison algorithms. In the Lego Truck dataset, due to the complex structure of the reconstruction part, the reconstruction time of ours took 18 seconds more than the literature [3]. However, the reconstruction accuracy rate is 92.613%, which is the highest among similar algorithms. In the Eucalyptus Flower dataset, the reconstruction of ours is accurate at 92.017%, the highest among similar algorithms. In the Bunny data set, the reconstructed part of the texture structure is complex and ours show superior performance in this data set. The reconstruction time is 507 s, and the reconstruction accuracy is 94.552%. In the Tarot dataset, although the reconstructed part contains text information and although the algorithm of this paper shows good results, the reconstruction time and reconstruction accuracy are the optimal values in the similar algorithms. It can be seen from the eight parameters of the overall experimental results that the algorithm has 50% of the parameters superior to the similar algorithms, especially in terms of reconstruction accuracy.

4.3. Maneuvering Target Recognition Performance Test

To verify the effectiveness of the proposed algorithm, this chapter will use the TB-50 and TB-100 video data sets as test data sets in simulation experiments. In this section, several representative data were selected for the identification test. The algorithm in this paper identifies the maneuvering target with the existing algorithm proposed in Literature [27], Literature [28], Literature [29], Literature [30], and Literature [31], as shown in Figure 7.

The results of maneuvering target recognition are visible in Figure 7 and Table 2. In the TB-50 Jump video dataset, all algorithms effectively identify the target. Ours-4 achieves the best performance in this data set, and the recognition accuracy is 81.873%. In the TB-50 CarScale video dataset, due to the occlusion of the tree, some algorithms have a phenomenon that the recognition frame is offset. In the TB-50 Surfer video dataset, the algorithm exhibits excellent results and the local information of the target is accurately identified by multiple agents. In the TB-100 Dog video dataset, all algorithms are effectively identified for the target. However, the method proposed by Literature [28] shows the phenomenon of frame offset in the last three frames of the video. At #106 of the TB-100 Singer1 video dataset, only the algorithm of this paper effectively identifies the target due to the interference of strong light.

The time analysis of the maneuvering target recognition is shown in Figure 8. It can be seen from the recognition results of different data sets that the target recognition time of all algorithms is within 3 s. It can also be seen that the recognition times of Ours-2, Ours-3, Literature [27], Literature [28], Literature [29], Literature [30], and Literature [31] are all at the same level. It can be seen from the identification results of TB-100 Singer1 and TB-100 Tiger1 that the time of Ours-2 is between 2.3 s and 2.8 s, and the time of Ours-3 is between 2 s and 2.5 s. It can be seen from the recognition results of TB-100 Dog that the recognition time of Ours-4 is smaller than Ours-3, and the recognition time is between 1.8 s and 2.3 s. In summary, as the number of agents increases, the recognition time will decrease. Therefore, it has been experimentally proved that the method is effective in reducing the recognition time.

5. Conclusion

Aiming at the problem of insufficient sample reconstruction of light field and limited field of view of a single agent, this paper proposes a new method of maneuvering target recognition. First of all, this paper introduces GAN into the field of light field reconstruction and uses its own data generation and enhanced data characteristics to greatly improve the accuracy of light field reconstruction. Second, this paper establishes a multiagent collaboration model. The method effectively removes the overlap and noise in the data and ensures the reliability of the data. Finally, the effectiveness of the proposed algorithm is verified in the simulation experiments of light field reconstruction and maneuvering target recognition.

Data Availability

The actual application dataset in this paper is a self-built maneuvering target data set. The data set contains the data of military equipment such as infantry vehicles, armored vehicles, and helicopters. Therefore, the data set of this paper has certain confidentiality and cannot be released.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Authors’ Contributions

Peien Luo and Lei Cai both contributed equally.

Acknowledgments

This paper was supported by the National Natural Science Foundation of China (61703143), Science and Technology Project of Henan Province (192102310260), Scientific and Technological Innovation Talents in Xinxiang (CXRC17004), Young Backbone Teacher Training Project of Henan University (2017GGJS123), and Science and Technology Major Special Project of Xinxiang City (ZD18006).