Abstract

A Message-Passing Interface (MPI) parallel implementation of an integral equation solver that uses the Method of Moments (MoM) with higher-order basis functions has been proposed to compute the Radar Cross-Section (RCS) of various targets. The block-partitioned scheme for the large dense MoM matrix is designed to achieve excellent load balance and high parallel efficiency. Some numerical results demonstrate that higher-order basis in this parallelized scheme is more efficient than the conventional RWG method and able to efficiently analyze RCS of various electrically large platforms.

1. Introduction

Radar Cross-Section (RCS) computation of electrically large platforms has attracted a great deal of attention in the past few decades. One traditional and widely adopted method is the method of moments (MoM) [1]. However, when the operating frequency is high, the MoM method based on the Rao-Wilton-Glisson basis functions (RWGs) [2, 3] produces a very large number of unknowns for electrically large structures. To reduce the number of unknowns and to accelerate the computation, the fast multipole method (FMM) is a feasible approach. Although this technique can achieve our goal to some extent, there may be a problem of convergence when the model to be simulated is complex. Another choice is to use higher-order polynomials over wires and quadrilateral plates as basis functions over larger subdomain patches [4, 5]. The use of higher-order basis functions significantly reduces the number of unknowns. However, it is necessary to state that higher-order basis is suitable for large smooth structure but not very beneficial for detailed structure. In addition, to reduce the total wall clock time, the large dense MoM matrix is divided into a number of small block matrices that are nearly equal in size and distributed among all the available processes in the parallel method.

In this paper, the parallel in-core MoM solver combined with the higher-order polynomial basis functions (HOBs) is employed on high-performance clusters so that the capability of the MoM method has been significantly improved. This technique is capable of solving electrically large scattering problems [3, 5, 6] of several hundred wavelengths in the maximum dimension.

In Section 2 of this paper, the basic theory of higher-order basis function and the matrix partition scheme are listed respectively; And then the computation platforms are described in Section 3; Section 4 lists some numerical examples to validate the accuracy, efficiency, and application of this paper’s method: Section 4.1 demonstrates the convergence of the higher-order basis MoM method; Section 4.2.1 validates the accuracy of this paper’s method through the comparison with measurement results; Section 4.2.2 demonstrates that higher-order bases are able to significantly reduce the number of unknowns and can effectively shorten the computation time; Section 4.3 checks the parallel efficiency of the parallel scheme used in this paper. Finally, Section 4.4 illustrates a real-life problem of the RCS computation of a missile whose maximum electrical dimension is bigger than one hundred wavelengths. Finally, Section 5 presents the conclusions and the acknowledgements follow, respectively.

2. Basic Theory

2.1. Higher-order Basis Functions

Flexible geometric modeling can be achieved by using truncated cones for wires and bilinear patches to characterize surfaces [4]. The surface current over a bilinear surface is decomposed into its 𝑝 and 𝑠-components, as shown in Figure 1(a). However, the 𝑝-current component can be treated as the 𝑠-current component defined over the same bilinear surface with an interchange of the 𝑝 and 𝑠 coordinates. The approximations for the 𝑠-components of the electric and magnetic currents over a bilinear surface are typically defined by𝐽𝑠(𝑝,𝑠)=𝑁𝑝𝑖=0𝑐𝑖1𝐸𝑖(𝑝,𝑠)+𝑐𝑖2𝐸𝑖(𝑝,𝑠)+𝑁𝑠𝑗=2𝑎𝑖𝑗𝑃𝑖𝑗(𝑝,𝑠),(1) where 𝑐𝑖1,𝑐𝑖2,(𝑖=0,1,,𝑁𝑝) are defined as𝑐𝑖1=𝑁𝑠𝑗=0𝑎𝑖𝑗(1)𝑗,𝑐𝑖2=𝑁𝑠𝑗=0𝑎𝑖𝑗.(2)

The edge basis functions 𝐸𝑖(𝑝,𝑠) and the patch basis functions 𝑃𝑖𝑗(𝑝,𝑠)  (𝑖=0,,𝑁𝑝,𝑗=2,,𝑁𝑠) are expressed by (3) and (4), respectively,𝐸𝑖(𝑝,𝑠)=𝛼𝑠||𝛼𝑝×𝛼𝑠||𝑝𝑖𝑃𝑁(𝑠),(3)𝑖𝑗(𝑝,𝑠)=𝛼𝑠||𝛼𝑝×𝛼𝑠||𝑝𝑖𝑆𝑗(𝑠),(4) where 𝛼𝑝, 𝛼𝑠 are the unitary vectors defined as𝛼𝑝=𝜕𝑟(𝑝,𝑠)𝜕𝑝,𝛼𝑠=𝜕𝑟(𝑝,𝑠)𝜕𝑠.(5) The parametric equation of such an isoparametric element can be written in the following form:𝑟(𝑝,𝑠)=𝑟11(1𝑝)(1𝑠)4+𝑟12(1𝑝)(1+𝑠)4+𝑟21(1+𝑝)(1𝑠)4+𝑟22(1+𝑝)(1+𝑠)4,1𝑝1,1𝑠1,(6) where 𝑟11,𝑟12,𝑟21,𝑟22 are the position vectors of its vertices and the 𝑝 and 𝑠 are the local coordinates.

A right-truncated cone is determined by the position vectors and the radii of its beginning and its end,𝑟1,𝑎1, 𝑟2, 𝑎2, respectively, as shown in Figure 1(b). Generalized wires (i.e., wires that have a curvilinear axis and a variable radius) can be approximated by right-truncated cones.

Currents along wires are approximated by polynomials and can be written as𝐼(𝑠)=𝐼1𝑁(𝑠)+𝐼2𝑁(𝑠)+𝑁𝑠𝑖=2𝑎𝑖𝑆𝑖(𝑠),1𝑠1,(7) where node basis functions, 𝑁(𝑠), and segment basis functions, 𝑆𝑖(𝑠), are expressed as𝑁(𝑠)=1𝑠2𝑆𝑖(𝑠)=𝑠𝑖𝑆1,𝑖iseven𝑖(𝑠)=𝑠𝑖1,𝑖isodd,(8) respectively, and where 𝑎𝑖, (𝑖=2,𝑁𝑠) are the coefficients, and 𝐼1=𝐼(1), 𝐼2=𝐼(1) are the values of the currents at the wire ends, respectively.

The parametric equation of the cone surface can be written as𝑟𝑎(𝜙,𝑠)=𝑟𝑎𝑖(𝑠)+𝑎(𝑠)𝜌(𝜙),1𝑠1,𝜋𝜙𝜋,(9) where 𝜙 is the circumferential angle, measured from the 𝑥-axis, and 𝑖𝜌(𝜙) is the radial unit vector, perpendicular to the cone axis.

2.2. The Matrix Partition Scheme

Assume that the matrix 𝐴 is a large dense matrix, it can be divided into smaller blocks and distributed to each process grid [6]. For explanation purposes, the MoM matrix equation is rewritten in a general form as𝐴𝑋=𝐵,(10) where 𝐴 denotes the complex dense matrix, 𝑋 is the unknown vector to be determined, and 𝐵 denotes the given source vector.

Assume that the matrix 𝐴 is divided into 6×6 blocks, which are distributed to 6 processes in a 2×3 process grid, as illustrated in Figure 2(a). Figure 2(b) shows to which process the blocks of 𝐴 are distributed using ScaLAPACK’s distribution methodology.

In Figure 2(a), the outermost numbers denote the row and column indices of the process coordinates. The top and bottom numbers in any block of Figure 2(b) denote the process rank and the process coordinate of a certain process, respectively, corresponding to the block of the matrix shown in Figure 2(a). By varying the dimensions of the blocks of 𝐴 and those of the process grid, different mappings can be obtained. This scheme can be referred to as a block-cyclic distribution procedure.

Load balancing is critical to obtain an efficient operation of a parallel code. This parallel scheme is able to achieve the good load balancing. Little communication between processes is necessary during the matrix filling process [4].

Also, it is necessary to mention that the degree of higher-order basis is confirmed by the maximum length of edge of the corresponding plate. And the same load-balancing scheme is used no matter what the order of basis function is.

3. Description of the Computation Platforms

To illustrate the versatility of the solver, two representative computer platforms have been chosen.(1)Personal computer: Quad core Intel I5 processor (2.67 GHz) with 4 GB RAM and 500 GB of hard disk.(2)Shanghai supercomputer center (SSC): the 37 nodes from Magic-cube Machine with a total of 592 AMD CPU cores (1.9 GHz per CPU and 4 cores on each CPU): 16 CPU cores on each node and 4 GB RAM per core, and a total amount of RAM approximately equal to 2.3 TB. No hard disk storage is available for computation. InfiniBand is used for the network interconnection.

4. Numerical Results and Discussion

4.1. The Accuracy versus the Order of Basis Function

In this benchmark, a model of PEC sphere is used to test the relationship between the accuracy of simulation and the order of basis function. The radius of the sphere is 1.0 meter. The simulation frequency is 1 GHz. The incident direction is along 𝑥-axis and the observation plane is XOY, as illustrated in Figure 3. Parallel higher-order MoM with 512 CPUs is employed to calculate the bistatic RCS (dB). Through changing the order of basis, the results obtained are compared in Figure 4 and the information of simulation process is listed in Table 1.

We can see from Figure 4 that the results are stable when the order of basis function ranges from two to five. Therefore to this model, numbers of two to five can be chosen as the reasonable order of basis. Moreover, Table 1 lists the information about the simulations, respectively. It is obvious that the number of unknowns is more when the order of basis is higher. Meanwhile, the simulation time required is longer as the order of basis function is increased.

4.2. Comparison with the Measurement Results and Parallel FMM with RWG Basis

To validate the accuracy and efficiency of the proposed parallel higher-order basis MoM methodology, two benchmarks of a truncated cone and a Y-8 plane are simulated to calculate their RCS, respectively.

4.2.1. Truncated Cone [7]

This benchmark is an end-capped truncated cone oriented along the 𝑧-axis and centred in the plane 𝑧=0 (illustrated in Figure 5). The elevation angle (𝜃) is taken from the positive 𝑧-axis and the azimuth angle (𝜙) from the positive 𝑥-axis. There are several interesting points in this target. First, it shows the RCS response of targets with single curvature (common in structural parts of an aircraft, such as the fuselage). It is also important to know the diffraction mechanism at curved edges. Reflection from planar surfaces with curved edges can also be observed. Therefore, this target is especially suitable for the validation of the prediction of objects with flat surfaces delimited by curved edges and for evaluation of curved edge contributions.

This model has been simulated at 7 GHz. The RCS pattern, for HH polarizations, corresponds to 𝜙=0 and 𝜃 ranges from 0° to 180 with a 1° step. The incident direction is perpendicular to the generatrix. The simulation is performed on the first kind of computer platform described above.

Figure 6 shows the RCS pattern of the truncated cone for HH polarization at 7 GHz. Three main lobes are clearly defined. Two of them correspond to the specular reflection from the two bases of the cone. The minor one corresponds to 𝜃=0 and the major one to 𝜃=180, with the different levels resulting from the different areas of the corresponding bases. The other main lobe corresponds to the angle at which the generatrix is perpendicular to the incident direction. Diffraction from the curved edges becomes important in the intermediate region between the main lobes. The RCS pattern is compared with measured results and good agreement is seen.

4.2.2. Y-8 Plane

This benchmark is a real aircraft named Y-8 (illustrated in Figure 7). The elevation angle (𝜃) is taken from the positive 𝑥-axis and the azimuth angle (𝜙) also from the positive 𝑥-axis. The operating frequency is 100 MHz. The airplane model is 36.2 m long, 38 m wide, and 10.5 m high. The corresponding electrical sizes of the model are 12.1𝜆, 12.7𝜆, and 3.5𝜆, where 𝜆 is the free-space wavelength at the operating frequency. The incident wave with HH polarization is along the negative 𝑥-axis.

In this simulation, the order of higher-order basis is three; also, FMM parameters are described as follows.(1)Number of levels: 6.(2)Top level: 3.(3)Number of boxes at the top level: 27.(4)Number of boxes at the finest level: 1937.(5)Finest box size: 0.23*lambda.

The simulation is performed on the first kind of computer platform described above. The Bistatic RCS results obtained by using the proposed parallel higher-order basis MoM method and the parallel FMM method, are plotted together in Figure 8. As shown in this figure the results agree with each other very well from 15° to 345°. The only considerable discrepancy between them occurs in the nose region of the plane, for angles from 0° to 15° and from 345° to 360°.

The comparisons of some computation parameters are listed in Table 2.

From Table 2, one can see that the higher-order basis adopted in the proposed method results in less number of unknowns than the FMM RWGs do. The total computation time of the proposed method is only about 23.6% of the time required by parallel FMM method, and it implies that the proposed method is about 4.2 times faster than the parallel FMM method. This benefit not only comes from the smaller number of unknowns needed when using higher-order basis, but also due to the parallel matrix partition scheme.

4.3. Parallel Efficiency of Mirage’s RCS Computation

In this benchmark, the parallel efficiency of Mirage’s RCS computation has been measured with respect to different numbers of processes, as shown in Table 2. The model of the Mirage aircraft is described in Figure 9, and its geometric dimensions are 11.3 m × 7 m × 2.85 m. The operating frequency is 1.25 GHz. Thus the corresponding electrical dimensions are 47.1𝜆×29.2𝜆×11.9𝜆, where 𝜆 is the wave length in free space. The model is placed along 𝑥-axis and is excited by a plane wave propagating along the negative 𝑥-axis and with VV polarization.

Taking the time for 16 processes as a reference, the parallel efficiencies for this simulation are described in Figure 10, and the times of simulation for different number of CPUs are listed in Table 3. In the cases of 16, 32, 64, and 128 processes, the parallel efficiencies for the wall time are higher than 80%, which demonstrates that this proposed method can reach an excellent parallel efficiency and is capable of effectively reducing the computation time.

4.4. RCS Computation of Benchmark with Electrically Large Dimension

In the following examples, the elevation angle (𝜃) is taken from the positive 𝑥-axis to 𝑧-axis and the azimuth angle (𝜙) from the positive 𝑥-axis. The simulations are performed on the second kind of computer platform described in Section 3.(1) In the first part of this section, the parallel speedup ratio for computing a real missile’s bistatic RCS is tested.

The model of this benchmark is listed as follows. The missile model is placed along the 𝑥-axis, as illustrated in Figure 11. Table 4 illustrates the corresponding parameters of this model.

This testing benchmark is operating at 5.0 GHz, for which the number of unknowns is 69247. The simulation times for different number of processes and the results of parallel speed-up ratio are listed in Table 5 and Figure 12, respectively. Taking the time for 64 processes as a reference, it can be found in Figure 12 that the parallel speedup ratio is nearly linear. However, in the case of 192 and 256 processes, the parallel efficiencies for the simulation decrease compared with the theoretical results. An increase in the number of processes deteriorates the performance. This is expected because the ratio of the communication volume to computation increases with an increase in the number of processes for this problem. But the parallel efficiencies of all these four situations are also higher than 80%, which proves a good parallel performance of this paper’s method.

(2) Consider next the missile’s bistatic RCS and its surface current distribution.

Table 6 illustrates the corresponding parameters of this benchmark, and Table 7 summarizes the computation results. In the following, RCS results and surface current distribution of this benchmark are listed in Figures 13 and 14, respectively.

From this benchmark, it is clear that the proposed method in this paper is able to handle electrically large scattering problems of hundreds of wavelength in the maximum dimension.

5. Conclusion

In this paper, RCS computation of electrically large platforms using parallel MoM technique with higher-order basis functions is presented. A load-balanced parallel method is achieved by a matrix partition scheme, so the total wall clock time for solving a large dense matrix is shortened. Its accuracy, efficiency, and applicability are also validated through several numerical examples. In conclusion, the method proposed in this paper can solve some electrically large problems with high accuracy and short computation time, which cannot be achieved by the conventional RWG MoM method. Also this paper contributes to ongoing research efforts on developing numerically accurate solutions for electrically large problems.

Acknowledgments

This work is partly supported by the Fundamental Research Funds for the Central Universities of China (JY10000902002, K50510020017) and the National Science Foundation of China (61072019). This work is also supported by Shanghai Supercomputer Center of China (SSC).