Mathematical Problems in Engineering

Volume 2015, Article ID 823426, 9 pages

http://dx.doi.org/10.1155/2015/823426

## Parallel Numerical Simulations of Three-Dimensional Electromagnetic Radiation with MPI-CUDA Paradigms

High Performance Computing Centre, Shanghai University, Shanghai 200436, China

Received 28 August 2014; Accepted 15 December 2014

Academic Editor: L. W. Zhang

Copyright © 2015 Bing He et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Using parallel computation can enhance the performance of numerical simulation of electromagnetic radiation and get great runtime reduction. We simulate the electromagnetic radiation calculation based on the multicore CPU and GPU Parallel Architecture Clusters by using MPI-OpenMP and MPI-CUDA hybrid parallel algorithm. This is an effective solution comparing to the traditional finite-difference time-domain method which has a shortage in the calculation of the electromagnetic radiation on the problem of inadequate large data space and time. What is more, we use regional segmentation, subregional data communications, consolidation, and other methods to improve procedures nested parallelism and finally verify the correctness of the calculation results. Studying these two hybrid models of parallel algorithms run on the high-performance cluster computer, we draw the conclusion that both models are suitable for large-scale numerical calculations, and MPI-CUDA hybrid model can achieve higher speedup.

#### 1. Introduction

Finite-difference time-domain (FDTD) method has become a common method for solving Maxwell’s equations [1]. It is a full vector method and can be naturally given time-domain and frequency-domain information user need. This is the unique advantage in electromagnetic and photonic application. FDTD algorithm is discrete in terms of time and space. Therefore, the structure of the electromagnetic field must be described on the grid by the Yee cellular composition. Maxwell’s equation is discrete in time factor; therefore, time step is closely related to the mesh size. When mesh size tends to zero in the limit case, the discrete model accurately describes Maxwell’s equations.

Recently, general-purpose computing on a graphics processing unit (GPGPU) has received considerable attention in many scientific fields [2–4] because a GPGPU offers high computational performance at low cost. What is more, Intel Xeon Phi coprocessor, based on the Many Integrated Core (MIC) architecture, packs up to 1 TFLOP of double precision performance in one chip. It runs a Linux operations system and provides x86 compatibility and also supports several popular programming models including MPI, OpenMP, Thread Building Blocks, and others that are used on multicore architectures. High-performance computer architecture tends to hybrid system, and this corresponds to the software program design requirements mixed programming model. GPGPU and MIC accelerated computing components which appeared in recent years provide the opportunity to improve the performance of FDTD parallel algorithm. Therefore, we achieved the parallel three-dimensional FDTD algorithm based MPI-CUDA model.

The FDTD algorithm obtains a wide range of applications in many fields of electromagnetic radiation, such as radiation antenna analysis, scattering calculations, electronic packaging, and radar. With the development of high-performance computing, the MPI has solved a weakness that the computing time of the FDTD parallel algorithm [5, 6] is too long. However, increasing amount of computation, the MPI process in a single node increases computational burden. When we use two different hybrid models which are MPI-OpenMP model and MPI-CUDA model to solve this problem, we can use the distributed shared memory features to improve the parallel speedup and scalability [7, 8].

The rest of the paper is organized as follow. We present the FDTD algorithm with uniaxial perfectly matched layer (UPML) in Section 2. Then we describe the procedure and present basic steps for the method acceleration by means of MPI-OpenMP paradigms and MPI-CUDA paradigms in Sections 3 and 4. In Section 5, the performance of parallel computing of the two methods was compared and analyzed the factor that affects performance. The conclusions are given in Section 6 finally.

#### 2. FDTD Algorithm

FDTD algorithm is a numerical method based on Maxwell’s equations. The algorithm uses leapfrog calculation method and alternating electric field and magnetic field distribution in space within a half step sampling by Yee cellular composition [9].

From Figure 1, we can see that the Yee cell has the following characteristics: each magnetic field component was surrounded by four electric field components and each electric field component was surrounded by the four components of the magnetic field, and these field components placement relative position in the Yee cell and automatically satisfy the continuity conditions in the interface. This sampling method not only meets Maxwell’s equations difference calculation but also meets Faraday’s law of electromagnetic induction and the natural Ampere’s law [10]. Therefore, this method gradually completes recursive entire electromagnetic fields. First, the explicit equations for the and are given bywhere where is the relative permittivity, is the conductivity of the tissue [S/m], is the mesh size, and is the time step. Figure 2 shows the workflow of the FDTD method.