Research Article

Parallel Numerical Simulations of Three-Dimensional Electromagnetic Radiation with MPI-CUDA Paradigms

Algorithm 1

 #pragma omp parallel for schedule(static)
 for (; _tot; ++)
  for (; _tot; ++)
   for (; _tot; ++)
    ⋯
 In order to increase OpenMP load balance and get better scalability, we merge
the nested loops and reduce from 3 nest layers to 2 layers.
 _end = _tot;
_end = (_tot) * (_tot);
_start = _end + 1;
 #pragma omp parallel for schedule(static)
 for (_start; _end; ++)
 {
  /_end;
  ;
  if (!) continue;
  for (; _tot; ++)
   ⋯
 }