International Journal of Reconfigurable Computing

Volume 2016, Article ID 4592780, 10 pages

http://dx.doi.org/10.1155/2016/4592780

## An Accelerating Solution for -Body MOND Simulation with FPGA-SoC

^{1}Key Laboratory of Strongly Coupled Quantum Matter Physics, Chinese Academy of Sciences, School of Physical Sciences, University of Science and Technology of China, Hefei, Anhui 230026, China^{2}Hefei Branch Center of National ASIC Design Engineering Technology Research Center, Institute of Advanced Technology, University of Science and Technology of China, Hefei, Anhui 230600, China^{3}Yunnan Observatories, Chinese Academy of Sciences, Kunming 650216, China^{4}Key Laboratory for the Structure and Evolution of the Celestial Objects, Chinese Academy of Sciences, Kunming 650216, China^{5}University of Chinese Academy of Sciences, Beijing 100049, China

Received 21 December 2015; Accepted 17 May 2016

Academic Editor: Michael Hübner

Copyright © 2016 Bo Peng et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

As a modified-gravity proposal to handle the dark matter problem on galactic scales, Modified Newtonian Dynamics (MOND) has shown a great success. However, the -body MOND simulation is quite challenged by its computation complexity, which appeals to acceleration of the simulation calculation. In this paper, we present a highly integrated accelerating solution for -body MOND simulations. By using the FPGA-SoC, which integrates both FPGA and SoC (system on chip) in one chip, our solution exhibits potentials for better performance, higher integration, and lower power consumption. To handle the calculation bottleneck of potential summation, on one hand, we develop a strategy to simplify the pipeline, in which the square calculation task is conducted by the DSP48E1 of Xilinx 7 series FPGAs, so as to reduce the logic resource utilization of each pipeline; on the other hand, advantages of particle-mesh scheme are taken to overcome the bottleneck on bandwidth. Our experiment results show that 2 more pipelines can be integrated in Zynq-7020 FPGA-SoC with the simplified pipeline, and the bandwidth requirement is reduced significantly. Furthermore, our accelerating solution has a full range of advantages over different processors. Compared with GPU, our work is about 10 times better in performance per watt and 50% better in performance per cost.

#### 1. Introduction

Modified Newtonian Dynamics (MOND) is an alternative proposal to popular dark matter (DM) theory, accounting for the missing mass problem in astrophysics. To study the outskirts of disk galaxies, -body MOND simulation is an essential task, namely, to simulate the dynamic evolution of an astronomical system consisting of multiple celestial objects (denoted as -body), where the interacting philosophy of each pair object obeys the MOND proposal.

Gravitational -body problem is traditionally explored with computer simulations, which involves massive nonlinear calculations for the MOND potentials. Yet it has been limited due to both the nonlinearity and the large -scale in a long time. In 2010, Mordehai Milgrom proposed a new formulation of MOND, named quasi-linear MOND (QUMOND) [1]. Based on the QUMOND formulation, the calculation of MOND potential contains 3 steps: the first and the last are potential calculations similar to classical gravity calculations, whereas the second step is to calculate the phantom dark matter (PDM) distribution. The drawback of nonlinearity is ameliorated by QUMOND, while the large -scale is still a big challenge, which brings a tremendous computation complexity . Besides the development of algorithms optimizing the computation complexity, the hardware acceleration of the arithmetic calculation unit is another effective solution to accelerate the -body MOND simulation and thus is the subject of the present paper.

Hardware accelerators for -body MOND simulation abound. In early 1990s, based on the methodology of application specific integrated circuit (ASIC), a series of specific processors, named GRAPE (GRAvity PipE), were proposed for the calculation of particle-particle interaction in the -body problem. From 1991 to 2012, GRAPE experienced a development from version 1 to version 8 [2]. A single GRAPE-8 chip integrates 48 pipeline processors and provides a peak performance of 480 Gflops (12000 Mpairs/s) in total. Besides ASIC like GRAPE, FPGA and GPU are also popular choices to accelerate the -body simulation. Similar with ASIC, the main idea of FPGA-based accelerators is to customize parallel pipelines utilizing programmable gate arrays; nevertheless GPU-based accelerators implement a calculating parallelization through thousands of cores. In 2006, Kawai and Fukushige proposed two FPGA-based add-in cards [3], which were applied to astrophysical -body simulations with the hierarchical tree algorithm; they achieved a performance of 80.9 Gflops (2128 Mpairs/s) with 16 133 MHz pipelines being used. In 2007, Portegies Zwart et al. proposed a GPU accelerator for gravitational -body simulations [4]; their results indicated that the GeForce 8800GTX GPU had a 10-time speedup compared to Intel Xeon CPUs.

During the simulation, parameters of these objects are correlated. Therefore in most situations where ASIC, FPGA, or GPU is utilized, the accelerator is implemented as an add-in card, relying on a host computer to deal with the data dispatching. However, the potential calculation consumes most of the simulation time, which means that processors in the host computer are mostly idle during the simulation. Wang et al. provided statistical results of a CPU-GPU hybrid parallel strategy for an cosmological simulation; they revealed that, in nearly 75 percents of the total time, CPUs were in a waiting state [5]. Moreover, this card-host structure usually requires extra energy and spaces, which leads to a big waste in energy, money, and space. Especially, in -body MOND simulation, the two-step potential calculation brings a double waste with the same simulation scale. These disadvantages motivate us to turn to FPGA-SoC, which combines the embedded low-power processor and FPGA, exhibiting a high integration.

In this paper, by utilizing the FPGA-SoC, we propose a highly integrated accelerating solution for -body MOND simulations. Besides data dispatching, the embedded processor is also responsible for the PDM distribution calculation in the 2nd step. Considering that the number of pipelines an FPGA integrates directly affects the accelerator performance, a modification is made to the typical potential summation pipeline, that is, to make full use of the DSP48E1 in Xilinx 7 series FPGA, so as to reduce the logic resource occupation of each pipeline. What is more, we optimize the data flow from memory to pipelines based on the particle-mesh scheme, which contributes to a reduction of the bandwidth requirement and improves the performance of our solution significantly. At last, we test our solution in the low-cost Zynq-7020 all programmable SoC.

The rest of this paper is organized as follows. The background of MOND and -body MOND simulation is briefly introduced in the next section. We discuss our motivation and contribution in Section 3. Then we describe the particle-mesh scheme and make an illustration of the system architecture of our solution in Section 4. Experimental results are presented in Section 5. Finally, we conclude this paper in Section 6.

#### 2. -Body MOND Simulation

##### 2.1. Modified Newtonian Dynamics

Modified Newtonian Dynamics (MOND) can be interpreted as a modification to the law of gravity. It is an alternative proposal to popular dark matter (DM) theory, accounting for the missing mass problem in astrophysics. Both MOND and DM elegantly fit the rotation curve of spiral galaxies. However, there exist some challenges for the DM-based model; the biggest one is that the tight scaling relations cannot be understood [6], while MOND provides a good explanation to it, as well as details of the rotation curve [7].

Modified Newtonian Dynamics (MOND) was firstly proposed by Milgrom in 1983 [7]. The Milgrom proposal is as follows: the acceleration is Newtonian in the strong gravity field but begins to deviate from it around a critical acceleration and converges to the weak field limit:

Here is Milgromian characteristic acceleration constant, denotes the Milgromian gravitational acceleration, and indicates the Newtonian one.

In conventional Newtonian dynamics, we have the classical Poisson equation:where is the Newtonian potential and is the density distribution of baryonic matters including the star and the gas. Linear equation (2) can be calculated by a typical Poisson solver. However, when describing MOND in this form, it leads to nonlinear generalization of the Newtonian Poisson equation [8]: where is the distribution of MOND potential and the -function is an interpolating function, representing the MOND gravity in the transitional zone from Newtonian to the weak field limit, and is usually adopted as follows:

Equation (3) is hard to solve due to its nonlinearity.

Therefore, in the weak field limit, we have the so-called quasi-linear MOND (QUMOND) [1], as written inHere indicates the Newtonian potential.

For simplification, denoteand (5) can be rewritten as

Analogous to dark matter, the so-called phantom dark matter (PDM) is introduced, and can be interpreted as its density distribution [9]. MOND potential (7) has a similar formulation like (2), and it can be regarded as a modification to the Newtonian potential. Both MOND potential (7) and Newtonian potential (2) can be solved by existing Poisson solvers.

##### 2.2. -Body MOND Simulation

An -body simulation problem is to study the interactions of each pair object and further simulate the dynamic evolution of an astronomical system, which consists of multiple celestial objects (denoted as -body). -body MOND simulation is to calculate the changes of particle properties in a galaxy with time under MOND; namely, the interacting philosophy of each pair object obeys the MOND proposal. These properties include potential, velocity, and position, to name but a few. In this paper, we focus on the potential distribution at a fixed time, with the known baryonic matter distribution . By using the evaluated , the current acceleration can be worked out according to . Particle properties at next time can be further calculated. For an individual particle, the MOND potential calculation comprises three steps:(1)with the known baryonic matter distribution , calculating Newtonian potential according to (2);(2)calculating the phantom dark matter (PDM) distribution with (6);(3)solving modified Poisson equation (7) to get the final MOND potential.

The -body MOND simulation has a double potential computation compared to the typical Newtonian simulation, which challenges the performance of accelerators. Noticing that the formulation of (2) is similar to that of (7), thus the first and third steps can be conducted by the same accelerator.

#### 3. Motivation and Contribution

Table 1 provides the practical time cost of MOND calculation on the Intel i5 processor; the calculation is conducted through 3 steps described in Section 2. With the increase of the simulation scale , the computation time of MOND simulation reveals a quadratical increase. Thus it is both crucial and necessary to accelerate -body MOND simulations. What is more, Table 1 shows that the time consumption of the 2nd step is just a minority, while the bottleneck of MOND simulation is the potential calculation, both Newtonian and MOND. This fact enables our keystone that accelerations are only applied to potential calculations of steps (1) and (3), whereas the 2nd step is conducted by the common processor, other than an specific accelerator.