Abstract
Simulation speed depends on code structures. Hence, it is crucial how to build a fast algorithm. We solve the Allen–Cahn equation by an explicit finite difference method, so it requires grid calculations implemented by many forloops in the simulation code. In terms of programming, many forloops make the simulation speed slow. We propose a model architecture containing a pad and a convolution operation on the Allen–Cahn equation for fast computation while maintaining accuracy. Also, the GPU operation is used to boost up the speed more. In this way, the simulation of other differential equations can be improved. In this paper, various numerical simulations are conducted to confirm that the Allen–Cahn equation follows motion by mean curvature and phase separation in twodimensional and threedimensional spaces. Finally, we demonstrate that our algorithm is much faster than an unoptimized code and the CPU operation.
1. Introduction
The Allen–Cahn (AC) equation is a reactiondiffusion equation composed of the reaction term and the diffusion term :where is the order parameter which is defined as the difference in concentration of the two components in a mixture; is double well potential energy function with minimum values at and 1, and its form is . is the thickness of the transition layer which is a small positive constant value. The AC equation is first introduced in a research on the phase separation of binary iron alloys [1]. The AC equation was studied and applied to various fields such as image inpainting [2–4], image segmentation [5, 6], and crystal growth [7–9]. Also, there are various methods for numerically solving the reactiondiffusion type equations. In [10–12], the authors use the fourthorder exponential time differencing Runge–Kutta method to solve reactiondiffusion type equation as Burgers–Huxley and Gray–Scott model, respectively. There is a study of solving equations using an adaptive method [13–15]. The authors of [16, 17] considered the Fourier spectral method to solve system equations.
With the development of computer hardware such as GPU (graphics processing unit) and memory cards, neural networks are applied in a wide range of research areas such as computer vision, natural language processing, and numerical analysis. As GPU operations outperform CPU (central processing unit) performance in multitasks and highdimensional problems, open source machine learning libraries such as Pytorch provide a variety of neural networks using GPU and are useful to build the architecture combined with neural networks and numerical methods. Thus, many research studies use machine learning libraries. Raissi et al. [18] proposed physicsinformed neural networks combined by multilayer perceptrons and numerical methods to solve nonlinear partial differential equations. Karumuri et al. [19] introduced a solverfree approach for stochastic partial differential equations, and Yang et al. [20] proposed a Bayesian physicsinformed neural network. In this paper, we propose a structure using padding and convolution operation for the GPU calculation of the AC equation and demonstrate the validity of the proposed structure by verifying the result with the Python code that has the same mathematical meaning.
This paper is organized as follows. In Section 2, we present an explicit finite difference method to solve the AC equation which is implemented by CPU and GPU algorithms. In Section 3, the numerical simulations including a motion by mean curvature, phase separation, and temporal evolutions of various initial shapes are introduced as well as the runtime results between CPU and GPU operations are compared. Finally, conclusions are drawn in Section 4.
2. Numerical Solutions
In this section, we present an explicit finite difference method to solve AC equation (1). Also, we give an explanation of each algorithm for CPU and GPU computing. For simplicity of expression, we describe a numerical scheme for the AC equation in two dimensions (2D), and the definition in the threedimensional (3D) space can be easily extended and considered. A computational domain is defined using a uniform grid of size and for , is the set of cell centers. Here, are mesh sizes on computational domain . For the definition of the boundary condition, we define the extended computational domain as follows:for and . Let be approximations of , where is the temporal step size, is a final time, and is the total number of time steps. The boundary condition is zero Neumann boundary condition:
We define the thickness of the transition layer in equation (1) as [21]:where is the number of grids representing the thickness.
2.1. Numerical Solutions on CPU (Baseline)
The AC equation (1) is discretized using the explicit finite difference method aswhere and . The baseline algorithm is implemented in equation (5) using Numpy (CPU array).
2.2. Numerical Solutions on GPU (Pytorch)
For GPU computing, the AC equation (1) can be expressed using Pytorch, and the algorithm can be represented as Figure 1.where is a Pytorch model. The main Algorithm 1 is a set of steps in equation (6) using Pytorch.

In this Algorithm 1, nn.ReplicationPad2d (or nn.ReplicationPad3d) is applied to satisfy the boundary condition by padding the input using replication of the input boundary. Also, the convolutional operator F.conv2d (or F.conv3d) with the 2^{nd} order differencing filter is used to calculate the diffusion term . The model can choose an operation mode between GPU and CPU by to (device) in Algorithm 1. If device = cuda:0, the model is implemented on GPU, otherwise on CPU. All the codes are available from the first author’s GitHub web page (https://github.com/kimyde/gpuallencahn) and the corresponding author’s web page (https://sites.google.com/view/yhchoi/code).
3. Numerical Experiments
In this section, we perform the following numerical tests in 2D and 3D: effect, the motion by mean curvature effect with various initial shapes (circle (sphere in 3D), dumbbell, star, torus, maze), and phase separation. In the effect test, we compare the numerical solutions to analytic values to find a proper value. We simulate a phenomenon that follows motion by mean curvature in various initial shapes and phase separation with random initial condition. We perform the simulation on the following specifications: Intel (R) Core (TM) i99900K CPU @3.60 GHz, 32 GB RAM/NVIDIA GeForce RTX 2080 Super.
3.1. Convergence Tests
We perform convergence tests for time and space. We define the discrete norm aswhere and is reference solution. We use the following initial condition:with zero Neumann boundary condition on the computational domain .
First, to show the convergence rate with respect to time, the space step size (i.e., ) is fixed, and we use the reference solution with and . Then, the test proceeds to time for time steps , and . The temporal convergence rate is defined as . Table 1 shows that our proposed algorithm has firstorder accuracy for time.
Next, we show the spatial accuracy of our proposed algorithm. To show the spatial convergence rate, the time step was fixed to for each mesh size , and 0.0025. For the reference solution, the time step was set to for each mesh size. The spatial convergence rate is defined as . Table 2 shows that our proposed algorithm has between second and thirdorder for the spatial accuracy.
3.2. Energy Dissipation and Maximum Principle
AC equation (1) follows the energy dissipation law, and the equation is derived from
is decreasing with time
To check a discrete energy, we can rewrite equation (9) as . Figure 2 shows mesh plots and a discrete energy graph when time evolution is performed with random initial condition. The tested initial condition is used as equation (17a) and other parameters are used N_{x} = N_{y} = 200, space step size h = 1/N_{x}, time step size Δt = 0.1h^{2}, and thickness of transition layer ϵ_{10} on the computational domain Ω = (0, 1)^{2}.
(a)
(b)
(c)
(d)
(e)
(f)
(a)
(b)
(c)
(d)
(e)
Also, we can obtain that the AC equation does not conserve the initial mass.with zero Neumann boundary condition.
3.3. Initial Conditions
We consider the 2D and 3D initial conditions introduced in this section: to check the effect, we measure the circle’s (sphere in 3D) radius that changes with time and take the initial conditions:where is the initial radius of a circle (sphere in 3D).
In various tests to check the motion by the mean curvature, the initial conditions for the circle and sphere refer to equations (12a) and (12b), respectively. A dumbbell is constructed by following the initial conditions (o/w: otherwise)where is the initial radius of both sides of dumbbell’s circle (sphere in 3D) and for simplicity of expression, and .
The initial conditions of a star shape are defined aswhereand we use different domain sizes in 2D and 3D, so the center of the star depends on the dimensions.
And, a torus shape is given bywhere and are the radius of major (outside) and minor (inside) circles, respectively. And, for simplicity of expression, .
The initial conditions of a maze shape are complicated to describe its equation, so refer to the code (https://github.com/kimyde/gpuallencahn, https://sites.google.com/view/yhchoi/code).
And, the last initial condition is random for confirming the phase separation of the AC equation:
Here, the function rand has a random value between and 1.
3.4. Simulations in the 2Dimensional Space
Unless otherwise stated, we use the following parameters: mesh size , space step size , time step size , and computational domain . We should find a proper thickness of the transition layer as defined in equation (4). First, we find an appropriate by comparing the numerical solution with the exact solution for the radius that decreases due to the motion by mean curvature in the circle initial condition in equation (12a).
Figures 3(a)–3(d) show a circle shrinking with motion by mean curvature based on . The exact solution of the radius decreasing with time evolution can be calculated [22]:where is the initial radius of the circle, is the dimension, and is time. We simulate various until the final time . As shown in Figure 3, when is used, it is found that the exact solution and the numerical solution are most similar in the experiment. Therefore, the other tests are performed using except for the case of changing the grid in 2D.
Figure 4 shows the temporal evolution of the dumbbell shape, and the initial condition is shown in equation (13a). For resolution, we use and on the computational domain . The final time of the simulation is 0.0094 and . In Figure 4(e), the changing direction by the motion by mean curvature is indicated by arrows.
(a)
(b)
(c)
(d)
(e)
Figure 5 shows the evolution of the star shape created by equation (14a). The parameters are used as mentioned at the beginning of this section and . As shown in Figure 5, the tips of the star move inward and the gaps between the tips move outward. When it changes to the shape of a circle, the change of the radius can be predicted as shown in Figure 3.
(a)
(b)
(c)
(d)
Figure 6 shows the evolution of the torus shape of equation (16a) with , , and . Because the inner circle has a larger curvature, it shrinks faster than the outer circle, and after the inner circle disappears, the change of radius over time can be measured as shown in Figure 3.
(a)
(b)
(c)
(d)
Figure 7 shows the evolution of a maze shape. We use , , and . As shown in Figure 7, we obtain the results of shrinking while maintaining its initial shape.
(a)
(b)
(c)
(d)
The last simulation in 2D is phase separation with a random initial condition equation (17a). In Figure 8, starting with random values with 0.1 amplitude, but over time, phase separation occurs with values to 1.
(a)
(b)
(c)
(d)
According to the results in Table 3, the speed gap between CPU and GPU is significant. In 2D, it is up to 251.6 times the difference between the Python:CPU and the Pytorch:GPU codes. Also, Pytorch:GPU tensors make the model up to 4.73 times faster than Pytorch:CPU tensors in the same code.
3.5. Simulations in the 3Dimensional Space
Threedimensional simulations can be considered as an extension of twodimensional tests. Unless otherwise stated, we use the following parameters: mesh size , space step size , time step size , and computational domain . As in the 2D simulation, we find an appropriate by comparing the numerical solution with the exact solution (using in equation (18)) for the radius of the sphere initial condition in equation (12b).
Figures 9(a)–9(d) show the time evolution results for , and (e) presents the comparison of the numerical and exact solutions of radius for various epsilons. As shown in Figure 9, when is used, it is found that the exact solution and the numerical solution are most similar. Therefore, the other tests are performed using .
(a)
(b)
(c)
(d)
(e)
Figure 10 shows the temporal evolution of the dumbbell shape, and the initial condition is shown in equation (13b). For resolution, we use and on the computational domain . The final time of the simulation is and . The tendency of the 3D dumbbell motion is different from the result of 2D dumbbell. The reason is that, in the initial shape in Figure 10(a), the radius of the handle is much smaller than the radius of the spheres at both ends. Therefore, the handle part shrinks faster than the end of spheres and breaks as in Figure 10.
(a)
(b)
(c)
(d)
Figure 11 shows the evolution of the star shape shown in equation (14b) on the computational domain . The parameters are used as mentioned at the beginning of this section and . As shown in Figure 11, similar to the 2D result, the tips of the star move inward, and the gaps between the tips move outward.
(a)
(b)
(c)
(d)
Figure 12 shows the evolution of the torus shape with initial condition in equation (16b) using and and on the computational domain . Contrary to the 2D result, the inner circle becomes larger because the radius of the minor circle exists. Therefore, the radius of the major circle and the minor circle each has different curvature. Since the radius of the minor circle is smaller than major circle, the mean curvature drives the motion into the inside of the torus as shown in Figure 12.
(a)
(b)
(c)
(d)
Figure 13 shows the evolution of a maze shape. The initial condition is described in code (https://github.com/kimyde/gpuallencahn, https://sites.google.com/view/yhchoi/code). In this simulation, we use and computational domain . As shown in Figure 13, we obtain the results of shrinking while maintaining its initial shape in Figure 13(b), and then merges and shrinks in Figures 13(c) and 13(d). If we use different , it can shrink while preserving its initial shape.
(a)
(b)
(c)
(d)
The last simulation in 3D is phase separation with random initial condition in equation (17b). In Figure 14, starting with random values with 0.1 amplitude, but over time, phase separation occurs with values to 1.
(a)
(b)
(c)
(d)
To estimate the error of the CPU and GPU codes, we use the following defined :where is the total number of iterations, is the average of elements in an array , and means evolution time (i.e., , ). We show the error results obtained by equation (19) to check the difference between the numerical results of CPU:Python (baseline) and GPU:Pytorch. In Table 4, all the errors for any cases are less than .
According to the results in Table 5, the speed gap between CPU and GPU operations is significant. Of course, it will be faster by performing GPU calculations, but we proposed a structure using padding and convolution operation in performing GPU calculations on AC equations. And, the results are demonstrated by verifying the results (Table 4) with Python code which has the same mathematical meaning. In 3D, the GPU performance is much more overwhelming than the CPUs. For instance, the GPU code is 4766 times faster than the Python code in the torus problem. Also, GPU tensors make the model up to 76 times faster than CPU tensors in the same code. The values in parentheses describe how many times the difference is based on GPU:Pytorch time for each test.
4. Conclusions
In this paper, we proposed a structure using padding and convolution operation for the GPU calculation of the Allen–Cahn equation. We increased the simulation speed and demonstrated the validity of the proposed structure by verifying the result with the Python code that has the same mathematical meaning. We solved the Allen–Cahn equation by the explicit finite difference method and compared the runtime results between CPU and GPU algorithms. The errors of CPU:Python and GPU:Pytorch are less than for the given initial conditions. Also, we showed that our GPU code is up to 251.6 and 4765.81 times faster than the CPU codes in 2D and 3D, respectively. By showing these results, accuracy and efficiency have been demonstrated. Various numerical simulations were presented to confirm that the Allen–Cahn equation follows the motion by mean curvature and phase separation in the 2D and 3D space. In this way, we can build a fast algorithm for any differential equations using the finite difference method by efficient programmatic code structures.
Data Availability
All the codes are available from the first author's GitHub web page (https://github.com/kimyde/gpuallencahn) and the corresponding author's web page (https://sites.google.com/view/yhchoi/code).
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
Acknowledgments
This research was supported by the Daegu University Research Grant, 2019 (to Y. Choi).