Abstract

How to solve the numerical solution of nonlinear partial differential equations efficiently and conveniently has always been a difficult and meaningful problem. In this paper, the data-driven quasiperiodic wave, periodic wave, and soliton solutions of the KdV-mKdV equation are simulated by the multilayer physics-informed neural networks (PINNs) and compared with the exact solution obtained by the generalized Jacobi elliptic function method. Firstly, the different types of solitary wave solutions are used as initial data to train the PINNs. At the same time, the different PINNs are applied to learn the same initial data by selecting the different numbers of initial points sampled, residual collocation points sampled, network layers, and neurons per hidden layer, respectively. The result shows that the PINNs well reconstruct the dynamical behaviors of the quasiperiodic wave, periodic wave, and soliton solutions for the KdV-mKdV equation, which gives a good way to simulate the solutions of nonlinear partial differential equations via one deep learning method.

1. Introduction

In recent decades, nonlinear partial differential equations (PDEs) are paid more and more attention in optical fiber communication, condensed matter physics, plasma physics, and fluid mechanics [13]. As one type of PDEs, the integrable equations attract more attention for their good properties; for example, the KdV-mKdV equation is widely used in nonlinear lattices, wave propagation of bound particles, acoustic waves, and thermal pulses [49]. In addition, the KdV-mKdV equation or the Gardner equation,tends to the classical Korteweg–de Vries (KdV) equation or the mKdV equation in the limits of small or large waves, respectively. Via the complex Miura transformation,the KdV-mKdV equation will be transformed into the KdV equation:which provides a model for describing surface and internal waves in oceans and the Rossby waves and can be also used to describe ion acoustic waves in plasmas and dust acoustic solitary structures in magnetized dusty plasmas [1015]. It is usually used to describe the balance between nonlinear wave steepening and linear wave dispersion.

Via linear transformations of the function,which is used to describe the acoustic waves in the plasma, the propagation of an elastic quasiplane wave in a lattice, electromagnetic waves in size-quantized films, and internal ocean waves under certain stratification [1619]. Because PDEs contain more than two nonlinear terms or the lowest-order nonlinearity is cubic [4, 5, 20, 21] in many physics applications, the exact and numerical solutions of the KdV-mKdV equation can more visually depict the various phenomena of nature and help understand the physical mechanism of the phenomenon.

Recently, with the explosive growth of available data and computing resources, recent advances in machine learning and data analytics have yielded transformative results across diverse scientific disciplines, including image recognition, cognitive science, and genomics [2225]. Machine learning techniques demonstrate impressive results for a range of highly complex tasks, especially where an accurate mathematical representation of the problem cannot be obtained [26, 27]. Therefore, the application of such techniques to solve PDEs will become a new subfield of machine learning. The physics-informed neural networks (PINNs) are introduced in [28, 29] and have become one of the most popular deep learning methods. PINNs employ a neural network as a solution surrogate and seek to find the best neural network guided by data and physical laws expressed as PDEs. A series of studies have shown the effectiveness of PINNs in fractional PDEs, stochastic differential equations, integrable systems, biomedical problems, and fluid mechanics [3037]. Therefore, the deep learning method with underlying physical constraints is introduced to obtain the data-driven quasiperiodic wave, periodic wave, and soliton solutions of the KdV-mKdV equation in this paper.

The rest of this paper is organized as follows: Section 2 introduces the PINNs and briefly presents some problem setups. Section 3 obtains data-driven quasiperiodic wave, periodic wave, and soliton solutions for the KdV-mKdV equation by the PINNs and discusses the effects of different numbers of initial points sampled, residual collocation points sampled, network layers, and neurons per hidden layer. Conclusion is given in the last section.

2. PINNs Method

In this section, the general architecture of PINNs is introduced and applied into the KdV-mKdV equation. Following notation similar to [33], the general form of the functions that the PINNs can approximate iswhere u(t, x) is the solution and is a nonlinear operator connecting the state variables u with the system parameters . The term t denotes time, and x denotes the system input. The domain can be bounded based on prior knowledge of the dynamical system, and [0, T] is the time interval of system evolution. The model parameters can be constant or unknown. In case that is unknown, approximating function (6) becomes a problem of system identification, where we seek the parameters for which the expression in (6) is satisfied. In order to describe the physical laws of the dynamical system, the physics-informed neural network f(t, x) is defined as

Note that if the system parameters are known, the nonlinear operator simplifies to . A neural network is used to predict u(t, x) based on the inputs t and x. To determine f(t, x), automatic differentiation [38] of the components of the neural network u(t, x) is used. Based on this, the required derivatives of u(t, x) with respect to time t and system inputs x are computed. As a result, the neural network f(t, x) has the same parameters compared to the neural network u(t, x), but different activation functions. The shared parameters of these two neural networks are optimized by minimizing the loss function:where denotes the mean squared error loss corresponding to the initial data, is the total number of training data, denotes the mean squared error at a finite set of collocation points, and is the total number of collocation points. The number of collocation points and training data influence the prediction accuracy and the computational time to optimize the loss function. The error enforces the boundary conditions of the independent variables x, and enforces the physics of the dynamical system imposed by condition (5), i.e., it penalizes deviations of the predicted physical law. Given a training data set and known system parameters , the parameters (weights and biases) of the neural networks are obtained to minimize (7). If the parameters are unknown, the same objective is trained but the system parameters are regarded as additional variables.

In this paper, the L-BFGS algorithm is used to optimize all loss functions. The algorithm is a full-batch gradient-based optimization algorithm based on a quasi-Newton method [39]. All codes in this article are based on Python 3.7 and TensorFlow 1.15, and all numerical examples reported here are run on a Lenovo Legion Y7000P 2020H computer with 2.60 GHz 6-core Intel(R) Core(TM) i7-10750H CPU and 16 GB memory. The hidden layers of the neural network use hyperbolic tangent activation functions. All codes can be mainly refered to https://github.com/maziarraissi/PINNs.

3. Data-Driven Solutions to the KdV-mKdV Equation

In the following, we consider the (1 + 1)-dimensional KdV-mKdV equation along with initial-boundary value conditions:where a is the constant and and are certain boundary functions.

Now, f(t, x) is defined as

The quasiperiodic wave, periodic wave, and soliton solutions of the KdV-mKdV equation have been obtained by many different methods, such as variational iteration method [40], tanh-function method [41], Jacobi elliptic function method [4244], and so on. Here, the quasiperiodic wave, periodic wave, and soliton solutions are simulated by using the PINNs and compared with the known exact solutions, so as to prove the effectiveness of solving the numerical solutions u(t, x) by neural networks.

3.1. Data-Driven Quasiperiodic Wave Solution

Firstly, we introduce the quasiperiodic wave solution which have been derived the Jacobi elliptic function method in [44], and the quasiperiodic wave solution is formed aswhere cn( . ) is the elliptic cosine function and .

In order to numerically construct data-driven quasiperiodic wave solution of equation (8), quasiperiodic wave solution (10) is reduced towhen k = 0.9. The corresponding initial condition is obtained by substituting a specific initial value into equation (11):with taking and as [0, 0.5] and [−10, 10], respectively.

The traditional finite difference model on even grids in MATLAB with initial data (12) is adopted to generate the training data. Specifically, by dividing space [0, 0.5] into 51 points and time [−10, 10] into 401 points, quasiperiodic wave solution u(t, x) is discretized into 401 snapshots accordingly. We subsample a small training dataset that contain initial-boundary subsets by randomly extracting from original initial-boundary data and collocation points which are generated by LHS [45]. After giving a dataset of initial and boundary points, the latent quasiperiodic wave solution u(t, x) is successfully learned by using Python and TensorFlow [46] with 4 hidden layers and 60 neurons per hidden layer to tune all learnable parameters of the neural network and regulating the loss function (8). The model achieves a relative error of 7.880803e−03 in about 1441 seconds, and the number of iterations is 8080.

Figure 1(a)shows the density diagrams of the quasiperiodic wave solution and clearly compares the exact solution with the learned spatiotemporal solution. The comparisons of different spatial locations x = 0.1, 0.25, and 0.4 are presented in the bottom panel of Figure 1(a). Figure 1(b) presents the error diagram about the difference between the exact quasiperiodic wave solution. Combining with Figure 1(b), it is visible that the error between the numerical solution and the exact solution is very small. The error is mainly concentrated in the area where the solitary wave solution is generated; that is to say, the oscillation has a certain influence on the PINNs. The three-dimensional motion of the predicted solution and the loss curve at different iterations are given out in detail in Figures 1(c) and 1(d). From Figure 1(d), it is obvious that there are some obvious fluctuations in the training only at the start. The results show that the loss curve is very smooth when the number of iterations is more than 400, which proves the effectiveness and stability of the PINNs.

In addition, based on the same initial and boundary values of the quasiperiodic wave solution in the case of and , the control variable method is used to study the influence of different numbers of network layers and neurons per hidden layer on the quasiperiodic wave solution dynamics of the KdV-mKdV equation. The relative errors of different network layers and different neurons per hidden layer are given in Table 1. The relative errors with 4 network layers and 40 neurons per hidden layer when taking different numbers of subsampling points in the initial-boundary data and collocation points are shown in Table 2. From the data in Table 1, it can be seen that the number of network layers and single-layer neurons have no obvious regularity in the influence of the relative error and both have certain influence. However, the influence of the number of single-layer neurons is greater. To sum up, it is eyeable that the network layers and the single-layer neurons jointly determine the relative error to some extent. In the case of the same training data set, from Table 2, it is obvious that the influence of on the relative error of the network is not obvious, which also indicates the network model with physical constraints can uncover accurate predicted solutions with smaller initial-boundary data and relatively many sampled collocation points.

3.2. Data-Driven Periodic Wave Solution

In this section, we study the periodic wave solution.when k = 0.6 in equation (10). Then, and are taken as [0, 0.5] and [−10, 10], respectively. The corresponding initial condition is obtained by substituting a specific initial value into equation (14).

The periodic wave solution is simulated numerically by using the same date generation and sampling method as the quasiperiodic wave solution. Then, from original initial-boundary data and collocation points are sampled randomly. After training by Python and TensorFlow with 8 hidden layers and 20 neurons per hidden layer, the model achieves a relative error of 3.289852e−03 in about 873 seconds and the number of iterations is 7392.

Similar to Figure 1, Figure 2 shows the density diagrams, the error diagram about the difference between exact and learned periodic wave solutions, the three-dimensional figure, and the loss curve figure, respectively. In addition, the relative errors of different network layers and different neurons per hidden layer are given in Table 3. At the same time, the relative errors of different numbers of and in the case of 4 network layers and 40 neurons per hidden layer are shown in Table 4. By comparing with Table 1, it is found that the influence of the number of network layers and the single-layer neurons on the relative error is exactly the opposite. Obviously, combining Figure 1(b) and Tables 1 and2, it is eyeable that due to the simpler structure of the periodic solution, the relative error of the periodic solution is significantly smaller. At the same time, from Table 4, it is obvious that and have a certain influence on the relative error of the network, but the regularity is not obvious. From Figure 2(d), it is visible that there are some obvious fluctuations at the start and end of training, but they have little effect on the overall training, so the PINNs is still effective and stable.

3.3. Data-Driven Soliton Solution

Similar to Section 3.2, the soliton solution,is obtained by equation (10) when k = 1. Then, the corresponding initial condition,is obtained with taking and as [0,0.5] and [-10,10], respectively.

The same date generation and sampling method as in Section 3.1 are used to simulate numerically the soliton solution. from original initial-boundary data and collocation points are sampled randomly, which builds up a small training dataset. After giving a dataset of initial and boundary points, the latent quasiperiodic wave solution u(t, x) is successfully learned by Python and TensorFlow with 4 hidden layers and 40 neurons per hidden layer to tune all learnable parameters of the neural network and regulating loss function (7). The model achieves a relative error of 3.070442e−03 in about 82 seconds, and the number of iterations is 4119.

Compared with the abovementioned figures and tables, it is obvious that the relative error of the soliton solution in Table 5 is larger than that of the quasiperiodic wave solution and smaller than that of the periodic wave solution. Similarly, the error is mainly concentrated in the place where the solitary solution is generated. At the same time, the influence of the number of network layers and the single-layer neurons on the relative error in Table 6 is similar to that of periodic wave solution. From Figure 3(d), it is visible that there are some obvious fluctuations only when the number of iterations is between 0 and 300 which proves the effectiveness and stability of the PINNs.

4. Conclusion

In this paper, the data-driven quasiperiodic wave, periodic wave, and soliton solutions of the KdV-mKdV equation are gained by the PINNs. Specifically, we discussed the influence of different numbers of network layers and neurons per hidden layer to solve the data-driven solutions of the KdV-mKdV equation. It is visible that the network layers and the single-layer neurons jointly determine the relative error to some extent and the error is mainly concentrated in the place where the solitary solution is generated. At the same time, according to the structures of the solutions, the applicability of the PINNs is also different. In other words, the more complex and regular the structure of the solutions, the better the effect of the PINNs. Remarkably, these results show that the PINNs can exactly recover different dynamical behaviors and obtain the data-driven solutions more quickly and efficiently for some nonlinear partial differential equations which cannot be solved by the traditional methods. Moreover, due to the physical constraints, the network is trained with just few data and has a better physical interpretability.

The PINNs obtain a series of results about various problems in the interdisciplinary fields of applied mathematics and computational science, which opens a new path for using deep learning to simulate unknown solutions and correspondingly discover the parametric equations. In this paper, we do not discuss the influence of noise on the neural network model, more complex boundary conditions, and more sampling methods. In the future work, we will focus on studying these problems to make the PINNs more universal and general.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors thank M. Raissi, P. Perdikaris, and G. E. Karniadakis for sharing the CODE about PINNs (https://github.com/maziarraissi/PINNs). In addition, the authors are very grateful to Professor Y. Chen for his report and helpful discussions on PINNs at Shandong University of Science and Technology (http://export.arxiv.org/pdf/2011.04949). This work was supported by the National Natural Science Foundation of China (Grant nos. 11975143, 12105161, and 61602188), Natural Science Foundation of Shandong Province (Grant no. ZR2019QD018), CAS Key Laboratory of Science and Technology on Operational Oceanography (Grant no. OOST2021-05), and Scientific Research Foundation of Shandong University of Science and Technology for Recruited Talents (Grant nos. 2017RCJJ068 and 2017RCJJ069).