#### Abstract

One can reformulate chemical master equations of the stochastic reaction network into a partial differential equation (PDE) of a probability generating function (PGF). In this paper, we present two improvements in such PGF-PDE approach, based on perturbation and double-truncation, respectively. The stiff system that involves fast and slow reactions together often requires high computational cost. By applying the perturbation method to PGF-PDEs, we expand the equation in terms of a small reaction rate which is often responsible for such stiffness of the system. Also by doubly truncating, we dump relatively small terms and reduce the computational load significantly at each time step. The terms corresponding to rare events are sieved out through truncations of Taylor expansion. It is shown through numerical examples of enzyme kinetics, transition model, and Brusselator model that the suggested method is accurate and efficient for approximation of the state probabilities.

#### 1. Introduction

Stochastic chemical kinetics arises mostly in the modeling of chemical reactions in relatively small biochemicalbhlt systems such as gene regulatory networks, cell systems, and protein folding models [1]. The governing equation, chemical master equation (CME), describes the probability distribution of the system under the Markov assumption aswhere , each denotes the molecular number of th species at time , is the propensity function or probability intensity for the th reaction, and denotes the th column of the stoichiometric matrix of which th entry is the change in the number of molecules of the th species by the occurrence of the th reaction [2].

Since (1) is a high dimensional system in most real applications, it is very challenging to handle both analytically and computationally. One of the key methods to help simplify the computation of the stochastic dynamics is done by using the stochastic simulation algorithm (SSA) [3]. One realization of the SSA shows the trajectory of the time-evolution of the states. Since it is basically a Monte-Carlo type algorithm, one has to perform many realizations to obtain important statistical quantities such as mean and variance. Moreover, it often requires a large number of realizations, especially in cases of the systems consisting of large numbers of molecules or very different time scale reactions. There have been many approaches to improve the SSA including Tau-leaping methods [4–6], reduction methods on the slow time scale [7–10], and moment closure methods [11–13].

In the recent papers [14, 15], the authors presented numerical schemes based on probability generating functions (PGF) for finding the probability distribution and moments. One can reformulate chemical master equations of the stochastic reaction network into a partial differential equation (PDE). Although such PDEs are mostly hard to deal with due to variable coefficients and lack of proper boundary conditions, several improvements in approximation of the moments based on Pade approximation and numerical superimposition have been made.

In this paper, we present two improvements in such PGF-PDE approach. The first method is an approximation method called “perturbed probability generating function (PPGF) method.” By applying the perturbation method to PDEs, we expand the equation in terms of a small reaction rate which is often responsible for the stiffness of the system. This enables us to handle the system more efficiently with less iteration and therefore capture same accuracy faster.

We also develop the double truncation method (DTM) that efficiently approximates the state probabilities. As suggested from the name, the method combines two truncation procedures for the series solution to PGF-PDEs.

Through consecutive truncations of the Taylor expansion of such PDEs in time, one can dump relatively rare events which correspond to excessive occurrence of reactions during a short time interval. This elimination is justified from the fact that the coefficient of each term in the PGF is the probability of a specific state. This enables us to ignore a great deal of small terms which do not affect the system significantly.

An outline of the paper is as follows. In Section 2, we review the properties of PDEs that PGF should satisfy. The perturbation of PGF is discussed in Section 3 with its numerical accuracy of through some examples. Section 4 presents the double truncation method for the PGF-PDEs. In Sections 3 and 4, we also illustrate numerical accuracy of the method in several examples.

#### 2. PGF-PDE Approach

The probability generating function is defined aswhere , , , and After differentiation of (2) with respect to and application of (1), one can derive a PDEHere , , is the highest order of the reactions in the chemical system, and denotes any th order partial derivatives of as One can see that the PDE (4) is linear and is a polynomial function of degree at most order which satisfies for .

The conditions of the PDE (4) can be found from the conditions on [14]. The initial condition is where is the initial condition of , and also it is true that

Using the solution of the PDE (4), one can find the important information for stochastic reaction network such as marginal probability, mean, and the second moment [14]. If denotes the number of molecules of th species at time , the probability that at time isand the mean and second moment are obtained by the first and second derivatives of where denotes the expectation of .

#### 3. Perturbation of Probability Generating Function Method

##### 3.1. Perturbation of PGF

For the PGF method recently proposed in [14], is represented by a power expansion with respect to aswhere , , are coefficient functions of . Note that the initial condition determines the first coefficient as .

Suppose that the reaction rates in the network do not change with time and therefore the coefficients of the PDE do not involve explicitly. This implies in (4). Plugging (10) into the both sides of (4) and comparing the power series lead to a recursive relation between and , . One can derive coefficient functions from and then construct an approximation for sufficiently large . Since the initial function and the coefficient of the PDE (2) are polynomials, all generated are polynomials as well.

Now, let us apply the perturbation method to PGF. Since many practical chemical networks that interest us involve reactions in a broad range of time scales, from seconds to microseconds, it is important to efficiently handle such stiffness in simulation of the systems. Suppose that is a small reaction rate in the system; then one can expand the PGF aswhere , , are functions to be determined. As in (10), one can set for each Now, instead of (10), one can seek in the following form:for large and . By plugging (13) into the both sides of (4) and comparing the power series, one can find recurrence relations for and , where . This decomposition enables us to handle stiffness caused by properly, by adjusting the perturbation order as will be shown in the numerical examples in the next sections.

##### 3.2. Example 1: Enzyme Kinetics

As the first application, we consider a fundamental chemical kinetic model, the enzyme-substrate reaction systemwhere , , , and denote enzyme, substrate, enzyme-substrate complex, and product, respectively, and , , and are probability constants for reactions. If we denote the molecular numbers of , , , and by , , , and , respectively, the governing equation of the stochastic enzyme-substrate system is obtained [16] as

One can see that there are two conservation quantities in the system and thus we can remove two of variables, say , . After the removal, we can derive a PDE of from the master equation [14] Here is the conserved quantity and we use the notations

In the example, we use parameters , , and and let the perturbation be expanded in . Figure 1 shows comparison of the exact solution of the master equation (15) and the approximate solution obtained from the PPGF method.

**(a)**

**(b)**

##### 3.3. Example 2: Brusselator Model

We consider the Brusselator model:Here , are molecular species and , , , and are constant species. The Brusselator is a generic model for a type of autocatalytic reaction. It is well known that the model reveals simple chemical oscillations. Refer to Figure 2(a) for a typical run of the model. If we denote the number of , by , , respectively, we obtain the PDE for the probability generating function as follows:Here and the reaction rates , , and are used. We use PPGF with respect to the smallest parameter . Figure 2 compares the corresponding numerical result to those from SSA.

**(a)**

**(b)**

**(c)**

**(d)**

#### 4. Double Truncation Method

##### 4.1. Truncations in and

The truncation method was firstly introduced in [15] to deal with PGF-PDEs. Although the method provides a fast way to evaluate the mean and the variance, most of high order terms need to be thrown away for efficient computations at each time step. As a result, the statistical information that is carried in high order terms, such as the higher order moments and the probabilities of the states, is generally not available from the method.

In this section, we develop an extension of the truncation method which enables us to carry high order terms in the computation. This is done by another truncation in at each step. Note that this is* not* just additional manipulation to simply make further elimination after the first truncation. By introducing the second truncation, we can alleviate the restriction on the original truncation in and allow higher order terms to remain, which is essential for computation of high order moments and probabilities.

In the second truncation, the terms with relatively small coefficients are dropped. It is justified from the fact that the coefficient of each term in the PGF is the probability of a specific state. We can therefore carry over high order terms, while keeping a reasonable number of terms only. This is important since evaluation of the high order moments and the probabilities is generally expensive, especially when the system involves stiff chemical reactions. Since DTM can preserve the high order terms without increasing the total number of the terms, it enables us to easily evaluate the probabilities for the states with many molecules.

*The First Truncation.* To overcome gradually increasing computational load in the series expansion (10), we restrict the series solution onto a short time interval. That is, on each interval , , , we construct the separate Taylor expansion at the center of finite order . We expect that side-by-side PGFs on consecutive intervals, and , are equal on the boundary as More generally, where is a “carry-over” functional. Our motivation of this successive superimpositions is to keep approximation order low in favor of computational efficiency while carrying over essential information like first several moments to a next function over the following time interval. Note that truncation of high order temporal terms can be justified in that, in PGF computation, represents how many reaction would happen for that interval , . Elimination of such terms has an effect of excluding excessive occurrence of reactions during such short interval.

One of the issues in the first truncation is how to determine carry-over functional . Although the best approximation in this framework would be obtained with the identity functional, it is not a practical choice since the number of terms involved in the computations keeps increasing rapidly. In [15], was suggested as a projection onto , a set of polynomials of degree three or less, which preserves the mean and the variance. However, such functional does not preserve the information related to high order derivatives of PGF and therefore approximation of quantities such as the state probabilities is not possible. Here we propose a different type of carry-over functional that preserves high order information. This functional indeed works as another truncation on what remains after the first truncation.

*The Second Truncation.* At time , the separate PGF becomeswhere means the probability for the state being at time . Now we define the carry-over functional such that it maps (22) into where is a function defined as Application of such implies that we sieve the terms out of which coefficients are less than . One can see from (2) that those terms correspond to rare events of which coefficients probability is less than . Considering that the iterative process in the PGF approach is a linear operation with respect to , manipulation on a term only affects entailed terms, each of which represents possible consequence of the corresponding state. Since the probabilities of all the entailed events and their sum as well cannot exceed the probability of the eliminated event, the effect of elimination does not propagate or grow in probability.

Note that this truncation process is clearly consistent in that the scheme reverts to the first truncation as . However, one should be careful when applying the second truncation to approximation of the moments, since the moments do not solely depend on the state probabilities. For example, suppose that a rare state eventually gives a birth to large number of a species in an open system. If the mean is what is sought, one cannot simply omit such state for the reason that the corresponding probability is small, since the elimination may substantially change the resulted mean or variance in approximation. Hence, for the moment approximation of the open systems, it is better to use the above functional together with other carry-over functional as suggested in [15].

##### 4.2. Example 1: Enzyme Kinetics

Once again, we use the enzyme kinetic model to test the performance of DTM. Figure 3 shows comparison of approximate solutions obtained from DTM and those from SSA. Note that, in the perturbation method in Section 3.2, evaluation of the probabilities was available only for the states that correspond to small number of that species, . This is because the probability computation (8) requires as many times of differentiation as the number of the species at the state. However, one can see that DTM successfully resolves this problem and produces the probabilities of the states corresponding as many as .

**(a)**

**(b)**

**(c)**

##### 4.3. Example 2: Transition Model

We consider the transition network in the eukaryotic cell cycle as follows [14, 17]:

We denote the number of molecules of , , , , , , , , , and by , , , , , , , , , and , respectively.

We obtain a PDE of the moment generating function [14]

In Figure 4, we compare the simulation results obtained from the SSA and the DTM.

**(a)**

**(b)**

**(c)**

#### 5. Conlusion

In this work, we presented the double truncation method (DTM) that improves the probability generating function (PGF) method for computation of the solutions of stochastic reaction networks. Advantage of the PGF method is that, rather than struggling with the chemical master equation of extremely high order, one can handle a partial differential equation (PDE) of low order. However, it still requires intense computation for high order approximation, especially for the systems with stiff parameters. The method proposed in this paper applies the perturbation method to the corresponding PGF for reaction networks. That is, we expand the PGF in the parameter that is responsible for stiffness of the system. Then the PGF-PDE leads to a recurrence relation of the coefficient functions with respect to both the time and the corresponding parameter, which we can easily solve symbolically. It turns out that the suggested method requires a lower approximation order than the original PGF method, while it produces as good approximation as such.

While mean and variance of species are easily traceable in single truncation suggested in [15], efficient evaluation of the state probability was still a challenging task. In Section 4, we showed that the probability evaluation though PGF-PDEs can be achieved by doubly truncating the Taylor series at each time step. Through consecutive truncation processes, we dump relatively rare events, most of which are generated from excessive occurrence of reactions during a short time interval. This also implies elimination of a great deal of terms and significant reduction in computation.

It turns out in several numerical examples that the suggested method produces good approximation for the state probabilities. We expect that the DTM will be used for a fast and accurate approximation for computation of real and complex stochastic chemical reaction networks.

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

Pilwon Kim acknowledges that this work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2011-0023486). Chang Hyeong Lee was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2014R1A1A2054976).