Abstract

HSimulator is a multithread simulator for mass-action biochemical reaction systems placed in a well-mixed environment. HSimulator provides optimized implementation of a set of widespread state-of-the-art stochastic, deterministic, and hybrid simulation strategies including the first publicly available implementation of the Hybrid Rejection-based Stochastic Simulation Algorithm (HRSSA). HRSSA, the fastest hybrid algorithm to date, allows for an efficient simulation of the models while ensuring the exact simulation of a subset of the reaction network modeling slow reactions. Benchmarks show that HSimulator is often considerably faster than the other considered simulators. The software, running on Java v6.0 or higher, offers a simulation GUI for modeling and visually exploring biological processes and a Javadoc-documented Java library to support the development of custom applications. HSimulator is released under the COSBI Shared Source license agreement (COSBI-SSLA).

1. Introduction

Computational systems biology is becoming a fundamental tool of life-science research, which aims at developing models representing biological phenomena and reliable computational techniques for their simulation [17].

We introduce HSimulator, a Java hybrid stochastic/deterministic simulator for mass-action biochemical reaction systems placed in a well-mixed environment, where position and speed of molecular species are randomized and therefore they do not affect reaction executions. Species are represented in terms of abundances (number of molecules) and reactions are defined aswhere the species on the left of the arrow are reactants and the ones on the right are products. The stoichiometric coefficients and indicate how many molecules of reactant are consumed and how many molecules of product are produced, respectively. The constant at the end of the reaction is the stochastic reaction constant introduced by Gillespie [8] for computing reaction firing according to the definition of mass-action propensities.

From the milestone work of Gillespie [8], where the Direct Method (DM) has been defined, exact stochastic simulation is the most accurate simulation approach. Its drawback is the high computational complexity arising from the need of separately simulating each reaction firing. Several algorithms have been introduced to decrease simulation runtime, such as the Next Reaction Method (NRM, [9]) and later the Rejection-based Stochastic Simulation Algorithm (RSSA, [10]). The latter is tailored for complex biochemical reactions with time-consuming propensity functions and it constitutes the state of the art of exact stochastic simulation.

Despite the improvements, the problem of simulation complexity remains. In fact, the investigation of complex diseases or disorders is often concerned with extended biomolecular networks involving genes, proteins, metabolites, and signal transduction cascades. This justifies the introduction of approximate simulation strategies, which sacrifice accuracy to decrease runtime. Such strategies progressively reduce the stochasticity of the dynamics [11, 12] until reaching a deterministic simulation, where the model is translated to a system of ordinary differential equations (ODEs, [7, 13]) and the dynamics provides an averaged behavior.

A drawback of an approximate simulation is that all reactions in the model are simulated according to the same approximate strategy. This is an issue when exact simulation is required at least for a small part of the reaction network, for example, when slow reactions are considered to model rare stochastic events. In such a case, a hybrid simulation strategy is able to partition reactions into subsets that are simulated by different algorithms [7, 14, 15]. Hybrid simulation therefore needs to synchronize the progress of different simulation strategies in order to guarantee the exactness of the simulation of slow reactions. This requires considering time-varying transition propensities, which in turn require the computation of integrals to calculate the exact firing time of slow reactions [7, 14, 1620] (see also Section 2.5 and (19)). This step is computationally demanding and practical implementation necessarily approximates the integral computation by a numerical method that affects the accuracy of the simulation of slow reactions.

The Hybrid Rejection-based Stochastic Simulation Algorithm (HRSSA) has been recently introduced in [21] to avoid approximations in simulating slow reactions. HRSSA exploits some computational advantages introduced in RSSA to exactly simulate slow reactions and to apply an efficient and accurate dynamic partitioning of reactions, which constitute the two most significant bottlenecks of hybrid simulation.

HSimulator herein presented fills a gap in the current literature of stochastic and hybrid simulation by providing a suite of published state-of-the-art simulation algorithms including, in the same package, the exact algorithm RSSA and the first publicly available implementation of its hybrid version HRSSA. The benchmarks in the paper show that the HSimulator implementation is often faster than the state-of-the-art simulator COPASI [22]. HSimulator is released under the COSBI Shared Source license agreement (COSBI-SSLA) and it is available for download at the COSBI website at http://www.cosbi.eu/research/prototypes/hsimulator.

2. Materials and Methods

HSimulator provides an implementation of five state-of-the-art simulation strategies covering exact stochastic simulation (DM and RSSA), deterministic simulation (Forward Euler and the Runge-Kutta-Fehlberg RK45 adaptive algorithm), and hybrid simulation (HRSSA). For deterministic and hybrid simulations, the simulator automatically translates the mass-action reaction network into a set of ordinary differential equations (ODEs, see Section 2.3). In the following some insights about the implemented simulation algorithms are provided as well as their pseudocodes, which can be used as a reference to better understand the functionalities implemented in the simulator (for additional details, the reader is invited to refer to the user guide of HSimulator available at http://www.cosbi.eu/research/prototypes/hsimulator). Readers already familiar with the topic can skip this section and going directly to Section 3.

2.1. Direct Method (DM)

The Direct Method constitutes the first practical implementation of an exact stochastic simulator [8]. The simulation of an exact algorithm is based on the concept of reaction probability density function (pdf):which provides the probability that the next reaction to apply will be in the infinitesimal time interval , given the system state at time .

The state of the system at time is represented by the column vectorwhere is the number of molecules of species in the system at time . In the following we will often write for and for to improve the readability of formulas.

The term in (2) is the propensity of in the state , while the total propensity is the sum of the propensities of all reactions:The reaction propensities are computed from the stochastic reaction rate :where is the number of distinct reactant combinations for in the state [8]. For standard mass-action kinetics, as considered here, it iswhere are the stoichiometric coefficients of according to (1).

Equation (2) shows that the next reaction fires with a discrete probability and its firing time has an exponential distribution with rate . Gillespie introduced the Direct Method (DM) for exactly sampling the pdf by applying the inverse transformation:where and are random numbers drawn from a uniform distribution . The pseudocode of the DM algorithm is in Algorithm 1.

Input: a reaction system with species and reactions as the one in (1), with a stochastic
reaction constant associated with each reaction ; an initial state defining molecule
abundances at time ; the last time instant to be simulated.
Output: a time series of states , providing the dynamics of the system.
Pseudocode:
  ;
  while    do
     For each reaction , compute reaction propensities according
     to (5);
     Compute according to (4);
     Generate two random numbers in ;
     ;
     Select such that satisfies
               ;
     Compute by applying to ;
     ;
  end while
2.2. The Rejection-Based Stochastic Simulation Algorithm (RSSA)

The Rejection-based Stochastic Simulation Algorithm (RSSA) is a novel exact stochastic simulation algorithm introduced in [10] and further improved in [7, 2426]. RSSA constitutes the state of the art of exact stochastic simulation tailored for complex biochemical reactions with time-consuming propensity functions. The pseudocode of RSSA is provided in Algorithm 2.

Input: The same as DM (Algorithm 1) together with a simulation parameter for
calculating the fluctuation interval of the system state.
Output: a time series of states , providing the dynamics of the system.
Pseudocode:
  ;
  while    do
     Define the fluctuation interval according to (8);
     For each reaction , compute propensity bounds , and
     the total propensity ;
     while    do
       Generate three random numbers in ;
       ;
       Select such that satisfies
               ;
       ;
       if    then
         ;
       else
         Compute according to (5);
         if    then  ;
       end  if
       if    then
         Compute by applying to ;
       end  if
       ;
     end  while
  end  while

RSSA computes for each reaction a lower bound and an upper bound of the propensity, which encompass all possible values of reaction propensities over the fluctuation interval. Such bounds are derived by defining a fluctuation interval for the system state (step ), whereand is a simulation parameter whose value is usually between and ( of the system state). Because reaction propensities with mass-action kinetics are monotonic functions, the propensity bounds of reaction , for , correspond to the interval (step ). These propensity bounds are recomputed only when the current system state is not anymore inside its fluctuation interval ().

The selection of reaction firing in RSSA is composed of two steps. First, it selects a candidate reaction with probability (step ), where . The candidate reaction is then validated according to a rejection-based procedure that implements the toss of a biased coin with success probability (steps ). To do this, the algorithm generates a random number in and moves in one of the following three cases:(1)If , then is accepted without computing its propensity because (quick accept);(2)If , then reaction propensity is computed and is accepted if (slow accept);(3)If , then is rejected (rejection).

After a reaction is accepted to fire, the state is updated and the algorithm moves to the next simulation iteration without updating the propensity bounds. Only for uncommon cases when state moves out of its fluctuation interval does RSSA have to define a new fluctuation interval and update the propensity bounds. Numerical experiments show that propensity updates in RSSA are infrequent; hence their impact is very low during the computation [10]. This advantage is mitigated by the penalty paid to prepare a potential advancement of the system for a candidate reaction that is finally rejected to fire, but the overall performance is still considerably better than DM and other stochastic simulation algorithms.

2.3. The Forward Euler Algorithm

The forward Euler algorithm is the simplest example of deterministic simulation algorithm. Here the mass-action model is translated to a system of ordinary differential equations (ODEs, [7, 13]) and the dynamics provides an averaged behavior of the system.

Consider a biochemical reaction system with species, reactions defined according to (1). The mass-action deterministic rate constant of can be obtained from the stochastic one aswhere indicates Avogadro’s number, is the volume where the reaction is taking place, and is the overall order of reaction , which in mass-action is defined as the sum of the stoichiometric coefficients of reaction reactants. Finally, the ODE describing the evolution in terms of molar concentrations of species , , iswhere squared brackets are used to indicate molar concentration of species (). Once the model has been translated to a system of ODEsa first-order approximation of the next system state can be computed bywhere is a user-defined discretization stepsize. The pseudocode of the Forward Euler algorithm is provided in Algorithm 3.

Input: a system of ODEs corresponding to a biochemical reaction
system, the initial state of the system with species concentrations at time , the last
time instant to be simulated and the discretization stepsize .
Output: a time series of states , providing the dynamics of the system in
terms of molar concentrations with discretization stepsize .
Pseudocode:
  initialize time and state ;
  while    do
     update ;
     update ;
  end while
2.4. Runge-Kutta-Fehlberg Algorithm (RK45)

The forward Euler algorithm introduced in the previous section is a first-order numerical method provided for didactic purposes. Several numerical methods have been introduced to increase the accuracy of deterministic simulations and to decrease simulation runtime. A popular algorithm is the Runge-Kutta-Fehlberg algorithm (RK45). This algorithm represents the standard choice of several numerical integrators when the model does not exhibit stiffness. It is also often used in hybrid simulation strategies for the simulation of the fast part of the network. The reader can find a comprehensive guide to numerical methods of ODEs in [7, 13].

The Runge-Kutta Fehlberg method of fourth-order, also known as the RK45 method, updates the system state bycoupled with a fifth-order RK methodGiven a system of ODEs as in (11) and a discretization stepsize , the values of are shared between (13) and (14) and they can be computed asThe fourth-order version of the algorithm provided in (13) is used to compute the dynamics of the system. The fifth-order scheme of (14), instead, is used to estimate the maximum local truncation error introduced in the simulated step:where provides the maximum value along the vector of truncation errors. The error estimate is then compared to the error threshold specified by the user. When , the local truncation error is assumed to be smaller than the threshold, the state is accepted, and the algorithm moves one step forward. Otherwise, the new state is not accepted and the next state is evaluated again using a different (smaller) value of . In both cases, the value of is updated asWhen the estimations and agree to more significant digits than required, then becomes greater than and is increased. Equation (18) is derived from the general formula , which defines how to update the value of of an adaptive one-step numerical method of order , by considering the error estimate and the user-defined threshold . The additional multiplicative factor of is an empirical number commonly added in RK45 implementation to reduce the variability of , because very high values of the stepsize increase the probability of repeating the next computed step.

The implementation of the algorithm is in Algorithm 4. The value of is updated at each step starting from an user-provided initial value . The next computed state of the system is accepted only when the estimate of the local truncation error remains below the user-provided threshold . Steps are additional steps added to the implementation in order to avoid very large modifications of in a single step.

Input: a system of ODEs corresponding to a biochemical reaction
system, the initial state of the system with species concentrations at time , the last
time instant to be simulated, an initial value for the discretization stepsize and a
threshold for the maximum local truncation error .
Output: a time series of states , providing the dynamics of the system in
terms of molar concentrations.
Pseudocode:
  initialize time , state and discretization stepsize ;
  while    do
     compute ;
     compute ;
     compute ;
     compute ;
     compute ;
     compute ;
     compute ;
     compute ;
     compute according to (16);
     if    then
       update ;
       update ;
     end if
     compute ;
     if    then
       update ;
     end if
     if    then
       update ;
     end if
     update ;
  end while

Even if the computation of a simulation iteration of the RK45 algorithm is more computational demanding than other nonadaptive Runge-Kutta implementation, the possibility of changing the value of often dramatically decreases the simulation runtime.

2.5. Hybrid Rejection-Based Stochastic Simulation Algorithm (HRSSA)

A drawback of approximate simulation is that all the model reactions are simulated according to the same approximate strategy. This is an issue when exact simulation is required at least for a small part of the reaction network, for example, when slow reactions are considered to model rare stochastic events. In such a case, a hybrid simulation strategy can be applied, which divides reactions into subsets that are simulated by different strategies at the same time [7, 14, 15].

An issue of this approach is the synchronization between the employed simulation strategies in order to guarantee the exactness of the simulation of slow reactions. This is a fundamental aspect of hybrid simulation relying on the definition of the probability density function (pdf) of slow reactions. In fact, even though fast reactions can be safely simulated across slow reaction events, it is not always the case that slow reactions can be simulated regardless of what fast reactions are changing in the system. Actually the reaction propensity of a slow reaction may depend on species whose quantities are changed by fast reactions. For this reason, the reaction probability density function of (2) is not suitable to simulate the reaction subnetwork made of only the slow reactions of the system and it has to be extended to consider time-varying transition propensities [7, 14], which account for the fact that reaction propensities of slow reactions may change over time by the simulation of fast reactions.

According to [7, 14, 1620], the pdf of the next firing of a slow reaction finally becomeswhere

The firing time of the next slow reaction is thus obtained by solving the equationwhere is a random number from . Solving (21) is computationally challenging because the state is changed by fast reactions during time interval . Therefore, the hybrid simulation has to evaluate the integral simultaneously with the simulation of fast reactions in order to correctly generate the next slow reaction event. Moreover, in practical implementation this step has to be necessarily approximated by a numerical method relying on an error threshold that introduces an additional approximation in the simulation of slow reactions. This step is so critical that the standard choice of several implementation available in literature, such as the hybrid algorithms implemented in COPASI [22], intentionally avoids the computation of the integral during the simulation by accepting some approximation also in the simulation of slow reactions.

The hybrid algorithm HRSSA is a novel hybrid simulation algorithm introduced in [21] to solve this problem. HRSSA is built on top of RSSA introduced in Section 2.2. In particular, the RSSA concept of fluctuation interval of the system state is widely used to provide an important computational advantage. HRSSA synchronizes the simulation of slow and fast reactions by avoiding the computation of the integral of (21) and applies an accurate dynamic partitioning of reactions without updating reaction classification at each simulation step.

HRSSA updates reaction partitioning only when the current system state does not fit anymore in its fluctuation interval (see (8)). This permits reducing the computational overhead without losing the accuracy of the classification. HRSSA considers both reaction propensities and the number of transformed molecules for reaction partitioning, by replacing real propensities with their lower bounds. The adoption of bounds instead of real propensities does not affect the accuracy of the classification. Conversely, the usage of lower bounds imposes more stringent constraints that tend to increase the number of reactions that are classified as slow (and therefore simulated without approximations). The pseudocode of the dynamic reaction partitioning of HRSSA is in Algorithm 5. The if clauses of steps and implement the two conditions on reaction propensities and number of transformed molecules, respectively.

Input: The same as RSSA (Algorithm 2) together with the time granularity used for the
(approximate) simulation of fast reactions; the minimum amount of molecules that
has to be available for fast reactions; the minimum number of times that a fast
reaction has to be applied, in average, within the time range of size .
Output: the sets and of slow and fast reactions, respectively.
Pseudocode:
  ; ;
  for each  reaction
     if     then
       Put reaction in the set ;
     else if   , modified by with stoichiometric coefficient , such that
       then
       Put reaction in the set ;
     else
       Put reaction in the set ;
     end if
  end for

After reaction partitioning in the two sets of slow and fast reactions, HRSSA computes the sum of upper propensity bounds of slow reactions as defined in (20). The firing time of a candidate slow reaction is then computed as where is a random number in .

Under the hypothesis that the system state will remain inside its fluctuation interval in , we can consider not dependent on time over and this allows us to simulate fast reactions over this interval without taking any side effect on the application of slow reactions into account. This is because (22) remains valid, regardless of the action of fast reactions, as long as the current system state respects its bounds.

After the simulation of fast reactions for the time interval , a slow reaction is chosen and validated to fire by a rejection test, according to the RSSA simulation strategy. To guarantee the exact simulation of slow reactions (the proof is in [21]), the simulation of fast reactions is required to happen in a feasible system state. Therefore, every time the system state exits from its bounds the simulation is stopped and the fluctuation interval is updated. The pseudocode of HRSSA is provided in Algorithm 6.

Input: The same as RSSA (Algorithm 2) together with the time granularity used for the
(approximate) simulation of fast reactions; the minimum amount of molecules that
has to be available for fast reactions; the minimum number of times that a fast
reaction has to be applied, in average, within the time range of size .
Output: a time series of states , providing the dynamics of the system.
Pseudocode:
  ;
  while    do
     Define the fluctuation interval according to (8);
     For each reaction , compute propensity bounds and ;
     Update reaction partitioning (sets and ) by applying the algorithm
     in Algorithm 5 according to input parameters , and ;
     Compute according to (20);
     ;
     while    do
       , where is a random number in ;
       Compute by simulating fast reactions (), at time steps of
       maximum length , according to an approximate algorithm (either
       stochastic or deterministic), where is a value such that ;
       if    then
         Select a candidate slow reaction by applying RSSA steps
         in Algorithm 2;
         if   is accepted by RSSA  then  update computed at step by
         applying ;
         if    then  ;
       else
         ;
       end if
       ;
     end while
  end while

3. Results and Discussion

HSimulator is a cross-platform simulation software written in Java, offering a suite of state-of-the-art stochastic, deterministic, and hybrid simulation strategies for mass-action well-stirred biochemical reaction systems. Multicompartmental modeling is natively supported allowing the definition of nested reaction volumes via a customary text-based representation of reactions and compartments. The multithreading implementation allows scaling very well when computing averaged model dynamics obtained by running multiple stochastic simulations in parallel. The software has been designed to run in three user-scenarios: (i) via command-line, for example, as batch jobs or as part of wider modeling terminal scripts; (ii) programmatically embedded via its API in custom applications [27] or within MATLAB/Octave/R/Mathematica projects; (iii) as a self-standing simulation environment with a graphical user interface (GUI) for biomedical system modeling. For the three usage scenarios extensive documentation is available at the software web page (http://www.cosbi.eu/research/prototypes/hsimulator.). The GUI assists the modeler while defining the reactions with syntax highlighting of the typed entities and quick graphical access to simulation parameters. The simulation environment is completed by an interactive viewport to explore and understand the modeled system dynamics (see Figure 1) which can be then exported as Excel spreadsheets.

To evaluate the performance of the HSimulator implementation, a set of benchmarks have been carried out considering the state-of-the-art simulator COPASI [22], version 4.19 (build 140). All the simulations have been run in the same conditions on a 64 bit macOS Sierra MacBook Pro, with 16 GB of RAM. Six models have been simulated according to 6 different algorithms covering all the simulation strategies (stochastic, deterministic, and hybrid). Namely, we considered the Direct Method (implemented in both COPASI and HSimulator) and RSSA (only implemented in HSimulator) for exact stochastic simulation, LSODA (implemented in COPASI) and RK45 (implemented in HSimulator) for deterministic simulation, and Hybrid Runge-Kutta (implemented in COPASI) and HRSSA (implemented in HSimulator) for hybrid simulation. Benchmarks have been computed by averaging the runtime of simulations. The simulation parameters for each algorithm are in Table 1 and they have been chosen to preserve as much as possible the reliability of comparisons.

The Hybrid Runge-Kutta algorithm provided by COPASI combines two fast simulation strategies available in literature to simulate slow and fast reactions (NRM combined with the 4th order Runge-Kutta method [7, 13]). Reaction partitioning is computed dynamically during the simulation according to a threshold which provides the maximum number of reactant molecules of a slow reaction. Moreover, the simulation of slow and fast reactions is synchronized without computing the integral of (21) by approximating reaction probabilities of slow reactions as constant during one stochastic step. This solution introduces an error in the simulation of slow reactions, but makes the computation faster. HRSSA resulted to be faster than this algorithm in all the simulation cases we considered, even if HRSSA does not apply such approximation in simulating slow reactions.

We note that the latest version of COPASI (version 4.19 build 140) considered in our benchmarks provides other two implementation cases of hybrid simulation algorithms (Hybrid LSODA and Hybrid RK45). Here we considered Hybrid Runge-Kutta because its implementation is closer to the one of HRSSA. However, we obtained very similar simulation runtimes also by running our benchmarks with Hybrid LSODA (data not shown). Hybrid RK45 has been not considered here, because this simulation strategy uses a static reaction partitioning whereas HSimulator and the compared methods use a dynamical approach allowing for a fair benchmark between strategies. We also have to acknowledge that COPASI is not the only simulation software available in the literature. In the present work we considered COPASI due to its high popularity and to allow, as much as possible, a fair comparison with HSimulator. In fact, COPASI provides a way of specifying the biochemical system that is very close to the one of HSimulator. Snoopy [28] is another promising simulation software that provides different simulation strategies for hybrid models, including a reinterpretation of HRSSA.

Simulation benchmarks are provided in Table 2. To allow the reproducibility of the presented results, all the considered models are provided with HSimulator both implemented for HSimulator and for COPASI. Simulation benchmarks comprehend two biological models (the MAPK cascade and the Gemcitabine mechanism of action) to test simulation strategies in real modeling applications plus two theoretical models (the fully connected model and the multiscale model) considered with two different parameterisations each, to evaluate the performance of simulation algorithms under specific conditions. The fully connected model has been considered to evaluate the performance of simulation algorithms when a mass-action biochemical reaction network of reactions that are all slow (MCM ) or fast (MCM ) is simulated. Conversely, the multiscale model allowed testing the intermediate condition, where the biochemical network can be divided into two subnetworks working at different time scales. This scenario has been specifically considered to test hybrid simulation strategies. The biochemical network of the multiscale model has been generated two times to test the scalability of simulation algorithms (MSM , network of species and reactions; MSM , network of species and reactions, see Section 3.4 for details). The provided benchmarks seem to indicate that HSimulator is more scalable than COPASI with respect to the growing complexity of the model. This result is interesting, especially from an implementer’s point of view, showing how a recent Java-based implementation (HSimulator), in the given conditions, may compare favourably with respect to C++ (COPASI).

3.1. The MAPK Cascade

The Mitogen-Activated Protein Kinase cascade (MAPK cascade) is one of the most important and intensively studied signaling pathways [29]. The MAPK cascade is at the heart of a molecular-signaling network that governs the growth, proliferation, differentiation, and survival of many cell types. Moreover, the MAPK pathway is deregulated in various diseases, ranging from cancer to immunological, inflammatory, and degenerative syndromes, and thus represents also an important drug target [30].

The MAPK cascade is a series of three protein kinases that are responsible for cell response to growth factors. The signal activates MAPKKK by phosphorylation, which in turn activates MAPKK. Once activated, MAPKK activates MAPK. When is added to the system, the output of activated MAPK increases rapidly. By removing the signal , the output level of activated MAPK reverts back to zero. Reverse reactions are triggered by the signal and by the MAPK phosphatases KKpase and Kpase. The reaction-based representation iswhere KKK denotes MAPKKK, KK denotes MAPKK, and K denotes MAPK and tags and pp indicate phosphorylation and double phosphorylation, respectively.

A simulation benchmark of the MAPK cascade is in Table 2 (first column). In the computed benchmark, the model has been simulated for time units with all stochastic reaction constants equal to . The initial abundances have been set to , , , , and . All the other species are initialized with molecules.

3.2. The Gemcitabine Model

Gemcitabine is a drug for non-small-cell lung cancer, pancreatic cancer, bladder cancer, and breast cancer. Here we will consider a simplified model of its mechanism of action according to [23].

Gemcitabine (dFdC, see Box 1) is transported from plasma into the cell through the cell membrane (). Gemcitabine can be deaminated by CDA in the cytoplasm and in the extracellular environment leading to dFdU. Both dFdC and dFdU can be phosphorylated by dCK. Monophosphorylated Gemcitabine can be deaminated by dCMPD, whereas dCMPD is inhibited by . Alternatively, it is further phosphorylated. The Gemcitabine triphosphates and compete with the natural nucleoside triphosphate dCTP for incorporation into nascent DNA chain and inhibit DNA synthesis, thus blocking cell proliferation in the early DNA synthesis phase.

A simulation benchmark of the Gemcitabine model is in Table 2 (second column, simulation length time units). In the computed benchmark, stochastic reaction constants are [23]: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , and . Initially all the variables have been set to except for , , , , and .

3.3. The Fully Connected Model

The fully connected model is a theoretical model that consists of species and a set of reactions that reversibly convert each species into each other of the form The fully connected model is useful to evaluate the scalability of algorithms (variation in runtime with growing values of ). Moreover, if we fix initial values of the species and stochastic reaction constants, then we can evaluate the performance of a simulation algorithm when it simulates a system that remains in a steady state condition for the entire length of the simulation.

Here we use the fully connected model starting from two different steady state conditions to quantify simulation runtime of a reaction network composed by only slow or fast reactions. These benchmarks are provided in Table 2, third and fourth columns. In both cases, the model has been generated with , that is, with variables and reactions. All stochastic reaction constants have been set to 1. In the first benchmark (MCM ), the initial state of all model variables has been set to 20 in order to create a network of slow reactions. In the second one (MCM ), instead, the initial state of all model variables has been set to 2000 in order to create a network of fast reactions. In both conditions, simulation algorithms implemented in HSimulator were the fastest (simulation length time units).

3.4. The Multiscaled Model

The multiscaled model is a theoretical model with a reaction network that can be divided into two subnetworks working at different time scales. It has been specifically considered to test hybrid simulation strategies because it allows testing the intermediate condition (network made of both fast and slow reactions) between the two scenarios previously considered with the fully connected model (network made of only slow or fast reactions). The first subnetwork of the model is given by a fully connected model of species modified by a set of fast reactions. The model is then extended with species modified by a set of reactions of the form where and are random species such that and . The stochastic constant rates of fast reactions are some order of magnitude bigger than the rates of the slow reactions to emphasize the difference of speed between the two subnetworks.

The fully connected model has been considered to evaluate the performance of simulation algorithms when a mass-action biochemical reaction network of reactions that are all slow (MCM ) or fast (MCM ) is simulated. Conversely, the multiscale model allowed testing the intermediate condition, where the biochemical network can be divided into two subnetworks working at different time scales.

In Table 2 two benchmarks related to this model are provided. The first one (fifth column, MSM ) is related to a model with and ( species and reactions), stochastic reaction constants equal to (fast reactions) and (slow reactions), initial values of fast species set to , and initial values of slow species set to . The second benchmark (last column of the table, MSM ) uses the same parameters of the first one except for and ( species and reactions). The provided benchmarks show that the gain in simulation runtime provided by HSimulator is scalable with respect to the complexity of the model (simulation length time units).

4. Conclusions

We presented HSimulator, a state-of-the-art multithread Java simulator for mass-action well-stirred biochemical reaction systems. HSimulator provides a suite of published state-of-the-art simulation algorithms including, in the same package, the exact algorithm RSSA and the first publicly available implementation of its hybrid version HRSSA. The benchmarks in the paper show that the HSimulator implementation is often faster than the state-of-the-art simulator COPASI [22]. This could open new perspectives in computational systems biology, where often scientists have to balance the accuracy of their simulations with the need of considering large reaction networks modeling complex diseases or disorders.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors acknowledge Alessandro Pozzer for technical assistance and Dr. Giuditta Franco (Department of Computer Science, University of Verona, Italy) for her helpful suggestions and discussions. This work was partially supported by the PAT (Provincia Autonoma di Trento) Project S503/2015/664730 Zoomer.