Abstract

Biochemical systems theory (BST) is the foundation for a set of analytical andmodeling tools that facilitate the analysis of dynamic biological systems. This paper depicts major developments in BST up to the current state of the art in 2012. It discusses its rationale, describes the typical strategies and methods of designing, diagnosing, analyzing, and utilizing BST models, and reviews areas of application. The paper is intended as a guide for investigators entering the fascinating field of biological systems analysis and as a resource for practitioners and experts.

1. Preamble

Biochemical systems theory (BST) is a mathematical and computational framework for analyzing and simulating systems. It was originally developed for biochemical pathways but by now has become much more widely applied to systems throughout biology and beyond. BST is called “canonical,” which means that model construction, diagnosis, and analysis follow stringent rules, which will be discussed throughout this paper. The key ingredient of BST is the power-law representation of all processes in a system.

BST has been the focus of a number of books [16], including some in Chinese and Japanese [79]. The most detailed modern text dedicated specifically to BST is [3]. Moreover, since its inception, numerous reviews have portrayed the evolving state of the art in BST. Most of these reviews summarized methodological advances for the analysis of biochemical pathway systems [1050]. Others compared BST models with alternative modeling frameworks [32, 5163]; some of these comparisons will be discussed later. Yet other reviews focused on specialty areas, such as customized methods for optimizing BST models (e.g., [6467]) or estimating their parameters (e.g., [6872]), strategies for discovering design and operating principles in natural system (e.g., [7378]), and even the use of BST models in statistics [7981]. A historical account of the first twenty years of BST was presented in [82].

Supporting the methodological developments in the field, several software packages were developed for different aspects of BST analysis. The earliest was ESSYNS [8385], which supported all standard steady-state analyses and also contained a numerical solver that, at the time, was many times faster than any off-the-shelf software [86, 87]; see also [88]. Utilizing advances in computing and an extension of the solver from ESSYNS, Ferreira created the very user-friendly package PLAS, which is openly available and still very widely used [89] (see also [3]). Yamada and colleagues developed software to translate E-cell models into BST models in PLAS format [90]. Okamoto’s group created the software BestKit, which permits the graphical design and translation of models, along with their analysis [9193]. Vera and colleagues developed the software tool PLMaddon for analyzing BST models within SBML [94]; see also [95]. It expands the Matlab SBToolbox with specific functionalities for power-law representations, which are fundamental to BST. A similar goal had the program SBML squeezer, whose scope includes BST, but is more general [96]. Another software package is Cadlive, which offers comprehensive computational tools for constructing, analyzing, and simulating large-scale biological network models [97]. Vera and Torres [98] composed software specifically for the optimization of BST-based models, and Savageau’s group developed a specific toolbox for the analysis of design spaces; it can be found at [99]. Goel et al. proposed an automated method for constructing BST equations from information on the connectivity and regulation of a system [45].

The purpose of this paper is to portray the present state of the art in BST. Considering that BST was conceived over forty years ago, a lot has happened and more is percolating. This paper is quite comprehensive, but by no means complete, and it is likely that some important contributions are missing, for which I apologize. Furthermore, due to its wide scope, the paper can only superficially discuss many of the pertinent topics, and many interesting subtleties will not become apparent and instead “disappear” in clusters of references. Examples are detailed studies of the design principles of gene regulatory networks and of the inference of their topology and regulation. Nonetheless, the paper is hoped to offer the newcomer easy access to BST and to provide the more experienced researcher with a valuable resource.

2. History and Rationale

BST was proposed by M. Savageau in 1969 [100102]. Several interesting trends motivated him to devise this new formalism. First, biochemistry was in the process of advancing from the traditional studies of isolated reactions to analyses of pathways and, in particular, to investigations of the role of feedback inhibition in their regulation and control. As Savageau [101] noted, “we are now able to attempt the description of behavior of biochemical systems in terms of the component reactions. However, as yet, no method of systems analysis has been proposed which takes into account the particular nonlinear nature of biochemical systems.”

Concurrent with this important development in biochemistry was the rise of computing, and early pioneers of biochemical systems analysis like David Garfinkel envisioned large-scale computer-simulation systems for elucidating metabolic pathways (e.g., [103105]). Exuberant about the power of the new computers, the sky seemed to be the limit, and it appeared to be only a matter of time that any large reaction system could be analyzed through simulations. Alas, Savageau, and other contemporaries, immediately realized that computing power was only one aspect among many in biological systems analysis: even if one could write algorithms for simulating large reaction networks, it was not at all clear what functions should be used. Arguably, one could resort to default functions like Michaelis-Menten or Hill rate laws and their generalizations [106108], but then it would become overwhelmingly difficult to determine all kinetic constants and coefficients. Indeed, Schulz [109] showed convincingly that accurate representations of enzyme-catalyzed reactions become surprisingly complicated even for apparently simple bisubstrate biproduct reactions, if one attempts to account correctly for the dynamics of all intermediate complexes. Cautioning against large ad hoc models from a different angle, Heinrich and Rapoport [110, 111] remarked that complicated simulation models of reaction networks render it very difficult to discern between important and unimportant effects that enzymes and metabolites have on the system. Expressed differently, it becomes a significant challenge to infer from simulations alone which components of a system are responsible for certain systemic responses. By contrast, it was argued, if a truly effective mathematical approach could be identified, it would permit relatively straightforward computations of eigenvalues, sensitivities, gains, and other key characteristics of a model. Interestingly, these key characteristics are often governed by a vastly reduced number of essential parameters, which further enhances the appeal of models that facilitate the characterization of a model’s key drivers.

The emerging complications with pure simulation approaches suggested a search for alternative representations, which was to be guided by the much-envied paradigms of physics and engineering. Physics is solidly anchored in theory, and the representations of two coupled pendulums or of an electrical circuit are prescribed by this theory and therefore essentially unambiguous. Furthermore, the governing parameters, such as the resistance and conductance in the electrical circuit, are usually measurable. Not so in biology. Imagine a situation where a hormone triggers a change in gene expression. On the surface, one is faced with a simple cause-effect relationship. However, in detail, this process is exceedingly complicated. The hormone, released from a sender, such as a gland, needs to find its target cells. There it must dock to a receptor. The receptor is usually a transmembrane protein which, upon hormone docking, undergoes some very specific change in its three-dimensional structure. This change secondarily serves as a stimulus for a signaling cascade, which in itself involves several proteins. A protein at the last layer of the cascade directly or indirectly signals the relocalization or activation of a transcription factor, which binds to a corresponding regulator region of the gene whose expression is to be up- or downregulated. Formulating this chain of events mechanistically and in biophysical or molecular detail is presently not feasible.

The second role model for potential representations in biology was engineering. Here, the overwhelming choice is the linear model. No formalism is as well understood as linear mathematics, and in particular, linear algebra, and the repertoire of analytical and computational methods is without equal. Furthermore, engineers have managed to design devices in such a fashion that their characteristic responses indeed are linear, at least approximately. The success of engineering and its utilization of linear mathematics is evident everywhere. One might only think of the exploration of Mars, where rockets have exactly been following computer predictions to deliver a car-size Rover that drives around in a self-guided manner and sends back pictures of formerly unimaginable quality.

Unfortunately, engineering principles can seldom be applied directly to biological systems, because biology is intrinsically different from engineering. We humans are not the ones creating the components and merging them into functioning machines with approximately linear characteristics. Instead, nature has evolved very many known and yet unknown components, and these components cooperate in ill-characterized ways, which almost always contain genuine nonlinearities that can hardly be designed away for easy systems analysis. In the original set of articles on BST, Savageau already described this fundamental difference, stating that ““small-signal’’ linearization is entirely inadequate, because the dynamic range of the variables is known to result in the nonlinear operation of biochemical systems” [102]. At the same time, he recognized that general nonlinear theory was too complicated for any streamlined analysis [100]. The only realistic, feasible strategy had to be a compromise between generality and tractability, and this compromise had to be an approximation.

Merging ideas from Bode analysis in electrical engineering [112] and Taylor’s approximation theory, Savageau thus proposed the power-law representation as a valid local description of processes in biochemistry. This representation combined generality with simplicity, turned out to be rich enough to capture typical nonlinearities, such as stable oscillations [102], limited the amount of experimental data necessary for the description of general rate laws, and generally seemed ideal for “the ultimate purpose … to provide an explanation for the behavior of large-scale biochemical systems rather than individual reactions” [100]. Indeed, it became clear later that all nonlinearities that can be formulated as ordinary differential equations can also be represented, with complete exactness, in BST [113].

It is over forty years that the core ideas of BST were proposed, which raises the question of whether the original goal of “providing an explanation for the behavior of large-scale … systems” [100] is still unchanged. Expanding the scope beyond biochemistry and metabolism, one might address this question by looking at the nascent field of systems biology. And indeed, the often declared goals and purposes of systems biology are not fundamentally different from those of early BST. Under the common heading of “understanding” a biological system, one might place them into two categories, which at first appear to be rather different but in fact have quite a bit of overlap [44]. The first goal is the creation of large-scale models of an entire cell or organism. Such models would clearly be very useful in a vast array of applications, from metabolic engineering to drug targeting and the development of personalized disease simulators. The second type of understanding is the discovery of design and operating principles, which rationalize why a particular structure or process in nature outcompeted alternatives during evolution [34, 35, 78]. For instance, why does end product inhibition almost always target the first step in a linear chain of processes? Why are some genes controlled by inducers and others by repressors? Attaining the first goal of realistic simulators clearly requires very large models with many processes and parameters, while the second goal suggests the peeling away of any extraneous information, until the essence of a structure or process is revealed in a relatively small model. Nonetheless, at a deep organizational level, the goals are two sides of the same issue, because most large systems in biology are modular and exhibit possibly generic design features at different levels. They are organized and controlled in a hierarchical manner so that a true understanding of ever smaller functional modules greatly enhances the understanding of the system as a whole.

Thus, the core goals and aspirations of BST are still valid, with an extension in scope toward biological systems in general. The methods of analysis have of course evolved, and it is now possible, for instance, to assesS-systems with Monte-Carlo simulations of millions of runs, a feat that, a few decades ago, could only be accomplished on a few computers worldwide. Of equal importance is the rapidly expanding availability of biological technologies, along with the enormous amounts of useful quantitative data they bring forth. Combined with a much enhanced appreciation for computational approaches among experimental biologists, there is an unprecedented exuberance that we might indeed be able to formulate and parameterize very large models of biological systems in the foreseeable future and use these models for the betterment of humankind.

The true holy grail of systems biology will be a theory of biology. It is easy to see what such a theory could do, when we study the transition of physics from an experimental to a theory-based science. Instead of studying one application at a time, we could make (and prove) general statements about entire classes of biological phenomena. Some biological laws and partial theories have already been proposed, but they are scarce and isolated in certain niches. For instance, the almost universal law of correspondence between amino acids and codons has had tremendous ramifications for the interpretation of genomic information, and the theory of evolution has helped us explain the relatedness and differences among species. Demand theory [114, 115] explains different modes of gene regulation, and the theory of multiple equilibria and of concordance [116, 117] addresses certain phenomena in metabolism. Nonetheless, while some inroads have been made, general theories of larger biological domains seem out of reach at this point.

3. BST and Other Canonical Modeling Approaches

The key to understanding large-scale systems in biology through modeling is an effective compromise between applicability, accuracy of representation, mathematical tractability, and efficient computational tools. As the term “compromise” suggests, no perfect modeling strategy is known that satisfies all four criteria in all relevant situations. It is not difficult to realize though that models consisting of vastly heterogeneous mixtures of functions and submodels, while possibly quite accurate, are not likely to permit streamlined formulations and analyses or crisp interpretations. By contrast, homogeneous model structures provide a high potential level of tractability and elegance, especially if powerful tools like linear algebra can be brought to bear on these structures. In recognition of this fact, several approaches for biological systems analysis have made use of “canonical representations” [2, 6, 11, 17, 81, 118]. These consist of mathematically homogeneous structures that are the result of strict construction rules which, in turn, are rigorously derived from mathematical theory.

The best-understood canonical model is the linear representation. When setting up a linear model, it is clear from the beginning that every process in the model is to be represented with a linear function and that the ultimate result will consist entirely of such linear functions. The reward of this severe constraint to model choice and construction is that a vast repertoire of effective methods is available for analysis. It is this wealth of methods that, for instance, has propelled engineering to the enormous level of sophistication we discussed before. As soon as one component of the model is made nonlinear, the homogeneity is lost, and streamlined analyses are hampered, if not precluded.

While linear models are rightfully the first choice for modeling, they obviously do not account for nonlinearities, and because biological phenomena are predominantly nonlinear, linear models are not directly applicable to biological systems. We will see, however, that some nonlinear canonical models possess linear features that allow analyses of certain aspects of nonlinear systems. Two questions thus arise: can we identify effective nonlinear canonical models, and if so, are the results worth the trouble? Let us begin with the second question of why is it beneficial to accept the constraints of canonical models. Several complementary answers may be given.

First, canonical models directly address the fundamental question of how to get started with the design of a model. As indicated above, we usually do not know the biophysical processes and mathematical functions that are optimal for describing complex processes in biology. One response of the modeling community has been the use of default models or ad hoc formulations that for some reason appear beneficial, even though they might not have a biophysical or chemical foundation and even if the necessary assumptions are not really valid. For instance, the standard representation of enzyme-catalyzed reactions is the Michaelis-Menten function [106108], although the biochemical assumptions underlying the use of this function are typically not satisfied for metabolic pathways in vivo [4, 12, 13]. At the same time, this function has been used to study processes that have little to do with enzymes or biochemistry, such as the uptake of nutrients from soil through roots [119]. In canonical approaches, the representation of such a process is formally prescribed and even guaranteed to be correct within some limits.

Second, the result of a canonical model design is by definition a homogeneously structured model that permits analyses that are almost independent of the size of the model. Similar to the case of linear models, for instance, canonical Lotka-Volterra models and S-system models within BST permit straightforward analyses of steady states and their features, no matter how many variables are involved; we will discuss this feature later in detail. As a very practical aspect, a homogeneous model structure facilitates the development of customized software. Such software can be tailored specifically for the canonical structure and optimized in unique ways, because it does not have to be prepared for eventualities that often emerge in ad hoc models. As an example, a numerical solver for systems of power-law differential equations does not have to anticipate the occurrence of poles or other ill-defined situations and can therefore be optimized in unique ways [8587]. Similarly, different efficient solvers for sensitivity computations and boundary value problems made use of the homogeneous structure of BST models [88, 120127].

Third, well-chosen canonical models come with some guarantees regarding their accuracy. Because they are essentially always the result of specific approximations, they are known to be exact at some operating point of choice and very accurate close to this point. At the same time, it is clear that their accuracy is of unknown quality if one moves away from this point. This issue, however, is a genuine feature of all models in biology and will be discussed later in more detail within the context of BST.

Fourth, canonical models possess an interesting property that has been called telescopic [128]. Because these models are always constructed according to the same principles, the specific organizational level for which the model is designed is immaterial. Thus, whether a model represents the dynamics of genes, metabolites, organisms, or populations, the model structure is always the same. If an additional, more detailed module is created at a lower level than a model, its structure is guaranteed to be the same, and the result can be seamlessly incorporated into the model at the higher level. As a consequence, canonical models are natural starting points for multiscale modeling approaches.

Finally, canonical models permit objective comparisons, because they only differ in parameter values, but not in structure. By contrast, if two ad hoc models are to be compared, it is not even clear how many parameters are to be considered. For instance, the typical Hill function apparently has two parameters ( and ). However, one could count the Hill coefficient as a third parameter, which happens to be equal to 2 in this formulation, but could be changed to a different value. More generally, it is all but trivial to compare different functional forms in an objective manner. As a specific example, consider a Hill model and a generalized logistic model, which have very similar graphs but different formats and different numbers of parameters (see Figure 1). Two canonical models of the same kind are always much more similar, because they have the same structure and types of parameters, thus allowing direct, fair comparisons of models of the same size and even of nested models that include more and more parameters [129].

All canonical models have a certain appeal from a conceptual point of view, but different choices of models have their genuine advantages and drawbacks, and, as always, they all constitute different compromises with respect to structural suitability, accuracy, mathematical and computational tractability, complexity, their numbers of parameters, and a host of other features. While this paper focuses primarily on BST, alternative canonical approaches, using ordinary differential equations (ODEs), should be mentioned.

Before we discuss alternative canonical models in detail, it is useful to introduce generic nomenclature. We distinguish two types of variables. Dependent variables are possibly affected by the action of the system and may change over time. In a system of differential equations, each dependent variable has its own equation. Independent variables are either constant or follow a dynamic that is controlled from outside the system. They typically do not have their own equations. Examples are the brightness and temperature during the day-night cycle, an externally managed feeding regimen, or an enzyme activity that is believed not to change during a numerical experiment. Parameters are placeholders for numerical values that make the model specific. Their values are constant throughout any given (computational) experiment, but may be changed from one experiment to the next. Fundamental constants such as and are sometimes used but not as explicit model components.

The generic format of ODE models for biochemical, and various other biological, systems is Here is the vector of dependent variables, is the stoichiometric matrix, and is a vector or reactions. The stoichiometric matrix describes which variables are involved in which reaction [130].

As an example, consider the model of a branched pathway system, as shown in Figure 2. The system contains three dependent variables (, , and ) and five reactions . Thus, the stoichiometric matrix has three rows and five columns and reads Positive entries correspond to influxes, while negative entries signify effluxes from a metabolite pool. Most entries are 0’s and 1’s, but other quantities may account for the stoichiometry of splitting or merging reactions. The vector of reactions directly shows the format of the canonical (or noncanonical) model. For instance, each could be a linear function, a Michaelis-Menten or Hill rate function, a power-law function, or some other representation.

3.1. Types of Nonlinear Canonical Models
3.1.1. Lotka-Volterra Models

The oldest and best-known nonlinear canonical modeling structure is the Lotka-Volterra (LV) model, which typically describes interactions among populations [131136]. Specifically, the dynamics of a population is described as a sum of one linear term and some or all binary interactions between any two populations, which leads to the formulation This formulation contains a quadratic term for variable in its own equation, which is often interpreted as a crowding term that becomes important when a population grows too large. Any of the parameters or may be positive, negative, or have values of 0. If all variables are strictly positive, one may divide both sides of each equation by , and the result is a system in the format: where the change in the logarithm of a variable is given as a linear function of all variables. This formulation is quite intriguing, because it turns out that many nonlinear canonical forms contain three ingredients: differentiation, summation, and the logarithm of variables.

The intriguing aspect is that every LV model has this same format and that the only differences between two models are the number of variables (equations) and the numerical values of the parameter values. One might obtain the impression that the format severely limits the repertoire of possible model responses. However, this impression is wrong, and LV models can represent the most complicated nonlinearities [21, 134], including deterministic chaos. For instance, the graph in Figure 3 is the output of an LV system with just four variables [71, 137, 138].

Although LV models have a long history in ecology and are very rich in their possible responses, they are not well suited to describe biochemical systems. The reason is that this specific canonical format is incompatible with basic functionalities of pathways of enzymatic reactions. As a simple, generic example, consider a pathway containing a bimolecular reaction and one feedback signal, as depicted in Figure 4.

This system is generically modeled as with some functions that we need not specify for this illustration. The LV format has problems with several of the processes in this simple example. A constant input to the system is incongruent with the LV format; it would have to be formulated in LV as a linear function of or , respectively. The reaction from to is a problem in the equation of , because the corresponding term describing the production of should include as a substrate, as well as the inhibitor , but not itself, because does not contribute to its own generation. The generation of should be modeled with a term containing and , but not , which, again, is problematic. Thus, while LV models have been extremely successful for systems of pair-wise interaction and simple growth processes, they are not suitable for systems of biochemical reactions. Their previously mentioned generality and flexibility are the result of defining auxiliary variables that artificially increase the size of the system (see the later section on Recasting for parallels to BST).

3.1.2. BST Models

BST prescribes canonical models in which every process is formulated as a product of power-law functions. BST permits several variants, as we will discuss later, but once a variant is chosen, the resulting model always has the same homogeneous structure. Similar to the cases of linear and LV models, the only differences are manifest in the number of equations and the parameter values.

BST has two roots: first, Hendrik Wade Bode (1905–1982) showed that a ratio of polynomials can be represented in small pieces by straight lines, once a logarithmic transformation has been applied to the dependent variable and the function [1, 101, 112]. Second, Brook Taylor (1685–1731) had shown much earlier that any sufficiently smooth function (with sufficiently many continuous derivatives) can be approximated with a polynomial of order . For , the approximation becomes equivalent with the approximated function, but, even for small , the approximation is exact at one point of choice, called the operating point, and very accurate close to this point. Further away, the two typically diverge, but general statements characterizing the rate of divergence and the accuracy of the approximation are seldom practically helpful, even though they can be formulated. Specifically, the sufficiently smooth function with at least continuous derivatives is represented at the operating point as The quantities , , , and denote the first, second, third, and th derivative of , respectively. The higher derivatives contribute less and less to the representation, because they are divided by factorials , which rapidly become large. Arguably the most useful variant is linearization, where , and the result is The slope is given by the derivative , and the intersect is . Figure 5 illustrates this linearization with the function at the operating point (for details, see [6]).

Figure 5 immediately indicates that the range of accurate representation may be small. However, in many practical cases it is sufficient. For instance, consider the clearly nonlinear function in Figure 6. If the dependent variable always operates within a range such as [2, 5], the linearization may be sufficient and is certainly much simpler.

BST uses this linearization strategy, but because the goal is a nonlinear canonical form, the approximation is executed in logarithmic coordinates. Specifically, one linearizes as a function of , and the results are always of the form . The kinetic order, that is, the exponent , is computed first. Executing the Taylor approximation in logarithmic space leads to the result that is equal to the derivative of with respect to , multiplied by , divided by , and evaluated at an operating value of that we may choose. For instance, consider the task of computing the power-law representation of the function at the operating point . We obtain directly For specific values such as , , and , at the operating point is 1.036.

At the operating point, and its power-law approximation must, by definition, have the same value. This fact allows us to compute the rate constant as follows: The value of at the operating point () is 0.619, and we have already determined as 1.036. Therefore, . It is clear that the choice of a different operating point leads to a different power-law representation. Two examples are illustrated in Figure 7.

Studying the steps of the Taylor approximation in more detail, one finds that a positive effect of a variable on a function results in a positive kinetic order, that a negative effect, such as inhibition, results in a negative kinetic order, and that a kinetic order of 0 corresponds to the situation where the function is not affected by the variable at all. These correspondences are of tremendous value for designing BST model, as we will discuss later.

The derivation of the power-law terms also indicates that nothing is really limited to biochemistry. As long as variables and functions are positive-valued, BST equations can be developed whether the application is biochemical, biological, or comes from some other field.

One intriguing feature of Taylor’s method is that it applies in an analogous fashion to functions of several dependent variables. The only true difference is that the derivatives in this case are partial derivatives. Specifically, if one wants to approximate the function as a power-law function, the results are already known to have the format . The kinetic order is computed as and the analogous is true for . As before, the rate constant is computed by equating and the power-law term at an operating point of choice.

As an example, consider a generalized Monod model, which describes the growth of a microbial population in a fermenter. It depends in this example on the substrate and is also affected by the alcohol that is generated in the fermentation process (cf. [139141]). A suitable formulation is where , , and are parameters that characterize the growth and product inhibition characteristics. We already know the format of the corresponding power-law representation, which is The kinetic orders and the rate constant are calculated as described above, and the result is see [4]. All three quantities are to be evaluated at some operating points of our choice. Because it represents the effect of the substrate on the process, is positive. By contrast, is negative, because it represents product inhibition. As in the one-variable case, any multivariate power-law model is a local representation that is exact at the operating point, very accurate close by, and of ill-characterized accuracy everywhere else. A detailed example of how these representations enter a complete systems model is given in a later section.

3.1.3. Other Canonical Models

A generalization of BST models, as well as the LV format, is the Generalized Lotka-Volterra model. In this format, any power-law term that appears in any equation is included in every equation, if necessary with a multiplier of 0 [142148]. As a result, all equations formally contain a sum of the same terms. This structural homogeneity, although somewhat artificial, permits higher-level algebraic manipulations that can be used for elegant classification purposes. However, as first-line models for the practical analysis of pathways, this format is overly complicated.

Other canonical forms have been proposed more recently. The first is the lin-log model, which is linear in logarithmically transformed variables, which are all expressed in relation to their normal reference values [62, 149151]. As a result, a process description in a lin-log model always has the form: where the are metabolites and modulators, are enzymes, and variables with index 0 are the corresponding reference values. Furthermore, is the flux rate, and is the reference steady-state flux through the pathway. The parameters are reference “elasticities,” which play the same role as kinetic orders in BST. Lin-log models are directly related to the analytical framework of Metabolic Control Analysis (MCA), which targets the effects of parameter changes on the steady state of a pathway system. Many comparisons between BST and alternative approaches, including MCA and lin-log models, have been presented throughout the history of BST [32, 42, 5255, 58, 59, 62, 63, 152166]. In terms of accuracy, power-law models are better suited for small substrate concentrations, and lin-log models for large concentrations, and combinations of models often turn out to be most accurate over wide ranges of variations in variables [167] although these combinations lose the key advantage of structural homogeneity of canonical models. The collective experience with the theory and applications of lin-log systems is so far much smaller than for LV systems and the variant formats within BST.

Another more recent canonical modeling format is the representation of metabolic processes with sigmoidal modules of Hill-type, rather than power-law functions [42, 168, 169]. Thus, a process involving two variables is represented as where , , and , , are parameters. As one might expect, this formulation permits greater flexibility than the corresponding power-law representation but requires more parameter values.

Vinga and colleagues compared BST with the concepts of dynamic energy budget theory and proposed a combined approach that used aspects of both [170]. Her group also presented a comprehensive comparison of several modeling frameworks, including BST, flux balance analysis (FBA), and Petri nets [171]. Wu and Voit embedded BST into hybrid functional Petri nets, which expanded the range of phenomena that could be modeled [172174] (see also [175]). They also considered stochastic analogues to GMA systems [176]; see also [177].

4. Model Design, Parameter Estimation, and Diagnostics in BST

A great advantage of BST models is the simplicity with which the modeling process can be initiated [3, 16, 30]. This simplicity is directly related to the theory behind the power-law representation, which assigns a direct and unique interpretation to each parameter.

4.1. Setting Up Equations

The typical starting point of model design is a pathway diagram, as we saw it in Figure 2, but which in real applications is usually much larger and more complicated. Even for large systems, a diagram of pools and processes immediately dictates how a model is to be set up. Two basic decisions have to be made. First, which of the variables should be declared as dependent and which should be independent? And second, which modeling format should be chosen? The first question requires biological insight and good judgment that balance realism and practical feasibility, because a dependent variable ultimately requires much more input information than an independent variable. In response to the second question, we focus primarily on BST, which permits a choice among a few alternatives.

The main ingredient of BST is the power-law approximation of all processes in the system, as it was discussed before. While this core concept is immutable, it leaves a residual degree of freedom, which leads to different variants within BST. This variability stems from differences in the order in which the processes in the system are formulated and approximated. Coarsely speaking, one may approximate every process in the system separately, which leads to the Generalized Mass Action (GMA) form, or one may aggregate some processes first and then approximate the result, which leads to S-systems and Half-systems. The differences are most clearly illustrated with an example, and we consider for this purpose again the branched pathway that was shown before in Figure 2.

The construction of equations is facilitated by the interpretation of kinetic orders, as discussed earlier: positive and negative effects correspond to positive and negative kinetic orders, respectively, and, importantly, if there is no effect, the kinetic order is zero. For instance, considering the process in Figure 2, we know immediately that it will exclusively contain as a variable. Process contains variable with a positive kinetic order and with a negative kinetic order, because it is an inhibitor. Applying these rules, we obtain the symbolic BST equations This format models every flux independently of others, but inspection of the diagram mandates some equivalences between terms, because the same processes sometimes appear twice; namely, we need to require and , as well as and . This format, which focuses on processes, is called the Generalized Mass Action (or GMA) format.

An alternative is the S-system format. This terminology first emerged in the mid-1980s [11, 18] and refers to the fact that this system representation, in a minimal fashion, is capable of representing “synergistic and saturable” phenomena. The S-system form focuses on the pools (dependent variables) and collectively formulates all incoming processes and all outgoing processes in one term each. If there are no branches, there is no difference to the GMA format. However, there is a notable difference where two processes diverge from . In the GMA format, the loss of is represented with the two terms . In the S-system format, one argues that the loss of only depends on , thus leading to the single, aggregate term . At the operating point, the two formulations are exactly equivalent, but for other values of they are generally different, unless and happen to have the same value. Thus, the S-system model with typical parameter names is If and have different values, is a weighted average of the two, and the weights are the flux rates toward and . Thus, the following constraints have to be satisfied for the S-system and the GMA system to be equivalent at the operating point of choice: Although the functional representations at branch points are different in the GMA and S-system forms, the dynamics of the resulting models is often rather similar. For instance, if the branched pathway system in Figure 2 is initiated away from the steady-state operating point, the two transients are essentially the same (Figure 8).

In the Half-system format [113], all processes contributing to the dynamics of a variable are collectively represented in a single power-law term, and the corresponding representation for the branched pathway is

The three representations are mathematically different and have their genuine advantages and disadvantages. The greatest advantage of the GMA format is that it is the most intuitive: it focuses on fluxes, and every flux is mapped directly onto one power-law term. The S-system representation focuses on pools (variables). Its greatest appeal is its simpler two-term format, which leads to unusual mathematical opportunities, which we will discuss later. In particular, the steady state of an S-system can be computed with methods of linear algebra, whereas other formats, including GMA, require more complicated methods [178180]. Maybe surprisingly, the S-system format is also the more accurate representation for functions with a hyperbolic shape, that is, functions that start at a small value and monotonically grow toward saturation, such as a Michaelis-Menten rate function. The Half-system form has the simplest mathematical structure, which permits, for instance, very simple means of parameter estimation from time series data [181]. However, it can only capture steady states in which at least one variable is zero; otherwise the dynamics continues to increase or decrease. In the section on recasting, we will introduce the even simpler format of Binary Systems, which are still surprisingly rich in their repertoire of responses [113].

All these formats possess the telescopic property [128]: the overall response of a model at a lower level may be approximated as a power-law term, which can replace a dependent variable at a different level, and since any power-law term, raised to a power, is again a power-law term, the format is preserved at the new focus level.

Summarizing the main variants within BST, the different orders of flux aggregation and power-law approximation always result in the following formats.(1) If every process is modeled with its own power-law representation, the result is a GMA system. Each equation in a GMA system with dependent, independent variables, and terms has the format (2) If, for each (metabolite) pool, all incoming processes and all outgoing processes are first aggregated and then collectively modeled with one power-law representation each, the result is an S-system. Each equation in an S-system with dependent and independent variables has the format (3) If all processes affecting a (metabolite) pool are aggregated and then collectively modeled with a single power-law representation, the result is a Half-system. Each equation in a Half-system with dependent and independent variables has the format

No matter which representation is chosen, the procedure of setting up symbolic equations is fully prescribed and easily learned by novices [182].

4.2. Parameter Estimation/Inverse Problems

Once one has decided which variables should be dependent or independent, the construction of model equations in BST is straightforward. Indeed, it is possible to write computer programs that automatically set up correct equations [45, 9193] (see also [30]). The equations thus established are initially symbolic, which means that numerical values have not yet been assigned to the parameters. Nonetheless, a big step forward has been made, and this step would have been much more difficult without a canonical modeling structure. The next step is usually parameter estimation, that is, the assignment of numerical values to all parameters in the symbolic equations, including the initial values, for all differential equations in the system. Uncounted methods have been developed for this difficult step, but none of them is ideal in all situations. Interestingly, the history of models has iteratively experienced phases of relatively high and low data availability, which had a direct effect on parameter estimation methods [19].

Parameter estimation and structure identification are arguably the most difficult steps of a biological systems analysis. In pure parameter estimation tasks, the model structure is assumed to be fully known, whereas structure identification refers to the situation where the topology and regulation of a system are uncertain, as well as the parameter values [70]. Clearly, the latter is more difficult than the former.

Until recently, almost all parameters of biological systems were estimated in a bottom-up fashion [49]. In the case of biochemical systems, one searched for kinetic information pertaining to individual enzymes, modeled each reaction step in isolation, and then merged all models of these steps into comprehensive systems models. In most cases, the pieces did not truly fit together, with the result that the integrated model exhibited unrealistic responses. These prompted a revisiting of the model and the kinetic information and often resulted in a long drawn-out process of refinements and parameter tuning. Good examples of such extensive model construction efforts are [37, 183189].

Different approaches were proposed by Sorribas’ group, who proposed using receiver-operating characteristic curves [190], or nonlinear regression methods [191193]. An advantage of BST models for all these estimation tasks is the fact that every parameter has a clear and direct meaning, which often permits the definition of relatively small search ranges, especially for kinetic orders [3].

Somewhat of a transition from the local estimation of process descriptions to global data was an approach that uses dynamic data, but only close to an operating point. Namely, it was proposed to use linearized equations for studying system responses to small perturbations around the steady state and to estimate the Jacobian by simple multilinear regression [194, 195]. Similarly, Sorribas proposed the numerical characterization of BST systems from transient response of systems to perturbations [191, 196199]. Savageau noted that models can be identified, in principle, from sensitivity profiles [200].

As early as 1982, Voit and Savageau proposed a drastically different estimation method for situations where metabolite concentrations had been measured at successive time points [16, 201]; similar ideas were presented in the same year by Varah [202]. The basic idea is to replace the differentials on the left-hand sides of a system of differential equations with slopes that are to be estimated from the time series data at sufficiently many time points. The result of this procedure is a system of algebraic equations. This system consists of sets, where is the number of dependent variables and is the number of time points at which slopes are estimated. This conversion of differential into algebraic equations avoids all numerical integrations, which in typical cases consume more than 95%, if not 99%, of the overall time to estimate parameter values [203]. The slope-substitution method is not without problems though, because the variable time becomes implicit, which can lead to misinterpretations of the data [203] or a “time-warping” of the solution [6]. Nonetheless, the slope estimation strategy is statistically valid [204] and can, at the very least, be used to develop coarse solutions from which to start regular parameter optimization approaches. Reviews of this subfield include [68, 69, 205214].

The early articles did not receive much attention, presumably because biologists very rarely generating time series data. However, when the same ideas were reviewed in a textbook [3] and declared as a bottleneck in model analyses, a flurry of algorithms was proposed to deal with time series data. In almost all cases, these algorithms worked fairly well for some applications, but failed for others, and the estimation community is still awaiting a truly exceptional solution.

The vast majority of estimation strategies from time series data made use of evolutionary algorithms and, in particular, genetic algorithms. Very many of these approaches were proposed by the groups of Okamoto, Kimura, and Tomita and applied to gene regulatory networks of different sizes. Each strategy presented a new variation or combination of techniques that improved performance, at least with respect to the presented applications. As the goal of many of these studies was the identification of interactions between pairs of genes, the parameter estimation became almost identical with structure identification, where corresponding kinetic orders were either zero (no interaction) or nonzero (interaction). Because numerical estimations seldom return 0 as the optimal value, different pruning strategies were implemented to replace small optimal value with 0, thereby simplifying the inferred network structure. Representative examples of this line of research in microbial systems are [195, 215276]. Wu and collaborators applied similar approaches to the analysis of a eukaryotic cell-cycle gene network [277, 278].

Genetic algorithms were not the only proposals for the estimation task. Also using evolutionary methods, Mendoza’s group developed a method for estimating parameters in GMA systems, based on particle-swarm optimization [279281]; see also [282, 283]. The group also tested ant colony optimization [284, 285] and simulated annealing [286, 287] (see also [288, 289]) and provided a benchmark system for comparing different approaches [290]. McKinney and Tian proposed an artificial immune system for the same estimation purposes and compared various underlying models [291]. Lee and Yang used a clustering approach [292].

Alternatives to evolutionary algorithms included least-squares regression [170, 293, 294], Kalman filter methods [295], and neural networks [203, 296, 297]. Yet other approaches to the same problem employed interval and Newton-flow analysis [298302]. Chou and colleagues devised a method specifically for the estimation of S-systems, which alternates between the estimation of alpha- and beta terms [303, 304]; see also [305]. Tian and collaborators proposed a different type of alternating estimation, where linear and nonlinear parameters are estimated separately [306]. Lall et al. made use of precursor-product relationships in a metabolic pathway to estimate parameters [307, 308]. Chen et al. proposed a network component analysis for deducing the dynamic features of regulatory signaling systems [309].

Wang’s group suggested collocation methods that convert ODEs into algebraic systems and combined them with a hybrid differential evolution strategy that was able to find a global solution [310314]. The group also proposed a multiobjective optimization approach of model inference [315, 316]. Others used fuzzy multiobjective optimization [317, 318]. Different groups used global optimization methods, based on branch-and-reduce techniques, to estimate parameter values in a metabolic system [319, 320].

Several authors used hybrid methods that combined evolutionary algorithms or genetic programming with statistical analysis or Monte-Carlo techniques [249, 321323]. Others proposed various decomposition schemes for different types of parameters [255, 324326].

Numerous groups utilized biological or numerical constraints to ameliorate the estimation challenge. Hecker et al. [327] and Mao et al. [328] demonstrated that the incorporation of information from different data sources, such as sequence and protein-DNA interaction data, benefits the network inference process. Hatzimanikatis’ group inferred gene regulatory networks by using an S-system method that was constrained with additional data on protein kinetics [329, 330]. Lecca and colleagues presented an estimation strategy for BST models based on a probabilistic model of the noise in experimentally measured metabolite concentrations [331]. Daisuke and Horton argued that gene expression networks are typically scale-free and that this property could be used to constrain the S-system-based network inference process [260]. Ko et al. constrained the estimation problem for GMA systems by using connectivity information of the pathway system, which is fundamental to metabolic control analysis (MCA) [332]. Indeed, since elasticity coefficients in MCA are equivalent with kinetic orders, the estimation techniques for these coefficients are applicable to BST models as well (e.g., see [333]). In contrast to these successes, Calçada and coworkers did not find benefits from restricting the inference task with topological constraints [334].

An important prerequisite of all methods that substitute derivatives with slopes and thereby decouple the system of differential equations is the adequate estimation of slopes directly from the time series data. Seatzu proposed B-splines [335, 336], and Vilela et al. composed a smoother that accounted for the noise structure in the data [337, 338]. Wang et al. compared various alternatives, including B-splines [339]. Nounou and coworkers used a multiscale filtering approach to denoise the data [340]. In addition to smoothing, it was also seen as useful to identify details of the system topology in order to prime the estimation process [341].

4.3. Network Reconstruction and System Identification

While essentially all estimation methods described above had as the ultimate target “the one optimal solution,” several groups in recent years emphasized that biological systems models are not necessarily completely identifiable from the available data [342], but “sloppy” [343], and that one might be better advised to look for entire domains within the parameter space that correspond to similarly good solutions; for BST related studies see [322, 323, 344].

For canonical models, the transition between parameter estimation and structure identification or network reconstruction is fluid, because the structure of the inferred system changes if a rate constant or kinetic order parameter value becomes 0. Therefore, many of the estimation studies described before are to some degree structure identification or network reconstruction analyses. In addition, several approaches were proposed specifically for structure identification, and they are briefly discussed here.

As already mentioned, Sorribas’ group proposed the numerical characterization of BST systems from transient responses of systems to perturbations about the steady state [191, 194, 195, 197, 199].

Specifically addressing the direct relationship between parameter estimates and the structure of S-system models, mixed integer linear programs (MILP) and multiobjective optimization approaches were proposed to estimate both, structure and parameter values [345, 346]. Marino and Voit devised a semiautomated model-finding approach for identifying the minimal connectivity of a system as well as its parameters [129]. Khoury and colleagues developed a symbolic method for learning dynamic models of compartmental systems, which combined genetic programming with a fuzzy representation environment [347].

Spieth and collaborators developed novel optimization methods to evolve the topology of a gene regulatory network as well as its parameter values, based on microarray data [348353]; see also [354]. Liao’s group combined S-system concepts with the method of network component analysis (NCA) for the estimation of transcription factor activity [355]. Given that most genes are regulated by only a few transcription factors, the authors developed criteria for the identifiability of transcriptome networks that are effective even if the input information consists of sparse microarray data. Alves and colleagues used a combination of bioinformatics and structural biology, along with genomic, proteomic, and metabolic time series data, to reconstruct pathways and generate novel hypotheses [39]. Liu and colleagues devised a structure identification method that combines parameter estimation with a pruning strategy that adds a regularization term to the objective function and prunes the solution according to a user-specified threshold value [356]. Džeroski and Todorovski proposed machine-learning methods [212].

Schnell’s group reconstructed a system model for glycolysis in the bacterium L. lactis from time series data (see [357]), using a global nonlinear modeling technique that identifies the elementary reaction steps constituting the pathway [358].

Searson and colleagues developed a hybrid S-system approach for the inference of chemical reaction networks, based on time series data from fed-batch experiments, where essentially no a priori information regarding products and reactants was available [359, 360].

Kattas et al. [361] compared network identification strategies using GMA, S-system, or linear models. They tested the methods against three benchmark problems (with and without noise) that were designed to highlight specific features of biochemical pathways. They implemented the methods into algorithms and were able to obtain better results than competing methods.

Voit’s group proposed methods for identifying the functional shapes of fluxes in a metabolic pathway system, based on metabolic time series data [362366].

Ndukum and colleagues addressed a different type of inference from time series data. Namely, they asked whether the difference between two dynamic substrate uptake profiles in yeast cultures were statistically significant [367]. A similar task was addressed by Kawagoe et al. [368].

Ferreira and collaborators designed kinetic essays for optimal model selection, under the assumption that two models were already parameterized and provided equivalent solutions [369].

Veflingstad and colleagues demonstrated that it is beneficial to obtain as much qualitative information about a system as possible before the system structure is identified [341].

As noted earlier, the structures of systems have also been assessed with large-scale simulations and post hoc filtering processes that distinguish acceptable from undesired structures [322, 323, 370374].

Němcová [375, 376] developed a rigorous theory to characterize the identifiability problem for deterministic classes of polynomial and rational systems and noise-free data. She used an algebraic approach to realization theory and demonstrated that the biological systems in question, including BST systems, are smooth nonlinear Nash systems [377]. Němcová applied her results to a number of systems, including a glycolytic pathway model for the bacterium L. lactis [357]. Papachristodoulou and Recht addressed the interconnections in chemical networks by minimizing the one-norm of all possible reaction rates [378]. While they demonstrated the method with mass action kinetics, this method also applies to power-law systems.

4.4. Steady-State Diagnostics

With BST equations formulated and parameter values assigned, the initial model design is complete. However, before the model is utilized for its intended purposes, it should be diagnosed for internal and external consistency and robustness. This diagnostic phase of the model analysis typically begins with the identification of steady states. These states are characterized by the fact that none of the variables changes although material may flow among the variables. For S-systems, this step is surprisingly simple [100], while it is much harder for GMA systems.

4.4.1. Steady-State Computations

Suppose that none of the variables is zero and that every equation in the S-system has both terms. At the steady state, all differentials on the left-hand side of the equations must be zero, because none of the variables changes. Thus, we obtain Moving the -term to the right-hand side, we can take logarithms of both sides, which yields Recalling basic properties of logarithmic, exponential, and power-law functions, these terms can be simplified. In particular, if we rename the variables = and rearrange the equation, we obtain the result Further renaming for all and and defining = leads to steady-state equations of the S-system model that are linear: see [100]. This system has equations and variables, which include dependent and independent variables. In typical modeling situations, the values of the independent variables are assumed to be known, so that the linear system can be solved. For many purposes, it is useful to formulate these equations in matrix format, which reads see [200]. In this formulation, the dependent variables, collected in vector , are separated from the independent variables, which are collected in vector . The matrices and refer to the system in (28), with the former corresponding exclusively to dependent variables and the latter to the effects of independent on the dependent variables. The matrices and contain sensitivity and logarithmic gain values, which in a different manner describe the effects of parameters and independent variables on the steady state of an S-system.

The matrix formulation in (29) is very interesting, because it says that linear algebra may be used to compute steady states of arbitrarily large S-systems, as long as no variables or rate constants are zero. If indeed one or more of the variables are zero, or if terms in the S-system are zero, one must consider these special cases one by one [1]. For later biological interpretations, one must remember to translate the results back from ’s to ’s with an exponential transformation. While the formulation in (29) reflects most situations, the steady-state equations have also been used “in reverse” to compute which values of the independent variables would move the system to a desired steady state [379, 380].

Although S-systems and GMA systems are very similar, the computation of steady states in GMA systems is incomparably more complicated. In fact, there is no algebraic method that generally solves this problem. Complicating matters, and in contrast to regular S-systems, GMA systems can have multiple nontrivial steady states. These are to be determined with search algorithms which, however, are not guaranteed to find all solutions [178, 179]; see also [381383]. One may also solve the GMA system dynamically, but this strategy only leads to the one stable steady state in whose neighborhood one initiates the solution.

4.4.2. Local Stability

The next diagnostics is the characterization of the model at the steady state(s). The first feature to be tested is local stability. The question is: If the model is slightly perturbed away from the steady state, will it return to this steady state on its own accord? If so, the steady state is locally stable; otherwise it is not. Most, but not all, biological systems are locally stable, and one typically observes that they tend to maintain the same steady state, which is sometimes called homeostasis.

Local stability could be assessed with perturbation studies that move the system slightly away from a steady state and test whether the system returns. A potential issue with this strategy is that one would have to select directions for the perturbations, thereby making the method vulnerable to bias and possible omissions of interesting cases. A more elegant and general method is the computation of eigenvalues. These are complex numbers, each consisting of a real and an imaginary part. A system possesses one eigenvalue for each dependent variable. If and only if all real parts are negative, the system is stable. For S-systems, the determination of eigenvalues is relatively simple. For small systems, the eigenvalues can be determined with a Routh-Hurwitz analysis [1]. For larger systems, it is achieved essentially immediately with software programs like PLAS [89], PLMaddon [94], or any other software that supports eigenvalue analysis. If the imaginary parts are not zero, the system has the potential of exhibiting oscillations [1, 102]. A useful representation of the stability of a small system is the root-locus plot [30]. Lin and colleagues used Lyapunov theory to determine the stability of a cascaded S-system [384].

In relatively rarer cases, biological systems may also have two (or more) stable steady states, and it is possible that external or internal controls cause these systems to “toggle” between the two states (e.g., see [36, 385, 386]). In such a case, the system may also exhibit hysteresis, which means that its approach toward either one of the states depends on its recent “history.” A BST example of a bistable system with or without hysteresis is a two-component system, with which microorganisms sense and respond to their environment [387390]; see also [6]. Depending on the regulatory structure of this type of system, one observes hysteresis or not.

4.4.3. Sensitivity and Robustness

A second feature that is easy to compute for S-systems, due to their linear steady-state equations, is the profile of sensitivities. Each sensitivity measures how much a steady-state feature changes if one of the parameters in the system is varied by a very small amount [10, 200, 391]. Specifically, a metabolite sensitivity of +5.4 means that a one percent change in the parameter in question leads approximately to a 5.4 percent increase in the given metabolite at the steady state. A sensitivity of −3.2 corresponds to a 3.2 percent decrease. Sensitivities can also be computed with respect to fluxes. Strictly speaking, sensitivities are infinitesimal quantities, but they are excellent predictors for the effects of small, or even moderate, changes. Changes in steady-state features in response to small variations in independent variables are called logarithmic gains. Similar to sensitivities, these quantities are easily computed for S-systems [163, 200, 391]. These features can also be computed for GMA systems, but the computation is more complicated [3, 392].

Local stability and sensitivity analyses are the main diagnostic tools for assessing the internal consistency of a model. As an extension, Salvador [393396] explored the effects of simultaneous changes in two parameters on the steady state, using methods of advanced linear algebra. Guebel computed sensitivities for BST-like models that included the second-order term of the Taylor approximation [397]. Additional diagnostic tools include criteria of robustness and noise attenuation [398405]. Shi and colleagues studied the balance between robustness and fragility in biochemical networks [406]. Several articles expanded the concepts of gains and sensitivities to questions of dynamics [124127, 407412]. These analyses assess how changes in initial values affect the transient responses of a system. Horner analyzed sensitivity profiles with Monte-Carlo simulations rather than analytically [413]. Drengstig and coworkers noted that sensitivity coefficients of reaction networks are closely related to transfer functions that are used widely in control engineering [414].

Due to the regular structure of BST models, it is even possible to use computer algebra to characterize steady-state features of S-systems in symbolic, nonparameterized models [415].

4.4.4. Structural Stability

Local stability analysis addresses the response of a system close to a steady-state point. A different type of diagnosis, called structural stability analysis, addresses situations where the behavior of a system changes in a qualitative manner, which means that not only the numerical value of some feature changes, but that the feature changes from one class of behaviors to another. The best-studied situation consists of a slight persistent change in a parameter value that causes the system to lose (local) stability. The steady state thus becomes unstable, and the system may exhibit a stable oscillation surrounding the now unstable point. The specific value of the altered parameter where this change happens is called a (Hopf) bifurcation point, and the stable oscillation is called a limit cycle [416]. An example is the following small S-system, which could represent two genes that affect each other’s expression (Figure 9): The system has a unique nontrivial steady state at , which is easily checked by computing the eigenvalues, for instance in PLAS. They are . In the time domain, both variables oscillate with decreasing amplitude toward the steady state (Figure 10), which they strictly only reach for .

Now, we slightly change the rate parameter with value 0.9 in the first equation. For instance, for 0.95, the oscillations have a similar appearance, but the amplitude decreases more slowly (results not shown). The speed is even slower for 0.99 and 0.999. In the latter case, the eigenvalues are . A Hopf bifurcation occurs if the parameter crosses the value of 1, where the real parts of the eigenvalues switch from negative to positive. For a rate parameter with value 1.01, the amplitude initially decreases but then stabilizes (Figure 11). The result is seen more clearly in the phase plane, where is plotted against . Here, the expression limit cycle becomes evident, because the system is first spiraling but ultimately cycling along the same quasielliptic trajectory or orbit. To generate the blue inward spiral, the system is initiated outside the limit cycle, here at . The same system, initiated inside the limit cycle , generates a green outward spiral, which in the illustration so tight that it appears as a solid ring. The two spirals both approach the same limit cycle.

The characterization of Hopf bifurcations is usually very laborious [417], but Lewis showed that it becomes an amazingly straightforward task for the special structure of S-systems, especially if the system consists of only two variables [418]. In fact, Lewis derived a criterion for bifurcations in S-systems that consists of the evaluation of a simple algebraic formula. This criterion was subsequently “inverted” and thus made into a tool for constructing limit cycles of different shapes from scratch, which otherwise is a rather cumbersome challenge [419]. A famous example of a limit cycle system is the Brusselator, which is actually a GMA system [420, 421].

Nikolov and colleagues analyzed a signaling system with a generic strategy that sequentially integrated sensitivity analysis, bifurcation analysis and predictive simulations. The strategy demonstrated how information from one step can feed into other analyses and thereby help refine the structure of a model [422].

4.5. Diagnostics of Dynamic Features

In addition to these types of algebraic analyses, which for reasons of practicality are typically executed with computational software such as PLAS [89], PLMaddon [94], BestKit [93], or in Matlab, system models are typically tested and diagnosed with simulations that help explore ranges of possible system behaviors. These simulations require the integration of the differential equations, which almost always has to be numerical. It is only possible to a very limited degree to solve small S-system differential equations analytically [423].

System simulations fall into different categories. The first may be called ‘‘what-if’’ simulations. Here initial values of the system variables are changed and the resulting time trajectories are compared with data or the investigator’s expectations. Similarly, parameters or independent variables may be altered, to model mutations or altered environmental conditions, and a numerical solution of the system reveals whether the system might be appropriate or is evidently wrong in some aspects. Essentially all of the applied BST models described in a later section have used this strategy of exploration extensively. Simple, didactic examples are given in [3].

A different type of simulation, which lately has been garnering increased interest, is Monte Carlo (MC) simulation. This type of simulation is not targeted on a specific goal, as the previously described simulations, but much more exploratory. Specifically, one accounts for the possibility that not every parameter is 100% certain but in reality comes from some range. It could be that all values within this range are equally likely or that they have certain probability distributions. For instance, the probability of a particular value could be highest in the center of the range and decrease to zero toward the upper and lower limits. For each run of an MC simulation, one randomly selects one value for each parameter from its range and according to the corresponding probability profile and solves the system. Repeating this step thousands of times results in a distribution of outputs (e.g., see [4]).

Let us illustrate the results with the same S-system example of a branched pathway that we used for a comparison with the corresponding GMA systems ((18) and (19); Figure 2). For ease of demonstration we assume that all parameter values are fixed, except for the value of the input , which, instead of being fixed at 4, could be any number between 3 and 5, and the inhibition parameter , which we allow to have any value between 0 and −1, instead of the fixed –0.75. Figure 12 shows the system responses in from two runs of randomly drawing 20 combinations of and . Several observations can be made. First, every Monte-Carlo simulation is different, which is easy to explain, because random numbers are sampled. Second, the iterations typically fall on both sides of the original, deterministic case (dotted lines), because the fixed parameter values are within the sampling ranges. Third, for small sets of MC simulations, the extreme results and their variation can be different. For instance, in one panel of Figure 12 the highest value is higher than in the other panel, but the spread of steady-state solutions is smaller. Running thousands of MC simulations, one obtains a more complete picture of all possibilities and their likelihoods.

Applying an MC method, Horner analyzed the sensitivity of the uric acid concentration in blood serum in a model of impaired purine synthesis to initial conditions [413]. Expanding on the traditional MC concept, Balthis developed a hierarchical MC method to analyze mercury exposure in a human population that was stratified by features such as age and fish consumption [424].

MC simulations render it possible to distinguish ensembles of feasible from infeasible models. The goal of this approach is not to identify the one best model parameterization, but an entire cluster of models that all satisfy certain criteria of quality. Logistically, one launches a large-scale MC simulation where every uncertain parameter is given a generous range of possible values. After every run, one checks if the solution satisfies the set output criteria. If so, the solution is retained; otherwise it is discarded. As an example, Lee et al. [323] used an MC sampling with filtering to obtain feasible models for a metabolic model in plants, and Yin and Voit [322] applied the method to explaining the functionality of the enzyme NADPH oxidase, which is involved with the control of reactive oxygen species.

Ni and Savageau used a somewhat similar “sample-and-retain-or-discard” strategy for screening many alternative instantiations of human red-blood-cell models for putative errors, such as instability or unreasonably high sensitivities, and for proposing candidate regulatory mechanisms that were able to alleviate these problems [370, 371]. Alves et al. applied a large-scale model screening strategy to characterizing details of the biogenesis of iron-sulfur clusters in yeast [373, 374].

4.6. Model Validation

Once the repertoire of possible behaviors has been sufficiently explored, the model should be validated against data that had not been used for model construction. These data may be quantitative or qualitative, and model evaluations may be in the form of comprehensive sensitivity and robustness analyses, simulations, or mixtures of diagnostic assessments [425]. Essentially all published models have undergone some degree of validation. Relatively comprehensive examples for BST models are [188, 426].

An entirely different type of validation uses methods of an automated, algebraic model checking [427436]. A good example in the context of BST is a Simpatica software system that explores the time-trajectories of models with an automaton-based semantic language. This language permits the asking and answering of questions about the logical properties of the temporal evolution of a system, such as “is the system able to reach a steady state?” or “what are the possible bounds for the trajectories of a particular dependent variable in the system?” The model checking language also contains qualifiers such as “eventually” and “always,” as well as their negation, which leads to the qualifiers “never” and “sometimes.” As a specific example, the software may check the truth of the expression “Eventually(Always(zero-derivatives)),” which corresponds to the situation that the system will certainly approach a steady state for going toward infinity. In this manner, the system can qualitatively reason about features of the system by using propositional temporal logic that succinctly and unambiguously addresses ordered sequences of events. The authors of this software illustrated the power of this type of automatic checking with the model of a limit cycle [437] and a quite complex S-system of purine metabolism [186, 438, 439]. Somewhat similar in concept, Gentili proposed a combination of S-systems with stochastic -calculus, which is essentially a programming language, to analyze gene regulation networks in an automated fashion [440]. Campagna and Piazza studied the topic of reachability in dynamic systems, including S-systems, with semialgebraic hybrid automata [441].

5. Application of BST Models to Specific Biological Systems

BST has been used for rather different purposes throughout biology and beyond. Within the realm of biology, and using very broad categories, one may distinguish model analyses whose purpose was to gain novel insights in a specific biological system from analyses whose prime target was the development of methods, along with a better understanding of an entire class of biological systems, such as “all linear pathways with feedback.” We begin this section with the latter, and discuss specific applications subsequently.

5.1. Design and Operation and the Method of Controlled Mathematical Comparisons

From the very beginning, many BST studies addressed entire classes of systems of high biological relevance, without focusing exclusively on specific applications. A long line of such investigations targeted the different observed modes of gene regulation, especially in microbes. Spearheading this effort, Savageau early on presented detailed analyses of classical and autogenous control of inducible operons and the interplay between gene regulation and the demands imposed on organisms by their environments [442449]. These analyses revealed that different regulatory schemes are indeed optimal, depending on environmental circumstances. Many later articles similarly addressed the design features of gene regulatory networks [36, 114, 115, 161, 385, 450461]. The overarching goal of these studies was the discovery of general rules and mechanisms that were able to explain in an unbiased manner why different genes and operons are regulated in distinct ways.

The envisioned generality of these studies made it apparent that new methods of objective, comparative analysis were needed and led to the development of an unbiased approach for comparing alternative modes of operation or, more generally, alternative systems. The ultimate result was the method of controlled mathematical comparisons (MCMCs), which aims to explain why certain system features are observed in a certain situation, rather than others [11, 73, 462464]. A typical question within MCMC is: What is the advantage of a feedback inhibition signal in some specific, observed, or hypothesized system and what would happen without this signal? The basic concept of MCMC consists of comparing two (or more) systems that differ in just one feature, such as the inhibition signal in the previous example. In addition to setting up these models in parallel, one defines objective biological criteria that render one structure advantageous over the other. Typical examples of such criteria of functional effectiveness are the minimal accumulation of unneeded intermediate metabolites or the amount of time the systems require to respond to external stimuli. The alternative systems are assessed against a list of such criteria, and the system scoring highest against these criteria is declared best. Moreover, the advantages of the winning system can be attributed directly to the one feature that distinguishes this system form the other(s).

MCMC has been used to analyze the design principles governing various gene regulatory systems and metabolic pathway systems. With respect to gene regulation, MCMC ultimately led to the Demand Theory of Gene Regulation [36, 114, 115, 385, 453, 465]. This theory connects environmental demands, as well as the cycle time, which is defined as the average time for a gene to complete an on-off cycle, to the mode of gene regulation and the coupling of gene expression in the form of elementary circuits. Based on these concepts, Atkinson et al. demonstrated that a deep understanding of the design of such gene circuits and their regulation is a prerequisite for creating new systems, as it is the goal in synthetic biology [466]. Specifically, these authors predicted, and subsequently validated experimentally, the function of a sophisticated genetic circuitry exhibiting a toggle switch, as well as oscillations in gene expression that were sustained in a bacterial population for several generations.

MCMC was also applied to different classes of metabolic pathways [467476], the evolutionary development of different designs of enzyme systems and moiety-transfer cycles [477484], bacterial growth phenomena [485], two-component sensing systems with and without bistable switches and hysteresis [387390, 475], a stress response system in yeast [486], higher plant systems [487], and even control systems for lymphocytes and their responses to antigens [462, 463]. Puigjaner and colleagues compared MCMC for BST and metabolic control analysis (MCA; [488]). Zhang and collaborators emphasized the value of understanding design principles in the context of toxicity testing [489].

A representative example for the discovery of design principles is the comparison of all possible patterns of feedback inhibition in a linear, unbranched pathway (some possibilities are shown in Figure 13), which reveals that the very common pattern of the final product inhibiting the first step of the pathway has distinct advantages over all other imaginable designs [1, 10, 467, 468, 471, 472].

While the original MCMC mostly relied on algebraic analyses of S-systems, it was later augmented with simulations and graphical and statistical assessments [464, 471474, 490].

Recent advances have expanded the concept of design principles to design spaces [480, 481, 491494]. A design space is a dimensional compression of the parameter space of a system that reveals and characterizes generic relationships between system parameters, environmental variables, and qualitatively different system behaviors. Specifically, the entire space of possible responses of a system is partitioned into regions of the parameter space that corresponds to distinct phenotypes of qualitative system behavior [492]. Using local S-system approximations, these regions can be investigated in terms of their fitness and tolerance to external perturbations, and it is possible to compute the boundaries between these phenotypic regions. An application of this concept was an analysis of the role and regulation of the O2 sensor FNR in E. coli [495, 496]. Software has been proposed to automate the key steps of such an analysis [99].

A different extension of methods for the discovery of design principles was the corresponding analysis of operating principles, where the focus is not so much on the structural design of systems, but on their operation in response to stimuli [76, 77, 380, 484]. A recent review of design and operating principles can be found in [78].

Chen and colleagues developed adaptive design rules for biochemical systems, based on the criterion of sufficient robustness toward mutations and environmental changes during evolution [402, 497]. They found that systems satisfying these criteria exhibit increased diversity in phenotypes.

Wu et al. formulated complex genetic trait formation as a dynamic system [498]. They used a simple S-system to quantify how alterations in different system constituents can lead to global changes in traits and provide a quantitative tool for testing the interplay between genes and development. The authors found that the genetic mapping of complex traits may be improved through an increased understanding of biological design principles.

5.2. Case Studies

In contrast to the general design studies described in an earlier section, many BST analyses have focused on specific applications and often used actual experimental data in the form of kinetic properties or measurements of metabolite concentrations, fluxes, or other quantitative features. In line with the original purpose of BST, many of the earlier analyses addressed biochemical pathway systems, but over time the scope widened within and even beyond biology. Representative studies are briefly summarized in the following vignettes. They start with microorganisms and progress toward higher organisms, including plants and humans. In many cases, this sequence roughly corresponds to the time of publication. Additional studies will be discussed in the context of system optimization.

5.2.1. Microbial Studies

Early on, Savageau and colleagues analyzed intriguing mechanisms that ensure the correct translation of RNA into proteins [499507]. Based on experimental in vitro and in vivo data from E. coli, these authors used power-law models within BST to investigate in detail the kinetic proofreading mechanisms for the aminoacylation of tRNAs, which typically occurs with a high degree of accuracy.

Arguably the first numerical data analysis with BST was yeast fermentation of glucose into ethanol [201]. Based on experimental data, alternative models were constructed and parameterized. One conclusion from comparing the models was that high ethanol concentrations in these cultures led to cell death by lysis. A much more detailed model of yeast fermentation was later introduced by Curto and coworkers [185, 508, 509]. This series of papers was interesting for a number of reasons. In particular, it showed how traditional enzyme kinetic information can be validly converted into power-law models that offer additional modes of analysis and novel insights. This more complex model was later used as a test bed for various optimization and estimation [3, 4, 255, 319, 409, 510520].

Conceptually similar modeling studies addressed the production of amino acids, enzymes, proteins, and other valuable organics in E. coli [58, 521523]. Lall and Mitchell used an S-system model to characterize the reduction of metal in the bacterium Shewanella oneidensis [524]. Sheridan and colleagues successfully engineered and validated the meta-cleavage pathway of Pseudomonas putida, based on BST simulations and logarithmic gain analysis [525]. Sun and collaborators modeled glycerol fermentation by Klebsiella pneumonia [526].

Lactococcus lactis is a bacterium of great relevance in the dairy industry, where it is used in the production of yoghurts, cheeses, and other food products. Its appeal for modeling derives from the fact that the organism has been used for in vivo NMR studies that characterized the temporal dynamics of glycolytic metabolites in response to a glucose bolus after a period of starvation. These time series data opened new avenues of parameter estimation and model construction in general [342, 357, 358, 362, 363, 527].

Different groups analyzed cultures of the alga Chlamydomonas reinhardtii for the purpose of producing hydrogen. Horner and Wolinsky [528] used an S-system model to investigate the sensitivity of H2 production to photosynthetic activity and the production of protons by photolysis [528]. Jorquera and colleagues [529] studied sulfur deprivation, while Zhang suggested S-system equations for the reactions in a space-dependent advective-diffusive representation of the system [530, 531]; see also [532].

Torres developed the first BST models that comprehensively captured the dynamics of citric acid generation in the fungus Aspergillus niger [183, 184]. This mold is industrially very important, as it produces the lion share of a worldwide annual citric acid production of almost 2 million tons [533, 534]. The initial model, as well as several more sophisticated successors, was used later for testing various optimization methods [66, 535539]. De Jongh explained formerly puzzling experimental findings based on the models [540].

Alves developed a model for iron metabolism in yeast [373, 374, 541]. Particularly interesting in this study is the combination of dynamic modeling with the use of information regarding the molecular structure of key enzymes. Huang and Wang studied the growth dynamics of Saccharomyces diastaticus in mixed sugar culture designed to produce ethanol and glycerol [165].

Alvarez-Vasquez spearheaded the development of a series of models describing sphingolipid biosynthesis in yeast [187, 426, 542]. Sphingolipids are specialized lipids that not only contribute to rafts in lipid membranes, but also exert distinct signaling functions. Interestingly, the general structure of their biosynthesis has been conserved from yeast to humans. In a recent expansion, the pathway was merged with a second pathway, describing the de novo synthesis of ergosterol [188].

Stress responses in microbes have been studied for a long time, because they offer insights into the cellular machinery that mounts appropriate responses to adverse environmental conditions. Several BST models have addressed heat stress responses in yeast. Voit and Radivoyevitch demonstrated that the glycolytic gene expression profile following heat stress is all but intuitive, yet appropriate for the required metabolic response [543]. Sorribas’ group analyzed the same phenomenon and showed that the cellular response is essentially optimal, given the constraints within which it has to be mounted [484, 544546].

One hallmark of the heat stress response in yeast is a dramatic increase in the intracellular concentration of trehalose. This disaccharide is a stress response metabolite that is found in a variety of microorganisms, plants, and invertebrates. Its synthesis normally occurs at a low rate, but increases enormously and within minutes of severe stresses. Due to the strong metabolic and genomic responses, various aspects of trehalose dynamics, and other heat stress responses have been modeled in recent years. They included an analysis of the design and operating principles of the trehalose pathway itself [486], as well as detailed assessments of the multiscale processes that lead to the appropriate stress response [6, 547]. In addition to trehalose, yeast sphingolipids respond to heat stress within minutes in a coordinated fashion [548, 549]. The interactions between the trehalose and sphingolipid responses at different organizational levels constitute a very intriguing regulatory system [550].

A different stress response in yeast is starvation, to which the organism reacts by reversing the direction of glycolysis, a phenomenon called the diauxic shift. Alvarez and collaborators showed that this response is coordinated with a global small-magnitude up- or downregulation of very many metabolic steps, rather than more pronounced changes in a few steps [542] (see also [380]).

Shiraishi and Savageau [37, 551555] presented what one might call the first complete standard BST analysis of a complex pathway system. Addressing the TCA cycle in the slime mold Dictyostelium discoideum, they discussed the advantages, disadvantages, and relationships between different model structures, set-up equations, based on the pathway diagram, executed steady-state and dynamic analyses, and discovered and remedied problems in earlier non-BST models.

All organisms rely on robust signal transduction systems, which possess some aspects of metabolic and proteomic systems, but also their own genuine features. BST has been employed for a rich variety of analyses of signaling systems. In particular, the group of Vera and Wolkenhauer has contributed much to our understanding of such systems. Working at the interface between experimentation and theoretical biology, they demonstrated how modeling can complement data collection and interpretation in the context of the JAK2/STAT5 pathway [556558]. In [41], the group compared detailed and simplified power-law models [559]. Not surprisingly, the preference for one or the other model choice depends critically on the amount and quality of the data used and the specific questions asked. For example, the article [560] successfully used simple power-law models to analyze the dynamics of feedback loops in the ERK signal transduction system. In a different, yet related line of work, Nikolov and colleagues developed a multiscale model that accounted for the effect of erythropoietin mediated JAK2-STAT5 signaling in erythropoiesis [422, 561]. The group also assessed oscillations in the NFκB pathway with a model that had BST components, as well as Michaelis-Menten-type functions [562].

Lin and colleagues studied options for controlling cascaded systems, including signal transduction systems, which were formulated as S-systems [384, 563]. Boykin and Ogle investigated the sonic hedgehog-signaling pathway [564]. Ray and Kirschner analyzed signaling processes associated with the primary role of macrophages to use chemicals like nitric oxide and the iron regulatory apparatus to kill pathogens [565].

Schwacke and Voit analyzed the generic features of MAP kinase cascades and rationalized with computational means why the parameter values in different layers of the cascade must fall into distinct domains of the parameter space [566]. They also demonstrated that cross-talk between two cascades is sufficient to create all genuine information responses, including negation and the exclusive-or function [567].

While higher organisms tend to use multilayer protein cascades for signaling purposes, bacteria often use two-component systems for sensing their environment and responding adequately. From an analytical point of view, these systems are very intriguing in that they are bistable and can exhibit hysteresis [387390, 475]. Other articles on signal transduction included [568, 569].

5.2.2. Studies on Higher Plant and Animal Systems

Shortly after the introduction of BST, Smith used power-law models to study the functional interactions between the components of an ecosystem [570572]. Specifically, he studied pathways of nutrient flux with tracer experiments. The rationale for using BST was that earlier studies had proposed ratios of polynomials for the description of ecological processes, and that these were adequately approximated with power-law models, as Savageau had demonstrated [1, 101]. Thus, Smith converted an earlier model into an S-system and used this model to characterize steady states, stability, and species extinction. In a similar vein, Torres developed a model of magnesium flow in a tropical forest, where the variable pools represented plants, animals, litter, and soil [573].

Voit and Sands developed a model describing nutrient allocation in trees within planted forests and studied the effects of different fertilization regimens [119, 574576]; see also [577]. Interestingly, the allocation patterns changed with the age of the trees and their fertilization regimens. Martin used BST to condense a complex forest simulation model into an S-system model that was much easier to analyze [578]. Renton and colleagues combined power-law modeling with the structural modeling framework of L-systems to generate functional-structural plant models that permitted a spectrum of granularity in detail ranging from accurate and descriptive to mechanistic and explanatory models [579581].

In a more theoretical study, Voit showed that a widely observed rule of forest growth, the so-called 3/2 rule, is a direct consequence of an appropriate S-system model formulation [582, 583]. The rule posits that growing forests approach a power-law relationship between tree density and average tree size that has a slope of about 1.5. He also proposed a statistical method, using an S-distribution (see later), to characterize the changes in tree size distributions during the aging of a forest [575, 584]. Torsella showed how simplified S-systems with only one nonzero term per equation permitted linear regression methods for the estimation of tree growth in planted stands [181]. Johnson modeled the growth of pine trees [293, 585] and studied economic aspects of dynamic agricultural systems [586].

Lee was the first to develop comprehensive models of lignin synthesis in poplar trees and alfalfa [323, 487, 587]. Lignin is a biopolymer in secondary plant cell walls which hardens them and also makes them resistant to foraging by animals and the attempt to extract ethanol for biofuels from inedible plant parts.

Chaudhuri and colleagues developed an S-system model to analyze bioeconomic features of fisheries, using a realistic catch-rate function, as well as costs, taxes, and regulatory controls [588, 589]. Methods of variational calculus and control theory led to an optimal harvesting policy. Brown used an S-system model to optimize the feed and slaughter age of turkey [590, 591]. Torres’ group used BST methods to optimize a feeding regimen for economically valuable octopus populations kept in captivity [592].

Torres presented a BST model of the glycolytic and glycogenolytic pathways in rat liver, whose predicted responses correlated well with experimentally determined results [593]. This model was later made into a detailed case study for analyzing BST models [3] and used as a benchmark for the application of control theoretical methods to steering systems toward a desired goal [406, 594597].

Although not addressing a living organism, one might add that Streichert and colleagues used an S-system representation in a reaction-diffusion approach to investigate the endogenous evolution of a head-tail pattern in an artificial embryo without the prior definition of spatial gradients [532].

5.2.3. Human Physiology and Disease

Human metabolism and its failure have received increasing attention with the availability of more comprehensive data and the corresponding development of analytical techniques. As an early example of a BST analysis, Ni and Savageau developed models of the metabolic dynamics of red blood cells [370, 371]. Reilly and colleagues studied renal hemodynamics [598], and Vera’s group studied with a multiscale model the signaling role of erythropoietin on erythropoiesis [422, 561].

Salvador et al. modeled the NADPH redox cycle in human erythrocytes and used these models to connect the molecular properties of glucose 6-phosphate dehydrogenase mutants to clinical phenotypes [478481]. Interestingly, the design space of this system partitioned cleanly into three regions of qualitatively distinct performance. Within this design space, the distances between mutants and normalcy correlated negatively with the severity of the phenotypes.

Also related to the blood system in humans, Yin and collaborators showed with a BST model how different mechanical shear stresses affect the metabolic and signaling responses of endothelial cells in vasculature [599]. Also in vasculature, they analyzed the role of the enzyme NADPH oxidase, which controls the dynamics of reactive oxygen species, and whose malfunction can contribute to diseases like atherosclerosis [322].

Sorribas and González developed a BST model for the hypothalamus-anterior pituitary-thyroid network. They translated this complex physiological system into a relatively simple S-system model, which qualitatively captured the responses of healthy subjects to an injection of thyrotropin-releasing hormone. The authors also studied the dynamics of the system’s regulatory signals under physiological and pathological conditions [600].

Curto and colleagues developed a series of models of purine metabolism in humans and demonstrated that an S-system model was able to serve as a novel classification tool of purine-related mental diseases [3, 186, 438, 439]. Thomas adopted Curto’s model with some adjustments to clarify confusing effects of the drug mizoribine on guanine and adenosine nucleotide synthesis [601]. Horner used a variant of the model to study the effects of parameter variations in the case of one of the most devastating purine-related diseases, namely, hypoxanthine-guanine phosphoribosyltransferase (HGPRT) deficiency, which in severe cases is known as Lesh-Nyhan syndrome [413].

Qi and collaborators developed several models describing the dynamics of dopamine metabolism. Dopamine is a crucial neurotransmitter in the reward system of mammals, including humans, and changes in dopamine concentrations can lead to Parkinson’s disease, schizophrenia, sleep disorder, and attention deficit/hyperactivity disorder [44, 174, 602609], (see also [175, 610612]). The work of Qi’s group contains models for the presynapse, which is a nerve ending in the midbrain where dopamine is produced, as well as the postsynapse in the forebrain, where dopamine signals are interpreted. Sass and colleagues used a complementary BST model to investigate the dynamics of the protein -synuclein, which is the second hallmark of Parkinson’s disease, outside changes in dopamine levels [613].

Liu and colleagues proposed an S-system model to characterize the dynamic regulatory characteristics of the p53 signaling pathway, which is crucial in tumor suppression and apoptosis [614]. Simulations with the model assisted the identification of key molecules in this signaling pathway. Vera and colleagues [615] showed with a model containing power-law and ratio terms that the dynamics of p53 expression and its role as a transcription factor are affected by epigenetic silencing of 14-3-3 proteins, which can bind to various signaling proteins.

Broome and Coleman developed a model of cell death in multiple sclerosis, which they used to identify possible triggers of the disease as well as potential drug therapies, particularly with respect to reactive oxygen and nitrogen species [616].

Ferreira and colleagues modeled a biochemical system that has a typical time scale seconds, but can cause disease due to the accumulation of metabolites during a human’s lifetime [617]. Namely, they studied the glyoxalase system and its role in the very slow formation of advanced glycation end products, due to the Maillard reaction, which in the long run can lead to Alzheimer’s disease and to alterations in the proteins in the lens of the eye.

Several groups used BST models to analyze infectious diseases. Garcia and her collaborators developed a detailed model of the parasite Cryptococcus neoformans, which is a common cause of fungal meningitis [618, 619]. This organism can grow in the respiratory tract or within the phagolysosome of phagocytic cells and therefore must adapt readily to strikingly different acidity milieus within the human body. Berg and collaborators demonstrated how S-system models can be used to study the antimicrobial efficacy of drugs in humans [620]. Vera and colleagues analyzed model-based strategies for identifying potential drug targets in cases of known enzymatic dysfunction [621]. Torres’ group proposed a model for the dynamics of Leishmaniasis and explored possible applications of the model to the design of effective therapies [622]. Magombedze and Mulder developed an S-system model of tuberculosis [623].

Rather than focusing on specific diseases, several authors developed and analyzed generic disease models, using BST. Some of these focused on risk groups, risk factors, and epidemics [25, 28, 424, 598, 624627], while others studied the trajectories from health to disease in general and in a personalized setting [379, 628]. They also described generic modeling approaches for understanding inflammation [629], which were specifically applied to a simple cystic fibrosis model [182]. Faratian and colleagues suggested S-system modeling as a very promising approach for integrating data in cancer research [630, 631].

A few investigations pointed out the use of BST models for drug development [43, 620]. Torres’ group proposed a generic strategy of data integration through modeling for disease caused by malfunctioning enzymes [621]. Some additional studies are discussed as applications of optimization methods.

Boege and colleagues studied the intracellular dynamics and localization of the enzyme topoisomerase IIβ [632].

6. Optimization

The specific application models discussed before had various purposes. Some were developed to test whether the scientific community understood a pathway correctly, some were constructed to explore what-if scenarios, and others were the starting point for various kinds of manipulations, such as the remediation of a disease. A prominent role for some models was the optimization of some beneficial model feature. In particular, several microbial models were designed with the goal of engineering and optimizing microbes to make, in a cost effective manner, desirable high-volume products, such as ethanol or citric acid, or low-volume, valuable organic compounds, such as feed additives and pharmaceutical precursor compounds. Hand in hand with these goals, general and BST specific methods were developed to make such optimization tasks feasible and effective. Thus, over the years, metabolic engineering became a prime area of optimization within the realm of BST. The generic task of this endeavor is to alter the metabolic flux distribution within a microorganism such that this organism produces and excretes a desired end product in much larger quantities than the wild type.

As in other areas of mathematics, there is huge difference between linear and nonlinear optimization tasks. While it is fairly easy to solve linear optimization problems with hundreds of variables, nonlinear problems become computationally expensive even if only a few dozen variables are involved. As a consequence, it is desirable to explore to what degree application problems can be assessed with linear optimization methods.

6.1. Stoichiometric and Flux Balance Analysis

Although metabolic systems are intrinsically nonlinear, they do have fundamental linear features, and these have been exploited in metabolic engineering (e.g., [130, 633]). The first is distribution of fluxes at the steady state of a system. Consider as an example the illustration diagram in Figure 14. Here, every flux is typically nonlinear, but the overall structure of the flux network is linear.

Suppose that the system receives input through the flux , which subsequently distributes throughout the system. The change in any of the nodes (i.e., metabolite pools) can be expressed as a function of all contributing fluxes. For instance, the dynamics of is governed by the following equation: which directly expresses that and contribute to the growth in , while moves material away from . The fluxes , , and are typically nonlinear functions of metabolites, enzymes, and modulators. They could be Michaelis-Menten or power-law processes or take any structural format from among an unlimited repertoire. As in many other cases, the steady state is of particular interest. In order for all nodes to remain constant, the fluxes at each node must balance. This balance requires, for instance, for : In general, the flux balance of a metabolic pathway system at a steady state is characterized by a system of linear algebraic equations. These derive directly from the generic system representation in (2), which are expressed for no change in the variables and can be expressed in matrix format as Here is the vector of metabolite concentrations, is the stoichiometric matrix, and is the vector of all reactions steps in the system.

As an example, let us revisit the GMA model of the branched pathway system in Figure 2 and (18). The stoichiometric matrix has three rows, one for each dependent variable, and five columns for the five reaction steps in the system. Thus, Typical optimization tasks in these systems consist of optimizing a flux that represents the production of a desired metabolite. In many cases, the optimized system should operate at the steady state, and the stoichiometric equation serves as a set of linear algebraic constraints. Thus, the solution maximizes the desired flux, while ensuring that the system operates under steady-state conditions. In most practical cases, the number of fluxes is quite a bit larger than the number of metabolites, with the consequence that (infinitely) many solutions are possible. Since the equations refer to a biological system, it is reasonable to constrain metabolites and fluxes in magnitude so that they cannot be smaller or larger than some limits. These constraints can ensure that the system does not deviate too much from its normal operation in most of its components, that thermodynamic constraints are satisfied, and that the system is not metabolically overburdened with the need to produce and maintain too many enzymes at significantly elevated levels. The subspecialty of metabolic engineering pursuing optimization with these types of concepts and strategies is stoichiometric and flux balance analysis (FBA) (e.g., [130, 633]).

FBA is mathematically elegant and has been very successful in practical applications. Its enormous advantages are the linearity of the optimization task and the associated fact that very large and even genome-wide systems can be optimized with reasonable effort. However, FBA also has intrinsic disadvantages. First, by its nature, it predicts new steady-state levels of fluxes but does not allow inferences on metabolite levels. Second, it largely ignores regulatory features at the metabolic level, which in reality tend to keep systems from deviating too far from their wild type steady state. Thus, FBA runs the risk of predicting a steady-state flux distribution that the actual cell would avoid, due to regulatory controls. Attempts have been made to account for regulatory processes, but these have been rather coarse. Finally, FBA requires the specification of an objective function, which typically reduces the infinite number of solutions to a single optimal solution. In microbial systems, this objective function usually formalizes that microbes tend to maximize their growth rate. In other cases, a valid objective is not always easy to pose. For instance, cells in higher organisms have different objectives. Nonetheless, the concepts and simplicity of FBA are very appealing, and recent attempts have been made to begin with FBA and then to convert the results into nonlinear dynamical systems.

6.2. Steady-State Optimization of BST Models

As a genuine alternative to stoichiometric and flux balance analysis, attempts were made to optimize fully regulated metabolic pathway systems. However, these immediately involve nonlinearities that seem unavoidable. Intriguingly, a compromise was found in S-systems [634, 635]. As discussed before, S-system models within BST do capture the nonlinearities of metabolic systems, but, at the same time, possess steady states that are characterized by systems of linear equations in logarithmic coordinates (see (28)-(29)). Furthermore, fluxes become linear in logarithmic coordinates as well, and it is mathematically equivalent for optimization purposes whether typical constraints on fluxes or metabolites are formulated in Cartesian or logarithmic coordinates. As a consequence, the entire optimization task becomes linear in a logarithmic space, and one obtains the following linear program:(1) maximize ln(flux)

subject to(2) steady  state  equations,   expressed in  logarithms  of variables,(3)ln(dependent  or  independent  variable) ≤ constant,(4) ln(dependent  or  independent  variable) ≥ constant,(5) ln(dependent  or  independent  variable) = constant,(6) ln(dependent  or  independent  variable)  unrestricted,(7) ln(flux) ≤ constant,(8) ln(flux) ≥ constant,(9) ln(flux)  unrestricted,(10) ln((flux  1)/(flux  2)) ≤ constant.Here, (1) is the objective function, (2) ensures operation of the system at steady state, and (3)–(10) are biologically motivated constraints on various metabolites and fluxes in the system [4, 634]. Thus, arbitrarily complex S-system models can be optimized under steady-state operation with linear optimization methods.

Of course, the simplicity of this strategy does not come for free. An important argument against this approach is that S-system models are local approximations, so that the ranges around the operating points, where metabolites are validly represented, are limited. Thus, this strategy has to address questions regarding the accuracy of the power-law approximation (see later), which are difficult to answer in the abstract, but which have been addressed in comparative optimization studies (e.g., [4, 511, 512, 519]). Furthermore, it is possible to pursue an iteration of optimization steps, where each individual step is likely to remain within its range of validity [634]. A related question is to what degree imprecise implementations of the optimized enzyme profile would compromise the results [636].

The second argument against S-systems is their nonintuitive aggregation of fluxes at branch points, as was discussed before. This issue is seen as a particular problem if one of several branches has to be altered for optimizing the declared objective. Over the years, three compromises between the simplicity of S-system optimization and the more intuitive GMA representation were suggested. The group of Torres proposed the Indirect Optimization Method (IOM), which iteratively formulates a GMA model, converts it locally into an S-system, optimizes this S-system with strict constraints, and converts the predicted optimal solution back to the GMA form [4, 511, 637]. The results of this mixed strategy were compared with direct nonlinear optimizations and found sufficiently accurate and, at the same time, computationally much cheaper to implement and solve.

The IOM method was later extended to take static as well as dynamic features into account and to facilitate the optimization of bioprocesses with efficient methods of linear programming [638]. Xu corrected the approximation error by proposing a “modified IOM” with a Lagrangian analysis, where the objective function includes an extra term that compares the derivatives of the metabolite concentrations with respect to the enzyme activities in the original and the S-system model [639]. Subsequently, the group analyzed how uncertainties affect the results of the modified IOM approach [640]. The same investigators proposed two variations of a biobjective optimization approach based on S-system representations and a weighted-sum or iterative minimax procedure [518, 520]. Chang and Sahinidis used a complex bilevel optimization scheme to optimize S-system models of metabolic pathways, while ensuring their stability [517]. Xu’s group, as well as Sun et al. compared alternative optimization methods for biological systems, including those based on S-systems and IOM methods [209, 641].

A second strategy for avoiding accuracy problems with S-systems, which at first sounds reasonable but is more complex than one might expect, is the separation of GMA optimization tasks into two linear models, one using the stoichiometric structure as linear steady-state constraints, and a second set of constraints that formulates every flux term as a linear equation in logarithmic coordinates [4]. So far, this simultaneous optimization in a Cartesian and a logarithmic space has not been implemented efficiently.

The third proposal for optimizing GMA system was the use of linear or geometric programming [637, 642]. The latter optimization method is designed for tasks where both the objective function and the constraints consist of positive power-law terms, called posynomials. The only, but unfortunately crucial, issue for a direct application to GMA systems is the fact that the GMA models almost always contain terms with negative signs. Approximation methods for condensing positive and negative power-law terms into strictly positive terms may be used to remedy this issue and accomplish the optimization task. This process of “condensing” is equivalent to the aggregation process that governs the conversion of GMA into S-system models, which we discussed with the example of a branched pathway.

The group of Gatzke approached the GMA optimization task without capitalizing on similarities to the S-system structure. Instead, this group proposed a branch-and-reduce (BR) method, which guarantees the identification of the global optimum, and not just some local optimum [513515]. In the BR approach, the entire solution space is subdivided into subspaces, and the clever construction of so-called underestimating functions permits the elimination of subspaces based on the fact that no solution within this subspace can possibly be better than some solutions in other subspaces. In the case of GMA systems, the construction of underestimating functions is facilitated by the fact that the only possible functions encountered are power-law functions, whose mathematical structure is well known. In a variation on this global approach, Sorribas’ group recently used a global optimization method that makes direct use of the structure of GMA systems [546, 643645]. These authors also showed that the nonlinear global optimization method is compatible with certain types of recasting (see later), which makes the approach available for larger classes of nonlinear systems [646], including an extension of BST toward Hill-type processes, which constitute the so-called Saturable and Cooperative Formalism (SC formalism; [168, 169]).

As another alternative, Rodríguez-Acosta and collaborators proposed a stochastic optimization algorithm [510]. Similarly, Zheng and colleagues devised a stochastic optimization approach, based on information theory and clustering analysis, that assisted in the identification of local optima, while permitting the exploration of regulatory signals [647].

Shiraishi’s group proposed means for identifying bottlenecks in S-system models that prevent optimization toward higher yields [409411, 648, 649]. Torres’ group compared different optimization approaches based on BST and lin-log models [160] and developed a software tool for optimizing metabolic system models in BST format [98].

An important extension of all strategies discussed so far is the simultaneous optimization of multiple objectives. For instance, one might want to optimize the production of a valuable organic compound, while simultaneously minimizing cost. In most cases, such multiobjective optimizations cannot optimize all criteria but have to arrive at a compromise where some objectives are addressed in an inferior way in order to allow better performance in other criteria. Within the realm of BST, multiobjective optimization has been discussed several times (e.g., [4, 318, 512, 650, 651]). As an example, Vera and colleagues simultaneously maximized ethanol production and minimized each of the internal metabolite concentrations [512]. For this task, they used an S-system representation, which they found well suited. These authors also considered investment costs and “paracosts,” which included stability, flexibility, and controllability [652]. Link et al. proposed a hybrid genetic algorithm-based method to solve constrained multiobjective optimization problems and applied it to the fermentation reaction network in yeast [651]. The simultaneous goals were to maximize ethanol production and reduce metabolic burden. Other multiobjective optimization methods were already discussed in the context of parameter estimation.

Yet another type of steady-state optimization of S-systems was proposed by Hatzimanikatis and colleagues [345, 346]. Recognizing the one-to-one mapping between parameters and system structures in BST models, the authors proposed a mixed integer linear program (MILP) to optimize not just parameter values but the entire regulatory structure of a model. In this case, the presence or absence of regulatory features was modeled by setting the corresponding kinetic orders to zero or to nonzero positive or negative values and optimizing the system within these settings.

Three applications of several of these optimization methods have received particular attention. The first is citric acid production in the black fungus Aspergillus niger, which was originally modeled by Torres [183, 184], as discussed earlier. Over the years, the system was optimized, the model was refined, and it was optimized again; examples are [4, 66, 535539, 653].

The second application has been the optimization of metabolites in the fermentation pathway of yeast. Here, the original BST model had been developed by Curto et al. [185, 508, 509], and optimization studies included [3, 4, 255, 319, 409, 510520].

The third application targeted the improved production of L-(−)-carnitine. This substance is a valuable food supplement, especially for individuals receiving HIV treatment. While it is possible to synthesize carnitine in vitro, the result is a racemic mixture that is expensive to separate. The strategy was therefore to produce carnitine in a modified E. coli strain. The original model was developed and optimized by Alvarez-Vasquez and colleagues, who used the IOM method described before [523]. Some of the predictions from the optimization analysis were subsequently tested in an experimental lab and found to be correct [654, 655]. Specifically, the investigators altered the dilution rate and the initial precursor concentration in amounts suggested by the model optimization and obtained a substantial increase in carnitine production rate. In subsequent modeling studies, the group analyzed the induction of carnitine by cAMP [656] and studied the effects of salt stress conditions on carnitine metabolism and thereby elucidated the role of carnitine as an osmoprotectant in E. coli [657].

Other model applications addressed the increased production of tryptophan in bacteria by means of genetic manipulations of the tryptophan operon, which led to the not-yet validated prediction of a strongly increased tryptophan flux [658]; see also [640, 659, 660]. Similar methods were furthermore used to detect drug targets in cases of diseases caused by enzyme failure [621], and for studies of the catalytic efficiency of the enzyme triosephosphate isomerase [661].

6.3. Optimization of Transients

All optimization tasks so far were related to optimal system operation under steady-state conditions. A conceptually and mathematically different task is the steering of a system toward a desired state or along a desired trajectory. For instance, it might be important to reach a desired operating state in as short a time period as possible. The simplest situation is the following. Suppose that a system needs to be moved to a new steady state, and the question is how (independent) control variables would have to be set to make the system reach the desired state. It is easy to imagine that the problem has many solutions. For instance, one might be able to change a few control variables a lot or many control variables a little. Also, some approaches of the new state may be faster than others.

If the model is formulated as an S-system, the task is again much facilitated by the linearity of the steady-state equations. Indeed, the task becomes a matter of inverting the system matrix in (29) [379, 380]. If the numbers of dependent and independent variables are such that the system matrix has full rank, the solution just requires a simple matrix inversion. In other cases, pseudoinversion is required. In typical cases, this pseudoinversion permits additional features to be optimized. For instance, one could demand that the control variables deviate from their normal values as little as possible.

A more complicated situation arises if the desired state is not a steady state or if the time allowed to reach the desired state is finite. Ervadi-Radhakrishnan and Voit approached this problem by transforming an S-system into an affine nonlinear control system [594]. Controllability through independent variables was achieved with an exact feedback linearization method. The method was illustrated with a small glycolytic-glycogenolytic pathway model that had been proposed by Torres and which subsequently became a benchmark example for these types of control tasks [593]. For instance, the group of Meskin and Nounou proposed several variations of control schemes, for instance, based on Kalman filters or fuzzy algebra, for moving the variables in this system to new states [595597, 662]. Lin et al. studied the control of signaling cascades [384].

Domingues et al. considered the problem of identifying a control function that, within a finite time interval, produced the maximally possible concentration of a desired product [663]. They proposed a combined strategy of optimal control that combined stoichiometric and kinetic features BST model in S-systems form. Using Pontryagin’s Maximum Principle, the authors showed that the control function, defined within the interval [0, 1], achieved optimality at one of the extremes and that a control regimen consisted of optimized switches between 0 and 1.

In a different line of work, Long and colleagues developed a predictive control method for inferring states of a system that are impossible or difficult to measure directly [664]. Specifically, they designed a controller that operates by taking process data at discrete time intervals and uses the measured information to formulate an optimization problem that minimizes some objective function. The optimization task is solved repeatedly and leads to an optimal control trajectory.

7. Methodological Extensions of BST

The initial rules for setting up BST models were established in Savageau’s early work [1, 10, 100102], and they were later reviewed in a variety of papers addressing different audiences, as was discussed before. Not surprisingly, with an expansion in scope, specific methods were developed where needed, and some of these capitalized on the particular structure of BST models, while other were more generic and did not explicitly utilize the particular power-law structure.

Several BST articles addressed the topic of metabolic channeling [61, 323, 665]. This phenomenon is associated with the observation that enzymes are often located close to each other, for instance by being anchored in a membrane, which facilitates the transfer of substrates in a chain of enzymatic reactions. As a consequence, the individual reactions in the chain are not independent of each other, and the implicit assumption of a homogeneous spatial distribution of substrates and enzymes is no longer valid.

More generally, the question was asked how kinetic rate equations are affected by nonhomogeneous milieus. Inspired by precise experimental and biophysical work on intracellular kinetics [666670], Savageau treated this situation by introducing power-law representations for fractal kinetics, which were shown to capture spatially restricted reaction systems better than, for instance, Michaelis-Menten formulations [60, 671673]. Intriguingly, the spatial restrictions of reactions lead to much higher kinetic orders than for Michaelis-Menten reactions in a homogeneous three-dimensional space. Bajzer’s [674676] group tested fractal power-law representations with carefully designed experiments and found them to be valid and more appropriate than models with time-dependent reaction rates. Wu et al. analyzed the same data with stochastic equivalents of GMA systems [176].

Goel and others asked to what degree ill-characterized systems may be represented with BST models. They assumed that a “concept map” of the topological interactions of an otherwise ill-defined system was known and that additional biological insights or intuition permitted a coarse description of the dynamics of the system components, following some stimuli. Based on this information, they suggested the use of a BST representation combined with inverse methods for constructing a first, broad-stroke model [45, 612]. Somewhat related is the construction of mesoscopic models of medium granularity, which may be abstracted toward the discovery of design principles or expanded toward more detailed simulation models [44].

Ubiquitous metabolites like ATP or NADH are always problematic for modeling studies, because they are involved in dozens, if not hundreds of reactions. As a consequence, the modeler faces the conundrum of either trying to include all reactions, which is typically infeasible, or to make simplifying assumptions, such as constancy of the metabolites or factors themselves or of constructs like the total ATP + ADP + AMP pool, which have their own issues. Voit and Ferreira therefore translated a common strategy from biochemistry, namely, the use of buffers, into corresponding computational buffers, which are modules within BST that either absorb excess materials or release materials that are temporarily in demand [677].

Some authors introduced slight variations on the pure BST structure. For instance, if an inhibitor is modeled with a negative kinetic order, the corresponding terms grow without bound for small concentrations. A simple remedy is the definition of the inhibitor by a term like [564, 620].

Delays are often ignored in purely biochemical systems. However, if additional time scales must be incorporated in a model, the explicit representation of delays may be needed. Interestingly, delays can be used in BST models without destroying their mathematical structure. While the delays are approximated, the approximation error can be made arbitrarily small [172, 174, 678]. It was demonstrated that the inclusion of delays is sometimes mandatory in multi-time-scale models, lest the model results are quantitatively, and even qualitatively, compromised [422, 599].

Of course it is possible to model some components of a system with power-law models and some with other structures. However, this strategy destroys some of the advantages of canonical models. If the heterogeneous components are also ODE models, the method of recasting can be used to regain a complete model in BST (see later). However, this strategy has its own issues, as artificial variables must be introduced. Nonetheless, it is, for instance, possible to construct recast input modules, such as circadian oscillators, that govern the temporal features of a system [3, 421, 679]. If the heterogeneous components are not representable with ODEs, other strategies are needed. For instance, Wu and Voit showed how BST models can be merged with discrete and stochastic effects through their embedding in the framework of Hybrid Functional Petri Nets (HFPNs) [172, 173]. Searson and colleagues used a hybrid S-system model for the analysis of fed-batch data [360]. Vinga and colleagues discussed advantages of integrating BST models with aspects of dynamic energy budget theory, a modeling framework that represents certain constraints on the organization of metabolic network at the organismic level in a nonspecies specific manner [170].

While many studies have described the distribution of a radioisotope label in network systems at steady state, Voit and colleagues developed models to capture the exact dynamics of this distribution upon an input of radioactive material [680, 681]. The group also investigated the effect of spacing in radiotherapy, in order to develop isoeffect relationships [682], and pointed to the importance of dynamic modeling in the analysis of radiation damage [683].

Marin-Sanguino and colleagues studied the question of whether a BST model, which is expressed in terms of metabolites, could also be expressed in terms of fluxes. In other words, this “dual” system contains fluxes as dependent variables. Such a transformation to the dual form is indeed possible with methods of advanced algebra [684].

8. Mathematical Features of BST Models

8.1. Accuracy

BST models are usually developed as approximations, and their accuracy of representation is a priori unknown. Nonetheless, four aspects are clear. First, the functions to be approximated by power-law expressions are in truth almost always unknown for biological systems. They are typically complex, convoluted aggregates of functions without an evident mathematical representation. Thus, it is immediately difficult, if not impossible, to assess with generality how well BST models or other representations perform. Second, with the correct parameter values, a BST model is absolutely exact at an operating, and it is very accurate close by, as it is guaranteed by Taylor’s theory. In principle, the accuracy of representation could also be estimated from Taylor’s theory, if the approximated functions were known, but such an assessment is practically infeasible in most cases [685]. Third, except for rare cases, the quality of representation eventually degrades in a distance from the operating point. This simple fact has been used as a generic argument against the use of power-law models [62, 151, 686, 687]. However, the argument is not quite fair, because all dynamic models in biology suffer this problem in one form or another. Finally, it is always possible to improve the accuracy of representation, at least in principle. One option is the use of piecewise approximations, which were already mentioned in the original BST papers [101] and which are directly related to breakpoint analysis [112], which is one root of BST models. The pieces may be concatenated in an ad hoc fashion, in response to some input, or in an automated fashion that optimizes the overall data fit [36, 385, 386, 688]. A second option is the inclusion of higher-order derivatives in the Taylor approximation. Cascante et al. [689] and Guebel [397] described the details for a second-order representation. While this representation is more accurate, its format becomes so cumbersome that this approach has not been used in practical applications.

Hernández and collaborators suggested the use of power-law models, in which the parameter values were not obtained from approximation at a single operating point, but by regression over a reasonable domain of the involved variables [192, 193]. In this case, the error between the original and the power-law function is distributed over an a priori determined range. The use of computational buffers was already mentioned [677]. By absorbing excess material and releasing material on demand, these buffers permit larger variations in system variables.

Finally, the technique of recasting permits increases in accuracy. Recasting is a method that uses equivalence transformations to convert a system of arbitrary ODEs exactly into a GMA or S-system; it will be discussed in the next section. As an illustrative example, consider the so-called Brusselator, which is a limit cycle system in the format of a GMA system [420]. Approximating two power-law terms in this system with a single power-law term, which would convert the GMA into an S-system, is locally quite accurate, but the S-system format loses the limit cycle property, whereas recasting the GMA into and S-system retains this feature perfectly [421].

In addition to these unambiguous mathematical facts, one may speculate more generically about the accuracy of BST systems. Two arguments seem reasonable. First, the fact that power-law representations permit nonlinearities seems to imply that these functions are probably better suited for biological modeling than linear functions. Second, one might argue that the accuracy of power-law approximations tends to become better for larger systems, because larger systems are usually better buffered against perturbations, and this buffering keeps the system variables in small ranges around a normal operating point. Related to this supposition is the conjecture that kinetic orders tend to become smaller in magnitude for larger systems, which in turn reduces many of the sensitivities of the system model and thus increases robustness [3].

Other assessments of accuracy fall into three categories. In the first, power-law models may be compared to functions that are assumed to be correct, at least in certain situations. The second, semiheuristic category covers cases where carefully generated experimental data suggest power-law relationships, while the third category contains cases where BST models fit experimental data very well.

8.1.1. Comparisons with Other Approximations

Before discussing details of model comparisons, one should note that many representations perform similarly well, if variables stay relatively close to their operating values, which are often taken at the normal steady state of the system. As a consequence, dynamic simulations with different model types often exhibit very similar results (e.g., [63, 186, 187]).

Shiraishi and Savageau discussed different typical alternatives for representing biochemical reactions [37]. They showed that several of these standard models are in fact special cases of power-law models and that the level of approximation plays a role in the assessment of accuracy. For instance, the Michaelis-Menten (MM) model in its explicit form is not a special case of a power-law model. However, formulating the MM processes in terms of elementary chemical reactions makes the model a mass action system and therefore a special case of a GMA system.

Several studies compared BST models against other dynamic modeling formats. Examples included comparisons with lin-log models, which perform better for very high substrate concentrations, but lead to negative rates for very small concentrations [58, 62, 63, 71, 151]. Kaddi and colleagues compared several modeling frameworks, including GMA systems, but could not identify a clear winner [690]. Similarly, Dräger et al. 2009 compared various rate laws, especially with respect to their optimization potential and obtained mixed results. Depending on the situation, a GMA model or a combination of MM with so-called convenience kinetics produced the best results [691]. Costa et al. considered a combination of lin-log and MM models best, unless the concentrations in the system were small [167]. Hadlich and colleagues (2009) compared several modeling formats for biochemical networks and found that there is “no silver bullet of metabolic model formulation,” primarily because all models are approximations with a limited range of valid representation [692]. Similarly, Grima and Schnell compared GMA models with kinetic formulations permitting time-dependent rate coefficients, and the results were not clear cut [693].

Addressing systems with relatively few molecules, Wolkenhauer’s group compared GMA models with a compact version of a Gillespie representation for stochastic kinetic systems in the format of the Chemical Master Equation [694]. Wu characterized under what conditions such stochastic systems can be modeled as power-law functions with sufficient accuracy [176].

Maybe surprisingly, the S-system variant is more accurate than the corresponding GMA model if MM or Hill rate laws are to be represented [695]. This finding was explained with the fact that power-law approximations tend to overestimate hyperbolic functions, such as MM, while the aggregation of terms tends to underestimate sums of processes, thereby leading to a partial compensation of errors. Similar results were demonstrated for reversible reaction systems, where different types of aggregation are possible [5456, 454].

8.1.2. Semiheuristic Analyses of Accuracy and Heuristic Evidence

Much circumstantial support for power-law representations comes from good model fits to data characterizing systems in vitro and in vivo. Indeed, most of the applied studies described before were tested against some data and found sufficiently accurate. Of course, these fits are neither proof of correctness nor superiority over other models, because the data often were not detailed enough to permit such assessments. Nevertheless, some types of data lend direct support to the use of power-law functions. For example, Savageau observed that many processes in vitro and in vivo naturally follow power-law relationships over surprisingly wide ranges of variation that span several orders of magnitude. Good examples include the induction characteristics of several genes [1, 10].

Kopelman’s group conducted careful experiments characterizing the kinetics of chemical reactions in homogeneous three-dimensional environments as well as in constrained, lower-dimensional spaces, such as membranes or channels [666, 667, 669, 670]. They determined that fractal kinetics, as described by Savageau [13, 60, 671], represented reality well. Similarly, Neff and Bajzer compared different model representations against experimental data in viscous media and found that fractal kinetics within BST seemed to be slightly superior to other alternatives [675, 676].

Vlad and colleagueS-systematically analyzed data characterizing a specific recycling process in a prothrombin assay. For the slow processes, they formulated a variant of a mass-action law, while fast reactions were modeled with delay expressions. They concluded that GMA models capture the empirical equations well [696]. In other contexts, several analyses of metabolic time series data were fitted quite well with GMA and S-system models. Examples included bacterial and yeast systems (e.g., [201, 357, 547]).

Growth may be seen as a heuristic manifestation of the long-term dynamics of organisms. To demonstrate this point, Savageau derived growth functions and the phenomenon of allometry from first principles of cellular processes [128, 697, 698]. Indeed, allometry, which describes the rate of growth of one part of an organism in relation to other parts, is directly and uniquely commensurate with power-law functions [3, 582, 697, 699701]. More generally, many growth phenomena were represented well with power-law models within BST, sometimes directly, and sometimes using the method of recasting, which will be discussed next [25, 119, 128, 165, 423, 447, 485, 575, 576, 582, 584, 620, 697, 698, 702705].

9. Recasting

The growth functions just mentioned present a good opportunity to discuss a feature of BST models that is very intriguing. Namely, any system of ODEs can be converted into an exactly equivalent BST model in GMA or S-system format, or even in the much simpler format of a binary Half-system that only has exponents of 0 or 1 [113, 706]. This generality of the BST format first became evident during the development of the fundamental theory underlying BST. Namely, Savageau noted that growth functions can be converted in to equivalent S-systems through the introduction of one or two auxiliary variables [128, 698]. It was shown later that even a simple two-variable Half-system format can be used to classify most of the standard growth laws found in the literature [25, 624, 706].

The general principles of recasting are surprisingly simple. For each function that does not appear in an ODE system as a power-law term, one defines a new variable and differentiates it. Iterating this process eventually leads to a GMA system. Furthermore, the repeated definition of a product of two new variables for the sum or difference of two terms within a GMA system reduces this system to the S-system or Half-system form. The first component of this strategy was independently observed in the field of mathematical physics [707], while the second component was shown somewhat later for BST models [113, 136]. At the same time, Peschel and Mende showed that all ODE systems can be recast as Lotka-Volterra (LV) systems [134], and the equivalence of the two formats was confirmed soon after [21, 136].

The observation that essentially all nonlinearities could be equivalently represented as either LV or BST systems elevated these model representations to the status of universal formats. Hernández-Bermejo and Fairén combined the two formats into the representation of a Generalized Lotka-Volterra model, which allowed unique algebraic classification tasks, and declared this format as truly “canonical” [144, 145, 147, 708710]. Papachristodoulou and Prajna demonstrated that recasting can be used to study nonpolynomial vector fields by recasting them into rational vector fields [711].

As an illustrative example, consider the differential equation with , , and . Clearly, the sine function needs to be replaced, and we define . Differentiation of yields , and we define, for instance, . Thus, we obtain the two derivatives In addition, we reduce the Michaelis-Menten function by defining , which has the derivative . The initial values are set according to the definitions of the new variables, namely, , , and . Taken together, we obtain

A plot of the original and the recast is shown in Figure 15. The auxiliary variables are usually not of particular interest.

In the case just discussed the original “system” has one variable, where the recast system has four. The recast system is thus much “bigger.” However, correctly fixing the initial values completely defines a trajectory that is equivalent to the original equation. Thus, the original is embedded in a higher-dimensional space, where it is described exclusively by power-law functions. Several articles have discussed generic features of recasting and highlighted the fact that the process is not unique. For instance, polynomials may be recast in distinctly different ways [25, 706, 712]. Different recast versions of the same original system intersect in a high-dimensional space, and this intersection contains (or is equivalent to) the original.

The last step of a typical recasting process is the reduction of a GMA system to the S-system format [113]. This step raised the question of whether it is possible to transform S-systems equivalently back into GMA systems with fewer variables. As one might expect, such a reduction in dimensionality is not possible in general. However, it is possible in select cases, some of which can be characterized algebraically with methods of Lie transformations [712714].

Recasting can be very beneficial and may, for instance, speed up numerical solutions [87, 113], but it does not eradicate complicated problems with the original system. For instance, the computation of a steady state is not much simplified, because the recast S-system tends to have a system matrix with lower than maximal rank.

Recasting can be used as a modeling tool. For example, if a system is affected by a circadian process, it may not be feasible to model this process in detail. Instead, the periodic process may be modeled by a sine function and recast as shown above [3, 421]. It is even possible to study the effects of chaos on a system by modeling the chaotic process as a recast BST system [679].

It is also to some degree possible to recast other systems of ODEs into the GMA or S-system format and then to use BST methods for further analysis. As examples, one can optimize this recast system toward the maximization of yield [646] or study features of bistability [36, 385, 386].

Outside growth functions, recasting has been widely applied to statistical distribution functions. Savageau initiated this effort by showing how all typical univariate probability functions can be embedded in a “suprasystem” of S-system models [715]. Rust, Chen, and others extended this idea to probability density functions that are otherwise difficult to evaluate, such as the noncentral , , beta, gamma, and Chi-square distributions, as well as the distribution of the correlation coefficient [79, 716730].

While recasting provided a means for numerically evaluating complicated probability distribution functions, the high dimensionality of Savageau’s suprasystem suggested the search for simpler alternatives. Recognizing that cumulative distribution functions always grow monotonically from 0 to 1 and very much resemble growth functions, Voit proposed a single S-system equation as a good approximation of cumulative distribution functions, calling it the S-distribution [731]. It turned out that this S-distribution has interesting features. For instance, the initial value is directly related to the median of the distribution, the rate constant, which must be the same for both terms in the S-system equation, is related to the variance, and the kinetic orders determine the shape of the distribution. Indeed, the two kinetic orders were used as a shape classification system for continuous as well as discrete distribution functions [731733]. The same distribution was subsequently used in survival analysis and risk assessment [28, 625, 734738], and as a tool for random number generation and quantile analysis [739, 740], as well as for inference [741, 742]. The efficiency and flexibility of random number generation permitted the use of S-distributions for traditional and hierarchical Monte-Carlo simulations [424, 743, 744].

Further analysis showed that the S-distribution has interesting scalability features that permit analyses analogous to z-score computations with normal distributions [81, 745].

Parameters of the distribution were estimated with least-squares and maximum likelihood methods [731, 740, 746].

Yu generalized the S-distribution to multiple variates, through the introduction of copulas [747].

Sorribas’ group used the S-distribution, as well as a generalized GS-distribution [748], in a clinical setting to analyze reference intervals of normalcy and receiver operating characteristic (ROC) curves, which offer an effective method for the evaluation of the performance of a diagnostic test [190, 749, 750]. They also used the GS-distribution to study questions in survival analysis [751]. Voit’s group used the S-distribution for health-economical questions [752].

Considering dynamic systems as modulators of the S-distribution, changes in distributional shapes over time were characterized. As examples, the sizes of girls in healthy populations and the changes in size distributions of growing tree stands were analyzed [575, 584, 702].

10. Conclusions

During the past decade, the field of systems biology has been expanding at an amazing speed, and as this paper indicates, the technological developments and applications of BST have grown at a similar rate. The question then arises: what’s next? Of course, predictions are always treacherous, but some trends are emerging at the horizon, and they appear to be very well aligned with BST.

First, paralleling the increased scope in data generation, larger and larger systems are being tackled with computational models. Typically, large systems are better buffered against perturbations than small systems. As a consequence, key variables in larger systems tend to remain within smaller ranges of variation. As far as this is true in specific situations, the accuracy of BST will be sufficient in many cases.

Second, the search for design and operating principles has become popular [78]. This search has been a central theme in BST for a long time [1]. Most of the BST analyses in this area focused on small modules that could be investigated, at least partially, with crisp mathematical methods. It seems that the next phase of this theme might be the investigation of functional and operational principles that are embedded within, or distributed throughout, larger systems [44]. The relatively new focus on design spaces is a first significant step in this direction [492, 493], but it might be that additional concepts must be conceived to grasp design and operating features spanning large systems and several organizational scales.

Third, the community of systems biologists is fully recognizant of the fact that our current tools for addressing multiscale systems are not powerful enough. This deficiency pertains to models for several overlapping time scales, where neither slow nor fast processes can validly be ignored or considered constant, for spatial scales, where it is not feasible to represent all processes at similarly detailed molecular scales, as well as for organizational scales, where system responses extend from molecular to organismic scales and beyond. The telescopic property of power-law systems has been discussed theoretically [128], especially for different organizational scales, but it has not been used all that much in practical applications. This property may work well with ideas of mesoscopic modeling, because these models are located at a medium scale, which is extendable toward larger, more detailed systems, and also to more abstracted skeletal systems of smaller size that permit design analyses [44]. The telescopic property may also align well with methods of concept map modeling [45], where the topology of a system is complemented with coarse information, or even a biologist’s intuition, in order to convert a conceptual diagram into an initial dynamical model that can be used to sharpen intuition, gain novel insights, and suggest a new set of experimental hypotheses.

Fourth, recent developments in experimental biology have been empowering the field to generate molecular time-series data. These data mandate the creation of more efficient inverse methods for parameter estimation than are currently available. Moreover, in spite of recent and future advancements, these data can seldom be expected to be complete. Thus, systems biology must develop computational methods for filling gaps, merging heterogeneous data, and stitching series of static measurements together into dynamic trajectories. Given that the processes underlying these data are usually not fully known, the minimally biased, procedural model design methods of BST might provide excellent initial default models for this purpose. In particular, the power to get started, quasi-automatically, with the design of symbolic equations from biological diagrams, the rich repertoire of experience with parameter values, and the efficacy of the tools available for BST analyses provide superb starting conditions that may lead to the full completion of an analysis or at least show where finer or different approximations are needed. As an alternative to starting from a biological diagram, one might also start with the static results of stoichiometric or flux balance analyses and develop methods for converting these into a dynamic BST models. Some rudimentary analyses of this type have been proposed [323], but much more effort is needed. As one intriguing aspect of this pursuit, computational systems analysts might become enabled to propose minimal combinations of data generation experiments that would permit valid conversions of static into dynamic models.

BST has been around for slightly over forty years, and, indeed, this may be its prime time.

Acknowledgments

This work was funded in part by the National Science Foundation (Projects MCB-0946595 (PI: EOV) and ARRA-0928196 (PI: E. Mann)), the National Institutes of Health (Projects NIH-GM063265 (PI: Y. A. Hannun) and P01-ES016731 (PI: G. W. Miller)), and the BioEnergy Science Center (BESC; PI: Paul Gilna), a U.S. Department of Energy Research Center supported by the Office of Biological and Environmental Research in the DOE Office of Science. The funding agencies are not responsible for the content of this paper.