Abstract

Monoclonal antibodies (mAbs) are at present one of the fastest growing products of pharmaceutical industry, with widespread applications in biochemistry, biology, and medicine. The operation of mAbs production processes is predominantly based on empirical knowledge, the improvements being achieved by using trial-and-error experiments and precedent practices. The nonlinearity of these processes and the absence of suitable instrumentation require an enhanced modelling effort and modern kinetic parameter estimation strategies. The present work is dedicated to nonlinear dynamic modelling and parameter estimation for a mammalian cell culture process used for mAb production. By using a dynamical model of such kind of processes, an optimization-based technique for estimation of kinetic parameters in the model of mammalian cell culture process is developed. The estimation is achieved as a result of minimizing an error function by a particle swarm optimization (PSO) algorithm. The proposed estimation approach is analyzed in this work by using a particular model of mammalian cell culture, as a case study, but is generic for this class of bioprocesses. The presented case study shows that the proposed parameter estimation technique provides a more accurate simulation of the experimentally observed process behaviour than reported in previous studies.

1. Introduction

As the market demand for monoclonal antibodies is increasing, there is significant interest in developing proper models for mammalian cell culture processes, due to the fact that these are commonly used as production platforms for mAbs, which are the fastest growing segment of the biopharmaceutical industry [16]. For mAb production, various mammalian cell lines are usually exploited, such as murine myeloma (NS0), murine hybridomas, Chinese hamster ovary (CHO), and PER.C6 human cells. The selection of expression system is determined by its capability to deliver high productivity with suitable product quality attributes [7]. Medical applications for mAbs are quite extensive: diagnostic tools, therapies for various cancers, rheumatoid arthritis, cardiovascular conditions, and so on [4, 69].

Typically, the industrial operation for mammalian cell culture mAb platforms relies on empirical knowledge [2, 3, 10] and the improvements are achieved by using trial-and-error experiments and precedent practices. Consequently, process improvements have generally been time-consuming and costly, with a high degree of specificity. To assist these laboratory experiments and, in practical terms, to achieve high productivity and better quality products, it is of obvious interest to develop model-based applications and to achieve accurate dynamical models. However, the specific characteristics of these processes, such as complexity, nonlinearity, and absence of cheap and reliable instrumentation, require an enhanced modelling effort and advanced kinetic parameter estimation strategies.

In order to surmount the above-mentioned limitations of trial-and-error process development, the so-called predictive models for mammalian cell culture processes are quite attractive [4]. Generically speaking, cell culture modelling techniques are classified based upon whether a dynamic or a pseudo-steady-state interpretation of cellular metabolism is used [2, 4, 11, 12]. Being well-known in control systems, the pseudo-steady-state approach has a biochemical interpretation in cell culture processes. It is assumed that all metabolites within the cell culture process are accumulated or depleted at a rate considerably faster than the overall cell growth rate. Consequently, the concentration of each system metabolite and the rate of each metabolic reaction are all considered time-invariant [4]. This approach is simple and the obtained models are linear systems, which can be easily computed regardless of the model size (complexity). The information gathered in such pseudo-steady-state models concerns the metabolic configuration of cell culture. However, mammalian cells have a complicated internal structure, with several interconnected biochemical processes and with phenomena on multiple time scales. Thus, the pseudo-steady-state models cannot describe in detail the changes that occur over a continuous time-horizon (intracellular concentration profiles, changes in reaction rate due to gene regulation, etc.). Therefore, the dynamic modelling is more appropriate for these complex (and dynamical) processes. In this case, a system of differential equations will describe the bioprocess model. In many cases, the difficulty that arises is related to the computational problems, especially for large and stiff systems. No matter what modelling method is chosen, the complexity together with the nonlinearity of these processes is a limiting factor in model building.

In this paper, which is an extended work of [13, 14], an essential problem in dynamic modelling of cell culture systems is analysed, the so-called parameter estimation. The model of such bioprocesses can be obtained by using dynamic classical modelling (based on mass balance) or alternative approaches such as pseudo-bond-graph method (a version of bond graph method introduced by Paynter in 1961 and further developed in [1526]). However, regardless of the modelling method, in order to obtain a dynamical model useful for process development (including the design of some control strategies), the nonmeasurable parameters of the mammalian cell culture system must to be estimated. However, any parameter in a cell culture model could [4] have physical meaning and be measurable by experiment, have clear physical meaning but be experimentally inaccessible, or have no clear physical meaning (e.g., be purely mathematical in nature). Typically, optimization-based techniques are used for the estimation of nonmeasurable parameters of such biological processes [4, 27, 28]. For example, a quadratic programming technique was used by Gao et al. [27], and a simple discretization scheme combined with a filtered interior point primal-dual line-search algorithm [29] was proposed by Baughman et al. [4]. Other nonlinear optimization-based techniques are the genetic algorithms, orthogonal collocation, and particle swarm optimization, PSO, which have been applied mostly on chemical processes (see, e.g., [30, 31]) or in gene regulatory networks modelling [32, 33]. In order to obtain accurate solutions in the case of the mAb production process, in this paper, a particle swarm-based multistep nonlinear optimization algorithm is proposed [3436].

Concerning the applications of PSO for identification of biological systems, some results were reported for the process of glycerol fermentation by Klebsiella pneumoniae in batch, fed-batch, and continuous cultures [3741]. The estimation approach used in these works is in most cases a parallel PSO technique, which requires a considerable computational effort. Another trend is related to an indirect use of PSO technique for estimation, more precisely for the training of a neural network, which models the bioprocess [42].

During the last decade, PSO algorithms have gained much attention and wide applications in different fields due to their effectiveness in performing difficult optimization issues, as well as simplicity of implementation and ability of fast converge to a reasonably good solution. PSO is a population-based heuristic global optimization technique, first introduced by Kennedy and Eberhart [43] and referred to as a swarm-intelligence technique. It is motivated from the simulation of social behaviour of animals such as bird flocking, fish schooling, and swarm. In this algorithm, the population is called a swarm and the trajectory of each particle in the search space is controlled through the medium of a term called “velocity,” according to its own flying experience and swarm experience in the search space.

This paper proposes a multistep PSO version that uses time-varying acceleration coefficients [35], which is developed to solve the nonconvex optimization problem, ensuring fast convergence and very good performance. Finally, the obtained solution is an optimal set of the kinetic parameter values.

The proposed nonlinear modelling and estimation approaches are analyzed in this work by using a particular model of mammalian cell culture, as a case study, but they are generic for this class of bioprocesses. A previously published dynamic model of mammalian cell culture by Gao et al. in [27] is used as a case study. More precisely, a process of an Immunoglobulin G-secreting murine hybridoma cultured in a growth medium supplemented with proline, L-asparagine, and L-aspartic acid is taken into consideration.

2. Methods

2.1. MAb Synthesis by Mammalian Cell Culture: Process Description and Modelling Issues

In order to model the mammalian cell culture processes, first it is necessary to analyze the reconstruction of metabolic activities. However, the reconstruction generally includes only a subset of the highly active metabolic units found in proliferating mammalian cells [4, 34]. After the choice of this key subset, the next step consists in the modelling of the reactions’ rates in the frame of reconstruction. This process is a very difficult one, and the modelling of reactions as single-enzyme processes by using in vitro kinetic parameters is possible only for simple and small reconstructions. Often, in vitro kinetic parameters do not compare well against in vivo observations [4]. The reconstruction in complex processes will frequently combine a number of discrete processes into a single lumped process and will then apply kinetic parameters to the lumped process. Consequently, some kinetic parameters that appear in the model may have small or no physical significance and usually their values are not experimentally measurable [4].

Therefore, if all of the interaction of metabolites and cell physiology are included in the modelling process, then the size of the obtained model is very large and it is not appropriate for model-based optimization and control purposes. The usual solution is to select a priori the elementary reaction schemes and to relate the major macroscopic species such as biomass, essential substrates, and products by a set of so-called macroreactions [27]. Thus, a simplified model is obtained, which is suitable for optimization and control. As was mentioned before, the next step in the modelling is related to the determination of reaction kinetics and the final model is obtained based on mass balance equations of the macroscopic species involved in the reactions.

Next, a particular model of mammalian cell culture, published by Gao et al. [27], will be described and used as a case study. Gao et al. [27] provided a detailed description of an Immunoglobulin G- (IgG-) secreting murine hybridoma (130-8F Sanofi Pasteur) cultured in a D-MEM (Dulbecco’s Modified Eagle Medium) growth medium supplemented with proline, L-asparagine, and L-aspartic acid. In this process, batch cultures of the organism were allowed to grow for a minimum of 7 days. The infrequent measured concentration data for glucose, lactate, and ammonia, as well as for 20 amino acids and the monoclonal antibody, were obtained from the collected samples via proper techniques. By using the measured data, the average rates of transmembrane fluxes were calculated for each metabolite for both the initial exponential growth phase and for the postexponential (decline) phase. Gao et al. [27] used the metabolic flux analysis (MFA) in order to calculate the unknown intracellular fluxes from measured extracellular fluxes by applying steady-state mass balance equations. The obtained metabolic network was constructed based on some preliminary studies [4447], and it represents the significant metabolic pathways in proliferating animal cells. Gao et al. [27] determined that 16 reactions (a half of the total number) in the chosen reconstruction did not function significantly, and consequently these reactions, with an activity of about 1% of the total, were eliminated. The remaining subset of 16 reactions of the reduced metabolic reconstruction was further reduced by using a technique that combines reactions that share common metabolites [48]. Finally, the reduced reaction scheme for this mAb bioprocess contains a number of only 11 extracellular compounds and it consists of nine macroreactions, presented in Table 1 [4, 27].

The dynamical model of a generic bioprocess inside a bioreactor can be obtained by using the mass balance of the component and it is given by the following set of differential equations [49]: where is the -dimensional vector of the instantaneous concentrations (the concentrations of extracellular metabolites in our particular case), is the vector of the reaction rates, and is the dimensional matrix of stoichiometric (or yield) coefficients, with , ; , where if . The notation indicates that the sum is done in accordance with the reactions that involve the components . The sign of the yield coefficients is given by the type of the component : plus (+) when the component is a reaction product and minus (−) otherwise. is the specific volumetric outflow rate (h−1), usually called dilution rate. In (1), is the vector of rates of liquid supply and is the vector of rates of removal of the components in gaseous form.

Model (1) describes in fact the behaviour of an entire class of bioprocesses and is referred to as the general dynamical state-space model of this class [49, 50]. In (1), the term is in fact the rate of consumption and/or production of the components in the reactor, that is, the reaction kinetics. The term represents the exchange with the environment. The strongly nonlinear character of this model is given by the reaction kinetics. In many practical situations the structure and the parameters of the reaction rates are partially known or even completely unknown.

Typically, in a batch process the reactor is filled with the reactant mixture: substrates and microorganisms. Then, the reactions occur inside the reactor for a time period; after that, the products are removed from the tank. Because the studied bioprocess takes place inside a batch reactor, model (1) becomes that is, the term (which represents the exchange with the environment) is zero in this particular batch mode.

For the mAb production process, the concentrations of the 11 extracellular metabolites (given in the reaction scheme from Table 1) constitute the elements of the state vector from model (1) and are denoted as follows: where GLC = glucose, GLN = glutamine, GLU = glutamate, ASN = asparagine, ASP = aspartate, LAC = lactate, ALA = alanine, PRO = proline, MAb = monoclonal antibody, BM = biomass, = ammonia are the metabolites given in Table 1 (and for simplicity, the concentrations of the corresponding elements in model (1)).

However, in order to complete the model of the mAb production process, it is necessary to add the evolution of the viable cell concentrations of the culture, because the metabolite mass balances depend on the amount of viable cells. Gao et al. [27] noticed the typical behaviour of the batch culture, with exponential growth and postexponential decline, senescence phase (which occurs after the first phase of evolution, due to the aging of the cells and the accumulation of autoinhibitory metabolites). Therefore, another two concentrations enter in the complex model of the bioprocess, the viable cell concentration and the dead cell concentration . The dynamics of these concentrations will be modelled separately, depending of the phase (growth or decay).

Remark 1. To be exact, for the mAb production process the exchange with environment is zero except the CO2 gaseous flow, but this flow is not measured and CO2 is not predicted in the final model, as it is considered in [27].

In the following, the dynamical model (2) of the mAb production process will be presented, starting with the reaction scheme given in Table 1. Afterward, the problem of kinetic rates is addressed, together with the parameter estimation problem, via PSO-based techniques.

The dynamical model of the form (2) can be particularized for the mAb production process described by the reaction scheme from Table 1 by using the mass balance of the components (via classical methods [4, 27] or bond graph approach [13]) inside the batch reactor. The following dynamical model is obtained:

Model (4) can be written in a compact form [13]:where the values of stoichiometric coefficients are given in the reaction schemes from Table 1 and are as follows [4, 27]: , , , , , , , , , , , , , , , , , , , , and .

The nonlinear dynamical model (5) is obvious from the general form (2). However, in order to complete the model of mAb production process, it is necessary to add the submodels corresponding to the dynamics of viable cell concentration and dead cell concentration, respectively. Here it should be noted that Gao et al. [27] have obtained from the experimental observations that the model describing viable cell growth changes at  h to reflect the transition from exponential growth to the decline phase. With this remark, the dynamical model of the viable and cell concentrations evolutions is as follows [27]: where is the specific growth rate of the viable cells, is a kinetic (decay) parameter, and is the time period of the exponential growth phase.

The most difficult modelling problem for the system of differential equations (5), (6) is related to the model of nonlinear reaction kinetics. Gao et al. [27] suggested that a generalized form of saturable kinetics (i.e., compound Monod kinetics) is suitable to describe the rate of each macroreaction from the reaction scheme given in Table 1. Rates for each of these macroreactions were expressed in the next compact form [4, 31]:

In the kinetic rates expression (7), is the reaction rate for reaction , is the maximum reaction rate for reaction , is the concentration of substrate within the set of substrates for reaction , and is a kinetic half-saturation constant for substrate in reaction . The specific rate expressions for each macroreaction are given in Table 2 [4, 27]. As Baughman et al. [4] noticed, the rate expressions for macroreactions 7 and 8 do not rigorously conform to the general format (7). More precisely, it was assumed that the principal rate-limiting substrate for both biomass and antibody synthesis is glutamine, and the kinetic contributions of any other substrates were thus omitted.

In conclusion, the full dynamical model of mAb production process is given by (5), where the kinetic rates are of the form presented in Table 3, together with the dynamical models (6) of viable and dead cell evolution in the batch reactor.

The state variables within the dynamical model (5)–(7) are associated with components of the macroreactions from the reaction scheme given in Table 1. While these components represent biological variables (concentrations of some substances or compounds), the kinetic parameters do not have always clear measurable physical representations.

The problem that remains to be solved now is related to the estimation of the unknown (inaccessible) kinetic parameters of the dynamical model (5), (6) of the mammalian cell culture. Therefore, it is necessary to estimate the experimentally inaccessible parameter values for the model that provide the best approximation to the measured culture concentrations data.

2.2. PSO-Based Technique Parameter Estimation
2.2.1. Problem Statement and Basic PSO Algorithms

At the beginning of parameter estimation, the input and output data are known and the real system parameters are assumed as unknown. The identification problem is formulated in terms of an optimization problem in which the error between an actual physical measured response of the system and the simulated response of a parameterized model is minimized. The estimation of the system parameters is achieved as a result of minimizing the error function by the PSO algorithm.

Consider that the nonlinear system (2) that describes the dynamical behaviour of a class of bioprocesses is written as the following -dimensional nonlinear system: where is the state vector (i.e., the vector of concentrations), is the unknown parameters vector (i.e., the vector of unknown kinetic parameters), and is a given nonlinear vector function.

To estimate the unknown parameters in (8), a parameter identification system is defined as follows: where is the estimated state vector and is the estimated parameters vector.

The objective function defined as the mean squared errors between real and estimated responses for a number of given samples is considered as fitness of estimated model parameters [14]: where is the number of measurable states and is the data length used for parameter identification, whereas and are the real and estimated values of state at time , respectively.

This objective function is a function difficult to minimize because there are many local minima and the global minimum has a very narrow domain of attraction. Our goal is to determine the system parameters, using particle swarm optimization algorithms in such a way that the value of is minimized, approaching zero as much as possible.

Mathematical description of basic PSO and some important variants is presented in the following.

Candidate solutions of a population called particles coexist and evolve simultaneously based on knowledge sharing with neighbouring particles. Each particle represents a potential solution to the optimization problem and it has a fitness value decided by optimal function. Supposing search space is -dimensional, each individual is treated as a particle in the -dimensional search space. The position and rate of position change for th particle can be represented by -dimensional vector, and , respectively. The best position previously visited by the th particle is recorded and represented by , called . The swarm best position previously visited by all the particles in the population is represented by , called . Then particles search their best position, which are guided by swarm information and their own information . Each particle modifies its velocity to find a better solution (position) by applying its own flying experience (i.e., memory of the best position found in earlier flights) and the experience of neighbouring particles (i.e., the best solution found by the population). Each particle position is evaluated by using fitness function and updates its position and velocity according to the following equations: where is iteration number, is inertia weight, and are two acceleration coefficients regulating the relative velocity toward local and global best position, and and are two random numbers from the interval . Many effects have been made over the last decade to determinate the inertia weight. Various studies have shown that under certain conditions convergence is guaranteed to a stable equilibrium point [51]. These conditions include and . The technique originally proposed was to bound velocities so that each component of is kept within the range .

Unfortunately, this simple form of PSO suffers from the premature convergence problem, which is particularly true in complex problems since the interacted information among particles in PSO is too simple to encourage a global search. Many efforts have been made to avoid the premature convergence. One solution is the use of a constriction factor to insure convergence of the PSO, introduced in [45]. Thus, the expression for velocity has been modified as where represents the constriction factor and is defined as

In this variant of the PSO algorithm, controls the magnitude of the particle velocity and can be seen as a dampening factor. It provides the algorithm with two important features [52]. First, it usually leads to faster convergence than standard PSO. Second, the swarm maintains the ability to perform wide movements in the search space, even if convergence is already advanced but a new optimum is found. Therefore, the constriction PSO has the potential to avoid being trapped in local optima while possessing a fast convergence. It was shown to have superior performance compared to a standard PSO [53].

It is shown that a larger inertia weight tends to facilitate the global exploration and a smaller inertia weight achieves the local exploration to fine-tune the current search area. The best performance could be obtained by initially setting to some relatively high value (e.g., 0.9), which corresponds to a system where particles perform extensive exploration, and gradually reducing to a much lower value (e.g., 0.4), where the system would be more dissipative and exploitative and would be better at homing into local optima. In [54], a linearly decreased inertia weight over time is proposed, where is given by the following equation: where , are starting and final values of inertia weight, respectively, is the maximum number of the iteration, and is the current iteration number. It is generally taken that starting value is and final value is .

On the other hand, in [35] PSO was introduced with time-varying acceleration coefficients. The improvement has the same motivation and similar techniques as the adaptation of inertia weight. In this case, the cognitive coefficient is decreased linearly and the social coefficient is increased linearly over time as follows: where and are the initial values of the acceleration coefficients and and and are the final values of the acceleration coefficients and , respectively. Usually, ; ; ; and [14, 35, 36].

The dynamical model of mAb production process given by the relations (5), (6), associated with the expressions of the kinetic rates presented in Table 3, contains a number of 23 kinetic parameters (maximum reaction rates and kinetic half-saturation constants). In order to estimate these unknown (inaccessible) kinetic parameters of the complex dynamical model of mammalian cell culture, the measured concentrations are used and a PSO-based algorithm is implemented. The goal is to obtain a model that approximates as well as possible the behaviour of the process (expressed by means of the experimentally obtained data). The model of mAb production process under investigation is in fact based on several macroreactions; therefore it results in the fact that the kinetic parameters do not have always clear measurable physical representations. Thus, an optimization-based estimation technique is suitable for this set of kinetic parameters.

2.2.2. Implementation of PSO-Based Technique

In the following, a multistep PSO-based version that uses time-varying acceleration coefficients is implemented and an optimal set of kinetic parameter values of the mAb production process is obtained.

In order to implement the PSO-based technique, the model of mAb production process (5), (6) obtained from the macroreactions schemes is used, translated into the generic parameter identification system represented in (9).

The experimental concentration values for all the involved extracellular metabolites are provided in the work of Gao et al. [27]. The batch cultures of the organism were allowed to grow for 147 h, with 54 h the exponential growth phase and 93 h the postexponential (decline) phase. The infrequent measured concentration data for the metabolites were obtained from the collected samples via proper techniques. The set of concentrations measurements are given in Table 3 [4, 27]. Each data point is the average of measurements taken from three independent experiments, with standard deviation [4, 27].

To facilitate the application of the proposed PSO-based parameter estimation strategy, the time derivatives of the states from model (5)–(7) must be reconstructed. Because the measured data are very few (only 7 experimental measurements for each parameter, see Table 3), an interpolation method is necessary to find intermediate values of the states, which are actually the biological parameters of the process, that is, the concentrations of metabolites. Such situations with a small number of experimental measurements are typical for many bioprocesses. Ideally speaking, the online measurements (in each sampling moment, at every 6 min., e.g.) for each concentration are necessary. However, these online measurements are achieved with expensive instrumentation, or there are no such fast sensors for some concentrations. Thus, the infrequent offline measurements are preferred. To facilitate the achievement of an accurate estimation of model parameters of mAb process, we need the interpolation of these measured data, which allows us to obtain the unavailable data between adjacent measurement points (i.e., to estimate the unavailable data needed to calculate model predictions between these measurement points).

Remark 2. From mathematical point of view, a discussion about the interpolation technique can be done. Many authors use the linear interpolation, with advantages related to rapidity and simple implementation [4]. However, the linear interpolation is not very precise. Another disadvantage is that the interpolant is not differentiable at the points where the value of the function is known. Therefore, we propose a cubic interpolation method that is the simplest method that offers true continuity between the measured data. A cubic Hermite spline or cubic Hermite interpolator is a spline where each piece is a third-degree polynomial specified in Hermite form, that is, by its values and first derivatives at the end points of the corresponding domain interval. Cubic Hermite splines are typically used for interpolation of numeric data specified at given argument values , to obtain a smooth continuous function. The Hermite formula is applied to each interval (, ) separately. The resulting spline will be continuous and will have continuous first derivative.

The time derivatives of the states are approximated using forward differences: where represents the sampling period. In this approximation, the error is proportional with the sampling interval (a smaller sampling period will give a smaller approximation error).

Because a 23-dimensional optimization problem that must be solved for simultaneously estimation of all unknown parameters requires great computational resources, a multistep approach was used. So, the problem was split in nine simpler problems that are solved sequentially until all 23 parameters are found. These problems are noted with and the corresponding resulted parameters are presented in Table 4. A flowchart of the multistep PSO algorithm is given in Figure 1.

For example, the problem corresponds to the 10th equation from system (5)-(6) (that represents time evolution of the biomass) and only two parameters must be estimated in this case: and . The PSO algorithm is used to minimize the sum of the square errors between measured and estimated data:

3. Results and Discussion

3.1. Optimal Set of Kinetic Parameters

The optimization problem formulated in the previous section is nonlinear and nonconvex with many local minima. The estimated parameters of one subproblem are then considered known in the subsequent equations. In order to be clear, the already estimated parameters are not updated between solutions of subproblems. For example, in the frame of problem two parameters are estimated: and . These parameters will be available in the next problem (), and so on. In this study, all the computations were achieved with a sampling period  min (0.1 h). As example, for problem a number of 150 particles randomly initialized were used. The algorithm stops if the square error is smaller then or after 300 iterations. The optimal set of kinetic parameter values of the mAb production process obtained via this multistep PSO-based approach is given in Table 5.

The partition of the multidimensional optimization problem proposed within our PSO algorithm not only ensures the decrease of necessary computational resources by comparison with Gao et al. [27] and Baughman et al. [4] approaches but additionally offers a solution for the reported problems concerning the stiffness of estimated parameter set. More precisely, as Baughman et al. [4] noticed, there are some concentrations such as the mAb concentration that are several orders of magnitude less than other metabolites. This fact leads to stiffness problems in the optimization procedure, which are partially solved in [4] by using an alternative error objective for the mAb concentration. In our approach, the partition in simpler PSO problems solves uniformly this issue, using the same error objective for the entire set of parameters.

Another important problem approached and solved by using the proposed PSO method is related to the expressions of reaction kinetics. To simplify the optimization problem, Gao et al. [27] considered that the values of half-saturation constants are sufficiently small and consequently the kinetic rates given in Table 2 can be simplified such that the relation (7) becomes , . This simplification eliminates the necessity of half-saturation constants estimation, and only the maximum reaction rates need to be estimated. However, as was mentioned in [4], half-saturation constants can be significantly smaller than the corresponding substrate concentration in processes controlled by a single enzyme (e.g., glucose transport), but this fact is not necessarily true for the macroreactions in mAb production processes. Therefore, Gao et al. assumption is unwarranted and it can affect the reliability of the model. This is one of the reasons because the proposed PSO method yield good fitting results compared with Gao et al. [27], as it can be seen in Figures 25.

Remark 3. There are necessary some comments concerning the overfitting problems. Overfitting arises when a statistical model describes noise instead of the underlying relationship; it usually occurs when a model is very complex, such as having many parameters relative to the number of observations. Even though the approached mAb production model is quite complex, it is not a statistical model. Also, even if the number of measured concentration samples is relatively small by comparison with the number of parameters, the PSO technique is an optimization procedure, which is much less sensitive to overfitting than the methods that are based on model training, such as neural network techniques (see Tetko et al. [55]). The potential for overfitting depends not only on the number of parameters and measured data but also on the conformability of the model structure with the data shape.

Some comparisons of the proposed PSO approach with other PSO applications to bioprocesses can be done. Most of the reported works were focused on the process of glycerol fermentation by Klebsiella pneumoniae in batch, fed-batch, and continuous cultures [3741]. Shen et al. [37] studied a mathematical model of Klebsiella pneumoniae in a continuous culture. An eight-dimensional nonlinear dynamical system was obtained, and a parallel PSO technique was implemented in order to identify 19 parameters. The identification results are compared only with experimental steady-state values. The reported mean relative errors between the computational values and the experimental data are quite large (between 8% and 13%). A similar model of bio-dissimilation of glycerol by K. pneumoniae in a continuous culture was widely analyzed by Zhai et al. in [38]. Here a parallel PSO pathways identification algorithm was constructed to find the optimal pathway and 21 parameters under various conditions. The combined estimation of pathways and process parameters leads to a vast identification model, solved on a cluster server with 16 nodes (each node with 4 Core, 64-bit, 2.5 GHz processor), in over 130 hours. Comparable PSO approaches were used in [39, 41] in the case of the same fermentation process, but in a batch culture. For example, Yuan et al. [39] used a parallel migration PSO algorithm to estimate pathways and 12 parameters of the eight-dimensional model. The identification problem was split into two subproblems (one for pathways and one for parameters) and solved on the above cluster in about 18 hours. Another work addressed the PSO identification in the case of the fermentation of glycerol by K. pneumoniae in a fed-batch culture [40]. A nonlinear hybrid system was developed (with seven differential equations plus a switching mechanism and 8 parameters). The proposed technique was an asynchronous parallel PSO, and the reported averaged computational time was about 3.26 h, on the above-mentioned cluster server with 16 nodes. The reported results in [3840] are very good, even the accuracy of the algorithms cannot be fully assessed (statistical reports were not provided). However, the computational effort is considerable, given the fact that simultaneous pathways and parameters identification was approached.

The particle swarm-based multistep nonlinear optimization algorithm proposed in the present work was used for the estimation of 23 parameters of an eleven-dimensional nonlinear system (the pathways identification was not considered). By using the multistep approach, the computational effort was quite small (about 20 min. on a computer with Intel Core i5, 64-bit, 3.3 GHz processor). The obtained results and the statistical analysis show a good accuracy of the identification results.

3.2. Simulation Results

The performance of the proposed estimation technique was analyzed by using numerical simulations. All these simulations are achieved by using the development, programming, and simulation environment MATLAB (registered trademark of The MathWorks, Inc., USA). For comparison, the simulated profiles based on the kinetic parameters obtained via PSO technique (Table 5) are represented together with the original system measurements [27] and with the profiles obtained by Gao et al. [27] and Baughman et al. [4], respectively. The concentration profiles based on the results of Gao et al. and on the results of Baughman et al. [4] approach, respectively, are simulated and plotted using the kinetic parameter values given in Table 6 [4, 27].

The simulated concentration profiles are presented in Figures 25. First, in Figure 2 the time evolution of the biomass concentration is depicted. As can be seen, the best matching with the measured data (the values of measured data plotted in all figures are taken from Table 3) is given by the PSO approach. Figure 3 presents the simulated concentration profiles of glucose, glutamine, asparagine, and aspartate. In all cases, the PSO proposed technique ensures the best estimates (in the case of asparagine, the results of Gao et al. [27] are comparable with those obtained using PSO). In Figure 4, the concentration profiles of lactate, proline, alanine, and monoclonal antibody are plotted. The best matching with the measured data is obtained with the PSO estimation technique for lactate, proline, and alanine. In the case of antibody, the best results were obtained by Baughman et al. [4]. Finally, Figure 5 shows the concentration evolutions of glutamate, ammonia, viable, and dead cells, respectively. The glutamate and ammonia concentrations are not measured; the time profiles of these variables were obtained from the dynamic model simulation. The viable cells and dead cells evolution obtained via PSO estimation match very well the measured data.

Since in industrial practice the measured data are affected by various disturbances, one explored the extent to which noisy measurements affects the estimated parameter values. For this reason, a Monte Carlo simulation approach was used. First, normal (Gaussian) distributions were constructed for every measured data set in Table 3, subject to the known mean and standard deviation of each point.

A set of 150 simulated measurement sets were generated. Finally, using each randomized data set, a new cubic interpolation of the data was generated for our standard condition and the parameter estimation problems () were solved for each case.

Outlying solutions were identified and excluded using a basic quartile classification method. The quartile values are chosen in the following manner. First, use the median to divide the ordered data set into two halves. The median is not included into the halves. Then, the lower quartile value is the median of the lower half of the data () and the upper quartile value is the median of the upper half of the data (). Namely, the lower and upper quartile () positions were found for the set of 150 objective values and the interquartile range was calculated. Solutions with objective values lying outside the interval were considered to be outlying cases. Using the remaining solutions, mean values and associated standard deviations were calculated for each estimated parameter.

The standard deviations were then converted to percentages of their associated mean value. These means and standard deviations are listed in Table 7.

Certain parameter estimates are much more susceptible to variability induced through perturbations in measured data than are others. It can be seen that certain parameters (, , ) are less sensitive to noisy measurements than are certain others ().

The proposed modelling and parameter estimation method can be applied to cellular processes described by the general form (1), and it is not yet applicable to all classes of bioprocesses. To be more specific, it is hard to be applied to processes characterized by phenomena such as propagation reactions, transport processes, latency and short intercellular phases (in epidemics), and spread (propagation) of infections, that is, processes with large heterogeneity and delays. A typical class of such nonlinear delay biosystems is represented by the dynamics models describing cell-to-cell spread mechanisms, encountered, for example, in HIV infections [56, 57].

4. Conclusions

In order to develop accurate models for mammalian cell culture processes and to overcome some of the specific problems of mAb production processes such as the nonlinearity, the absence of instrumentation, and the kinetics uncertainties, a multistep nonlinear particle swarm optimization-based technique for the estimation of experimentally unavailable kinetic parameters was designed and implemented. The proposed approach was tested by using a particular dynamical model of mammalian cell culture, as a case study, but is generic for this class of bioprocesses. We have established the capability of proposed technique to identify model parameters that provide an accurate simulation of experimentally observed mAb production process behaviour. The performed statistical analysis demonstrates that the proposed estimation method is robust against normal distributed noisy measurements. The simulations showed that the PSO parameter estimation technique provides more accurate results than those reported in previous studies.

The obtained dynamical model of the mAb production process is accurate and can contribute to the development of model-based applications, which lead to high productivity and better quality products. The performed simulations represent one of the possibilities of model validation. The results show that the proposed model offers good predictions not only of the cell culture, for instance predictions of concentrations of energy sources such as glutamine and glucose, but also of the main amino acids and products. The proposed estimation approach can be also applied to other bioprocesses belonging to the nonlinear class considered in the present study.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work was supported by UEFISCDI, Project PACBIO no. 701/2013 (French-Romanian project) and Project ADCOSBIO no. 211/2014, PN-II-PT-PCCA-2013-4-0544.