Journal of Chemistry

Volume 2017 (2017), Article ID 6560983, 12 pages

https://doi.org/10.1155/2017/6560983

## Robust Nonlinear Regression in Enzyme Kinetic Parameters Estimation

^{1}Faculty of Chemistry and Technology, University of Split, Ruđera Boškovića 35, 21000 Split, Croatia^{2}Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture, University of Split, Ruđera Boškovića 32, 21000 Split, Croatia

Correspondence should be addressed to Tea Marasović; rh.bsef@vosaramt

Received 18 October 2016; Accepted 6 February 2017; Published 5 March 2017

Academic Editor: Murat Senturk

Copyright © 2017 Maja Marasović et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Accurate estimation of essential enzyme kinetic parameters, such as and , is very important in modern biology. To this date, linearization of kinetic equations is still widely established practice for determining these parameters in chemical and enzyme catalysis. Although simplicity of linear optimization is alluring, these methods have certain pitfalls due to which they more often then not result in misleading estimation of enzyme parameters. In order to obtain more accurate predictions of parameter values, the use of nonlinear least-squares fitting techniques is recommended. However, when there are outliers present in the data, these techniques become unreliable. This paper proposes the use of a robust nonlinear regression estimator based on modified Tukey’s biweight function that can provide more resilient results in the presence of outliers and/or influential observations. Real and synthetic kinetic data have been used to test our approach. Monte Carlo simulations are performed to illustrate the efficacy and the robustness of the biweight estimator in comparison with the standard linearization methods and the ordinary least-squares nonlinear regression. We then apply this method to experimental data for the tyrosinase enzyme (EC 1.14.18.1) extracted from* Solanum tuberosum*,* Agaricus bisporus*, and* Pleurotus ostreatus*. The results on both artificial and experimental data clearly show that the proposed robust estimator can be successfully employed to determine accurate values of and .

#### 1. Introduction

Enzymes are molecules that act as biological catalysts and are responsible for maintaining virtually all life processes. Most enzymes are proteins, although a few are catalytic RNA molecules. Like all catalysts, enzymes increase the rate of chemical reactions without themselves undergoing any permanent chemical change in a process. They achieve their effect by temporarily binding to the substrate and, in doing so, lowering the activation energy needed to convert it to a product. The study of the rate at which an enzyme works is called enzyme kinetics and it is often regarded as one of the most fascinating research areas in biochemistry [1].

Mathematically, the relationship between substrate concentration and reaction rate under isothermal conditions for many of enzyme-catalyzed reactions can be modeled by the Michaelis-Menten equation [2]:where denotes a reaction rate, is a substrate concentration, is the maximum initial velocity, which is theoretically attained when the enzyme has been “saturated” by an infinite concentration of a substrate, and is the Michaelis constant, representing a measure of affinity of the enzyme-substrate interaction. By definition, is equal to the concentration of the substrate at half maximum initial velocity. The Michaelis constant, , is an intrinsic parameter of enzyme-catalyzed reactions and it is significant for its biological function [3].

Three most common methods, available in the literature, for determining the parameters of Michaelis-Menten equation based on a series of measurements of velocity as a function of substrate concentration, are Lineweaver-Burk plot, also known as the double reciprocal plot, Eadie-Hofstee plot, and Hanes-Woolf plot. All three of these methods are linearized models that transform the original Michaelis-Menten equation into a form which can be graphed as a straight line.

Lineweaver-Burk [4] (LB) plot, still the most popular and favored plot amongst the researchers, is defined by an equation:The -intercept in this plot is , the -intercept in second quadrant represents , and the slope of the line is .

Eadie-Hofstee [5] (EH) plot is a semireciprocal plot of versus . The linear equation has the following form:where the -intercept is and the slope is .

In Hanes-Woolf [6] (HW) plot, is plotted against . The linear equation is given bywhere the -intercept is and the slope is .

In all of the above-described linear transformations, linear regression is used to estimate the slope and intercept of the straight line and afterwards and are computed from the straight line parameters. Although these methods are very useful for data visualization and are still widely employed in enzyme kinetic studies, each of them possesses certain deficiencies, which make them prone to errors. For instance, Lineweaver-Burk plot has the disadvantage of compressing the data points at high substrate concentrations into a small region and emphasizing the points at lower substrate concentrations, which are often the least accurate [7]. The -intercept in Lineweaver-Burk plot is equivalent to inverse of due to which any small error in measurement gets magnified. Similarly, the Eadie-Hofstee plot has the disadvantage that appears on both axes; thus, any experimental error will also be present in both axes. In addition, experimental errors or uncertainties are propagated unevenly and become larger over the abscissa thereby giving more weight to smaller values of . Hanes-Woolf plot is the most accurate of the three; however, its major drawback is that again neither ordinate nor abscissa represents independent values: both are dependent on substrate concentration.

In order to reduce the errors due to the linearization of parameters, Wilkinson [8] proposed the use of least-squares nonlinear regression for more accurate estimation of enzyme kinetic parameters. Nonlinear regression allows direct determination of parameter values from untransformed data points. The process starts with initial estimates and then iteratively converges on parameter estimates that provide the best fit of the underlying model to the actual data points [9, 10]. The algorithms used include the Levenberg-Marquardt method, the Gauss-Newton method, the steepest-descent method, and simplex minimization. Numerous software packages, such as Excel, MATLAB, and GraphPrism, nowadays include readily available routines and scripts to perform nonlinear least-squares fitting [11, 12].

Least-squares nonlinear regression has been criticized for its performance in dealing with experimental data. This is mainly due to the fact that implicit assumptions related with nonlinear regression are in general not met in the context of deviations that appear as a result of biological errors (e.g., variations in the enzyme preparations due to oxidation or contaminations) and/or experimental errors (e.g., variations in measured volume of substrates and enzymes, imprecisions of the instrumentation). With the presence of outliers or influential observations in the data, the ordinary least-squares method can result in misleading values for the parameters of the nonlinear regression and estimates may no longer be reliable [13].

In this paper, we propose the use of robust nonlinear regression estimator based on modified Tukey’s biweight function for determining the parameters of Michaelis-Menten equation using experimental measurements in enzyme kinetics. The main idea is to fit a model to the data that gives resilient results in the presence of influential observations and/or outliers. To the best of our knowledge, this is the first study that examines the use of this technique for application in Michaelis-Menten enzyme analysis. We employ Monte Carlo simulations to validate the efficacy of the proposed procedure in comparison with the ordinary least-squares method and Eadie-Hofstee, Hanes-Woolf and Lineweaver-Burk plots. In addition, we illustrate the viability of our method by estimating the kinetic parameters of tyrosinase, an important enzyme widely distributed in microorganisms, animals, and plants, responsible for melanin production in mammal and enzymatic browning in plants, extracted from potato and two edible mushrooms.

The remainder of the paper is organized as follows. Section 2 provides a brief overview of the robust estimation model. Section 3 describes the experimental setup used in this research and the diagnostics that will be used to evaluate the effectiveness of the proposed procedure in determination of enzyme kinetic parameters. Results are discussed in Section 4. Finally, Section 5 summarizes the paper with a few concluding remarks.

#### 2. Robust Nonlinear Regression

Nonlinear regression, same as linear regression, relies heavily on the assumption that the scatter of data around the ideal curve follows, at least approximately, a Gaussian or normal distribution. This assumption leads to the well-known regression goal: to minimize the sum of the squares of the vertical distances (a.k.a residuals) between the points and the curve. In practice, however, this assumption does not always hold true. The analytical data often contains outliers that can play havoc with standard regression methods based on the normality assumption, causing them to produce more or less strongly biased results, depending on the magnitude of deviation and/or sensitivity of the procedure. It is not unusual to find an average of of outlying observations in data set of some processes [14].

Outliers are most commonly thought to be extreme values which are a result of measurement or experimental errors. Barnett and Lewis [15] provide a more cautious definition of the term outlier, describing it as the observation (or subset of observations) that appears to be inconsistent with the remainder of the dataset. This definition also includes the observations that do not follow the majority of the data, such as values that have been measured correctly but are, for one reason or another, far away from other data values, while the formulation “appears to be inconsistent” reflecting the subjective judgement of the observer whether or not an observation is declared to be outlying.

The ordinary least-squares (OLS) estimate of the parameter vector is obtained as the solution of the problem: where denotes the number of observations, is a matrix, whose rows are -dimensional vectors of predictor variables (or regressors), is a vector of responses, and is model function. Since all data points are attributed the same weights, OLS implicitly favors the observations with very large residuals and, consequently, the estimated parameters end up distorted if outliers are present.

In order to achieve robustness in coping with the problem of outliers, Huber [16] introduced a class of so-called -estimators, for which the sum of function of the residuals is minimized. The resulting vector of parameters estimated by an -estimator is thenThe residuals are standardized by a measure of dispersion to guarantee scale equivariance (i.e., independence with respect to the measurement units of the dependent variable). Function must be even, nondecreasing for positive values, and less increasing than the square.

The minimization in (6) can always be done directly. However, often it is simpler to differentiate function with respect to and solve for the root of the derivative. When this differentiation is possible, the -estimator is said to be of -type. Otherwise, the -estimator is said to be of -type.

Let be the derivative of . Assuming is known and defining weights , the estimates can be obtained by solving the system of equations:The weights are dependent upon the residuals, the residuals are dependent upon the estimated coefficients, and the estimated coefficients are dependent upon the weights. Hence, to solve for -estimators, an iteratively reweighted least-squares (IRLS) algorithm is employed. Starting from some initial estimates , at each iteration until it converges, this algorithm computes the residuals and the associated weights from the previous iteration and yields new weighted least-squares estimates.

##### 2.1. Objective Function

Several functions can be used. Here we opted for Tukey’s biweight [17] or bisquare function defined aswhere is a tuning constant and .

The corresponding function is

Tukey’s biweight estimator has a smoothly redescending function that prevents extreme outliers to affect the calculation of the biweight estimates by assigning them a zero weighting. As can be seen in Figure 1, the weights for the biweight estimator decline as soon as departs from 0 and are 0 for . Smaller values of produce more resistance to outliers, but at the expense of lower efficiency when the errors are normally distributed. The tuning constant is generally picked to give reasonably high efficiency in normal case; in particular produces a efficiency when the errors are normal, while guaranteeing resistance to contamination of up to of outliers.