Mathematical Methods and Models in the Natural to the Life SciencesView this Special Issue
Numerical Reduced Variable Optimization Methods via Implicit Functional Dependence with Applications
A systematic theoretical basis is developed that optimizes an arbitrary number of variables for (i) modeling data and (ii) the determination of stationary points of a function of several variables by the optimization of an auxiliary function of a single variable deemed the most significant on physical, experimental or mathematical grounds from which all the other optimized variables may be derived. Algorithms that focus on a reduced variable set avoid problems associated with multiple minima and maxima that arise because of the large numbers of parameters. For (i), both approximate and exact methods are presented, where the single controlling variable k of all the other variables passes through the local stationary point of the least squares metric. For (ii), an exact theory is developed whereby the solution of the optimized function of an independent variation of all parameters coincides with that due to single parameter optimization of an auxiliary function. The implicit function theorem has to be further qualified to arrive at this result. A nontrivial real world application of the above implicit methodology to rate constant and final concentration parameter determination is made to illustrate its utility. This work is more general than the reduction schemes for conditional linear parameters since it covers the nonconditional case as well and has potentially wide applicability.
The following theory is a systematic development of all functions covering properties of constrained and unconstrained functions that are continuous and differentiable to various specified degrees [1, 2] and the proof of the existence of implicit functions  for the form of these functions to be optimized. The implicit function theorem is applied in a manner that requires further qualification because the optimization problem is of an unconstrained kind without any redundant variables. Methods (i)a,b (described in Sections 2 and 3, resp.) refer to modeling of data [4, Chapter 15, pages 773–806] where the form of the function with independently varying variables is where and are datapoints and is a known function, and optimizations of may be termed a least squares (LS) fit over parameters which are independently optimized for datasets. Method (ii) focuses on optimizing a general function, not necessarily LS in form. There are many standard and hybrid methods to deal with such optimization [4, Chapter 10], such as golden section searches in 1D, simplex methods over multidimensions [4, pages 499–525], steepest descent and conjugate methods , and variable metric methods in multidimensions [4, pages 521–525]. Hybrid methods include multidimensional (DFP) secant methods , BFGS secant optimization , and RFO rational function optimization , which is a Newton-Raphson technique utilizing a rational function rather than a quadratic model for the function close to the solution point. Global deterministic optimization schemes combine several of the above approaches [9, Section ]. Other physical methods, perhaps less easy to justify analytically, include probabilistic “basin-hopping” algorithms [9, Section ], simulated annealing methods , and genetic algorithms [9, page 346]. An analytical justification on the other hand is attempted here for these deterministic methods, but in real-world applications some of the assumptions (e.g., continuity, compactness of spaces) may not always be obtained. For what follows, the distance metrics used are all Euclidean, represented by or , where det represents the determinant of the matrix . Reduction of the number of variables to be optimized is possible in the standard matrix regression model only if conditional linear parameters exist , where these variables do not appear in the final expression of the least squares function (2) to be optimized, whereas the nonconditional linear parameters do and are a subset of the variables; for the existence of each conditional linear parameter, there is a unit reduction in the number of independent parameters to be optimized. These reductions in variable number occur for any “expectation function” which is the model or law for which a fitting is required, where there are different datapoints , that must be used to determine the parameter variables [11, page 32, Chapter 2]. A conditionally linear parameter exists if and only if the derivative of the expectation function with respect to is independent of . Clearly such a condition may severely limit the number of parameters that can be neglected for the expectation function variables when the prescribed matrix regressional techniques are employed [11, Section , page 85] where the residual sum of squares is minimized: The -vectors in -dimensional space define the expectation surface. If the variables are partitioned into the conditional linear parameters and the other nonlinear parameters , then the response can be written . Golub and Pereyra  used a standard Gauss-Newton algorithm to minimise that depended only on the nonlinear parameters , where with being a defined pseudoinverse of [11, Section , page 85], where and are matrices. The variables must be separable as discussed above and the number of variable reduction is only equal to the number of conditional linear parameters that exists for the problem. In applications, the preferred algorithm that exploits this valuable variable reduction is called variable projection. There are many applications in time resolved spectroscopy that is heavily dependent on this technique and many references to the method are given in the review by van Stokkum et al. . Recently this method of variable projection has been extended in a restricted sense  in the field of inverse problems, which is not related to our method of either modeling or optimization, nor is the methodology related to the implicit function properties. In short, much of the reported methods developed are , meaning that they are constructed to face the specific problems at hand with no claim to overall generality and this work too is in the sense of suggesting variable reduction with specific classes of noninverse problems as indicated where the work develops a method of reducing the variable number to unity for all variables in the expectation function space irrespective of whether they are conditional or not by approximating their values by a method of averages (for method (i)a) without any form of linear regression being used in determining their approximations during the minimization iterations and without necessarily using the standard matrix theory that is valid for a very limited class of functions. Methods (i)b and (ii) are on the other hand exact treatments. No “elimination” of conditional linear parameters is involved in this nonlinear regression method. Nor is any projection in the mathematical sense involved. These general methods could have useful applications in deterministic systems comprising many parameters that are all linked to one variable: the primary one (denoted here) that is considered on physical grounds to be the most significant. A generalization of this method would be to select a smaller set of variables than the full parameter list instead of just one variable as illustrated here. Another tool that could be used in conjunction with the reduced variable method is to employ the various search algorithms that has been actively developed to the reduction scheme developed here [15–19]. As far as we are aware, these optimization methods for multivariable problems all seem to mainly focus on various stochastic or deterministic methods using discrete algorithms in some type of search sequence of the domain space of the multivariable domain space as detailed below in more recent publications. In other less ambitious works, the problem is narrowed to the specific nature of the system where the object function is specified and which is amenable to precise treatment, for example [20, 21], with a well-defined domain space, without variable reduction. In another context, quite different from the current development, variable reduction has been applied to DEA problems . Other examples of multiparameter complex systems include those for multiple-step elementary reactions each with its own rate constant that gives rise to photochemical spectra signals that must be resolved unambiguously , but these belong to the class of functions with conditional linear parameters. The work here, on the other hand, predominantly focuses on accurately determining the range of the function to be optimized by reducing the space of the domain. Hence, this method can be successfully combined with the usual domain searching techniques mentioned above to effectively locate stationary points by a two-pronged approach. All these complex and coupled processes in physical theories are related by postulated laws that feature parameters . Other examples include quantum chemical calculations with many topological and orientation variables that need to be optimized with respect to the energy, but in relation to one or a few variables, such as the molecular trajectory parameter during a chemical reaction where this variable is of primary significance in deciding on the “reasonableness” of the analysis [9, Section , page 294]. Methods (i)a and (i)b below refer to LS data-fitting algorithms. Method (i)a is an approximate method where it is proved under certain conditions; it could be a more accurate determination of parameters compared to a standard LS fit using (1). Method (i)b develops a technique where the optimum value for with domain values coincides with that of the standard LS method where the variables are varied independently. Also discussed are the relative accuracy of both methods (i)a in Section 2.2 and (i)b (endnote at end of Section 3). Method (ii) develops a single parameter optimization where the conditions of an arbitrary function are met simultaneously; namely, We note that methods (i)a, (i)b, and (ii) are not related to the Adomian decomposition method and its variants that expand polynomial coefficients  for solutions to differential equations not connected to estimation theory; indeed here there are no boundary values that determine the solution of the differential equations.
2. Method (i)a Theory
This approximate method utilizes the average of the unique solutions for each value of defined above, where the form of the fitting function—a “law” of nature for instance—is specified. Deterministic laws of nature are conveniently written in the form linking the variable to . The components of , () and are parameters. Verification of a law of form (4) relies on an experimental dataset . The variable could be a vector of variable components of experimentally measured values or a single parameter as in the kinetics examples below where denotes values of time in the domain space. The vector form will be denoted by . Variables are defined as members of the “domain space” of the measurable system and similarly is the defined range or “response” space of the physical measurement. Confirmation or verification of the law is based on (a) deriving experimentally meaningful values for the parameters and (b) showing a good enough degree of fit between the experimental set and . In real world applications, to chemical kinetics, for instance, several methods [25–28] and so forth have been devised to determine the optimal parameters, but most if not all these methods consider the aforementioned parameters as autonomous and independent (e.g., ). A similar scenario broadly holds for current state-of-the-art applications of structural elucidation via energy functions [9, Chapsters 4, 6]. To preserve the viewpoint of the interrelationship between these parameters and the experimental data, we devise schemes that relate to for all via the set and optimize the fit over -space only. That is there is induced a dependency on via the experimental set . The conditions that allow for this will also be stated for the different methods.
2.1. Details of Method (i)a
Let be the number of components of the parameter, the number of dataset pairs , and the number of singularities where the use of a particular dataset leads to a singularity in the determination of as defined below and which must be excluded from being used in the determination of . Then for the unique determination of . Let be the total number of different datasets that can be chosen which does not lead to singularities. If the singularities are not choice dependent, that is, a particular dataset pair leads to singularities for all possible choices, then we have the following definition for where is the total number of combinations of the data-sets taken at a time that does not lead to singularities in . In general, is determined by the nature of the datasets and the way in which the proposed equations are to be solved. Write in the form and for a particular dataset , write . Define the vector function with components . Assume defined on an open set that contains .
Lemma 1. For any such that , the unique function defined on where , and where for every .
Proof. The above follows from the implicit function theorem (IFT) [3, Theorem 13.7, page 374] where is the independent variable for the existence of the function.
We seek the solutions for subject to the above conditions for our defined functions. Map as follows: where the term and its components are defined below and where is a varying parameter. For any of the combinations denoted by a combination variable where is a particular dataset pair, it is in principle possible to solve for the components of in terms of through the following simultaneous equations: from Lemma 1. And each choice yields a unique solution (), where . Hence any function of involving addition and multiplication is also in . For each , there will be different solutions, . We can define an arithmetic mean (there are several possible mean definitions that can be utilized) for the components of as In choosing an appropriate functional form for (8) we assumed equal weightage for each of the dataset combinations; however, the choice is open, based on appropriate physical criteria. We verify below that the choice of satisfies the constrained variation of the LS method so as to emphasize the connection between the level-surfaces of the unconstrained LS with the line function .
Each is a function of whose derivative is known either analytically or by numerical differentiation. To derive an optimized set, then for the LS method, define
Then for an optimized , we have . Defining the optimized solution of corresponds to which has been reduced to a one-dimensional problem. The standard LS variation on the other hand states that the variables in (5) are independently varied so that with solutions for in terms of whenever . Of interest is the relationship between the single variable variation in (9) and the total variation in (11). Since is a function of , then (11) is a constrained variation where subjected to (i.e., for some function of ) and where are the components of . According to the Lagrange multiplier theory [3, Theorem 13.12, page 381] the function has an optimal value at subject to the constraints over the subset where vanishes; that is, , where when either of the following equivalent equations ((13), (14)) are satisfied: where and the ’s are invariant real numbers. We refer to as any variable that is a function of constructed on physical or mathematical grounds, and not just to the special case defined in (8). Write where since and therefore . We abbreviate the functions and . Define where are the experimental subspace variables as in (7) with defined above. We next verify the relation between and .
Proof. Define the Lagrangian to the problem as . Then the equations that satisfy the stationary condition reduce to the (equivalent) simultaneous equations Substituting in (18) to (19) leads to Since , then (20) implies for the functions in (11), (12), and (16).
Of interest is the theoretical relationship of the variables of the functions described by (9), (12), (16) denoted and those of the freely varying function of (1) denoted with the variable set which can be written as which is given by the following theorem, where we abbreviate and , where we note that the functional form is unique and is of the same form for both these variables.
Theorem 2. The unconstrained LS solution to for the independent variables is also a solution for the constrained variation single variable , where , . Further, the two solutions coincide if and only if
Proof. The unconstrained solution is derived from the equations with being constants. If there is a dependency, then we have If the variable set satisfies (24) and (25) in unconstrained variation, then the values when substituted into (26) satisfy the equation since and are the same functional form. This proves the first part of the theorem. The second part follows from the converse argument, where from (26), if , then setting one factor to zero in (27) leads to the implication of (28) which is the solution set which satisfies and is satisfied by the conditions of both (27) and (28). Then (27) satisfies (24) and (28) satisfies (25).
The theorem, verification, and lemma above do not indicate topologically under what conditions a coincidence of solutions for the constrained and unconstrained models exists. Figure 1 depicts the discussion below. From Theorem 2, if set represents the solution for the unconstrained LS method and set for the constrained method, then . Define within the range . Then is in a compact space, and since , is uniformly continuous [2, Theorem 8, page 79]. Then admissible solutions to the above constraint problem with the inequality imply , where is the unconstrained minimum. The unconstrained LS function to be minimized in (11) implies Defining the constrained function , then where . Because , solutions occur when (i) corresponding to the coincidence of the local minimum of the unconstrained for the best choice for the line with coordinates as it passes through the local unconstrained minimum and (ii) , , where this solution is a special case of (iii) when the vector is to ; that is, is at a tangent to the surface for some where this situation is shown in Figure 1, where the vector is tangent at some point of the surface . Whilst the above characterizes the topology of a solution only, the existence of a solution for the line which passes through the point of the unconstrained minimum of is proven below under certain conditions where a set of equations are constructed to allow for this significant application that specifies the conditions when the standard LS constrained variation solution implies the same solution as for the unconstrained variation. Also discussed is the case when it may be possible for unconstrained solution set to satisfy the inequality , where is a function designed to accommodate all solutions of (7), as given below in (30).
2.2. Discussion of LS Fit for a Function with a Possibility of a Smaller LS Deviation than for Parameters Derived from a Free Variation of (11)
The LS function metric such as (11) implied at a stationary (minimum) point for variables . On the other hand, the sets of solutions of (7) denoted , in number provides for each set exact solutions averaged to using (8). If the , solutions are in a -neighbourhood, then we examine the possibility that the composite function metric to be optimized over all the sets of equations , in number defined here as could be such that where is the unconstrained optimized value of (22). This implies that under these conditions, the of (30) is a better measure of fit. This will be proven to be the case under certain conditions below. For what follows, the for equation set obtains for all values of the open set , , from the IFT, including which minimises (9). Another possibility that will be discussed briefly later is where in (30), all are free to vary. Here we consider the case of the values averaged to for some . We recall the intermediate-value theorem (IVT) [3, Theorem 4.38, page 87] for real continuous functions defined over a connected domain which is a subset of some . We assume that the functions immediately below obey the IVT. For each solution of the set for a specific we assume that the function is a strictly increasing function in the sense of definition (7) below, where with , in the following sense.
Definition 3. A real function is (strictly) increasing in a connected domain about the origin at if relative to this origin, if (for the boundaries of ball and implies both and .
Note. A similar definition is obtained for a (strictly) decreasing function with the inequalities. Since the boundaries are compact and is continuous, the maximum and minimum values are attained for all ball boundaries. We assume to be strictly increasing relative to for what follows below.
Lemma 4. For any region bounded by and with coordinate (radius centered about coordinate ),
Proof. Suppose in fact ; then which is a contradiction to the definition and a similar proof is obtained for the upper bound.
Note. Similar conditions apply for the nonstrict inequalities .
The function that is optimized is Define as the solution vector for the equation set . We illustrate the conditions where the solution for a free variation for the metric given in (11) can fulfill the inequality where is as defined in (35) with given as in (8). A preliminary result is required. Define , for all and .
Lemma 5. .
Lemma 6. for for some .
Proof. Any point would be located within a spherical annulus centered at , with radii chosen so that by Lemma 4, we have the following results: where in (32). Choose so that . Define as the space bounded by the boundary of the balls centered on of radius and (). Then by Lemma 5. Since in (30) is not equivalent to in (11) where we write here the free variation vector solution as , then the above results lead to the following: where (40) follows from (33). Summing (40) leads to .
Hence we have demonstrated that it may be more realistic or accurate to fit parameters based on a function that represents different coupling sets such as above rather than the standard LS method using (30) if lies sufficiently far away from . We note that if is the solution of the free variation of the above in (30), then from the arguments presented after the proof of Theorem 2, it follows that which implies that the independent variation of all parameters in LS optimization of the variation is the most accurate functional form to use assuming equal weighting of experimental measurements than the standard free variation of parameters using the function of (11).
3. Method (i)b Theory
Whilst it is advantageous in science data analysis to optimize a particular multiparameter function by focusing on a few key variables (our variable of restricted dimensionality, which we have applied to a 1-dimensional optimization in the next section), it has been shown that this method yields a solution that is always of higher value for the same function than a full, independent parameter optimization, meaning that it is less accurate. The key issue, therefore, is whether for any function, including those of the variety, it is possible to construct a parameter optimization such that the line of parameter variables passes through the minimum surface of the function. We develop a theory to construct such a function below. However, method (i)a may still be advantageous because of the greater simplicity of the equations to be solved, and the fact that functions were required, whereas here the functions must be at least continuous.
Theorem 7. For the function defined in (11), where each of the functions is on an open set and where is convex, the solution at any point of , whenever at determines uniquely the line equation that passes the minimum of the function when .
Proof. As before , so that Define , and for an independent variation of the variables at the stationary point, we have The above results for the functions () having a unique implicit function of , denoted by the IFT [3, Theorem 13.7, page 374], require that on an open set . More formally, the expansion of the preceding determinant in (45) verifies that a symmetric matrix is obtained for due to the commutation of second-order partial derivatives of Defining as a function of only by expanding yields the total derivative with respect to as , where Then by construction (43) so that () and (43) implies (for all ) and hence Substituting (48) derived from (43) and (44) into (47) together with the condition implies that , which satisfies (44) for the free variation in . Thus, for independent variation of . So fulfills the criteria of a stationary point at say , since ([2, Proposition 16, page 112]). Suppose that is convex, where is a minimum point, , a convex subdomain of . Then at , , and is also the unique global minimum over according to [1, Theorem 3.2, page 46]. Thus is unique, whether derived from a free variation of or via dependent parameters with the function.
Note. As before, and may be replaced with the summation of indexes as for in (30) to derive a physically more accurate fit.
4. Method (ii) Theory
Methods (i)a and (i)b which are mutual variants of each other are applications of the implicit method to modeling problems to provide a best fit to a function. Here, another variant of the implicit methodology for optimization of a target or cost function is presented. One can for instance consider to be an energy function with coordinates , where as before the components of are , , is another coordinate so that . For bounded systems, (such as the molecular coordinates), one can write Thus, is in a compact space . Define Then the equilibrium conditions become Take (50) as the defining equations for which is specified by in (50) which casts it in a form compatible with the IFT where some further qualification is required for . Assume is on , and , where . The matrix of the aforementioned determinant is symmetric, partaking of the properties due to this fact. Then by the IFT [3, Theorem 13.7, page 374], is a unique function where for some , , on with and such that for all . For an isolated point , from analysis, we find to be still open. Write , and , so that Denote as a solution to , where in the indicated range above in (49).
Theorem 8. The stationary points () where for exist for the range of coordinate if and only if for each of these , (i) and (ii) (for all , ). Each of these points space corresponds uniquely in a local sense in the open set to some equilibrium (stationary) point of the target function in space.
Proof. If where , then it also follows from the IFT that , and therefore from (53), which satisfies (50) and (51) for the equilibrium point. The conditions (i) and (ii) of the theorem are a requirement of the IFT. Conversely, if , () and (a stationary or equilibrium point), then by (53) . Hence, the coordinates for which refer to the condition, where , and uniqueness follows from the IFT reference to the local uniqueness of the function.
Note. In a bounded system, one can choose any of the components of as the coordinate (denoted ), partly based on the convenience of solving the implicit equations to determine the minima and thus determine by the uniqueness criterion the coordinates of the minima in space (spanning the independent variables and ).
For nondegenerate coordinate choice, meaning that for a particular coordinate choice, there does not exist an equilibrium structure (meaning a set of coordinate values) where for any two structures and , . For such structures, the total number of minima that exists within the bounded range in the coordinate is equal to the total number of minima of the target function within the bounded range. Hence, a method exists for the very challenging problem of locating and enumerating minima [9, Section 5.1, page 242 “How many stationary points are there?”]. From the uniqueness theorem of IFT, one could infer points in the -axis where nonuniqueness is obtained; that is, whenever . In such cases, for particles with the same intermolecular potentials, permutation of the coordinates in conjunction with symmetry considerations could be of use in selecting the appropriate coordinate system to overcome these systems with degeneracies [9, Section , page 205, “Appearance and disappearance of symmetry elements”]. Other methods that might address this situation include scanning different one dimensional (1D) choices graphs or profiles, where if degeneracies exist for choice , they may not exist for the choice in the graph for a specific point . Thus, by scanning through all or selecting a number of the () profiles for , it would be possible to make an assignment of the location of a minimum in space. One is reminded of the methods that spectroscopists use in assigning different energy bands based on selection rules to uniquely characterize, for instance, vibrational frequencies. A similar analogy is obtained for X-ray reflections, where the amplitude variation of the X-ray intensity in reciprocal space can be used to elucidate structure. The minima of the , coordinate scan must correspond to the minima in space of the function given that all such minima in are locally strict and global within a small open set about the minima for by continuity, for and for , which violates the condition for a maximum.
5. Specic Algorithms and Pseudocode for Solution to Optimization Problem Utilizing Method (i)(a), Method (i)(b) and Method (ii)
We provide suggestions in pseudocode form for the above 3 proven methodologies. Real world applications of these methods are very involved undertakings that are separate research topics in their own right. Nevertheless, we provide a detailed and extended application of method (i)a suitable for a real world chemical kinetics problem where experimental data from the published literature are used for method (i)a in Section 5.3.1 and the results obtained compared to conventional techniques.
5.1. Pseudocode Algorithm for Method (i)a
Of the many variations possible, the following approach conforms to the theoretical development.(1)For any physical law, for a total of datapoints, choose datapoints (set ) and solve for parameters , according to (7) for each of the sets , in number for a known value of . The solution set may be derived analytically (as in the example below) or by appropriate linear approximations.(2)Determine from the above set either the geometric or statistical average (as used here) for the solutions; that is, .(3)Determine (9) as (4)Solve the 1D equation at when . The solution set is for the optimization problem.
5.2. Pseudocode Algorithm for Method (i)b
This is an “exact” method relative to LS variation of all parameters. A suitable algorithm based on the theory could be as follows.(1)Solve for a particular value of . Since there are equations for , a solution exists. The solution may be exact or some linear approximation, depending on nature of the problem and convergence criteria.(2)Form the function , (46), where .(3)Solve for some ; that is, .(4)The solution set to the problem is for optimizing the LS function .
5.3. Pseudocode Algorithm for Method (ii)
Here is a general function, not necessarily of form in (42). Then for variables , define functions , , as in (51).(1)For a particular , solve  for . exists since there are equations. Approximate linearized solutions might also be attempted in the vicinity of the station point of .(2)Form the function .(3)Solve for , such that .(4)Solution to the optimization problem of by varying independently all the domain variables is .
5.3.1. Application of Method (i)a Algorithm (Section 5.1) in Chemical Kinetics
The utility of one of the above triad of methods is illustrated in the determination of two parameters in chemical reaction rate studies, of and order, respectively, using data from published literature, where method (i)a yields values close within experimental error to those quoted in the literature. The method can directly derive certain parameters like the final concentration terms (e.g., and ) if , the rate constant, is the single optimizing variable in this approximation, which is not the case in most conventional methodologies. We assume here that the rate laws and rate constants are not slowly varying functions of the reactant or product concentrations, which have recently, from simulation, been shown to be generally not the case . Under this standard assumption, the rate equations below are all obtained. The first-order reaction studied here is (i) the methanolysis of ionized phenyl salicylate with data derived from the literature [30, Table 7.1, page 381] and the second-order reaction analyzed is (ii) the reaction between plutonium(VI) and iron(II) according to the data in [31, Table II, page 1427] and [32, Tables 2–4, page 25].
5.3.2. First-Order Results
Reaction (i) above corresponds to where the rate law is pseudo first-order expressed as with the concentration of methanol held constant (80% v/v) and where the physical and thermodynamical conditions of the reaction appear in [30, Table 7.1, page 381]. The change in time for any material property , which in this case is the absorbance (i.e., ) is given by for a first-order reaction where refers to the measurable property value at time and is the value at which is usually treated as a parameter to yield the best least squares fit even if its optimized value is less for monotonically increasing functions (for positive at all ) than an experimentally determined at time . In Table 7.1 of , for instance, and this value of is used to derive the best estimate of the rate constant as in that work.
For this reaction, the of (5) refers to so that with and . To determine the parameter as a function of according to (9) based on the entire experimental dataset we invert (58) and write where the summation is for all terms with the i subscript of the experimental dataset that does not lead to zeros nor singularities, such as when . We define the nonoptimized, continuously deformable theoretical curve , where in (6) as With such a relationship of the parameter to , we seek the least square minimum of , where of (9) for this first-order rate constant in the form where the summation is over all the experimental values. The solution of the rate constant corresponds to the zero value of the function, which exists for both orders. The parameters ( and ) are derived by back substitution into (59) and (65), respectively. The Newton-Raphson (NR) numerical procedure [4, page 456] was used to find the roots to . For each dataset, there exists a value for and so the error expressed as a standard deviation may be computed. The error tolerance for the NR procedure was set to . We define the function deviation as the standard deviation of the experimental results with the best fit curve where . Our results are as follows: ; ; and .
The experimental estimates are ; ; and .
The experimental method involves adjusting the to minimize the function and hence no estimate of the error in could be made. Method (i)a allows direct calculation of and its error without the extraneous fittings required in the conventional methods. It is clear that our method has a lower value and is thus a better fit, and the parameter values can be considered to coincide with the experimental estimates within experimental error. Figure 2 shows the close fit between the curve due to our optimization procedure and experiment. The resulting function (10) for the first-order reaction based on the published dataset is given in Figure 3. The very slight variation between the two curves could be due to experimental uncertainties as shown in Figure 1.
5.3.3. Second-Order Results
To further test our method, we also analyze the second-order reaction whose rate is given by where is relative to the constancy of other ions in solution such as . The equations are very different in form to the first-order expressions and serves to confirm the viability of the current method.
For Espenson, the above stoichiometry is kinetically equivalent to the reaction scheme [32, equation ] which also follows from the work of Newton and Baker [31, equations , page 1429] whose data [31, Table II, page 1427] we use and analyze to verify the principles presented here. Espenson had also used the same data as we have to derive the rate constant and other parameters [32, pages 25-26] which are used to check the accuracy of our methodology. The overall absorbance in this case is given by [32, equation ] where is the ratio of initial concentrations where and , and and . A rearrangement of (64) leads to the equivalent expression [32, equation ] According to Espenson, one cannot use this equivalent form [32, page 25] “because an experimental value of was not reported” and he further asserts that if is determined autonomously, then , the rate constant, may be determined. Thus, central to all conventional methods is the autonomous and independent status of both and . We overcome this interpretation by defining as a function of the total experimental spectrum of values and by inverting (64) to define as where the summation is over all experimental values that does not lead to singularities such as at . In this case, the parameter is given by , is the varying parameter of (5). We likewise define a function of that is also a function of , but where the parameter is interpreted as a “distortion” parameter in the following manner: In order to extract the parameters and , we minimize the square function for this second-order rate constant with respect to given as where the summation is over the experimental coordinates. Then the solution to the minimization problem is when the corresponding function (10) is zero. The NR method was used to solve with the error tolerance of . With the same notation as in the first-order case, the second order results are ; ; and .
The experimental estimates from the conventional methods are [32, page 25]: ; .
Again the two results are in close agreement. The graph of the experimental curve and the one that is derived from our optimization method are given in Figure 4.
The triad of associated implicit function optimization covers both the topics of modeling of data and the optimization of arbitrary functions where experimental or theoretical considerations require that a single variable is tagged to a process variable that is iteratively relaxing to an equilibrium stationary point. Applying method (i)a to chemical kinetics allows for the direct determination of parameters that is not possible by application of the standard methodologies. The results presented here show that for linked variables, it is possible to derive all the parameters associated with a curve by considering only one independent variable which serves as the independent variable for other functions in the optimization process as illustrated by methods (i)a,b. Apart from possible reduced errors in the computations, it might also be a more accurate way of deriving parameters that are more influenced or conditioned (on physical grounds) by the value of one parameter (such as here) than others; the current methods that give equal weight to all the variables might in some cases lead to results that would be considered “unphysical.” In complex dynamical systems with multiprocesses, the physical considerations are such that for scientific purposes, it would be advantageous if optimization would be conducted on just one primary coordinate variable, such as in attempting to derive the most general stable conformer in a large molecule, where there are thousands of local minima present if all free coordinate variables are considered [9, Section 6.7, page 330]. For such systems, method (ii) might be applicable. This generalized potential surface might be found suitable for reaction trajectory calculations [9, Chapter 4, page 192 on “Features of a landscape”] that require a single path variable, where the general optimized conformer would be relevant to the study of the potential surfaces and force fields present.
List of Variables
|:||the entire domain space of a function such as where is the variable associated with a sequence of measurements, such as along the time coordinate. The vector with components is the normal parameter that must be optimized, and is the specially chosen variable on experimental grounds that is optimized whilst constructing functions such that (6)|
|refers to a function that is proposed to be a “law of nature” whose parameters are to be optimized (7)|
|:||the theoretical law of nature if is determined. That is, (6)|
|:||an experimentally determined datapoint that ideally represents the range of if there was no error; that is, for a perfect fit for all , for fixed (7)|
|an averaged value for based on some specified algorithm (8)|
|least squares (LS) function to optimize in the case of () (9). In general, all functions are LS functions specified by (e.g., (9), (11), (42), etc.)|
|:||General cost or object function to be optimized, not necessarily in LS form (50)|
|:||Lagrange multipliers associated with the optimization (14)|
|:||Lagrangian to the optimization problem (17)|
|:||chemical kinetics rate constant for reaction (e.g., (56))|
|:||absorbance measurements for first-order chemical kinetics reactions at time and infinity (e.g., (58) and (59))|
|:||absorbance measurements for second-order chemical kinetics reactions at time and infinity (e.g., (64) and (66)).|
Conflict of Interests
The author declares that there is no conflict of interests regarding the publication of this paper.
This work was supported by University of Malaya Grant UMRG(RG077/09AFR) and Malaysian Government grant FRGS(FP084/2010A). This work was initiated and completed during a Sabbatical research visit (2012-2013) to the Atomistic Simulation Centre (ASC), School of Mathematics and Physics, Queen’s University Belfast. I thank Ruth Lynden-Bell (Chemistry Department, Cambridge University) for facilitating this visit. Cordial discussions concerning real world applications with faculty at ASC are gratefully acknowledged. I thank my hosts, Jorge Kohanoff (ASC) and Christopher Hardacre (Chem. Dept., QUB) for congenial hospitality during this time.
B. D. Craven, Functions of Several Variables, Chapman & Hall, London, UK, 1981.View at: MathSciNet
J. DePree and C. Swartz, Introduction to Real Analysis, John Wiley & Sons, New York, NY, USA, 1988.View at: MathSciNet
T. M. Apostol, Mathematical Analysis, Narosa Publishing House, New Delhi, India, 2nd edition, 2002.
W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes—The Art of Scientific Computing, Cambridge University Press, Cambridge, UK, 3rd edition, 2007.View at: MathSciNet
J. A. Snyman, Practical Mathematical Optimization: An Introduction to Basic Optimization Theory and Classical and New Gradient-Based Algorithms, Springer, New York, NY, USA, 2005.View at: MathSciNet
C. G. Broyden, “The convergence of a class of double-rank minimization algorithms,” Journal of the Institute of Mathematics and Its Applications, vol. 6, pp. 76–90, 1970.View at: Google Scholar
A. Banerjee, N. Adams, J. Simons, and R. Shepard, “Search for stationary points on surfaces,” Journal of Physical Chemistry, vol. 89, no. 1, pp. 52–57, 1985.View at: Google Scholar
D. A. Wales, Energy Landscapes, Cambridge Molecular Science, Cambridge University Press, Cambridge, UK, 2003, edited by R. Saykally, A. Zewail and D. King.
M. Gendreau and J.-Y. Potvin, Eds., Handbook of Metaheuristics, vol. 146 of International Series in Operations Research & Management Science, Springer, New York, NY, USA, 2nd edition, 2010.
R. Varadhan and P. D. Gilbert, “BB: an R package for solving a large system of nonlinear equations and for optimizing a high-dimensional nonlinear objective function,” Journal of Statistical Software, vol. 32, no. 4, pp. 1–26, 2009.View at: Google Scholar
S. Solar, W. Solar, and N. Getoff, “A pulse radiolysis-computer simulation method for resolving of complex kinetics and spectra,” Radiation Physics and Chemistry, vol. 21, no. 1-2, pp. 129–138, 1983.View at: Google Scholar
J. J. Houser, “Estimation of A∞ in reaction-rate studies,” Journal of Chemical Education, vol. 59, no. 9, pp. 776–777, 1982.View at: Google Scholar
P. Moore, “Analysis of kinetic data for a first-order reaction with unknown initial and final readings by the method of non-linear least squares,” Journal of the Chemical Society, Faraday Transactions I: Physical Chemistry in Condensed Phases, vol. 68, pp. 1890–1893, 1972.View at: Publisher Site | Google Scholar
W. E. Wentworth, “Rigorous least squares adjustment: application to some non-linear equations, I,” Journal of Chemical Education, vol. 42, no. 2, pp. 96–103, 1965.View at: Google Scholar
W. E. Wentworth, “Rigorous least squares adjustment: application to some non-linear equations, II,” Journal of Chemical Education, vol. 42, no. 3, pp. 162–167, 1965.View at: Google Scholar
M. N. Khan, Micellar Catalysis, vol. 133 of Surfactant Science Series, Taylor & Francis, Boca Raton, Fla, USA, 2007, edited by A. T. Hubbard.
T. W. Newton and F. B. Baker, “The kinetics of the reaction between plutonium(VI) and iron(II),” Journal of Physical Chemistry, vol. 67, no. 7, pp. 1425–1432, 1963.View at: Google Scholar
J. H. Espenson, Chemical Kinetics and Reaction Mechanisms, vol. 102, McGraw-Hill, Singapore, 2nd edition, 1995.