Efficient Parameters Estimation Method for the Separable Nonlinear Least Squares Problem

Wang, Ke; Liu, Guolin; Tao, Qiuxiang; Zhai, Min

doi:https://doi.org/10.1155/2020/9619427

Complexity

On this page

Abstract Introduction Conclusion Data Availability Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2020 | Article ID 9619427 | https://doi.org/10.1155/2020/9619427

Efficient Parameters Estimation Method for the Separable Nonlinear Least Squares Problem

Ke Wang,^1,2Guolin Liu ,¹Qiuxiang Tao,¹and Min Zhai^1,2

Academic Editor: Chongyang Liu

Received08 Sept 2019

Revised18 Nov 2019

Accepted11 Dec 2019

Published10 Feb 2020

Abstract

In this work, we combine the special structure of the separable nonlinear least squares problem with a variable projection algorithm based on singular value decomposition to separate linear and nonlinear parameters. Then, we propose finding the nonlinear parameters using the Levenberg–Marquart (LM) algorithm and either solve the linear parameters using the least squares method directly or by using an iteration method that corrects the characteristic values based on the L-curve, according to whether or not the nonlinear function coefficient matrix is ill posed. To prove the feasibility of the proposed method, we compared its performance on three examples with that of the LM method without parameter separation. The results show that (1) the parameter separation method reduces the number of iterations and improves computational efficiency by reducing the parameter dimensions and (2) when the coefficient matrix of the linear parameters is well-posed, using the least squares method to solve the fitting problem provides the highest fitting accuracy. When the coefficient matrix is ill posed, the method of correcting characteristic values based on the L-curve provides the most accurate solution to the fitting problem.

1. Introduction

The separable nonlinear least squares problem is a special case of nonlinear least squares, and its model can be expressed as a linear combination of nonlinear functions. More generally, one can also consider that there are two sets of unknown parameters, where one set is dependent on the other and can be explicitly eliminated. This problem was first proposed in parameter estimation of the atomic physics particle half-life formula developed by Golub and Pereyra in 1973 [1]. In real life, models of this type are very common. For example, in the machine learning community, neural networks, their numerous variants [2, 3], and some neuro fuzzy systems [4, 5] take the form of a linear ensemble of some nonlinear basis functions. In the field of signal processing, Prony’s method [6, 7], which takes the sum of complex exponentials, is frequently used to analyze the frequency components of a signal. In the field of algorithm application, waveform data decomposition is one of the key steps in processing based on airborne full-waveform light detection and ranging (LiDAR) data. The full waveform can be decomposed into a linear combination of multiple Gaussian functions. Through waveform data decomposition, discrete point cloud and waveform parameter information can be obtained [8]. Furthermore, this approach has many applications in areas such as mechanical systems [9], telecommunications [10], robotics [11], and environmental sciences [12]. These applications can be viewed as nonlinear data-fitting problems in terms of numerical expression. However, data-fitting problems are often quite challenging numerically. Fortunately, by exploiting the special structure of separable nonlinear models, efficient algorithms can be obtained.

Based on the special structure of the separable nonlinear least squares problem, Golub and Pereyra [1] proposed the variable projection (VP) algorithm to eliminate linear parameters and obtain simplified functions involving only nonlinear parameters and also used the Levenberg–Marquart (LM) algorithm for its solution. Dimensionality reduction of parameter space improves the possibility of obtaining a global optimal solution [13]. Therefore, improvement and application [14] of VP have been conducted on that basis. Kaufman [15] proposed a modified VP algorithm based on trapezoidal decomposition to simplify the calculation, which reduced its computational complexity and improved computational efficiency. Subsequently, Ruano et al. [16] proposed an improved VP algorithm based on QR decomposition for the sparse case of nonlinear function matrix, which also effectively improved operation efficiency. Further, Gan and Li [17] proposed a VP algorithm based on Gram–Schmidt matrix decomposition for the case in which the number of observations is much larger than the number of linear parameters, which reduces the amount of calculation required.

In this study, in view of the ill-posed condition of a nonlinear function matrix, singular value decomposition (SVD) [18] is adopted to simplify the VP algorithm and improve the stability of the calculation. Then, after linear parameters are eliminated by the improved VP algorithm, the separable least squares problem is transformed into an optimization problem containing only nonlinear parameters [19]. As regards an estimation method for nonlinear parameters, Liu et al. [20] combined this with sequential quadratic programming (SQP), developing a gradient-based optimization algorithm to determine the optimal time-delays and system parameters in a novel dynamic optimization problem for nonlinear multistage systems with time delays. However, this approach was restricted to parameter identification problems.

The common nonlinear least squares iterative solutions [21] are the gradient descent method [22], Gauss–Newton method [23], and Levenberg–Marquart (LM) method [24, 25]. For example, in [26], the nonlinear least squares problem of the distributional robust parameter identification model for time-delay systems is transformed into a single-level optimization problem and a gradient-based optimization method is developed to solve the transformed problem. The method only involves the first-order moment information and is simple to calculate. To obtain robust estimates against the noise in measurements, Liu et al. [27] proposed a robust estimation formulation, in which the cost function was the variance of the error function and an additional constraint indicates an allowable sacrifice from the optimal expectation value of the classical estimation problem. On this basis, a gradient-based optimization algorithm to numerically solve the classical and robust parameter estimation problems was developed. This is more efficient than the existing methods used for problems, where optimization parameters outnumber constraints. This method involves simple calculations, but its convergence speed is generally slower than that of the Gauss–Newton method.

Torres et al. [28] used the sequentially semiseparable matrix to calculate the Jacobian matrix [29] and Hessian matrix [30], employing the Gauss–Newton method to optimize the output error of the global system. The effectiveness of the algorithm was verified by numerical examples. Bellavia et al. [31] improved the approximation function by controlling the accuracy level when the accuracy was too low to be optimized, and then proposed the LM method based on the dynamic precision relationship between the evaluation function and gradient for solving large-scale nonlinear least squares problems. They proved the global and local convergence and complexity of this method. The Gauss–Newton method has the advantages of fast convergence and high precision. However, the Jacobian matrix is required to be of full rank in the iterative process. If the problem has high nonlinearity or the residual is large, the method may not produce convergence. Therefore, the LM algorithm overcomes this shortcoming and adjusts the damping parameters according to the idea of a trust region, to effectively control the direction of iterative descent. Once the nonlinear parameters were determined, the least squares (LS) method [32] is used to estimate the linear parameters.

In view of a general nonlinear multistage system with time-delay and system parameters, Liu et al. [33] proposed a new parameter estimation formulation, in which the cost function is the variance of the error function and the constraint indicates an allowable sacrifice from the optimal expectation value of the classical parameter estimation problem. This parameter estimation approach is capable of solving parameter estimation problems with multiple stages and multiple time delays and, compared with classical parameter estimation, it is able to withstand the uncertainty in the distribution of measurement data. Nevertheless, this method has the limitation of relying on the statistical distribution of the noisy measurement output.

To enhance the estimation accuracy, Ding et al. [34, 35] presented a filtering-based gradient iterative algorithm and a filtering-based least squares iterative algorithm, which resulted in improved convergence speed. However, when nonlinear parameter estimation led to the ill condition of the linear least squares coefficient matrix, the estimation results of the linear parameters obtained by the LS method were unstable and even sometimes significantly different from the true values. For this problem, the regularization method is often used to solve for ill-posed problems in linear parameter estimation [36], such as the Tikhonov regularization method [37], truncated singular value method [38], and iteration by correcting characteristic values [39, 40].

In addition, Chen et al. [41] proposed a weighted generalized crossvalidation method to determine Tikhonov regularization parameters for the regularization of separable nonlinear least squares ill-posed problems based on the VP algorithm and verified its effectiveness experimentally. Aiming at the randomness of parameter selection in the iteration by correcting characteristic values in the process of linear least squares solution, Zhai et al. [42] constructed the L-curve [43, 44] of the relationship between the norm of the parameter solution and the residual. They selected the maximum curvature point as the regularization parameter and verified the correctness of the method through numerical experiments.

The existing literature presents several effective solutions for the parameter estimation problem, but only a few studies have been conducted on the structural transformation of separable nonlinear models. In this study, we consider the special structure of the separable nonlinear least squares problem, separating two types of parameters using a VP algorithm based on SVD. We then use the LM algorithm to estimate the nonlinear parameters while using the LS method or iteration by correcting characteristic values based on the L-curve method. By comparing the experimental results of parameter estimation of the Gauss function fitting model and fractional fitting model with those of the LM method without parameter separation, the validity of the VP algorithm based on SVD is evaluated and the accuracy of different linear parameter estimation methods is analyzed.

The remainder of this paper is outlined as follows. In Section 2, based on an improved VP algorithm derived from SVD, the methods of nonlinear parameter estimation and linear parameter estimation are explained and the algorithm for solving separable nonlinear least squares problems is provided. In Section 3, the Gaussian function fitting model and the fractional fitting model experiments are used to compare and analyze the proposed method with the traditional LM algorithm without unseparated parameters. Finally, we present our conclusions in Section 4.

2. Solution of Parameters for Separable Nonlinear Least Squares

2.1. Variable Projection Algorithm Based on SVD

Consider a set of observations . The problem of parameter estimation is to find the optimal parameters and when the following formula reaches the minimum value:where is a nonlinear function and and are the number of linear and nonlinear parameters to be estimated, respectively. The above formula can be written in a matrix form aswhere the column of corresponds to the nonlinear function associated with the parameter and denotes the Euclidean norm.

Let

For the given nonlinear parameters , the linear parameters can be estimated by solving the following linear least squares problem:where is the pseudoinverse of . Inserting (4) into (3),where is the orthogonal projector on the linear space spanned by the columns of and is the projector on the orthogonal complement of the column space.

To simplify the calculation, the matrix , which is composed of nonlinear functions, is decomposed by SVD:where is an orthogonal matrix, is an diagonal matrix, and is a orthogonal matrix. We obtain as

Then,

Suppose the rank of the matrix is , the first elements on the diagonal of are not zero, i.e., . can be divided into matrix and matrix . can be divided into matrix and matrix . Then,

The matrix composed of residual functions is simplified to the following equation by VP based on SVD:

Then, the objective function of the separable nonlinear least squares problem is simplified to

Theorem 1. We assume that in the open set , has constant rank .(a)If is a critical point (or a global minimizer in ) of , and combined with (4), then is a critical point of (or a global minimizer for ) and (b)If is a global minimizer of for , then is a global minimizer of in and . Furthermore, if there is a unique among the minimizing pairs of , then must satisfy (4).

Proof of Theorem 1. From (3) we see that . Therefore,where signifies a direct sum.
Assume that is a global minimizer of in and is defined by (4); then,Because , then is a critical point of .
Assume that is a global minimizer of in and satisfies (4). Then, clearly, . If we assume that there exists , such that . Because for any we have , then it follows that , which is a contradiction to the fact that is a global minimizer of in . Therefore, is a global minimizer of in , and part (a) of the theorem is proved.
Conversely, suppose that is a global minimizer of in . Then, as mentioned above . Now, let . Then, we have ; however, because is a global minimizer, we must have equality. If there is a unique among the minimizers of , then . We still have that is a global minimizer of . Assume that it is not. Thus, there will be , such that . Let be equal to . Then, , which is a contradiction to the fact that is a global minimizer of . This completes the proof.

2.2. Nonlinear Parameter Estimation Using the LM Algorithm

For a separable least squares problem with only nonlinear parameters, the LM algorithm is adopted for the solution and the nonlinear parameters is updated aswhere is current nonlinear parameter vector. is a small step length that ensures the decrease of the objection function (11), which is calculated by an imprecise search method, such as line search, in which we let be the smallest nonnegative integer satisfying in the iteration process. Then,where is search direction, which can be determined by the following equation:where is the Jacobian matrix of . Kaufman [15] formulated an explicit analytic expression for the Jacobian. This ensures the efficiency and reliability of the VP algorithms. In addition, combined with the simplified objective function by SVD, the Jacobian matrix of the residual vector is expressed aswhere is damping parameter. It is adjusted with a strategy similar to the trust region radius [45], and a quadratic function is defined at the current iteration point as follows:

Then, the incremental ratio of to the objective function is calculated:

When is close to one, the fitting between the quadratic function and objective function is good at point . LM is used to solve the nonlinear least squares problem. The parameter should be smaller; that is, in this case, the Gauss–Newton method is more effective. When it is close to zero, the fitting between the quadratic function and objective function is poor at point . LM is used to solve the nonlinear least squares problem, and the parameter should be larger. When is neither close to zero nor one, then is suitable and does not need to be adjusted. The critical values of are usually 0.25 and 0.75, and the adjustment rules of are as follows.

When is close to one, the fitting between quadratic function and the objective function is good at point . The parameter should be smaller when the LM algorithm is used to solve the nonlinear least squares problem. When is close to zero, the fitting between the quadratic function and the objective function is poor at point . The parameter should be larger when LM is used to solve the nonlinear least squares problem. When is neither close to zero nor one, is suitable and does not need to be adjusted. The critical values are usually 0.25 and 0.75, and the adjustment rules of are as follows:

2.3. Linear Parameter Estimation by Correcting Characteristic Value Based on L-Curve

The nonlinear parameter estimations are solved in Section (2); the linear parameters can then be calculated by (4). When is a nonsingular matrix, the least squares method is used to calculate it directly. However, when is singular or the condition number of exceeds 100, solution of the linear least squares method is unstable or cannot be solved. Therefore, it needs to be solved by the regularization method. In this study, an iteration method that corrects characteristic values based on the L-curve is used to solve the problem.

The least squares normal equation for the linear parameter estimation is

Adding to both sides of (21), we obtain

Let and , the step of the iteration method that corrects the characteristic value iswhere is the current nonlinear parameter vector and is a regularization parameter selected according to the L-curve method. The L-curve describes the relationship between the norm of the regularized solution and the residual norm , which corresponds to each set of regularized parameter values.

The convergence of the spectral correction iterative method based on the L-curve is proved as follows.

Let the initial estimated value of be , . The iterative calculation process of spectral correction can then be written as

When , is the number of inadequate iterations of the spectral modified iteration method; therefore, the relationship between estimated value and iteration initial value is

When , is the number of sufficient iterations of the spectral modified iterative method, and ; therefore, the relationship between estimated value and full iteration result is

According to (23), the relation between the estimated result of spectral correction and the initial iteration value is as follows:

In the case of , is the order positive definite matrix. An orthogonal matrix of order is that follows the expressionwhere , is the eigenvalue of .

Let , then

According to the convergence of the matrix sequence, it can be known that, in the case of , matrix is convergent and . Matrix power series is absolutely convergent, and the sum is .

Therefore,

That is, in the case of , the spectral correction iterative method based on L-curve converges to the least squares solution for any initial value. Therefore, the relationship between the estimated result and the iteration termination value can be further rewritten as

In the case of , the initial value of is equal to in the spectral correction iterative method. Further, .

Therefore, this method is convergent with the least squares method and is also feasible.

2.4. Summarization of the Algorithm

According to the estimation methods of linear parameters and nonlinear parameters proposed in Sections 2.1–2.3, the separable nonlinear least squares solution method—in which nonlinear parameters in the nonlinear least squares problem with SVD variable projection separation are solved with the LM algorithm and linear parameters are solved directly with the least squares method—is referred to as LM_VP + LS. When the linear parameters are solved using the iteration method that corrects characteristic values based on the L-curve, the method is referred to as LM_VP + CCVL. The traditional LM method without separation of parameters is referred to as LM_unSep. The steps for solving the separable nonlinear least squares problem are summarized as follows.

A flowchart of Algorithm 1 is presented in Figure 1.

	Step 1: given initial values , set , , . is a very small positive number.
	Step 2: calculate the matrix of the nonlinear function and Jacobian matrix of the objective function after VP, using (17).
	Step 3: calculate the iteration step length and iteration direction using (15) and (16), and then update the nonlinear parameters using (14).
	Step 4: if , terminate the algorithm; otherwise, , go to Step 3.
	Step 5: calculate linear parameters when is a nonsingular matrix. The least squares method is used to calculate this directly; when is singular or the condition number of exceeds 100, the solution is obtained using (23).

3. Numerical Examples

In this study, we used three examples, the Gaussian function fitting model, the fractional fitting model, and decomposition of full-waveform LiDAR data, to verify the method proposed in Section 2 for solving separable nonlinear least squares problems. The results were then compared with the LM algorithm with unseparated parameters with respect to iteration times, function calculation times, and fitting accuracy. The experiments were performed using MATLAB 2016b on a 2.3-GHz desktop PC running Windows 10.

3.1. Example 1: Gaussian Function Fitting Model

The Gaussian function fitting model often exists in the parameter estimation of signal processing. For the generalized Gaussian model, there is a linear combination of nonlinear functions that can be solved by separable nonlinear least squares based on the VP algorithm. The form of the model is as follows:where is the number of observations, is the number of linear parameters, and is the Gaussian noise. is the linear parameter vector to be estimated. is the nonlinear parameter vector to be estimated. Equation (32) is written in the form of a matrix aswhere and . The true parameter values of the model in (32) are , , and . The observation error is a normal distribution with a standard deviation of 0.5. According to the above parameters, ten groups of observations are randomly generated, where each group has 400 data points. The initial values of the nonlinear and linear parameters are randomly selected according to the following uniform distribution:

The error equations were listed based on the model in (32) and 400 pairs of observed values , and the parameters are solved using the LM_unSep, LM_VP + LS, and LM_VP + CCVL methods, respectively. The parameter estimates are shown in Table 1, which are obtained by taking the average of 10 times.

The fitting curve is shown in Figure 2, and the difference between the parameter estimation and true value is shown in Figure 3. Because the estimated parameters obtained by the three methods do not differ significantly, and the fitting curves are basically coincident, only one fitting curve is drawn.

We can see from Table 1 that the results of the three methods are completely consistent in their estimation of the nonlinear parameters ( and ). For the estimation of linear parameters, the results obtained by the LM_unSep and LM_VP + LS methods are identical and the results obtained by the LM_VP + CCVL method deviate slightly from the true values compared to those obtained by the other methods. From Figure 2, we can see that the parameters estimated by the three methods can fit the observations very well and that their fitting curves are basically the same. Figure 3 shows that the difference value is closer to zero, and the parameter valuation is closer to the truth value. Therefore, the parameter estimation results of the LM_unSep and LM_VP + LS methods may be taken to be identical and the difference value is small.

To better understand the comparison of the parameter estimation results obtained using the three methods, Table 2 lists the maximum, minimum, and average values of the sum of squares of residuals between all the parameter estimation values and the true values (all_SSR), which were calculated ten times. Because the latter two methods eliminate linear parameters using the VP algorithm, the sum of squares of residuals between the nonlinear estimated parameters and the true value (nonpar_SSR) are also listed in Table 2.

We compare the fitting residual sum of squares and calculation process of the three methods in Table 3.

Table 2 indicates that, in terms of numerical value, the all_SSR values obtained by LM_VP + LS and LM_unSep methods are equal. The results of nonpar_SSR using LM_VP + CCVL are equal to those of the other two methods; however, the results of all_SSR are higher, indicating that the estimation of linear parameters is worse than that of the other two methods.

Table 3 indicates that the sum of squares of residual errors between the predicted and true values obtained using LM_VP + CCVL is the largest, and those of the other two methods are equal. The mean square errors of the LM_unSep, LM_VP + LS, and LM_VP + CCVL methods are 0.0107, 0.0107, and 0.0248, respectively, which are relatively small, indicating that the results obtained by all three methods are reliable. Compared with the number of iterations and average number of calculations of the function, the method based on VP exhibits a considerable reduction compared with the method in which the parameters are not separated.

The residual change of the objective function in the iterative process is shown in Figure 4.

(a)

(b)

(c)

According to Figure 4, in the case of Gaussian function fitting, on the whole, the residual changes of the three methods are all reduced and the estimated results of the final parameters are very close to the truth value, suggesting that the three methods are convergent.

In addition, according to the structural characteristics of the model, when the nonlinear parameters are fixed in the LM_VP + LS and LM_VP + CCVL methods, the nonlinear function matrix is full rank and the number of conditions is less than five for all ten calculations; thus, there is no ill-condition problem.

Based on the experimental results, we draw two conclusions: (1) Because there is no rank deficiency or ill posing in the matrix of the nonlinear function, the LS method helps to attain more accurate linear parameters directly than the regularization method. (2) The LM_VP + LS method eliminates the linear parameters using the VP algorithm based on SVD, reduces the dimension of the parameters to be estimated, and improves the efficiency of the iteration process. The number of iterations required in the LM_VP + LS method is less than that for the LM_unSep method and produces the best effect under the same precision.

3.2. Example 2: Fractional Fitting Model

The fractional fitting model is also a commonly used curve-fitting method. The model is described by the following expression:where is a linear parameter vector to be estimated and is the nonlinear parameter vector. The model in (35) is written in the form of a matrix as follows:where and is the coefficient term corresponding to the linear parameter . The true values of the parameters in the model in (35) are and . Observation error is a normal distribution with a standard deviation of 0.001. According to the above parameters, ten groups of observations are randomly generated, where each group has 200 data points. The initial values of the nonlinear and linear parameters are randomly selected according to the following uniform distribution:

The error equations listed are based on the model in (32) and 200 pairs of observed values , and the parameters are solved using the LM_unSep, LM_VP + LS, and LM_VP + CCVL methods. The parameter estimates obtained by taking the average of the ten calculations are listed in Table 4.

Because the estimated parameters obtained by the three methods do not differ significantly, and the fitting curves are basically coincident, only one fitting curve is drawn. The fitting curve is shown in Figure 5.

Table 4 and Figure 5 indicate that the three methods produce similar true values for the nonlinear parameters. The calculation results in the LM_VP + CCVL and LM_VP + LS methods are the same because the separation of the objective function is equal, and the nonlinear parameters are all estimated by the LM algorithm. When the LM_VP + LS method is used, significant deviation is observed between the true values of the linear parameters. Combined with Figure 5, the fitting curves of the three methods are similar, which shows that the LM_VP + LS method cannot obtain the optimal parameters even though it satisfies the least squares principle in solving the linear parameters. This is because the condition number of the nonlinear function matrix is considerably greater than 100 in the process of solving the separation problem after VP. The number of conditions is above for all ten calculations, and these were seriously ill-posed problems.

Because the differences in the linear parameters solved by the LM_VP + LS method are much higher than those of the nonlinear parameters, the original difference was reduced by 10⁴ and then plotted. The differences between the parameter estimation solutions of the three methods and the true value are shown in Figure 6.

Figure 6 indicates that the difference between the parameter estimation and the true value obtained by the LM_VP + CCVL method is the smallest among the three methods, that is, the result is closest to the true values. The difference obtained by the LM_unSep method is next. The difference obtained by the LM_VP + LS method is the largest, even if that is reduced by 10⁴.

To better understand the comparison of the parameter estimation results of the three methods, Table 5 lists the maximum, minimum, and average values of the sum of squares of residuals between all the parameter estimation values and the true values (all_SSR) obtained from the ten calculations. Because the latter two methods eliminate linear parameters using the VP algorithm, the sums of squares of residuals between nonlinear estimated parameters and the true value (nonpar_SSR) are also listed in Table 5.

Table 5 indicates that the sum of the residuals of all the parameters using the LM_VP + LS method is much larger than the other two methods. The main reason is that the matrix of the nonlinear function is seriously ill posed, and the results of direct solution using the least squares method are not the optimal parameters. Therefore, the LM_VP + CCVL method using the iteration method by correcting characteristic values based on the L-curve obtained the optimal estimates that are similar to the true values. The sum of the residual squares of parameter estimates and true values is the smallest. The overall result is also superior to the LM_unSep method when the parameters are not separated. For the sum of residual squares of nonlinear parameters, the LM_VP + CCVL and LM_VP + LS methods exhibit the same results, which are lower than the values obtained by the LM_unSep method.

A comparison of the three methods in terms of model fitting and calculation processes is presented in Table 6.

Table 6 indicates that in terms of the sum of squared residues between the predicted value and the true value, irrespective of the maximum, minimum, or average value, the lowest values are obtained when the LM_VP + CCVL method is used, followed by the LM_unSep method; highest values are obtained using the LM_VP + LS method. In terms of the mean square error, the LM_VP + CCVL method provides the lowest values. The number of iterations and average number of functions considerably decrease for the VP algorithm; the average number of iterations was reduced from 161.4 to 20.8 and the average calculation number of functions was reduced from 2516.2 to 207.7.

The residual change in the objective function of the iterative process is shown in Figure 7.

(a)

(b)

(c)

According to Figure 7, in the case of Gaussian function fitting, on the whole, the residual changes of the three methods are all reduced and the estimated results of the final parameters are very close to the truth value, so it can be seen that the three methods are convergent.

The experimental analysis indicates that the problem is seriously ill posed, and when considering separable nonlinear least squares, the results obtained by the LM_VP + CCVL method afford the highest fitting precision. The computational efficiency is also much better than that for the LM method without separation.

3.3. Example 3: Decomposition of Full-Waveform LiDAR Data

This section of the experimental data originated from the full-waveform LiDAR data of a region collected in 2016. The full-waveform lidar system record the backscattered energy at different elevation points in a certain elevation range in the form of waveforms. The survey area mainly contains buildings, trees, and roads. The point cloud and waveform information are stored in the LAS file in the LAS1.3 standard format. The waveform data sampling interval is 1 ns and the number of samples is 287. The echo waveform is regarded as the superposition of several Gaussian functions. The model form is as follows:where is the Gaussian noise, is the linear parameter vector to be estimated, and is the nonlinear parameter vector to be estimated.

The parameters were solved by the LM_unSep, LM_VP + LS, and LM_VP + CCVL methods. The fitting curves are shown in Figures 8(a)–8(c).

(a)

(b)

(c)

It can be seen from Figure 8 that, in the case of the LM_unSep method, only the first decomposition waveform can be fitted to the observation value and the latter three decomposition waveforms are not well fitted. LM_VP + LS and LM_VP + CCVL provide an accurate fitting result.

As there is no truth value in the actual measurement, the estimated result is no longer compared with the truth value. To better understand the comparison of the parameter estimation results of the three methods, the mean square error between the fitting results and the observed values, the maximum fit difference (Diff-max), the minimum fit difference (Diff-min), the number of iterations, and function count calculations of the three methods are presented in Table 7.

It can be seen from Table 7 that, like the simulation experiment results, owing to parameter separation, the reduced dimension of the parameter estimation improves the chance of convergence. LM_VP + LS is the best of the three methods, with the least number of iterations and function calculations. LM_VP + CCVL is second. However, the LM_unSep method, in which parameters are not separated, does not produce the correct result.

The residual changes of the objective function in the iterative process are shown in Figure 9.

(a)

(b)

(c)

From Figure 9, we can know that all three methods tend to converge, but LM_unSep, whose parameters are not separated, does not converge to the optimal result, and the sum of squared residual convolutions is still large. The sum of the residuals of the remaining two methods tends to zero. The number of iterations is reduced from 20 to 10, and function count is also decreased from 290 to 106. The calculation is greatly simplified.

In short, these experiments indicate that the estimation method of parameter separation shows a great improvement in terms of the number of iterations, number of function calculations, and calculation result.

4. Conclusion

In this study, linear and nonlinear parameters were separated by the VP algorithm based on SVD. Further, the separable least squares problem was transformed into a least squares problem with only nonlinear parameters, which reduced the dimension of the parameters, number of iterations, and function calculations and improved operation efficiency. In addition, to solve the ill-posed problem of the coefficient matrix comprising nonlinear functions when solving for linear parameters, an iteration method that corrects characteristic values based on the L-curve was adopted. This also helped to ensure the convergence of model parameter estimation and improved prediction accuracy. The parameter estimation method used in our study is suitable for cases with a large number of linear parameters in separable nonlinear least squares problems. Additionally, an iteration method that involves correcting characteristic values based on L-curve was used to solve the ill-conditioned coefficient matrix while solving linear parameters. However, one limitation is that the situation of rank deficit often occurred in the process of separable nonlinear least squares parameter estimation; this will be addressed in future research.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was supported by the National Nature Science Foundation of China (Grant no. 41774002), the Special Project Fund of Taishan Scholars of Shandong Province (Grant no. TSXZ201509), and the Graduates Innovation Fund of Shandong University of Science and Technology (Grant no. SDKDYC190207).

References

G. H. Golub and V. Pereyra, “The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate,” SIAM Journal on Numerical Analysis, vol. 10, no. 2, pp. 413–432, 1973.
View at: Publisher Site | Google Scholar
R. Savitha, S. Suresh, and N. Sundararajan, “Metacognitive learning in a fully complex-valued radial basis function neural network,” Neural Computation, vol. 24, no. 5, pp. 1297–1328, 2012.
View at: Publisher Site | Google Scholar
R. Rakkiyappan, A. Chandrasekar, S. Laksmanan, and J. H. Park, “State estimation of memristor-based recurrent neural networks with time-varying delays based on passivity theory,” Complexity, vol. 19, no. 4, pp. 32–43, 2014.
View at: Publisher Site | Google Scholar
K. B. Cho and B. H. Wang, “Radial basis function based adaptive fuzzy systems and their applications to system identification and prediction,” Fuzzy Sets and Systems, vol. 83, no. 3, pp. 325–339, 1996.
View at: Publisher Site | Google Scholar
H.-X. Li and Z. Liu, “A probabilistic neural-fuzzy learning system for stochastic modeling,” IEEE Transactions on Fuzzy Systems, vol. 16, no. 4, pp. 898–908, 2008.
View at: Publisher Site | Google Scholar
M. R. Osborne and G. K. Smyth, “A modified prony algorithm for exponential function fitting,” SIAM Journal on Scientific Computing, vol. 16, no. 1, pp. 119–138, 1995.
View at: Publisher Site | Google Scholar
B. D. Rao and K. S. Arun, “Model based processing of signals: a state space approach,” Proceedings of the IEEE, vol. 80, no. 2, pp. 283–309, 1992.
View at: Publisher Site | Google Scholar
P. C. Li, The Technology of Terrain and Building Reconstruction Using Airborne Full-Waveform Li DAR Data, Information Engineering University, Zhengzhou, China, 2015.
U. Prells and M. I. Friswell, “Application of the variable projection method for updating models of mechanical systems,” Journal of Sound and Vibration, vol. 225, no. 2, pp. 307–325, 1999.
View at: Publisher Site | Google Scholar
R. Roy and T. Kailath, “Esprit-estimation of signal parameters via rotational invariance techniques,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 7, pp. 984–995, 1989.
View at: Publisher Site | Google Scholar
R. J. Gardner and P. Milanfar, “Reconstruction of convex bodies from brightness functions,” Discrete and Computational Geometry, vol. 29, no. 2, pp. 279–303, 2003.
View at: Publisher Site | Google Scholar
G. Golub and V. Pereyra, “Separable nonlinear least squares: the variable projection method and its applications,” Inverse Problems, vol. 19, no. 2, pp. R1–R26, 2003.
View at: Publisher Site | Google Scholar
J. Sjoberg and M. Viberg, “Separable non-linear least-squares minimization—possible improvements for neural net fitting,” in Proceedings of the 1997 IEEE Signal Processing Society Workshop. Neural Networks for Signal Processing VII, Amelia Island, FL, USA, September 1997.
View at: Publisher Site | Google Scholar
Z. Q. Fu and L. L. Guo, “Tikhonov regularized variable projection algorithms for separable nonlinear least squares problems,” Complexity, vol. 2019, Article ID 4861708, 9 pages, 2019.
View at: Publisher Site | Google Scholar
L. Kaufman, “A variable projection method for solving separable nonlinear least squares problems,” BIT, vol. 15, no. 1, pp. 49–57, 1975.
View at: Publisher Site | Google Scholar
A. E. B. Ruano, D. I. Jones, and P. J. Fleming, “A new formulation of the learning problem of a neural network controller,” in Proceedings of the 30th IEEE Conference on Decision and Control, pp. 865-866, Brighton, UK, December 1991.
View at: Publisher Site | Google Scholar
M. Gan and H.-X. Li, “An efficient variable projection formulation for separable nonlinear least squares problems,” IEEE Trans. Cybern, vol. 44, no. 5, pp. 707–711, 2014.
View at: Publisher Site | Google Scholar
Z. Bai and W. Sun, “Existence and multiplicity of positive solutions for singular fractional boundary value problems,” Computers & Mathematics with Applications, vol. 63, no. 9, pp. 1369–1381, 2012.
View at: Publisher Site | Google Scholar
C. Han, T. Feng, G. He, and T. Guo, “Parallel variable distribution algorithm for constrained optimization with nonmonotone technique,” Journal of Applied Mathematics, vol. 2013, Article ID 295147, 7 pages, 2013.
View at: Publisher Site | Google Scholar
C. Y. Liu, R. Loxton, and K. L. Teo, “Optimal parameter selection for nonlinear multistage systems with time-delays,” Computational Optimization and Applications, vol. 59, no. 1-2, pp. 285–306, 2014.
View at: Publisher Site | Google Scholar
Z. Fu, G. Liu, and L. Guo, “Sequential quadratic programming method for nonlinear least squares estimation and its application,” Mathematical Problems in Engineering, vol. 2019, Article ID 3087949, 8 pages, 2019.
View at: Publisher Site | Google Scholar
Y. Cheng, Q. Mou, X. Pan, and S. Yao, “A sufficient descent conjugate gradient method and its global convergence,” Optimization Methods and Software, vol. 31, no. 3, pp. 577–590, 2016.
View at: Publisher Site | Google Scholar
K. Madsen, H. B. Nielsen, and O. Tingleff, “Methods for non-linear least squares problems,” Society for Industrial and Applied Mathematics, vol. 2012, no. 1, pp. 1409–1415, 2004.
View at: Google Scholar
K. Levenberg, “A method for the solution of certain non-linear problems in least squares,” Quarterly of Applied Mathematics, vol. 2, no. 4, pp. 164–168, 1944.
View at: Publisher Site | Google Scholar
D. W. Marquardt, “An algorithm for least-squares estimation of nonlinear parameters,” Journal of the Society for Industrial and Applied Mathematics, vol. 11, no. 2, pp. 431–441, 1963.
View at: Publisher Site | Google Scholar
Z. Gong, C. Liu, K. L. Teo, and J. Sun, “Distributionally robust parameter identification of a time-delay dynamical system with stochastic measurements,” Applied Mathematical Modelling, vol. 69, pp. 685–695, 2019.
View at: Publisher Site | Google Scholar
C. Liu, M. Han, Z. Gong, and K. L. Teo, “Robust parameter estimation for constrained time-delay systems with inexact measurements,” Journal of Industrial & Management Optimization, vol. 13, no. 5, 2017.
View at: Publisher Site | Google Scholar
P. Torres, J.-W. van Wingerden, and M. Verhaegen, “Output-error identification of large scale 1D-spatially varying interconnected systems,” IEEE Transactions on Automatic Control, vol. 60, no. 1, pp. 130–142, 2015.
View at: Publisher Site | Google Scholar
X. Y. Dong, Z. B. Bai, and S. Q. Zhang, “Positive solutions to boundary value problems of p-Laplacian with fractional derivative,” Boundary Value Problems, vol. 2017, no. 1, pp. 1–15, 2017.
View at: Publisher Site | Google Scholar
T. Ma, X. Meng, and Z. Chang, “Dynamics and optimal harvesting control for a stochastic one-predator-two-prey time delay system with jumps,” Complexity, vol. 2019, Article ID 5342031, 19 pages, 2019.
View at: Publisher Site | Google Scholar
S. Bellavia, S. Gratton, and E. Riccietti, “A Levenberg-Marquardt method for large nonlinear least-squares problems with dynamic accuracy in functions and gradients,” Numerische Mathematik, vol. 140, no. 3, pp. 791–825, 2018.
View at: Publisher Site | Google Scholar
T. Jiang, Z. Jiang, and S. Ling, “An algebraic method for quaternion and complex Least Squares coneigen-problem in quantum mechanics,” Applied Mathematics and Computation, vol. 249, pp. 222–228, 2014.
View at: Publisher Site | Google Scholar
C. Liu, Z. Gong, and K. L. Teo, “Robust parameter estimation for nonlinear multistage time-delay systems with noisy measurement data,” Applied Mathematical Modelling, vol. 53, pp. 353–368, 2018.
View at: Publisher Site | Google Scholar
F. Ding, L. Xu, F. E. Alsaadi, and T. Hayat, “Iterative parameter identification for pseudo-linear systems with ARMA noise using the filtering technique,” IET Control Theory & Applications, vol. 12, no. 7, pp. 892–899, 2018.
View at: Publisher Site | Google Scholar
F. Ding, F. Wang, L. Xu, and M. Wu, “Decomposition based least squares iterative identification algorithm for multivariate pseudo-linear ARMA systems using the data filtering,” Journal of the Franklin Institute, vol. 354, no. 3, pp. 1321–1339, 2017.
View at: Publisher Site | Google Scholar
A. Gholami and H. R. Siahkoohi, “Regularization of linear and non-linear geophysical ill-posed problems with joint sparsity constraints,” Geophysical Journal International, vol. 180, no. 2, pp. 871–882, 2010.
View at: Publisher Site | Google Scholar
J. Lampe and H. Voss, “Large-scale Tikhonov regularization of total least squares,” Journal of Computational and Applied Mathematics, vol. 238, pp. 95–108, 2012.
View at: Publisher Site | Google Scholar
P. C. Hansen, “Truncated singular value decomposition solutions to discrete III-posed problems with III-determined numerical rank,” SIAM Journal on Scientific and Statistical Computing, vol. 11, no. 3, pp. 503–518, 1990.
View at: Publisher Site | Google Scholar
X. Z. Wang, D. Y. Liu, Q. Y. Zhang, and H. L. Huang, “The iteration by correcting characteristic value and its application in surveying data processing,” Journal of Heilongjiang Institute of Technology, vol. 15, no. 2, pp. 3–6, 2001.
View at: Google Scholar
X. Z. Wang, D. Y. Liu, and H. L. Huang, “The co-factor matrix of the iteration method by correcting characteristic value,” Geomatics and Information Science of Wuhan University, vol. 28, no. 4, pp. 429–431, 2003.
View at: Google Scholar
G. Y. Chen, M. Gan, C. L. P. Chen, and H. X. Li, “A regularized variable projection algorithm for separable nonlinear least squares problems,” IEEE Transactions on Automatic Control, vol. 64, no. 2, pp. 526–537, 2019.
View at: Publisher Site | Google Scholar
M. Zhai, G. Liu, Q. Tao, and M. Xin, “A novel characteristic value correction iteration method,” Communications in Statistics—Simulation and Computation, vol. 48, no. 2, pp. 591–600, 2019.
View at: Publisher Site | Google Scholar
P. C. Hansen, “Analysis of discrete III-posed problems by means of the L-curve,” SIAM Review, vol. 34, no. 4, pp. 561–580, 1992.
View at: Publisher Site | Google Scholar
S. Oh and S. Kwon, “Choosing regularization parameter by global L-curve criterion,” Journal of Chungcheong Mathematical Society, vol. 30, no. 1, pp. 117–128, 2017.
View at: Google Scholar
R. H. Byrd, R. B. Schnabel, and G. A. Shultz, “A trust region algorithm for nonlinearly constrained optimization,” SIAM Journal on Numerical Analysis, vol. 24, no. 5, pp. 1152–1170, 1987.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2020 Ke Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1890

Downloads

1296

Citations