Abstract

We propose a solution of the ill-posed semi-parametric regression model based on singular value modification restriction, aimed at the ill-posed problem of the normal matrix which may occur in the process of solving the semiparametric regression model. First, the coefficient matrix is decomposed into singular values, and the smaller singular values are selected according to the criterion (in the singular value matrix, ). Second, the relatively smaller singular values are modified by the biased parameter to suppress the magnification of the estimated variance so as to effectively reduce the variance of parameter estimation, reduce the introduction of deviation and obtain more reliable parameter estimation. The results of the numerical experiments show that the improved singular value modification restriction method can not only overcome the effect of the ill-posed normal matrix on the parameter estimation solution but also correctly separate the systematic errors and improve the accuracy of semiparametric regression model calculation results.

1. Introduction

The compensated least squares regularization criterion [16] has been widely used as a good estimator since the establishment of the semiparametric regression model. However, in practice it has gradually become clear that the properties of the compensated least squares estimator deteriorate significantly, the solution for the undetermined parameters is extremely unstable, and the effect of separating systematic errors becomes worse, all of which are due to the coefficient matrix containing linear correlation column vectors. To deal with this problem, researchers introduced ridge estimation into the semiparametric model and deduced the semiparametric general penalized least squares (GPLS) estimation method with the criterion [7, 8] and generalized penalized least squares by adding two smoothing parameters (GPLSBASP estimation method) with the criterion [9]. These theories and methods first use the L-curve method or generalized cross-validation method to determine the regularization parameter and then use ridge trace method or generalized cross-validation method to determine ridge regularization parameter k and determine and as the regularization unit matrix. Because the above estimation contains the regularization matrix in the form of a unit matrix, the analysis of variance and deviation shows that the singular values of the coefficient matrix are modified by the ridge regularization parameter . Although the ill-posed problem of the normal matrix is improved, the reliability of parameter estimation is reduced.

In dealing with ill-posed problems, more and more attention has been paid to the direct solution algorithm for the ill-posed observation equation based on singular value decomposition (SVD) [10, 11]. After SVD, relatively large singular values and corresponding left eigenvectors represent the more reliable part of the model parameters, while small singular values and corresponding right eigenvectors represent the unreliable part. The singular value correction method suppresses the unreliable components. It not only corrects the smaller singular values but also corrects the relatively larger singular values, and this distorts the determined components in the model and increases the deviation. To solve this problem, scholars have proposed and developed many and varied effective numerical methods. Based on the spectral decomposition of the coefficient matrix of the linear model, the difference between Tikhonov and Tsvd regularization methods is explained theoretically in the paper by Wu [12]. In the two-step solution proposed in the paper by Wang [13], the essence of selecting a suitable regularization matrix can also be summed up in the better modification of singular values. The paper distinguishes between relatively large and relatively small singular values of the coefficient matrix in the linear model [1418]. At the same time, it proposes an improved singular value correction method based on the characteristics of the truncated singular value method.

This paper synthesizes the characteristics of the truncated singular value method and the singular value correction method, draws lessons from the idea of dividing all singular values into two parts and modifying them separately, and proposes a solution method for the ill-posed semiparametric regression model based on the limitations of singular value correction.

2. Regularized Solutions of Ill-Posed Problems

2.1. Compensated Least Squares Estimation of the Semiparametric Model

Semiparametric model is

where coefficient is an -dimensional vector, estimated parameters and are -dimensional vectors, observation vector is an -dimensional vector (i.e., is an equal precision observation vector), and is an -dimensional error vector.

From the semiparametric regression model , the following error equation can be obtained:

Under the criterion of compensated least squares estimation, according to Lagrange function method for solving conditional extremum, the constructed extremum function is as follows:

where is a Lagrange constant, is the regularization parameter, and is the regularization matrix.

The following can be derived from (3) and , , and :

This gives the following normal equation:

When is reversible, the regularization matrix can be determined by the natural spline smoothing function method, the time series method, the prior variance, or the prior variance-covariance method. The regularization parameter can be determined by the L-curve method or generalized cross-validation method, and the following parametric and nonparametric estimation solution can be obtained:

where and .

The variance and deviation are

2.2. Generalized Compensated Least Squares Estimation in the Semiparametric Model

When is ill-posed or irreversible, the criterion for the semiparametric generalized compensated least squares estimation is defined as

where is called regularization parameter, k is called ridge regularization parameter (ridge regularization parameter are called in the regularization method for ill-posed problems, and biased parameter are called in the singular value decomposition method for ill-posed problems.) that plays a balancing role in the process of minimization for , , and . The regularization parameter is first determined by the L-curve method or generalized cross-validation method, and then the parameter is determined by the ridge trace method or generalized cross validation method and is a given regularization matrix, usually a unit array. The regularized matrix can be determined by the natural spline smoothing function method, the time series method, the prior variance, or the prior variance-covariance method. The criterion function is constructed by the Lagrangian extremum method as follows:

The following normal equation can be derived from (10) and , , and :

This gives the following:

where and .

The variance and deviation are

When the matrix is ill-posed or singular, the semiparametric compensated least squares method will fail. The generalized compensated least squares estimation method can overcome the ill-posed problem of the normal matrix and obtain the parameter estimation solution.

3. Singular Value Decomposition Solutions of Ill-Posed Problems

For the semiparametric model, the parametric estimation solution, variance, and deviation of the estimation are shown by (6) to (8). The weight matrix , which can be united, is set as the unit matrix to facilitate derivation, and the coefficient matrix is decomposed to a singular value:

Whereas and are orthogonal matrices that are the left and right singular value vectors of , is a singular value matrix, and , which is the singular value decomposition equation for the Moore-Penrose generalized inverse of , is obtained as follows:

The singular value decomposition expressions for the parametric and nonparametric estimates in the semiparametric models are

The variance and deviation of the parameter estimation are

In (18), the L-curve method or generalized cross-validation method is used to determine the regularization parameter ,and the regularization matrix can be determined by the natural spline smoothing function method or the time series method according to the actual situation, where , is the singular value of the design matrix [19, 20]. If the normal equation is ill-posed, then is a smaller value close to zero. It can be seen from (18) and (19) that smaller singular values will have a serious impact on the solution of the parametric estimation, and the estimation variance will be greatly amplified by smaller singular values, which will lead to unreliable compensation for least squares estimation and inaccurate estimation of the parameters.

The real cause of normal matrix morbidity is that some singular values of are close to zero. To overcome the unwanted influence of the ill-posed problem of design matrix on parameter estimation, we should start by improving the degree of non-zero singular values approaching zero. That is, by modifying the singular value of the coefficient matrix, the ill-conditioning of the coefficient matrix can be improved, and the stability and accuracy of parametric estimation can be improved. For this reason, an improved form of (16) is proposed, from which we can obtain

The singular value decomposition expressions of the parametric and nonparametric estimation solutions of the semiparametric model based on singular value modification are as follows:

The variance and deviation of the parametric estimation are

In (22), the method of determining the regularization parameter is L-curve method or generalized cross-validation method, the biased parameter is determined by , and the regularization matrix can be determined by the natural spline smoothing function method or the time series method according to the actual situation, where . Thus, compared with (23) and (19), (24), and (20), the solution of ill-conditioned semiparametric regression model based on singular value modification reduces the variance of the estimator and the introduction of deviation and improves the stability of the solution by modifying the singular value of the coefficient matrix. Since the regularization matrix is a unit matrix, the biased parameters modify each eigenvalue, and the degree of correction is the biased parameter .

4. Solution of Ill-Posed Semiparametric Regression Model Based on Singular Value Modification Restriction

Equation (21) modifies the smaller singular value to suppress its amplification effect on the parameter estimation solution. At the same time, it also modifies the larger singular value, which distorts the stable components of the model. An improved singular value decomposition method is designed for this purpose. The main idea is as follows. First, the coefficient matrix is decomposed into singular values, and the smaller singular values are selected according to the criterion (singular value matrix, ). Then, the smaller singular values are modified by the biased parameter, which not only restrains the magnification effect of the smaller singular values on the parameter estimation solution but also avoids the effect of the larger singular values being modified.

For the modified restricted singular value method, set as the truncation parameter and select it according to the decision condition . If is a biased parameter, the corrected singular value is

Then the singular value decomposition expressions of parametric and the nonparametric estimation solutions of the semiparametric models under the limitations of singular value modification can be written as follows:

Its variance and deviation are

In (28) and (29), the method of determining the regularization parameter is L-curve method or Generalized cross validation method; and the regularization matrix can be determined by the natural spline smoothing function method or the time series method according to the actual situation, where . Equations (19), (23), and (28) show that the influence of morbidity on the estimated variance is mainly reflected in the magnification of variance by smaller singular values. The larger singular values have no adverse effect on the difference. Because the information on singular value decomposition is insufficient, the smaller singular values are mainly reflected in the smaller singular values. Therefore, the solution method for the ill-posed semiparametric regression model based on the limitations of singular value correction is applied. According to the criterion , the smaller singular values are selectively corrected and the larger singular values are retained. Compared with (28) and (23), (29) and (24), the method proposed in this paper, can not only reduce the variance efficiently but also reduce the damage to information and the introduction of deviation.

5. Comparisons and Analysis of Simulation Examples and Results

Numeric experiments. The data for Experiment 1 come from the paper cited in [9]. The validity of this method is analyzed when the matrix is moderately ill-posed. The data for Experiment 2 are adapted from the data for Experiment 1. The validity of this method is analyzed when the matrix is seriously ill-posed. The data for Experiment 3 are taken from the analysis of the Generalized Measurement Adjustment sample; that is, the epoch interval time is 2 seconds, with five observation satellites and four epochs to solve the ambiguity.

Experiment 1. Suppose there is a linear model , and , where is a matrix of dimension 100×2 and , , . Suppose the system error is , where , . The random error is a column vector composed of 100 random numbers that obey . The observed value is .

In the semiparametric regression model, the regularization matrix is determined in the form of the sum of the squares of the difference between two adjacent observations. In this calculation example, and must satisfy . According to the semiparametric compensation least squares solution, there is a unique solution to calculate the condition number of the normal matrix N, . To illustrate the influence of small singular values on parametric estimation, the following five computational schemes are designed.

Scheme 1. PLS estimation, solution by Formula (6), the regularization parameter is determined by L-curve method;

Scheme 2. GPLS estimation, solution by Formula (12), the regularization parameter is determined by L-curve method and the ridge regularization parameter k is determined by generalized cross validation method;

Scheme 3. Direct solution of ill-posed semiparametric regression model based on singular value decomposition, solution by Formula (18), the regularization parameter is determined by L-curve method;

Scheme 4. Solution of ill-posed semiparametric regression model based on singular value modification, solution by Formula (22), Firstly, the regularization parameter 1 is determined by L-curve method, and then the biased parameter k is obtained by the solution of parameter estimation. Secondly, the biased parameter k is used to correct all singular values. Finally, the regularization parameter 2 is determined by L-curve method;

Scheme 5. Using the estimation method proposed in this paper, solution by Formula (27), Firstly, the regularization parameter 1 is determined by L-curve method, and then the biased parameter k is obtained by the solution of parameter estimation. Secondly, the biased parameter k is used to correct the smaller singular values. Finally, the regularization parameter 2 is determined by L-curve method;

The differences of the parameter estimation , norms of X and , norms of and , and unit weight errors calculated by the five schemes are shown in Table 1.

Experiment 2. There is no obvious difference between the values of the estimated solutions of the undetermined parameters in Table 1. This is because the ill-posed problem of the normal matrix in this numerical experiment is moderate and not very serious. The method of solving the ill-posed semiparametric regression model based on the modification restriction of singular values only shows a weak advantage. For this reason, the example is transformed and the true value of the parameter is taken as follows: . According to the condition of the Hilbert matrix, a coefficient matrix with dimension is obtained and the remaining assumptions remain unchanged. The condition number of the normal matrix is obtained as and the morbidity is serious. Similarly, the above five schemes are designed to calculate the numerical examples. The relevant calculation results are listed in Table 2.

Experiment 3. The coefficient matrix of the error equation isThe true value of the parameter is as follows:where is the ambiguity parameter and is the coordinate parameter. The observed values are as follows:The conditional number of the normal matrix is 3.2127×1010. There are serious ill-posed problems. Although some systematic errors of observations have been corrected in the calculation process of baseline calculation software, there are still some residual systematic errors in baseline calculation, which may lead to excessive residual errors. Therefore, the above five schemes are also used for calculation. The differences of the parameter estimation , norms of X and , and unit weight errors calculated by the five schemes are shown in Table 3.

The following conclusions can be drawn from Tables 2-3.

(1) When the normal matrix is seriously ill-posed, the deviation between the solution and the true value of scheme 1 and scheme 3 is very large, and the effect of separating the system error becomes worse. The estimation method is not suitable, and the deviation varies with the degree of the ill-posed normal matrix. The more serious the ill-posed normal matrix is, the greater the deviation is.

(2) Both scheme 2 and scheme 4 can effectively improve the estimation accuracy of scheme 1 when the normal matrix is ill-posed. However, when the normal matrix is seriously ill-posed, the deviation between the estimated value and the true value is still large. Then scheme 2 and scheme 4 can improve the solution effect of the ill-posed normal matrix. However, when the normal matrix is seriously ill-posed, the solution effect is not ideal.

(3) When the normal matrix is ill-posed and scheme 5 is adopted, the result of calculation is close to the simulated true value. The less ill-posed it is, the smaller the deviation is and the closer the estimation is to the true value. The accuracy of the parameter estimation solution obtained by this method is better than by the other two methods, and it is more effective in reducing the variance and deviation.

6. Conclusion

This paper proposes a method for solving the ill-posed semi-parametric regression model based on the limitations of singular value correction. The method combines the characteristics of the truncated singular value method and the singular value correction method, which overcomes the damage caused by the ill-posed matrix and weakens the influence of biased estimation deviation.

The simulation example shows that the algorithm improves the quality of the solution significantly from the aspect of the mean squared error. It is easy to use in solving practical problems and has good applicability.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This research was funded by the project of Shandong Province Higher Educational Science and Technology Program, grant numbers J18KA195. This research was also partially funded by the Natural Science Foundation of Shandong Province; grant number ZR2019PD016 and the Scientific Research Foundation of Shandong University of Science and Technology for Recruited Talents.