Inverse Problems: Theory and Application to Science and EngineeringView this Special Issue
A New Method for TSVD Regularization Truncated Parameter Selection
The truncated singular value decomposition (TSVD) regularization applied in ill-posed problem is studied. Through mathematical analysis, a new method for truncated parameter selection which is applied in TSVD regularization is proposed. In the new method, all the local optimal truncated parameters are selected first by taking into account the interval estimation of the observation noises; then the optimal truncated parameter is selected from the local optimal ones. While comparing the new method with the traditional generalized cross-validation (GCV) and curve methods, a random ill-posed matrices simulation approach is developed in order to make the comparison as statistically meaningful as possible. Simulation experiments have shown that the solutions applied with the new method have the smallest mean square errors, and the computational cost of the new algorithm is the least.
Ill-posed problem is widespread in the field of geophysical survey, such as GNSS rapid positioning, precise orbit solution of spacecraft, and downward continuation of airborne gravity [1–7]. The least square estimation is not stable in the ill-posed problem and other approaches have to be utilized. Tikhonov  proposed the Tikhonov regularization theory, Hanke  utilized iteration while Walker and Birch  and Hoerl and Kennard  applied ridge regression method. Hańćkowiak  and Lin et al.  also studied this problem. Among those approaches, methods based on singular value decomposition (SVD) have drawn extensive attention, thanks to the numerical stability of their solutions . On that basis, Wiggins  and Xu  intensively studied the truncated singular value decomposition (TSVD) regularization method.
The standard form of the observation equation is (nonstandard form can be transformed to a standard one) as follows: whereis the coefficient matrix, , is the parameter vector, is the observation vector, is theresidual vector,, and .
The condition number of normal matrix () is generally used as the measurement of the ill-posed severity as follows: where and are the maximum and minimum singular values of , respectively.
According to application experiences, if , the system is considered as ill-posed .
In ill-posed problem, some singular values of the coefficient matrix approximate to 0, the least square estimation will enormously amplify the observation noises and degrade the precision. In TSVD regularization, items containing these small singular values are discarded to maintain the stability of the solution. Setting a small positive number as the threshold value, items containing singular values smaller than are discarded; thus, the amplification to the observation noises is restrained. If there are singular values bigger than , the TSVD solution is given as
Normally, the mean square error (MSE) is used to evaluate the quality of ill-posed problem solution. The key point of the TSVD is how to select a proper truncated parameter to get the smallest mean square error of the solution. Many scholars have extensively studied this problem. Golub et al.  proposed GCV method, Hansen [18, 19] and Hansen and O’Leary  made use of the property of the curve to select truncated parameter, and these methods will be introduced in Section 2. A new truncated parameter selecting method will be proposed in Section 3. In Section 4, a random ill-posed matrices simulation approach for simulation experiment will be developed, and the performances of GCV, curve, and the new methods will be compared Section 2.
2. Traditional Truncated Parameter Selecting Method
2.1. GCV Method
Based on statistical point of view, Golub et al.  proposed GCV method for truncated parameter selection. In linear system , assuming that is the matrix mapping observation vector to regularized solution , . GCV method is equal to selecting a truncated parameter to minimize the GCV function, which is defined as where “” denotes the trace of a matrix.
The disadvantage of GCV method is that the GCV function converges very slowly in some cases. It may lead to the GCV function minimization while is equal to , in which situation, the truncated parameter cannot be selected, and the GCV method fails.
2.2. Curve Method
Hansen [18, 19] extensively studied the application of curve method in regularization. With the variation of , the discreet points and can be fitted with a cubic spline curve . He defined the maximum curvature point of this curve by (5)  as follows: where and are the independent and dependent variables of the cubic spline function, respectively.
The original discreet point which lies closest to the maximum curvature point from the left side is selected. Its truncated serial number is the truncated parameter asked for.
Further descriptions of the curve method can be seen in Hansen [18, 19]. Hansen  indicated that the truncated parameter selected by the curve method is not the optimum parameter but the asymptotically optimum one.
3. A New Truncated Parameter Selecting Method
In this section, we develop a new approach to selecting the truncated parameter. We focus on the MSE of the solution, making some proper estimation, and finally derive the conditions that the optimal truncated parameter should meet. In this section and the subsequent sections, is called the optimal value if is the smallest in all the ; is called the local optimal value if is the smallest just in , , and . Before the optimal value selection, we first select all the local optimal values as the candidates.
3.1. Local Optimal Truncated Parameter Selection
The MSE of TSVD solution is
Taking the singular values decomposition into consideration, the least square solution of the observation equation can be written as
As mentioned above, if is the local optimal value, the following inequalities are satisfied:
Denoting , inequalities (10) can be rewritten as
As and are unknown, we should estimate them first. Premultiplying matrix to both sides of (7), we obtain a new vector denoted as as follows: where is the th element of .
Then, we estimate ; an unbiased estimator of can be obtained from least-square estimation as
We have to treat cautiously, because it is just an approximate value of . Replacing with may cause some misjudgments. For example, in the situation shown in Figure 1, is the local optimal value since (denoting ). But, if we replace with , the opposite judgment will be made since .
In order to overcome this shortcoming, the interval estimation of is used instead of in this algorithm. Assuming that is the confidence interval of under the confidence level , we replace with and in the first and second inequality in inequalities (15), respectively. By this means, a fault-tolerance interval between the judgment thresholds is given. As seen in Figure 1, by properly selecting and , there are , and so, is judged as the local optimal value.
To get a proper interval estimation of , analysis goes as follows.
The estimation of the observation noises is where (note that among the elements, onlyof them are independent).
Thus, according to the definition of distribution, . Since is the confidence interval ofunder the confidence level , we yield
Considering the distribution, (18) can be transformed as follows:
Equation (20) is the estimator of and , the final local optimal truncated parameter selecting inequalities reads as follows:
There is an undetermined parameter in inequalities (21): the confidence level is . This parameter adjusts the size of the fault-tolerance interval. Enlarging will decrease the fault-tolerance interval and the risk of leaking local optimal value increases. While decreasing will increase the fault-tolerance interval, it will cause the judgment efficiency to decline. In order to balance the fault-tolerance and the judgment efficiency of inequalities (21), we first set in this algorithm. If no truncated parameter matches inequalities (21), the value of will be decreased until one or more truncated parameters can match it.
3.2. Optimal Truncated Parameter Selection
If there is only one local optimal value, we regard it as the optimal value. If not, assuming that there are local optimal values , It is essential to select the optimal value from them. Analysis goes as follows.
Assuming that and are both local optimal values, , minus is denoted as , we derive
According to inequalities (23), to determine the sign of , we take proper estimations of and simultaneously.
If we regard as an unknown random vector, the elements of which are independent and have the same statistics weight. That is, . Then, it yields
As and are both local optimal values, the regularization solutions and are much more stable than the least square solution. We estimate by and by its unbiased estimator , which yields
According to (26), if , we regard as the better truncated parameter and vice versa. After comparing all the local optimal values utilizing (26), we can get the minimized MSE of all the solutions, and the optimal truncated parameter is determined.
3.3. Algorithm Flow
In Sections 3.1 and 3.2, we have discussed the conditions that the optimal truncated parameter should meet. And then, the equations to select the local optimal and the optimal truncated parameter are deduced in detail. Although the derivation process is somehow complicated and trivial, the execution of this algorithm is quite simple. In this subsection, we summarize the steps of the algorithm flow to make it clear and more readable. Begin Step 1. Set the confidence level of interval estimation . Step 2. Calculate the two thresholds and using (20). Step 3. For each truncated parameters from 1 to , calculate inequalities (21). If the inequalities are satisfied, select the corresponding truncated parameter as a local optimal truncated parameter. Step 4. If no truncated parameter matches inequalities (21), set , return to Step 2; else, skip this step. Step 5. Estimate the variance of observation noise by (16). Step 6. If there is only one local optimal parameter, it is the optimal truncated parameter simultaneously; else, make pairwise comparisons of the local optimal truncated parameters according to (26), until getting the one minimizing the MSE as the optimal truncated parameter. End
4. Random Simulation Experiment
4.1. The Approach of Random Simulation
Although we can use a particular coefficient matrix to compare different truncated parameter selecting methods, such comparisons are of limited value . Because the doubt to the conclusions drawn from them will be threefold: (i) whether the conclusions are still valid if the dimensions of the coefficient matrix have changed; (ii) whether the conclusions are still valid if the condition number and singular values distribution of the coefficient matrix have changed; (iii) whether the conclusions are valid if they are applied to another coefficient matrix which has the same dimensions and condition number with .
In order to make the comparison of the three methods as persuasive as possible, we have to simulate the coefficient matrices by taking into account the three issues mentioned above. According to (2), ; thus, the simulation problem has turned into how to design the matrices , , and and their dimensions.
Theoretically, the dimensions and can be set any arbitrary positive integer which satisfy: . Considering the computing burden, we set and , each of them is randomly generated in their interval. As it is mentioned in Section 1, a coefficient matrix is regarded as ill-posed if . But the condition number cannot be infinite; otherwise, the ill-posed matrix will be turned into rank defect matrix. In the simulation, the mathematical expectation of is uniformly distributed over. Furthermore, about the singular values, we assume that the mathematical expectation of the sum of the logarithms of all the singular values is 0. Finally, the equations set can be written as follows: where the random variables and conform to uniform distribution, ; is also a random variable, .
Utilizing the Gram-Schmidt method, unitary matrix (or ) can be derived from an -dimension (or -dimension) random full rank matrix. For instance, assume that is an -dimension random full rank matrix and are column vectors of ; thus, the column vectors of are as follows:
To make the simulation meaningful, we sample 100 random generated coefficient matrices for our simulation experiment. The dimensions and condition numbers of these matrices are shown in Figures 2(a) and 2(b), respectively.
4.2. Comparisons of Truncated Parameter Selecting Methods
In this section, we will compare the performances of TSVD solutions applied with the GCV, curve, and the new methods proposed in this paper. The criterion is the mean square errors of the solutions. In the simulation, we set as -dimension vector, all elements of which are 1. The elements of residual vector are Gaussian white noises, generated by randomizer, . To enrich our comparison, two observation noise levels are used in our simulation, the variances of which are and . In consideration of the randomness of , objective comparisons cannot be obtained if we only do the experiment for once. In our simulation, 1000 times independent repeated experiments have been done for each coefficient matrix and each observation level. The mean square errors which are used in our comparisons are the mean value of those in the 1000 experiments.
(i) Results Comparison between GCV and the New Methods. The comparisons between GCV and the new methods under the observation noise levels and are shown in Figures 3(a) and 3(b), respectively. In order to gain further details of the comparisons, we draw the -value curves of their mean square errors. The -value curves of Figures 3(a) and 3(b) are shown in Figures 3(c) and 3(d), respectively; the negative value means that the new method has smaller mean square errors. Since the GCV method may fail in some cases, the solutions shown in Figures 3(a) to 3(d) do not contain those failure cases.
Seen in Figures 3(a) and 3(b), the solid lines are always beneath the dotted lines. In Figures 3(c) and 3(d), most values are negative. That means that the solutions of TSVD regularization applied with the new method have smaller mean square errors.
(ii) Results Comparison between Curve and the New Methods. Similar to previous comparisons, the comparisons between curve and the new methods under two observation noise levels are shown in Figures 4(a) and 4(b), respectively. The -value curves of Figures 4(a) and 4(b) are shown in Figures 4(c) and 4(d), respectively; the negative value means that the new method has smaller mean square errors.
As seen in Figures 4(a) and 4(b), the solid lines are also always beneath the dotted lines; and in Figures 4(c) and 4(d), most values are negative. So we can verdict that the new method has a better performance in truncated parameter selection than curve method.
(iii) Time Consuming Comparisons. Besides the quality of results, computational cost is another important aspect to evaluate the effectiveness of an algorithm. In the simulations, we record the time consuming of the algorithms for each coefficient matrix. The average time consuming of these three algorithms are shown in Figure 5(a). (Note that these time consuming values are the average values of 1000 times independent repeated experiments.) Figure 5(b) has shown the time consuming of the new algorithm under two different noise levels. The observation noise levels and are, respectively, labeled “Noise level 1” and “Noise level 2” in Figure 5(b).
From Figures 5(a) and 5(b), we can see that the computational costs of these algorithms are all small. Nevertheless, the cost of the new method is obviously smaller than that of the other two by dozens of times. With the increasing of the matrices’ dimension or the count of times, this advantage will be more and more obvious. Another conclusion we can get from Figure 5(b) is that the observation noises only have a little influences with the computational costs.
A new truncated parameter selecting method is proposed in this paper. Focusing on how to decrease the MSE of the solution, we divide the truncated parameter selection into two parts: local optimal and optimal truncated parameter selection. Detailed analyses are made and finally the equations to select the local optimal and the optimal truncated parameter are obtained. Although the derivation of the new algorithm is somehow trivial, its execution is quite simple and efficient.
The key point of local optimal truncated parameter selection is the estimation of the observation noises. To overcome the inaccuracy of point estimation, we apply the interval estimation with the confidence level . The parameter balances the fault-tolerance and the judgment efficiency of local truncated parameter selection. In this algorithm, is firstly set as 0.5, it can adjust automatically according to the situation of local truncated parameter selection.
The selection of optimal truncated parameter is based on the local optimal truncated parameters. We regard the parameter vector as a random variant vector, making use of the stability of its regularization solutions, and finally derive the equations of optimal truncated parameter selection.
A random ill-posed matrices simulation approach is developed and a great deal of experiments is made in Section 4. The results have shown that the solutions applied with the new method have the smallest mean square errors. Meanwhile, the computational cost of the new method is the least of the three.
This work was financially supported by the National Key Basic Research and Development Program (2012CB719902) and the National Natural Science Foundation of China (nos. 41274013, 41374082, and 61203193).
H. Ahmadian, J. E. Mottershead, and M. I. Friswell, “Regularisation methods for finite element model updating,” Mechanical Systems and Signal Processing, vol. 12, no. 1, pp. 47–64, 1998.View at: Google Scholar
S. L. Han and T. Kinoshita, “Reconstruction of initial wave field of a nonsteady-state wave propagation from limited measurements at a specific spatial point based on stochastic inversion,” Mathematical Problems in Engineering, vol. 2013, Article ID 286247, 10 pages, 2013.View at: Publisher Site | Google Scholar | MathSciNet
Z. Martinec, “Stability investigations of a discrete downward continuation problem for geoid determination in the Canadian Rocky Mountains,” Journal of Geodesy, vol. 70, no. 11, pp. 805–828, 1996.View at: Google Scholar
P. X. Xu and R. Rummel, “A simulation study of smoothness methods in recovery of regional gravity fields,” Geophysical Journal International, vol. 117, no. 2, pp. 472–486, 1994.View at: Google Scholar
A. N. Tikhonov, “Solution for incorrectly for mulated problems and the regularization method,” Soviet Mathematics. Doklady, vol. 4, pp. 1035–1038, 1963.View at: Google Scholar
E. Walker and J. B. Birch, “Influence measures in ridge regression,” Technometrics, vol. 30, pp. 377–398, 1988.View at: Google Scholar
A. E. Hoerl and R. W. Kennard, “Ridge regression: biased estimation for non-orthogonal problems,” Technometrics, vol. 12, pp. 55–67, 1970.View at: Google Scholar
J. Hańćkowiak, “Some insights into the regularization of III-posed problems,” Mathematical Problems in Engineering, vol. 5, no. 2, pp. 161–171, 1999.View at: Google Scholar
R. A. Wiggins, “The general inverse problem: implication of surface waves and free oscillation for earth structure,” Reviews of Geophysics and Space Physics, vol. 10, pp. 251–285, 1972.View at: Google Scholar
P. Xu, “Truncated SVD methods for discrete linear ill-posed problems,” Geophysical Journal International, vol. 135, no. 2, pp. 505–514, 1998.View at: Google Scholar
S. G. Wang, Linear Model Theory and Application, Anhui Education Publishing House, Hefei, China, 1987.View at: MathSciNet