Continuous Regularized Least Squares Polynomial Approximation on the Sphere
In this paper, we consider the problem of polynomial reconstruction of smooth functions on the sphere from their noisy values at discrete nodes on the two-sphere. The method considered in this paper is a weighted least squares form with a continuous regularization. Preliminary error bounds in terms of regularization parameter, noise scale, and smoothness are proposed under two assumptions: the mesh norm of the data point set and the perturbation bound of the weight. Condition numbers of the linear systems derived by the problem are discussed. We also show that spherical -designs, which can be seen as a generalization of spherical -designs, are well applied to this model. Numerical results show that the method has good performance in view of both the computation time and the approximation quality.
In recent decades, the problem of finding a polynomial approximation with degree of up to an integer of a continuous function on the sphere has received much attention [1–6]. Given noisy data at points , , regularized least squares schemes are usually considered, whereas the regularization varies for different tasks. In this paper, we consider approximating the function with a continuous regularized least squares scheme (CRLS) aswhere , are positive scalars as weights of the data fitting term. For normalization, we can always assume that . Moreover, is the regularized parameter to balance the data fitting term and the regularized term and denotes the polynomial space:where is a fixed -orthonormal real spherical harmonic of degree and order , which meanswhere denotes the surface measure on and is the Kronecker delta. It is well known that the dimension of is
The Sobolev space is defined for as the set of all functions whose Laplace–Fourier coefficientssatisfywhere , and the norm of can be defined aswhere the positive parameters are decreasing in and satisfy
Here, we say that if there exist positive constants and independent of such that for all . Obviously, by letting , we can obtain . CRLS is to obtain the approximation of with fitting the data values , , and at the same time, we keep the approximation smoother. The motivation is often from the geophysical quantity problems such as satellite gravity gradiometry problem (, pp. 120, 262), in which a smoother polynomial representation of the gravitational potential than the Earth’s surface is to be found. On the other hand, CRLS can be seen as a generalization or approximation of some smooth-oriented regularized least squares models (see the appendix for discussions on some polynomial approximation models relative to this paper).
Polynomial approximation problems arise in a variety of engineering areas, for which different models are used. To recover smooth functions on , a discrete regularized least squares method (DRLS) is proposed in [1, 5] aswhere is a linear “penalization” operator, which can be chosen in different ways. The points used are positive-weight cubature formula that is exact for all spherical polynomials of degree up to , which will guarantee the uniqueness of the solution and furthermore derive its explicit form. A reconstruction error bound is obtained for the case that the nodes and weights establish a cubature formula that is exact for all spherical polynomials of degree up to . To our best knowledge, the number of points needed to construct a -degree polynomial-exact cubature on is around , which is bigger than the dimension of the polynomial space in most practical cases. However, in real application, it is usually costly to acquire as much data points as expected, which motivates us to study the case that the number of data points is not large. In this paper, we do not assume that , that is, the number of data values is not necessary to reach or exceed the dimension of the polynomial space . The regularization term could still guarantee the uniqueness of the minimizer of CRLS when and avoid the overfitting.
The rest of this paper is organized as follows. In Section 2, we discuss the solution and approximation error of CRLS and the condition number of the linear system it derives. Based on the results, we further give a discussion on spherical -designs applied to this model. Numerical experiments are proposed in Section 3 in which the model is tested for its efficiency and accuracy.
2. Solution, Approximation Error, and Condition Number
2.1. Solution of CRLS
For simplicity, denote for . A matrix form of problem (1) could be written aswhere is defined as , , , and
Remark 1. For , equation (12) will always have a unique solution by the fact that is positive semidefinite and is positive definite.
By the uniqueness of the solution of problem (1), we denote the minimizer of it byFurthermore, when , problem (1) may have multiple solutions. In this case, represents the uniquely determined minimizer in of (1) with minimal norm.
2.2. Approximation Error
For the case that and is a spherical -design or a cubature rule with -degree polynomial exactness, the error bounds are discussed in [1, 5]. Model (1) can be seen as a variant of models considered in [1, 5], in which the difference is that we choose different regularization terms. Furthermore, to propose the error bounds, we do not need the assumption about the polynomial exactness the data point needs to have. The only assumption we need is on the mesh norm of the data point set , which can be defined as follows.
Definition 1. (mesh norm). The mesh norm of a point set isThe mesh norm could be regarded as the largest geodesic distance of all points on to their nearest or as the radius of the largest spherical cap that does not contain any . The results in this section are based on a slight generalization of Theorem 5.1 in  which presents an sampling inequality on the sphere .
Theorem 1. Let and let . There exist positive constants and (dependent on and ) such that for any finite point set of distinct points on with and for any function ,
For the setting of weighted least squares, we slightly revise the above theorem as the following.
Theorem 2. Let , , and with . There exist positive constants and (dependent on and ) such that for any finite point set of distinct points on with and for any function ,
Theorem 3. Let . For a point set of distinct points, a weight vector with and a function with values on , and given a vector of deterministic noise and given , there exist positive constants and (dependent on ) such that if the point set satisfies , the following holds:
Proof. For simplicity, we define the norm as for a vector . Note that the norm is well defined since is positive definite. Define the quadratic functional asLet , be the vector of function value of , at the point set , respectively. By sampling inequality (17) in Theorem 2, we can obtain that there exists a positive constant such thatThen, by the fact that is the minimizer of , the following holds:Moreover, we haveThus, the term can be bounded asSubstitute inequalities (21) and (23) into (20) with , and we havewhere the last inequality is derived by the fact that .
From the theorem, it is easy to see that the data point set and the weight need to be chosen properly to improve the approximation quality of the model. The data set should have its mesh norm as small as possible and should be near to its mean value. Furthermore, one can assume that and establish a cubature rule with polynomial exactness so that it can give a fast way to find the solution of the problem. By this view, spherical -designs which are proposed and studied in [6, 10–14] are a suitable choice in this paper.
Definition 2. (spherical -design). A spherical -design with on is a set of points , such that the quadrature ruleis exact for all spherical polynomials of degree at most , with the weight vector satisfyingNaturally, we can choose a spherical -design as the data set and its corresponding weight as the weight in problem (1). Moreover, solving problem (1) can become trivial when is large enough.
Proof. Note that when is a spherical -design,where the third equality is established by the orthonormality of spherical harmonics and the fact . By Remark 1, problem (1) has a unique optimal solution satisfyingNote that is diagonal, and thus we can directly obtain the solution of the above linear system as equation (28). The proof is completed.
2.3. Condition Number
For a symmetric positive definite matrix , let and denote the largest and smallest eigenvalues of and let denote the condition number of in the norm. The following theorem discusses the condition number and perturbation proposition of problem (1).
Theorem 5. Let denote a perturbation of . Then, the following holds:
Proof. The proof of the theorem is similar to the proof of Theorem 4.1 in , and we provide quick illustration to keep the paper self-contained. The condition number of could be obtained by the eigenvalue Weyl’s inequalities (, p. 181), which leads toNote that problem (1) can be written as a least square problem:withThe perturbation bound could then be achieved by applying the standard least squares perturbation bound (, Theorem 1.4.6) asWhen is a spherical ‐ design with corresponding weight with , by (27), we have that is a scalar multiple of the identity matrix, and from (13), we findTherefore, estimate (31) of the condition number is sharp. It can be directly obtained by the above theorem that , which induces that should be chosen properly as a balance of smoothing and solvability of the problem. However, the choice of , which may depend on the nature of target function and the scale of noise, will be an interesting question to be solved in the future.
3. Numerical Experiments
In this section, we report the numerical results for the CRLS model proposed in this paper. To illustrate the effectiveness of the model, we first compare the approximation errors of our model and the DRLS model (9) which is proposed in [1, 5]. The computation is implemented in Matlab 2016a and performed on a HP notebook computer equipped with Intel Core i7-6500U 2.5G Hz CPU, 16 GB RAM running Windows 10. The solver for linear systems generated by the two models is selected as the left matrix divide command embedded in Matlab, which indeed performs fast in the experiment.
For the choice of the regularized parameter and the penalized parameters for the DRLS model, one can refer to , in which are chosen based on the theoretical error bounds and real applications and is chosen by the balancing principle. The parameter choice of CRLS is also of great importance for the approximation quality of our model, and this will be a crucial problem we will tackle in the future work. The task for the numerical experiment here is to compare the numerical performance between CRLS and DRLS. What should also be noticed in the numerical experiments is that for an approximation of , both the uniform and approximation errors as and are difficult to obtain exactly. In such case, the errors will be approximately computed bywhere is a large-scaled and well-distributed point set and we use it to estimate the errors. In this paper, is selected as a maximal determinant (MD) (see [6, 17]) point set with points.
Example 1. In the first example, we will test the two models, CRLS and DRLS, and record their approximation errors for comparison. The data point set is also selected as the MD node of size 900. The penalized parameters are selected as and so that both models will achieve unique minimizers. The target function supposed to approximate is the Franke function :which is not a spherical polynomial but continuously differentiable on the whole sphere. The noises in the data obey a uniform distribution in i.i.d. Images of and its noisy version are shown in Figures 1(a) and 1(b).
Figure 2 shows uniform and approximation errors of the both models for , and , . It should be noted that both models will only be effective to recover the target function by choosing the parameters and properly. Therefore, thresholds as a success of the approximation are introduced into the figure, that is, error and uniform error . As can be seen from the figure, CRLS shows its effectiveness and robustness in this example, which seems to be achieving success with more choices of the parameters than DRLS. Also, the deep blue area of the first row is larger than that in the second row, which indicates that CRLS has more chance than DRLS to achieve small errors.
Example 2. In this example, we will focus on the solving speed of the linear systems induced by the models. The condition number and CPU time about the linear systems derived by CRLS and DRLS will be reported. Two types of data point set with different scales will be chosen in this example. One is a well-distributed type which is used in Example 1, and the other type is ill-distributed, called Gauss–Legendre point set (GL)  with Gauss points in (polar angle) and equally spaced points in azimuthal angle.
In Example 1, the two models cannot achieve each one’s smallest errors in the same setting and , even when other settings are the same. To deal with this point, for a fixed data set, we search for the optimal parameter setting of and for each model to achieve least uniform errors separately with the scheme shown in Example 1 and report their errors, condition numbers, and CPU time in Table 1. From Table 1, it is obvious to see that for both MD and GL cases, CRLS and DRLS can achieve similar uniform errors at their best situations. However, in most cases, CRLS is more well conditioned than DRLS and costs less CPU time.
It is an interesting question to explain that CRLS performs better than DRLS in the simulation. First, we can compare the theoretical results on the condition numbers of the linear systems induced by CRLS and DRLS and then get the reason that CRLS has smaller condition numbers and solving times. However, the theoretical underlying reason that CRLS achieves smaller errors is unknown to us until now. We can try to explain it in a heuristic way: we know that the difference between CRLS and DRLS is the choice of the regularization term. In CRLS, we use the norm of the polynomial approximation while in DRLS, we use the norm of the value of with some operator transform. Thus, it can be easy to obtain that the regularization term in DRLS is determined on the choice of the data point set but in CRLS, it is not. This may possibly be the reason for CRLS obtaining smaller errors.
4. Conclusion and Future Work
In this paper, a continuous regularized least squares polynomial approximation model on the two-sphere is discussed. Condition number and error bound of the model are proposed in the situation that the data are noisy and their cardinality is not necessarily larger than the dimension of the polynomial space. To guarantee the error bound, we need two assumptions: the mesh norm of the data point set should be small enough and the weights should be close to their mean. Numerical results show that the novel model is effective to approximate smooth functions on the sphere.
We conclude this paper by pointing out a few problems for future directions. The first is that it is of great importance to study the strategy of choosing the parameters and , which is crucial to the approximation quality of the model. Secondly, it is an interesting topic to explain that CRLS could achieve smaller errors than DRLS in the numerical experiments, since the difference between them is only the choices of the regularization term.
Two Relative Polynomial Approximation Models
We will introduce two related regularized polynomial approximation models in this part. The first one can be seen as a special case of the CRLS model discussed in this paper.
Example A.1. (Euclidean norm regularization). Let and . Letand . Then by (7), problem (1) can be rewritten asin which the regularized term is the Euclidean norm of .
The second example is that we replace the regularized term by the norm of the Laplace–Beltrami transform of , which is slightly different with another special case of the CRLS model.
Example A.2. (Laplace–Beltrami operator regularization). Let for . By omitting the term , problem (1) can be rewritten asin which denotes the Laplace–Beltrami operator (, pp. 38–39), which is the angular part of the Laplace operator in three dimensions:This can be directly derived by the characterization of Laplace–Beltrami operator that the spherical harmonics are its eigenfunctions, that is,In fact, when and for , such as in problem (A.3), the solution will also be unique, as is shown in the following lemma.
Proposition A.1. The solution of problem (A.3) is unique when .
Proof. Problem (A.3) could be reformulated as form (10) withThe proof will be completed if is nonsingular. Let be eigenvalues of . Note that by (13), matrix is semipositive definite, and thus we only need to prove that . Note thatBy this condition, we can prove the conclusion by showing thatsince under (A.8), for any , the two equalities and cannot hold at the same time, and thus is nonsingular.
In the following part, we will prove (A.8) by contradiction. Assume there exists , i.e., , which satisfiesEquality (A.9) means that and . By this fact, we can rewrite (A.10) asNote that the first component does not equal to 0, which contradicts equality (A.10). The proof is completed.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This research was supported by the National Natural Science Foundation of China (no. 11626147) and Shandong Provincial Natural Science Foundation, China (no. ZR2019PA004).
R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, Cambridge, UK, 2012.
A. Bjorck, Numerical Methods for Least Squares Problems, Siam, Philadelphia, PA, USA, 1996.
V. I. Krylov and A. H. Stroud, Approximate Calculation of Integrals, Courier Corporation, Chelmsford, MA, USA, 2006.
C. Müller, Spherical Harmonics, Springer, Berlin, Germany, 2006.