Research Article | Open Access
Least Squares Problems with Absolute Quadratic Constraints
This paper analyzes linear least squares problems with absolute quadratic constraints. We develop a generalized theory following Bookstein's conic-fitting and Fitzgibbon's direct ellipse-specific fitting. Under simple preconditions, it can be shown that a minimum always exists and can be determined by a generalized eigenvalue problem. This problem is numerically reduced to an eigenvalue problem by multiplications of Givens' rotations. Finally, four applications of this approach are presented.
The least squares methods cover a wide range of applications in signal processing and system identification [1–5]. Many technical applications need robust and fast algorithms for fitting ellipses to given points in the plane. In the past, effective methods were Bookstein's conic-fitting or Fitzgibbon's direct ellipse-specific fitting, where an algebaic distance with a quadratic constraint is minimized [6, 7]. In this paper, we develop an extended theory of minimization of least squares with a quadratic constraint based on the ideas of Bookstein and Fitzgibbon. Thereby, we show the existence of a minimal solution and present the uniqueness regarding to the smallest positive generalized eigenvalue. So, arbitrary conic fitting problems with quadratic constraints are possible.
Let be matrix with , be symmetric matrix, and a real value. We consider the problem of finding a vector which minimizes the function defined by The side condition introduces an absolute quadratic constraint. The problem (1.1) is not a special case of Gander's optimization as presented in , because in our case is a real symmetric matrix in contrast to the approach of Gander, where the side condition considers real symmetric matrices which are positive definite. For our considerations, we require the following three assumptions
Assumption 1.1. By replacing by , we consider . For , the trivial solution fulfills (1.1). Therefore, we demand .
Assumption 1.2. The set is not empty, that is, the matrix has not been less than one positive eigenvalue. If we assume that has only nonpositive eigenvalues, it would be negative semidefinite and holds. With it follows that the set would be empty in this case.
Assumption 1.3. In the following, we set and assume that is regular. is sometimes called scatter matrix.
In the following two sections, we introduce the theoretical basics of this optimization. The main result is the solution of a generalized eigenvalue problem. Afterwards, we reduce this system numerically to an eigenvalue problem. In Section 5, we present four typical applications for conic fitting problems with quadratic constraints. These approximations contain the ellipse fitting of Fitzgibbon, the hyperbola fitting of O'Leary, the conic fitting of Bookstein, and an optical application of shrinked aspheres [6, 7, 9, 10].
2. Existence of a Minimal Solution
Theorem 2.1. If is regular, then there exists a global minimum to the problem (1.1).
Proof. The real regular matrix is symmetric and positive definite. Therefore, a Cholesky decomposition exists with a regular upper triangular matrix . In (1.1), we are looking for a solution minimizing With regular we substitute by for . Thus, we obtain an equivalent problem to (2.1), where we want to find a vector , minimizing Now, we define with and look for a solution on the zero-set of with minimal distance to the point of origin. Let and be the closed sphere of in 0 with radius . Because of and being continuous, the set is nonempty, closed and bounded. Therefore, for the continuous function exists a minimal value on with For all from , it is . So, is a minimal value of in . By the equivalence of (2.1), and (2.2) the assumption follows.
3. Generalized Eigenvalue Problem
Theorem 3.1. If is an extremum of subject to , then a positive exists with that is, is an eigenvector to the generalized eigenvalue and holds.
Proof. Let be a defined as . For and follows . Further, is continuously differentiable with for all of the zero-set of . So, if is a local extremum of subject to , then it is . Since is also a continuously differentiable function in with , it follows by using a Lagrange multiplier : if is a local extremum of subject to , then a exists, such that the Lagrange function given as has a critical point in . Therefore, fulfills necessarily the equations: The first equation describes a generalized eigenvalue problem with With , and fulfills (3.6), must be a generalized eigenvalue, and is a corresponding eigenvector to of (3.6), so that is a singular matrix. If is an eigenvalue and a corresponding eigenvector to of (3.6), then every vector is also a solution of (3.6) for . Now, we are looking for , such that satisfies (3.5). For and (3.4), it follows Because the left side and the numerator are positive, the denominator must also be chosen positive, that is, only positive eigenvalues solve (3.4) and (3.5). By the multiplication with , follows and fulfills the constraint .
Lemma 3.3. If is regular and is symmetric, then all eigenvalues of (3.1) are real-valued and different to zero.
Proof. With , in (3.1). The Cholesky decomposition with a regular upper triangular matrix yields (3.1) With invertible and the substitution ,, we obtain an eigenvalue problem to the matrix : Furthermore, we have Therefore, the matrix is symmetric and all eigenvalues are real. With for follows the proposition.
Remark 3.4. Because of regular and in we can consider the equivalent problem with instead of (3.11): This system is called inverse eigenvalue problem to (3.1). Here, the eigenspaces to the generalized eigenvalue in (3.1) and to the eigenvalue in (3.14) are identical. Therefore, the generalized eigenvectors in (3.1) are perpendicular.
Definition 3.5. The set of all eigenvalues of a matrix is called spectrum . We call the set of all eigenvalues to the generalized eigenvalue problem in (3.1) also spectrum and denote . is defined as the set of all positive values in .
Remark 3.6. In case of , the inverse problem in (3.14) has eigenvalue 0 with multiplicity . Otherwise, for and follows . Analogously, for with ,.
The following lemma is a modified result of Fitzgibbon .
Lemma 3.7. The signs of the generalized eigenvalues of (3.1) are the same as those of .
Proof. With being nonsingular, every generalized eigenvalue of (3.1) is not zero. Therefore, it follows for the equivalent problem (3.11) that is also a positive eigenvalue to , where is an upper triangular matrix to the Cholesky decomposition of . With Sylvester's Inertia Law, we know that the signs of eigenvalues of the matrices are the same as those of .
For the following proofs, we need the lemma of Lagrange (see, e.g., ).
Lemma 3.8 (Lemma of Lagrange). For , , , and , let , so that is a minimal value of the function with Then is a minimal solution of in .
Definition 3.9. Let be the smallest positive value of and a corresponding generalized eigenvector to to the constraint .
Lemma 3.10. Let be a positive semidefinite matrix for . Then a generalized eigenvector corresponding to is a local minimum of (1.1).
Proof. We consider with With holds and since the Hessian matrix of is positive semidefinite, the vector is a minimal solution of . Then, minimizes also subject to by the Lemma 3.8.
Lemma 3.12. The matrix is positive semidefinite.
Proof. Let be an arbitrary eigenvalue of , where is the upper triangular matrix of the Cholesky decomposition from . With it follows that is an eigenvalue of . This value is corresponding in (3.11) with the inverse eigenvalue of problem (3.1). Furthermore, it yields So follows, that is, is positive semidefinite, and for we obtain By setting and with regular , we get With , for , that is, is positive semidefinite.
Theorem 3.13. For the smallest value in exists a corresponding generalized eigenvector , which minimizes subject to .
Proof. The matrix is positive semidefinite by Lemma 3.12, and with Lemma 3.10 it follows that is a local minimum of problem (1.1). Furthermore, we know by Theorem 3.1 that if is a local extremum of subject to , then a positive value exists with and it is . Because of the existence of a minimum in Theorem 2.1 a value exists in problem (3.21) regarding to . Otherwise, for an arbitrary local minimum , So, follows and is a minimum of subject to .
Example 3.14. We minimize with subject to . So, we have , the identity matrix , and a diagonal matrix with values −1 and 1. Then, we get the following generalized eigenvalue problem: with eigenvalues 1 and −1. Because of Theorem 3.1, we consider a generalized eigenvector with only for . Then, and are solutions subject to . This result is conform to the geometric interpretation, since we are looking for on the hyperbola with minimal distance to the origin.
4. Reduction to an Eigenvalue Problem of Dimension
In numerical applications, a generalized eigenvalue problem is mostly reduced to an eigenvalue problem, for example, by multiplication with . Thus, we obtain the inverse problem (3.14) from (3.1) (see, e.g., ). But, may be ill-conditioned, so that a solution of (3.14) may be numerical instable. Therefore, we present another reduction of (3.1).
Many times, is a sparse matrix with . This symmetric matrix is diagonalizable in with orthogonal and diagonal. Further, we assume that the first diagonal entrees in are different to 0. For the characteristic polynomial in (3.1), it follows The order of is . We decompose these matrices in with , , and . Now, we eliminate in by multiplications with Givens rotations , so that it follows: with , , and . In (4.1), we achieve with orthogonal Because of and , the submatrices are regular and the generalized eigenvalues of are different to zero. So, with can be transformed to an equivalent eigenvalue problem with This system can be solved by finding the matrix with using the Gaussian elimination and determining the eigenvalues of computing the algorithm . Because all steps are equivalent, we have , that is, the eigenvalues of (3.1) and (4.6) are the same.
With Theorem 3.13, we are looking for the smallest value and a corresponding generalized eigenvector to minimize the problem (3.1). So, yields. By substitution of for , we obtain We decompose into the subvectors and with . Then, is a generalized eigenvector for of the problems (4.5) and (4.6).
and a generalized eigenvector for in (3.1) is given as:
5. Applications in Conic Fitting
5.1. Fitzgibbon's Ellipse Fitting
First, we would like to find an ellipse for a given set of points in . Generally, a conic in is implicitly defined as the zero set of to a constant parameter : The equation can also be written with as The eigenvalues , of characterize a conic uniquely . Thus, we need for ellipses in . Furthermore, every scaled vector with describes the same zero-set of . So, we can impose the constraint for ellipses with . For given points , we want to find a parameter , which minimizes with This ellipse fitting problem is established and solved by Fitzgibbon . With the following matrices , and , we achieve the equivalent problem: For , we have a special case of (1.1). Assuming is a regular matrix and since the eigenvalues of are −2, −1, 0, and 2, by lemma 3.3 we know that the generalized eigenvalue problem has exactly one positive solution . Because of Theorem 3.13 a corresponding generalized eigenvector to minimizes the problem (5.5) and consists of the coefficients of an implicit given ellipse.
A numerically stable noniterative algorithm to solve this optimization problem is presented by Halir and Flusser . In comparison with Section 4, their method uses a special block decomposition of the matrices and .
5.2. Hyperbola Fitting
Instead of ellipses, O'Leary and Zsombor-Murray want to find a hyperbola to a set of scattered data . A hyperbola is a conic, which can uniquely be characterized by . So, we consider the constraint and obtain the optimization problem: with and being chosen in 5.1. The matrix has two positive eigenvalues. In this case, a solution is given by a generalized eigenvector to the smallest value in . But O'Leary and Zsombor-Murray determine the best hyperbolic fit by evaluation of , where the eigenvector is associated to a positive value of .
5.3. Bookstein's Conic Fitting
In Bookstein's method, the conic constraint is restricted to where , are the eigenvalues of in . There, it is and at least one of them different to 0. But the constraint (5.8) is not a restriction to a class of conics. Here, we determine an arbitrary conic, which minimizes The resulting data matrix is the same as for Fitzgibbon's problem. The constraint matrix has a diagonal shape with the entrees , that is, all eigenvalues of are nonnegative. In the case of a regular matrix , the problem (5.9) is solved for a generalized eigenvector to the smallest value in .
5.4. Approximation of Shrinked Aspheres
After the molding process in optical applications, the shrinkage of rotation-symmetric aspheres is implicitly defined for in where and are aspheric-specific constants . For with , the scattered data of a shrinked asphere are given in this approximation problem. Here, we are looking for the conic parameter for a fixed value , which minimizes Analogously to Fitzgibbon, we have the matrices and with and with we get the following optimization problem: This is also an application of (1.1). The matrix has the eigenvalues , and 2. So, the generalized eigenvalue problem in (3.1) with regular has two positive values in . With Theorem 3.13, a generalized eigenvector to the smaller of both values solves (5.13).
In this paper, we present a minimization problem of least squares subject to absolute quadratic constraints. We develop a closed theory with the main result that a minimum is a solution of a generalized eigenvalue problem corresponding to the smallest positive eigenvalue. Further, we show a reduction to an eigensystem for numerical calculations. Finally, we study four applications about conic approximations. We analyze Fitzgibbon's method for direct ellipse-specific fitting, O'Leary's direct hyperbola approximation, Bookstein's conic fitting, and an optical application of shrinked aspheres. All these systems are attribute to the general optimization problem.
- X. Liu and J. Lu, “Least squares based iterative identification for a class of multirate systems,” Automatica, vol. 46, no. 3, pp. 549–554, 2010.
- F. Ding, P. X. Liu, and G. Liu, “Multiinnovation least-squares identification for system modeling,” IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 40, no. 3, pp. 767–778, 2010.
- Y. Xiao, Y. Zhang, J. Ding, and J. Dai, “The residual based interactive least squares algorithms and simulation studies,” Computers and Mathematics with Applications, vol. 58, no. 6, pp. 1190–1197, 2009.
- B. Bao, Y. Xu, J. Sheng, and R. Ding, “Least squares based iterative parameter estimation algorithm for multivariable controlled ARMA system modelling with finite measurement data,” Mathematical and Computer Modelling, vol. 53, no. 9-10, pp. 1664–1669, 2011.
- F. Ding, P. X. Liu, and G. Liu, “Gradient based and least-squares based iterative identification methods for OE and OEMA systems,” Digital Signal Processing, vol. 20, no. 3, pp. 664–677, 2010.
- F. L. Bookstein, “Fitting conic sections to scattered data,” Computer Graphics and Image Processing, vol. 9, no. 1, pp. 56–71, 1979.
- A. W. J. Fitzgibbon, Stable segmentation of 2D curves, Ph.D. thesis, University of Edinburgh, 1997.
- W. Gander, “Least squares with a quadratic constraint,” Numerische Mathematik, vol. 36, no. 3, pp. 291–307, 1981.
- P. O'Leary and P. Zsombor-Murray, “Direct and specific least-square fitting of hyperbolæ and ellipses,” Journal of Electronic Imaging, vol. 13, no. 3, pp. 492–503, 2004.
- R. Schoene, D. Hintermann, and T. Hanning, “Approximation of shrinked aspheres,” in International Optical Design Conference, G. G. Gregory, J. M. Howard, and R. J. Koshel, Eds., vol. 6342 of Proceedings of SPIE, Vancouver, Canada, 2006.
- J. Jost, Postmodern Analysis, Universitext, Springer, Berlin, Germany, 1998.
- P. Kosmol, Optimierung und Approximation, de Gruyter Lehrbuch, Walter de Gruyter, Berlin, Germany, 1991.
- G. H. Golub and C. F. Van Loan, Matrix Computations, Johns Hopkins Studies in the Mathematical Sciences, Johns Hopkins University Press, Baltimore, Md, USA, 3rd edition, 1996.
- W. Gander, G. H. Golub, and R. Strebel, “Least-squares fitting of circles and ellipses,” BIT Numerical Mathematics, vol. 34, no. 4, pp. 558–578, 1994.
- R. Halir and J. Flusser, “Numerically stable direct least squares fitting of ellipses,” in Proceedings of the International Conference in Central Europe on Computer Graphics, Visualization and Interactive Digital Media, V. Skala, Ed., pp. 125–132, 1998.
Copyright © 2012 R. Schöne and T. Hanning. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.