Abstract

The perspective 3-point (P3P) problem, also known as pose estimation, has its origins in camera calibration and is of importance in many fields: for example, computer animation, automation, image analysis, and robotics. One possibility is to formulate it mathematically in terms of finding the solution to a quartic equation. However, there is yet no quantitative knowledge as to how control-point spacing affects the solution structure—in particular, the multisolution phenomenon. Here, we consider this problem through an algebraic analysis of the quartic’s coefficients and its discriminant and find that there are significant variations in the likelihood of two or four solutions, depending on how the spacing is chosen. The analysis indicates that although it is never possible to remove the occurrence of the four-solution case completely, it could be possible to choose spacings that would maximize the occurrence of two real solutions. Moreover, control-point spacing is found to impact significantly on the reality conditions for the solution of the quartic equation.

1. Introduction

The perspective 3-point (P3P) problem, also known as pose estimation, has its roots in camera calibration and is of importance in many fields: for example, computer animation, automation, image analysis, and robotics. It can be stated as follows: given the perspective projection of three points constituting the vertices of a known triangle in 3D space, it is possible to determine the position of each of the vertices. One way it can be formulated mathematically is in terms of finding the solution to a quartic equation. However, in general, the equation does not have a unique solution, and in some situations there are no solutions at all. A review of work on the problem up until 1994 is given by Haralick et al. [1]; by that stage, there were six different algorithms for the P3P problem, given by Grunert [2], Finsterwalder and Scheufele [3], Merritt [4], Smith [5], Fischler and Bolles [6], Linnainmaa et al. [7], and Grafarend et al. [8].

Thus, an important area of research on the P3P problem is its multisolution phenomenon. Fischler and Bolles [6] showed that up to four solutions would be present for the problem of matching three model points to three image points, and they gave a procedure for identifying each of these solutions. A necessary condition for the existence of the solution was given for the first time by Yuan [9]. Wolfe et al. [10] gave a geometric explanation to this multisolution phenomenon in the image plane under the assumption of “canonical view”. Su et al. [11] applied the Wu-Ritt zero decomposition method to find the main solution branch and some nondegenerate branches for the P3P problem, although a complete decomposition was not given; in [12], the same authors used the Sturm sequence to give some conditions to determine the number of solutions. Yang [13] gave partial solution classifications of the P3P problem under some nondegenerate conditions. Gao et al. [14] have given a complete classification, although the procedure is lengthy, mainly as they try to solve two quadratic equations simultaneously, rather than the single quartic equation. Even more recent are the papers by Zhang and Hu [1517], Hu et al. [18], and Rieck [1922]; in particular, Rieck [20] focused on the multisolution phenomenon. An alternative approach is that of Vynnycky and Kanev [23], who analyzed in detail the behaviour of coefficients of the quartic polynomial derived by Linnainmaa et al. [7]; this enabled them to draw conclusions on the comparative likelihoods of obtaining different numbers of solutions, albeit only for the case of equidistant spacing between the vertices, constituting an isosceles triangle. Moreover, it should be noted that there exist other ways to formulate the problem: for example, Finsterwalder and Scheufele [3] formulate it in terms of finding a root of a cubic polynomial and the roots of two quadratic polynomials, rather than finding all the roots of a fourth-order polynomial.

The purpose of this contribution is to extend the methodology of Vynnycky and Kanev [23] with a view to obtaining a better quantitative understanding of the multisolution phenomenon of the P3P problem; in contrast to other activities that seek to determine solutions to the problem numerically or to provide a geometrical interpretation of the problem [24], here we focus on analyzing the actual numbers of solutions that the problem has as a function of the properties of the known triangle in 3D space. Moreover, although such solutions can be computed numerically these days in microseconds, this ought not to preclude seeking a better analytical understanding of the underlying equations.

Although lengthy, the analysis in [23] was only for one of the four different formulations that result in a quartic polynomial [2, 4, 6, 7]; moreover, it was mostly only for the case when the known triangle in 3D space was isosceles. It should be noted, however, that different algebraic manipulations of the equations lead to different quartic polynomials, and a preliminary issue concerns how this will affect the solution structure; although one might expect it to be invariant, we demonstrate that there exist situations where the structure is different. The analysis here will be based on the polynomial obtained by Grunert [2], although we will also obtain new insight on the formulation of Linnainmaa et al. [7]. Moreover, we will go as far as considering arbitrary triangles in 3D space, corresponding to nonequidistant vertices, to determine whether it would be possible to remove the four-solution case completely through a prudent choice of triangle; if so, and the answer at the outset is far from obvious, this could help to reduce the complexity of numerical algorithms that seek to solve the P3P problem in real-time applications.

The layout of the paper is as follows. In Section 2, the P3P problem is stated and the formulation due to Grunert [2] is presented. In Section 3, some preliminary analysis is presented regarding the properties of the polynomial, while Section 4 considers in detail the solutions for the case of equidistant control points. The main results, for nonequidistant control points, are given in Section 5, and conclusions are drawn in Section 6.

2. Statement of Problem

The P3P problem amounts to determining the distances from a camera’s focal point, or centre of perspective, to three known control points. These control points are actual points that are attached to an object that is free to move in space and whose images in the camera’s image plane can be identified. This is achieved by performing computations based on image measurements, intrinsic camera properties, and the known physical distances between the control points. In what follows, the camera is assumed to be calibrated.

Let be the centre of perspective, and the control points, and the image points, as shown in Figure 1. Let and then with From triangles , and , we obtain the P3P equation system:Since we are interested only in physical solutions, we assume the following reality conditions:(1);(2);(3);(4), the last three of which can also be expressed asrespectively, and, as demonstrated by [23], can be reduced totermed as the tetrahedron of reality.

Without loss of generality, we take . Setting (3)-(5) becomewhere Hence, in the most general case, there will be 5 distinct parameters——with

Now, it is clear thatconstitute four planes that make up a tetrahedron with vertices in -space at in -space, this implies an inflated tetrahedron, given by (9), with vertices at respectively.

Here, we follow the formulation given by Grunert [2]. Setting (11)-(13) becomewhence we obtainrespectively. Equating (23) and (25) and making the subject of the resulting equation giveusing (24) and (25) instead givesSetting these two expressions for equal to each other and making the subject of the resulting equation, we obtainNote that (26) appears to have a singularity if ; details of the alternative manipulation that is required in this case are given in Appendix A. Substituting (28) into (26) then leads towhere

3. Preliminary Analysis

Prior to finding solutions to (29), it is worth analyzing it first. Consider the case when If , we set with to obtainwhere Since and as , it is evident that, in general, (35) can only have 0,2, or 4 positive solutions. So, if , i.e., respectively, we have respectively, or i.e., Alternatively, if , we have

On the other hand, if , we set with to obtainwhere Since and as , it is evident that, in general, (42) can only have 1 or 3 positive solutions. This occurs if corresponding to and , or corresponding to and .

It is worth comparing the coefficients obtained via this approach and those obtained by Linnainmaa et al. [7]. Expressions (30)-(34) are certainly more compact than those of the other approach, which will be recapped later in Section 5.2, but have a significant disadvantage: and change sign depending on the values of and , which can then lead to 0,1,2,3, or 4 positive solutions. On the other hand, and in the other approach are always positive, and hence there can only be 0,2, or 4 positive solutions. Nevertheless, a strong reason for pursuing the Grunert approach is that it provides an independent way to check the correctness of the other approach, and vice versa.

4. The Case of Equidistant Control Points:

We now consider the case when , corresponding to the case where the control points are equidistant; this allows to make a direct comparison with the corresponding result in [23]. Equations (30)-(34) giveA particular case of this, which allows comparison with the results in [14, 23], is when ; this is an instructive case to consider, since the resulting algebra is straightforward and also illustrates that two different quartic polynomials derived from the same set of quadratic equations, (11)-(13), can lead to a different solution structure.

4.1.

Using (46)-(50), polynomial (29) can be factorized as giving the solutionsandAlthough (52) and (53) give all of the solutions to (29), the only ones of interest are those which are real and positive and lie within the tetrahedron of reality, which is given for this plane by the positivity requirement is evident because from (10), whence from (19). We analyze (52) and (53) in turn.

For (52), we see that there will be no real solutions and two otherwise; however, will be negative if i.e., , or i.e., In addition, we see that both solutions will be negative if and .

For (53), there will be two real solutions if or The latter is not possible, and we focus on whether are positive in the case of the former. If , if So(i)if , then ifi.e., , and hence there are two positive roots;(ii)if , then and ifi.e., ; hence, , and there are no positive roots.

On the other hand, if , if So(i)if , , whereas ifi.e., ; hence, , and so there is only one positive root;(ii)if , and only if ; hence, , and so there is only one positive root.

The overall solution structure based on the above analysis is shown in Figure 2; note that it is consistent with the preliminary analysis in Section 3 since, with and , we should expect 0,2, or 4 positive solutions if and or and , and 1 or 3 positive solutions if and or and For comparison, we also show in Figure 3 the distribution of solutions that was calculated in [23] using the polynomial of Linnainmaa et al. [7]. It is clear that while there are some similarities, the results are far from identical. For example, for the polynomial used in [23], there were mostly one or three positive solutions in the plane , whereas, for Grunert’s polynomial, 0,1,2,3, and 4 are possible. It is therefore worth exploring why the differences arise, and how the two sets of results can be reconciled.

The common quantity in both approaches is the solution . With the Grunert polynomial, we find from (21) thatwhere is given by (52) and (53). With the Linnainmaa et al. polynomial,whereornote that (66) gives a double root. It is now far from obvious that inserting (52) and (53) into (64) will lead to the same solutions as inserting (66) and (67) into (65), particularly as the approaches seem to lead to different numbers of solutions. Although this can be shown rigorously, the necessary algebra is rather lengthy, and we instead adopt an alternative way to see what is happening. As shown in Figure 4, we superpose Figures 2 and 3, which indicates that there are sixteen different regions to consider when comparing solutions obtained via the two methods. Then, we pick a point at random from each region and calculate all of the possible solutions for , real as well as complex; these are documented in Table 1.

Several points are worthy of note. Even though Figures 2 and 3 are rather different to each other, the solutions for in Table 1 are identical. Even in regions where the Grunert polynomial gives no real positive solutions, i.e., regions 1 and 2, insertion of the complex solutions into (64) nevertheless leads to positive solutions for . Moreover, although Figure 2 resembles Figure 2 from [14] more than Figure 3, it turns out that Figure 3 is in fact more reflective of the number of positive solutions as and are varied.

4.2.

In [23], much of the analysis of the resulting polynomial centred on the signs of the coefficients . In particular, and were found to be positive within the inflated tetrahedron given by ; moreover, the fact that the roots that were of interest had to be strictly positive gave further scope for limiting the number of possibilities. Here, however, the coefficients are much simpler and and both change sign within the tetrahedron, as do , and ; also, both positive and negative roots of the polynomial are permissible. Hence, there is no scope to perform the detailed analysis of the coefficients, and we turn instead to considering the discriminant of the polynomial.

Recapping from [23], the discriminant, , of (29) is given byThe number of real roots that the polynomial will have will be determined by the sign of and the signs of three other polynomials:(1)such that is the second-degree coefficient of the depressed quartic associated with (29);(2)which is zero if the quartic has a triple root;(3)which is zero if the quartic has two double roots.

The possible cases for the nature of the roots are then as follows [25]:(i)If , then the equation has two real roots and two complex conjugate roots.(ii)If , then the equation’s four roots are either all real or all complex.(a)If and , then all four roots are real and distinct.(b)If or , then there are two pairs of complex conjugate roots [26].(iii)If , then either the polynomial has a multiple root, or it is the square of a quadratic polynomial. Here are the different cases that can occur:(a)If , , and , there are a real double root and two real simple roots.(b)If or , there are a real double root and two complex conjugate roots.(c)If and , there are a triple root and a simple root, all real.(d)If , then one has the following:(1)If , there are two real double roots.(2)If , there are two complex conjugate double roots.(3)If , all four roots are equal to

Using a symbolic manipulator, Maple™, to process (46)-(50) and (68)-(71), we obtain , and as given in (B.1)-(B.4) in Appendix B. Comparing with (61), (62), (63), and (65) in [23], we see that whereas (B.3) and (B.4) bear no resemblance to (63) and (65), respectively, expression (B.1) in this paper and expression (61) in [23] share a common factor, Furthermore, since the remainder of both expressions for is negative, it is clear that it is the factor which determines the sign of in both cases and hence determines the number of roots that the respective polynomials will have.

It is now evident that whenever Grunert’s polynomial has two real roots, so will the polynomial of Linnainmaa et al. [7]. However, the disparity in the two sets of expressions for and means that it is not entirely obvious that when one polynomial has four real roots, the other one will also. To verify this, there seems to be no choice other than to perform a numerical sweep over all ()-combinations lying within the inflated tetrahedron, in order to determine the signs of and ; it turns out that , implying that all four roots are real and distinct in both cases. Although sweeping is perhaps not entirely satisfactory from the point of view of rigour, there are a number of mitigating factors. First of all, we may vary the number of ()-combinations that we sweep over; if the results obtained are independent of this number, then we may be more certain that the result is correct. Moreover, and as we show below, the fact that sweeping carried out for two different polynomials leads to the same result is further evidence that the results are correct. More details on the sweeping are given in Section 5.

A consequence of the above is that, for both polynomials, there are four relevant solutions to the problem at 25% of all available spatial locations for the control-point combinations, and two positive solutions at the remaining 75%. This result extends the conclusion in [23] to Grunert’s polynomial, but it is nevertheless only for the case when . It is now of interest to see how this result changes for both polynomials if .

5.

In this section, we consider first, in Section 5.1, all of the roots of the quartic polynomial; then, in Section 5.2, having established beyond doubt that the Grunert and Linnainmaa et al. formulations give the same results, we consider the positive roots only, which is the case of direct relevance to the P3P problem.

5.1. All Solutions

For the general case when , we recall first that we need only consider In what follows, we seek to determine what fraction of the inflated tetrahedron consists of ()-combinations that lead to two or four solutions, as a function of and . To do this requires us to use the Maple-generated expressions for , and to perform the requisite volume integrals over the inflated tetrahedron, restricted to where each of these is of one sign. The volume integrals were computed via a rather unusual use of the finite-element software Comsol Multiphysics. Although this software is more often used for solving partial differential equations, the fact that it is able to discretize a given geometry into elements can be exploited to evaluate the required integrals; in particular, we discretized the ()-space, , using first-order tetrahedral elements. Note also that we use the software to calculate the quadratures, i.e., volume integrals, of known functions—, and —of three variables (); thus, no initial conditions are required.

Prior to the full computations, we tested that the results obtained were mesh-independent. Within the software, it is straightforward to refine the mesh, and benchmark results were generated for and as a function of for meshes having 17090, 60643, and 188936 elements. Figure 5 shows the percentage, , of ()-combinations for which as a function of . For both values of , the results for each mesh are literally on top of each other, and hence the results are indeed independent of the mesh used; subsequent calculations were then carried out using the mesh with 60643 elements.

Figures 6 and 7 show as a function of for using the formulations of Grunert [2] and Linnainmaa et al. [7], respectively. Also, denoting by and the discriminant of the polynomials arising from the Grunert and Linnainmaa et al. formulations, respectively, we find thatwhereaswhere with being a lengthy expression, which we omit here, that contains , and and satisfies hence, in Figures 6 and 7, Moreover, when , reduces to , which is given in (B.2). It is clear that although , it is their common factor which explains the similarity between Figures 6 and 7.

However, Figures 6 and 7 only consider the sign of , without considering the signs of and ; in the analysis in [23] for , it turned out that whenever , then , which meant that the sign alone was enough to indicate four real solutions. In this context, consider Figure 8 which shows , and , where(i) is the percentage of ()-combinations for which and ;(ii) is the percentage for which and ;(iii) is the percentage for which and .

This figure is for the case when and is computed using Grunert’s polynomial, but qualitatively similar results were obtained for other values of . Similarly, Figure 9 shows the corresponding result when using the polynomial of [7]; hence, although we were not able analytically to reconcile the two forms for and obtained here and in [23], the behaviour for and appears to be nevertheless the same.

5.2. Solutions to the P3P Problem

From the previous sections, it has become apparent that the formulation of Linnainmaa et al. provides the solution structure more transparently than the formulation of Grunert. For example, it provides zero, two, or four positive solutions for all values of , whereas the Grunert polynomial gives 0,2, or 4 for some values, and 1 and 3 for others. Thus, in order to consider the positive solutions, rather than just the real ones, we revert to the formulation of Linnainmaa et al.; the relevant polynomial coefficients are now, as in [7, 23],

A further advantage of this formulation is that and for all values of Consequently, we note that if , then there cannot be any negative solutions. Indeed, in [23], this was found to be the case when , which suggests exploring this possibility when . While it is not possible to verify analytically whether these six inequalities are always satisfied simultaneously, it is possible to carry out a numerical sweep over () for different ()-combinations. First of all, Figure 10 shows for , whereas Figure 11   shows for , where denotes the percentage of ()-combinations for which there are four real solutions and . It is clear that there is no difference between the two plots for the values of chosen, which strongly indicates that when there are four real solutions, they are all positive for all values of .

In fact, Figure 11 indicates that, for each value of , there exists a range in for which the possibility of four positive solutions is practically zero; roughly speaking, . However, as approaches the possibility of four complex solutions rises rapidly; thus, a plausible compromise for achieving the situation of two real solutions is to employ a value of that is not too close to . It is also evident that choosing a value of that is small will lower this possibility still further.

6. Conclusions

In this paper, we have analyzed the effect of control-point spacing on solutions to the quartic equation that arises in the P3P problem; as a diagnostic to verify the analysis, the formulations due to Grunert [2] and Linnainmaa et al. [7] were considered. Although both polynomials come from the original set of three quadratic equations, (3)-(5), their coefficients are completely different. Through analysis of solutions for the particular case when and , it is evident that the polynomials will not necessarily have the same number of solutions at all values of (); however, we were subsequently able to show how the two solution structures could be reconciled. Moreover, through symbolic manipulation, we are able to determine that, for a given combination (), the polynomials have the same percentage of all possible ()-combinations which have four positive solutions; when , this is 25%.

Moreover, a sweep over all admissible combinations of () enables us to explore whether it might be possible to choose and so that the case of four positive solutions can be avoided completely, since it may be computationally advantageous to only have to choose from two positive solutions, rather than four. It turns out that, for all admissible ()-combinations, there will always be some ()-combinations which give four positive solutions. Nevertheless, prudent choice of and can reduce the percentage of ()-combinations giving four positive solutions to practically zero. To achieve this, it is necessary to take either or close to 1, and then to take the other one greater than around 0.15. In addition, it ensures that all ()-combinations within the so-called inflated tetrahedron of reality are used, something which is not the case if is too close to one; an interpretation of this situation is that the solution becomes less sensitive to the displacement of the control points.

Appendix

A. Derivation of (29) when

In this case, we haveUsing (A.1) and (A.3) giveswhereas (A.2) and (A.3) giveSubstituting (A.4) into (A.5) and manipulating give (29), with These expressions are the same as if we had simply substituted into (30)-(34) first.

B. when

We find that can be factorized aswhereAlso,and

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

M. Vynnycky acknowledges the award of a visiting researcher grant within the Government of Brazil’s “Science without Borders” programme. G. M. M. Reddy acknowledges funding from the São Paulo Research Foundation (FAPESP) via Grant no. 2016/19648-9.