Mathematical Problems in Engineering

Volume 2016 (2016), Article ID 8523604, 14 pages

http://dx.doi.org/10.1155/2016/8523604

## Projective Invariants from Multiple Images: A Direct and Linear Method

School of Computer Science and Engineering, Northeastern University, Shenyang 110819, China

Received 3 October 2015; Accepted 3 March 2016

Academic Editor: Antonino Laudani

Copyright © 2016 Yuanbin Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

The projective reconstruction of 3D structures from 2D images is a central problem in computer vision. Existing methods for this problem are usually nonlinear or indirect. In the previous direct methods, we usually have to solve a system of nonlinear equations. They are very complicated and hard to implement. The previous linear indirect methods are usually imprecise. This paper presents a linear and direct method to derive projective structures of 3D points from their 2D images. Algorithms to compute projective invariants from two images, three images, and four images are given. The method is clear, simple, and easy to implement. For the first time in the literature, we present explicit linear formulas to solve this problem.* Mathematica* codes are provided to demonstrate the correctness of the formulas.

#### 1. Introduction

The recovery of the geometric structure of 3D points from their 2D projection images is fundamental in computer vision. It is known that the structure of a 3D point set cannot be recovered from a single image generally [1]. When two or more images are available, the 3D point structure of a scene can be recovered up to an unknown projective transformation. The projective reconstruction of camera parameters and 3D scene structure from multiple uncalibrated views is also called projective structure from motion [1–6]. The problem is well studied after decades of research. However, an ultimate solution for this problem is pending.

There are mainly three types of methods to solve the structure from motion problem. The first type of methods computes the projective invariants of the 3D points. Groups of researchers studied differently the problem of computing 3D projective invariants of a point set from its 2D images [7–10]. Previous methods to compute 3D projective invariants of six 3D points from three uncalibrated images can be found in [7, 9]. Previous method to compute 3D projective invariants of seven 3D points from two images can be found in [10]. However, these methods are very complicated. Solutions of polynomial equations up to the eighth degree are needed. Wang et al. proposed an explicit method to derive projective invariants of six 3D points from three uncalibrated images [11]. All these methods produce three possible solutions for the reconstruction problem. We need further information to select the unique solution. In the literature, it is not quite clear how to determine the unique solution.

There are methods to recover 3D shapes indirectly. Tensors of multiple images of the 3D scene are estimated first. A second-order tensor usually called the fundamental matrix captures the geometry between two views of a 3D scene. A third-order tensor usually called trifocal tensor captures the geometry among three views of a 3D scene. When these tensors of multiple views of a scene are known, there are many algorithms to recover the 3D geometric structure of the scene from them [12–17]. There are two kinds of methods to estimate these tensors, nonlinear methods and linear methods. The problem of the nonlinear methods is that they produce multiple solutions. For example, the seven-point nonlinear method to estimate the fundamental matrix produces up to three solutions. The problem of the linear methods is that they do not produce the precise solution. For example, the eight-point linear algorithm to derive the fundamental matrix generally produces a matrix that does not satisfy the rank constraint.

There are methods to estimate the structure and motion through the projective factorization technique [4, 14–16]. This technique organizes a set of constraints into a single matrix . When all the projective depths in are known, it is possible to factor into motion and shape matrices using the rank constraint. The general method to factor is through singular value decomposition. Since there is no mathematical proof that the derived motion and shape matrices by this technique are the real motion and shape matrices, we do not discuss this type of methods further in this paper.

In the literature, there is a well-known result demonstrated by Carlsson and Weinshall that the projective reconstruction with points and images is equivalent to that with points and images. We would like to emphasize that, for this theorem to be true, the number of points should be no less than six, and the number of images should be no less than two. So the minimal number of point correspondences for projective reconstruction from two images is seven. The minimal number of point correspondences for projective reconstruction from three images is six. The minimal number of point correspondences for projective reconstruction from four images is also six. It is generally impossible to projectively recover a 3D structure from less than six point correspondences no matter how many images are available.

For the fundamental projective reconstruction problem, it is generally accepted that we need to consider only the cases of two views, three views, and four views. While quadrifocal tensor is the most complicated and controversial tensor in multiple view geometry, we will demonstrate that the configuration of six points and four views is the most natural configuration for deriving 3D projective invariants.

This paper presents linear methods for computing projective invariants of 3D points from their 2D images directly. A 3D point structure can be configured by first choosing four reference points as a basis and then representing the other points under this basis. The cross ratios of the coordinates of the other points under this basis are projective invariant. Systems of bilinear equations are derived then. Traditional methods to solve nonlinear multivariable equations are very complicated. The main contribution of this paper is that we will show that these systems of equations can be easily transformed into systems of linear equations. We present the solutions of the systems of linear equations in the explicit form.

The rest of the paper is organized as follows. First, we review some of the previous works. In Section 3, we define the form of the 3D projective invariants and derive the basic relations of projective invariants among multiple views. Next, we present a linear method to compute 3D projective invariants of six points from four images. In Section 5, we present a linear method to compute 3D projective invariants of seven points from three images. In Section 6, we present a linear method to compute 3D projective invariants of eight points from two images. Final section is Conclusion. We present* mathematica* codes to demonstrate the correctness of the method in Appendix.

#### 2. Previous Works

We review a few related works in this section.

A camera is a device that transforms properties of a 3D scene onto an image plane. A pinhole camera model is used to represent the linear projection from 3D space onto each image plane. In this paper, 3D world points are represented by homogeneous 4-vector . The projection of the th 3D point is represented by a homogeneous 3-vector . The relationships among the 3D points and their 2D projections are where is the projection matrix (which is and is also called the camera matrix) of the th camera, is a nonzero scale factor called projective depth, and is the th projection of the th 3D point. Suppose that perspective images of a set of 3D points are given. The structure and motion problem is to recover the 3D point locations and camera parameters from the image measurements. When the cameras are uncalibrated and no additional geometric information of the point set is available, the reconstruction is determined only up to an unknown projective transformation. For any 3D projective transformation matrix and produce an equally valid reconstruction.

The projective geometry between two views of a 3D scene is completely captured by the epipolar geometry. Let and be images of a 3D point observed by two cameras with optical centers and . The epipolar constraint says that if and are images of the same 3D point , then must lie on the epipolar line associated with . That is,where is a matrix called the fundamental matrix.

The fundamental matrix is of rank two and is defined up to a scalar factor. It encodes all the geometric information among two views when no additional information is available. Numerous algorithms are designed to estimate this matrix. The most famous algorithms are the linear eight-point algorithm and the nonlinear seven-point algorithm [1, 3, 13, 17, 18]. The input to those methods is a set of point correspondences between the two images. The eight-point algorithm is simple, fast, and easy to implement. However, the fundamental matrix estimated by the eight-point algorithm is usually full rank.

Hartley proposed a method to recover the 3D scene from the fundamental matrix [12]. Two camera matrices and with different projection centers uniquely determine the fundamental matrix . On the other hand, the camera matrices and are not uniquely determined by the fundamental matrix . If the fundamental matrix is factored into then a realization of the fundamental matrix iswhere is identity matrix, is a nonsingular matrix, and

The 3D scene point is then determined by the two camera matrices, and , and the two projections of , and .

Quan proposed an algorithm to compute projective invariants of six 3D points from three projection images [9]. Given any six 3D points, the author selected five points as the standard projective basis. The six unknown points in 3D space are projective equivalent to the following normalized points:

We then normalize the known point locations in the three images accordingly. They are corresponding to

From these relations, a homogeneous nonlinear equation of the form can be derived for the th image, whereIt is also noticed thatSince six 3D points have 18 degrees of freedom and a 3D projective transformation has 15 degrees of freedom, six points in 3D space can have 3 independent projective invariants. There are many forms of projective invariants. It is noticed that the ratios of , , , and in (6) are projective invariant. The three independent such invariants can beSo the goal is to compute these unknown 3D projective invariants from three of the 2D images.

Quan tried to solve the system of bilinear equations (8) using the classical resultant technique. After eliminating the variable , he obtained two homogeneous polynomial equations of the third degree in three variables:Eliminating again will result in a homogeneous polynomial equation in and of degree eight. After that, a third degree polynomial equation can be derived numerically through polynomial factorization of the following form:

Heyden presented a similar method to compute projective invariants of six 3D points from three views [7].

As we can see from the procedure described above, the method proposed by Quan is hard to implement by ordinary users and inconvenient for real applications. In [11], Wang et al. proposed a method to eliminate variables and in a single step. A third degree polynomial equation in a single variable was given explicitly.

There are also methods to compute projective invariants of 3D points from two view images [10].

In the literature, it is generally noted that the minimal number of point correspondences needed for projective reconstruction from two images is seven. The minimal number of points needed for projective reconstruction from three images is six. This does not mean that we can obtain a definite reconstruction from the minimal number of points only. More points are needed to get a unique solution.

#### 3. Relations of Projective Invariants among Multiple Views

In this section, we will first define the form of the 3D projective invariants. We then derive the basic relations of projective invariants among multiple views.

Suppose that a set of 3D points labeled , is given. The geometric structure of it is unknown. The point set is projected into view images by unknown camera matrices . The relationships between them arewhere and . The only information available is the point locations in the images and point correspondences between the projectionswhere and . It is often supposed that no four points in space are coplanar and no three points in the images are collinear. Otherwise, the problem is much simpler. We can select , and as a basis of the vector space. Other points can be represented as the linear combinations of , and :where . Since points , and are linearly independent, this representation is unique. Since no four points are coplanar, all the coefficients in (16) are nonzero.

Six 3D points in general position have 18 degrees of freedom. Seven 3D points in general position have 21 degrees of freedom. Eight 3D points in general position have 24 degrees of freedom. The 3D projective transformation has 15 degrees of freedom. So six 3D points have three independent projective invariants, seven 3D points have six independent projective invariants, and eight 3D points have nine independent projective invariants. There are many forms of projective invariants. It is known that the cross ratios of coefficients in (16) are projective invariant. A set of independent invariants of this form are

In the rest of this paper, the symbols , will always denote these invariants. Since all the coefficients in (16) are nonzero, all the invariants in (17) cannot be zero.

The set of projective invariants have the property that when an invariant equals one, four of the 3D points are coplanar. This can be proved easily. For example, if , then . From (16) we haveSubtracting (18) from (19), we getSince and are not zero, we have a nontrivial linear combination of points , and . So they are coplanar. On the other hand, if points , and are coplanar, there are such thatSubstituting and using (16) into (21), we obtainSince points , and are not coplanar, we haveFrom (23) we obtainNext, we will derive the basic relations of projective invariants among multiple views. Multiplying each side of (16) by the projection matrices , we havewhere and . That is,where and . Applying variable eliminations to (26), we getwhereDividing each side of (27) by , we haveRewriting (29) in another form, we obtainwhereSince the system of equations in (30) has a nontrivial solution , the determinants of every four rows of the coefficients matrix in (30) must be rank deficient. We will use these constraints to derive the 3D projective invariants.

#### 4. Projective Invariants from Four Views

In [9], the author notes that it is possible to compute projective invariants of six 3D points from five images linearly. In this section, we will derive formulas to compute the 3D projective invariants of six 3D points from four images linearly. The result was first presented in [19].

In the case of four images and six points, from (30) we have the constraintswhere . Expanding the determinant in (32), we obtain a system of bilinear equations in variables , and of the following form:where .

Let denote the coefficients matrix of the system of equations in (33). It is a matrix. The first column of corresponds to the coefficients of , the second column of corresponds to the coefficients of , and so forth. Let denote the th column of the matrix . It is checked that

Although or or is possible solution of the system of bilinear equations (33), we discard these solutions since the invariants cannot be zero by definition.

Next, we will derive the solutions of (33) such that is not zero, where . Rewriting (33) in another form, we haveFrom (35), we can obtainThis is a second-degree polynomial equation in variable .

Applying constraints (34) to (36), we obtainThe solutions of (37) are and

The solution corresponds to the condition that four of the 3D points are coplanar. So it is discarded if we assume that no four of the 3D points are coplanar. Similarly, we can obtain the solutions of and linearly. The solution of is

The solution of is

As we can see from the previous derivation, four images of six 3D points are the simplest configuration to compute 3D projective invariants. On the contrary, it is very hard to estimate the quadrifocal tensor of four images. It requires the solution of a system of 81 multilinear equations.

#### 5. Projective Invariants from Three Views

In this section, we will derive formulas to compute the 3D projective invariants of seven 3D points from three images linearly. To our knowledge, there is no similar method reported. There are nonlinear methods to compute the 3D projective invariants of six 3D points from three images [7, 9].

In the case of three images and seven points, from (30) we havewhere .

Since the system of equations in (41) has a nontrivial solution , the determinant of the coefficients matrices of every four equations in (41) must be zero. From these constraints, we obtain the following system of equations:where .

The total number of equations in (42) is 12. We choose the first ten equations as the system of equations to compute the 3D projective invariants. Let denote the coefficients matrix of this system of equations. It is a matrix. The first column of corresponds to the coefficients of , the second column of corresponds to the coefficients of , and so forth. Let denote the th column of the matrix . Let denote the submatrix of the matrix with its th, th, th, and th columns deleted. It is checked thatLet us denote

Rewriting the system of (42) in the concise form, we haveSince the system of equations in (45) has nontrivial solutions, the determinant of the coefficients matrix must be zero. That is,Applying the constraint (43) to (46), we haveThe solution of (47) corresponds to the condition that four of the 3D points are coplanar, so we simply discard it. The unique solution of isSimilarly, we can have the solutions of , , , , and . The solution of isThe solution of isThe solution of is The solution of isThe solution of is

#### 6. Projective Invariants from Two Views

In this section, we will derive explicit formulas to compute the 3D projective invariants of eight 3D points from their two images linearly.

Since the system of equations in (30) has a nontrivial solution , the determinant of the coefficients matrices of every four equations in (30) must be zero. From these constraints, we obtain the following system of equations: