Abstract

We propose vision measurement scheme for estimating the distance or size of the object in static scene, which requires single camera with 3-axis accelerometer sensor rotating around a fixed axis. First, we formulate the rotation matrix and translation vector from one coordinate system of the camera to another in terms of the rotation angle, which can be figured out from the readouts of the sensor. Second, with the camera calibration data and through coordinate system transformation, we propose a method for calculating the orientation and position of the rotation axis relative to camera coordinate system. Finally, given the rotation angle and the images of the object in static scene at two different positions, one before and the other after camera rotation, the 3D coordinate of the point on the object can be determined. Experimental results show the validity of our method.

1. Introduction

Nowadays, digital camera or mobile phone with camera is very popular. It is appealing and convenient, if they are utilized to estimate the distance or size of an object. For this purpose, stereo images with disparity should be taken [1, 2]. One obvious method for the stereo image acquisition is using two cameras with different view angles. With two images of an object from two cameras and the relative orientation and position of the two different viewpoints, through the correspondence between image points in the two views, the 3D coordinates of the points on the object can be determined [3, 4]. But, in general, mobile phone or professional camera has only one camera and cannot acquire two images from different views simultaneously. Fortunately, there have been many methods for stereo vision system with single camera. The methods may be broadly divided into three categories. First, to obtain virtual images from different viewpoints, additional optical devices are introduced, such as two planar mirrors [5], a biprism [6], convex mirrors [1, 7], or the double lobed hyperbolic mirrors [8]. But these optical devices are expensive and space consuming. Second, 3D information of an object is inferred directly from a still image under the knowledge of some geometrical scene constraints such as planarity of points and parallelism of lines and planes [911] or prior knowledge about the scene obtained from the supervised learning [12]. Nevertheless, these methods require constrained scenes or extra computation for training the depth models. Third, 3D information is extracted from sequential images with respect to camera movement, which is often adopted in robot area. Due to the uncertainties in the sequential camera position, however, it is difficult to get the accurate 3D information in that method [1].

In this paper, we propose a novel vision measurement method for estimating the distance or size of the object in static scene, which requires single camera with 3-axis accelerometer rotating around a fixed axis. Through the 3-axis accelerometer sensor, the slope angle of the camera relative to gravity direction can be obtained [13]. The angle can uniquely determine the position of the camera if the camera is rotated around a fixed axis which is not parallel to gravity. Moreover, the relative position and orientation of the camera between two positions, one before and the other after camera rotation, can be determined by the rotation angle. Therefore, at the given two positions, if the camera is calibrated, the classical binocular view methods [4] can be adopted to extract the 3D coordinates of the points on objects. Unfortunately, it is very difficult for the user to rotate the camera to the same slope angle as the calibrated one.

To deal with this problem, we firstly formulate the rotation matrix and translation vector from one coordinate system of the camera to another in terms of the rotation angle. Then, with camera calibration data at various positions, we calculate the orientation and position of the rotation axis relative to the camera coordinate system. Accordingly, at given two positions, the rotation matrix and translation vector from one coordinate system of the camera to another can be calculated by rotation angle, and with the collected two images at the two positions, the 3D coordinate of the points on the object can be determined.

The paper is organized as follows: Section 2 provides a stereo vision system from single camera through rotation, and Section 3 indicates calculation method of the rotation matrix and translation vector from one coordinate system of the camera to another by the rotation angle. The calculation method for the position and orientation of rotation axis relative to the camera coordination system is proposed in Section 4, and the complete calibration and the 3D coordinate calculation of the point on an object are presented in Section 5. Experimental results are given in Section 6, and some conclusions and discussions are drawn in Section 7.

2. Stereo Vision through Camera Rotation

In what follows, we adopt the following hypotheses.(H1) The camera is provided with a 3-axis accelerometer sensor, whose readouts, , , and , are the divisions of gravity on the , , and axes of the sensor coordinate system, respectively.(H2) In calibration and measurement processes, the camera rotates around the fixed rotation axis which is parallel to the axis of the sensor coordinate system and is not parallel to the direction of gravity.

Thus, in the course of the rotation, the readout keeps steady. As a result, the division of gravity on the plane of the sensor coordinate system also keeps steady (see Figure 1). Therefore, the slope angle with respect to gravity division (position, for short) can be determined by

From position to , the rotation angle of device is governed by

Two images, and , of an object are collected at the positions and , respectively. and denote the camera coordinate systems at and , respectively. () and () denote the coordinates of a point on the object relative to - and -, respectively, and () and () denote image coordinates of the point of the object on and (image coordinate for short), respectively.

The projection of object coordinates relative to - and - into image coordinates is summarized by the following forms:

Let where and denote rotation matrix and translation vector between - and -.

Substituting (3) into (4), we get

Thus, the 3D coordinate of the point on the object relative to - (object coordinate) can be determined by where

From (6), we can see that the object coordinate can be found provided that the intrinsic parameters , , , and , the rotation matrix , and translation vector are available.

With the camera calibration method proposed by Zhang [14], the camera intrinsic parameters and extrinsic parameters describing the camera motion around a static scene can be figured out. Let and denote the extrinsic parameters at and and the extrinsic parameters at . For simplicity, = (), = (), and = (, and ) stand for the coordinates of the point on the object relative to the world coordinate system, - and -, respectively. Thus,

From (8), we get

Substituting (9) into (8), we get

Equation (4) can be rewritten as

Thus,

Based on the previous discussions, the object coordinate can be determined by the following process.(1)The camera is calibrated, and the intrinsic parameters, , , , and , are obtained.(2)At the positions and , the camera is calibrated, and the corresponding extrinsic parameters, , , , and , are obtained. Then, and can be acquired from (12).(3)In the course of measurement, the camera is rotated carefully to the positions and . At these two positions, the two images of the object, and , are collected. Once the image coordinates of the object, () and (), are known, the object coordinate can be figured out by (6).

It should be noticed that it is rather difficult for the user to rotate the camera accurately to the positions and , where camera is calibrated.

3. Calculation Method of Rotation Matrix and Translation Vector

Intuitively, the orientation and position of - relative to -, and , depend on the camera rotation angle, . Thus, and may be figured out by .

Let - be the coordinate system associated with the rotation axis, which is unit vector on the axis of the coordinate system. Let and denote the orientation and position of - relative to - (orientation and position of the rotation axis). Its homogeneous matrix can be written as

The rotation matrix for the camera rotation around the axis of - from to (device rotation matrix) can be modeled as

Its homogeneous matrix can be written as

The coordinate system transformation can be represented as a graph (Figure 2). A directed edge represents a relationship between two coordinate systems and is associated with a homogeneous transformation. From Figure 2, we can get where

Substituting (13), (15), and (17) into (16), we get

From (20) and (21), we can see that the and can be calculated by the rotation angle provided that and are available.

4. Calculation Method of Position and Orientation of Rotation Axis

4.1. Orientation of Rotation Axis

Suppose that the camera be calibrated at different positions. At position , we get camera calibration data. Let , and let denote the device rotation matrix with respect to rotation angle . and denote rotation matrix and translation vector between camera coordinate systems at the positions and . Equation (20) can be written as

When the rotation axis is fixed, would be constant. Thus, given the values of , , and , we can solve (22) for .

Using a normalized quaternion to define the rotation between two coordinate systems provides a simple and elegant way to formulate successive rotations [1517]. Given rotation matrix

it can be transformed to the quaternion with the following equation [18]:

Similarly, the quaternion can be transformed to the rotation matrix with the following equation [18]:

Let , , and denote the quaternion of , , and , respectively. With quaternion, the sequence of rotation can be formulated as an equation without involving rotation matrices [15]. As a result, the problem of solving can be transformed into an equivalent problem involving the corresponding quaternion as follows [15]:

Since the quaternion multiplication can be written in matrix form and with notations introduced in [19], we have the following [16]: where, letting ,

Moreover, these two matrices are orthogonal [16], that is,

Thus, where

Thus, the total error function allowing us to compute becomes where is the number of positions of the camera:

is the unit quaternion. Therefore, can be obtained by solving the following problem:

4.2. Position of Rotation Axis

At position , (21) can be written as

When the rotation axis is fixed, would be constant. Thus, given the values of and , we can solve (35) for .

Let , ], , and thus

Let denote the number of positions of the camera. The total error function is

Since the camera rotation axis is approximately vertical to , the value of approaches zero. Thus, can be obtained by solving the following problem:

5. Calculation of Position and Orientation of Rotation Axis and 3D Coordinate

5.1. Calculation of Position and Orientation of Rotation Axis

Based on the previous discussions, the complete process of calculation of position and orientation of rotation axis is outlined below:

(1)The chessboard (see Figure 3) is printed and plastered on plane. By rotating and moving the camera properly, a set of chessboard images is collected. Then, the values of the intrinsic parameters of the camera, , , , and , are obtained by calling the method proposed in [14].(2)The camera is fixed on a camera tripod, and the rotation axis of the camera, which is parallel to the axis of the sensor coordinate system, lies in an approximately horizontal plane. (3)By rotating the camera around the fixed axis to the different positions, another set of chessboard images are collected.(4)The extrinsic parameters of the camera at the positions , , and are obtained by calling the function cvFindExtrinsicCameraParams2 in Opencv [20].(5)The rotation matrix and translation vector between the camera coordinate systems at the positions and , , and are figured out by (12). The rotation angle and its corresponding rotation matrix are also calculated by (2) and (14).(6)The and are converted into quaternions and , respectively, by using the method proposed by Bar-Itzhack [18].(7) is found by solving problem (34). As a result, the can be obtained.(8) is obtained by solving problem (38).
5.2. 3D Coordinate Calculation

Based on the previous discussions, we present the complete process of 3D coordinate calculation as follows:(1)The camera is fixed on a camera tripod whose rotation axis lies in an approximately horizontal plane.(2)At certain position , the image of the object is collected. (3)By rotating the camera on the fixed axis to another position , we get another image of the object .(4)The rotation angle and its corresponding rotation matrix are figured out by (2) and (14).(5)With , and are calculated by (20) and (21). (6)The image coordinate of the point of the object, (), on the image is appointed manually.(7)The corresponding image coordinate in the image , (), can be determined by stereo correspondence method, for example, the function FindStereoCorrespondenceBM in Opencv [20] or by manual. (8)The 3D coordinate of the point on the object relative to , (), can be figured out by   (6).

6. Experiments

Since digital camera with 3-axis accelerometer sensor is not available for us, IPhone 4, which has the sensor, is adopted. To simulate digital camera, the phone is placed in a box which is fixed on a camera tripod. And it is ensured that the axis of the sensor (experiment results show that the axis is parallel to the optical axis of the camera) is parallel to the rotation axis of the camera tripod, so that the value of keeps steady in the course of the rotation. In calibration course, the distance between the camera and the chessboard is about 1000 mm.

Figure 4 illustrates the curve of the quaternion of the rotation matrix between the camera coordinate systems at the th position and 1st position, with respect to the rotation angle. The quaternion was calculated by the proposed method with rotation angle, while ] was converted directly from rotation matrix , which was from calibration data. From the graph, one can see that the proposed method can calculate the rotation matrix by rotation angle.

Figure 5 plots the curve of translation vector between the camera coordinate systems at the th position and 1st position, with respect to rotation angle. The vector was calculated by the proposed method with rotation angle, while was directly from calibration data . From the graphs, one can see that the proposed method can calculate effectively translation vector by rotation angle.

In order to estimate the accuracy of 3D coordinate calculated by the proposed method, the chessboard which has a bigger block than the one for calibration is printed. The width of the block is 46.6 mm, and the distance between the chessboard and the camera is about 1200 mm. The distance between two neighbor corners rather than the distance between the camera and the chessboard is calculated, because the measurement of the former by manual is easier. For simplicity, the corners of the blocks in images are automatically detected by calling the Opencv function “cvFindChessboardCorners” [19]. Figure 6 depicts the measurement error ratio of the distance between two corners with respect to rotation angle.

7. Conclusions

This paper proposed a stereo vision system with single camera, which requires digital camera with 3-axis accelerometer sensor rotating around the fixed axis which is parallel to the axis of the sensor. Under these conditions, the slope angle relative to gravity, which can be figured out from the readouts of the sensor, can determine the camera position, and the rotation angle between two positions can determine the rotation matrix and translation vector from one coordinate system of the camera to another. Accordingly, given the rotation angle and the images of the object at two different positions, one before and the other after camera rotation, the 3D coordinates of the points on the object can be determined. Theoretical analysis and experimental results show the validity of our method.

It should be noticed that few digital cameras are provided with 3-axis accelerometer sensor. However, to obtain stereo vision, we believe that the inexpensive sensor embedded in digital camera is worthy. Moreover, due to higher image quality and larger focus range, higher accuracy and larger range of measurement may be obtained. Furthermore, the smart phone which has the sensor is popular. If a mini fixed rotation axis is built in a corner of the phone and it does not move in the course of rotation, with the proposed method, the phone may estimate the size of the object being focused on and distance between the phone and the object.

Acknowledgment

This work is supported by Science and Technology Research Project of Chongqing’s Education Committee (KJ110806).