Abstract

This paper presents a novel relative attitude estimation approach for a uniform motion and slowly rotating noncooperative spacecraft. It is assumed that the uniform motion and slowly rotating noncooperative chief spacecraft is in failure or out of control and there is no a priori rotation rate information. We utilize a very fast binary descriptor based on binary robust independent elementary features (BRIEF) to obtain the features of the target, which are rotational invariance and resistance to noise. And then, we propose a novel combination of single candidate random sample consensus (RANSAC) with extended Kalman filter (EKF) that makes use of the available prior probabilistic information from the EKF in the RANSAC model hypothesis stage. The advantage of this combination obviously reduces the sample size to only one, which results in large computational savings without the loss of accuracy. Experimental results from real image sequence of a real model target show that the relative angular error is about 3.5% and the mean angular velocity error is about 0.1 deg/s.

1. Introduction

With the development of the space technology, the number of spacecraft on orbit is increasing rapidly year by year. At the same time, the space debris including thousands of inactive satellites, fragments of broken-up spacecraft, and equipment lost by astronauts is mounting [1]. And there has been a growing interest in autonomously servicing spacecraft on orbit to perform tasks such as spacecraft rescue and repairing, refueling, and removal. Several space programs have been completed or are being developed as technology demonstration of these tasks, such as JAXA’s ETS-7 [2], NASA’s DART program [3], DARPA’s Orbital Express program [4], and SUMO program [5]. On one hand, the malfunctioning spacecraft take up the valuable and limited orbit, which increases the potential security risks. On the other hand, they are also a kind of available resources on orbit. Therefore, the capture of a slowly rotating spacecraft has received detailed attentions in recent years [69].

Nowadays, vision-based navigation systems, which have the advantage of being entirely passive and of low power consumption, are usually used to determine the relative attitude and position [10, 11]. The vision measurement has been separated into two aspects of model based and model-independent methods [12]. The first class usually needs the CAD model of the target object or some other kinds of a priori knowledge. Cropp et al. [13] extracted the line features in an image by a monocular camera and calculated the matched relationship with a priori model. They got the relative pose of two stable spacecraft by the nonlinear least square optimization theory. The constraint function was the orthogonality between the normal vector of line in the image frame and the plane composed of the same line and the optic center in the camera frame. Kelsey et al. [12] obtained the pose of a slowly rotating spacecraft model by three steps which include the initialization by triangulation, the refinement by Iterative Reweighted Least Squares, and the tracking by EKF. In their experiments in laboratory, the attitude error was up to about 20 degrees and the range error was more than 10% of range.

The model-independent method is more challenging because no prior information will be used. In other words, the target is absolutely a noncooperative object. Priggemeyer et al. [14] acquired the target picture sequence by a stereo camera system with a baseline of 0.2 m. Afterwards, they extracted the point features with SURF. And then the DLT algorithm [15] solving a linear equation system was applied to find the projection matrices. Finally, a Kalman filter was applied as a recursive least squares filter to reduce noise. Oumer and Panin [16] and Segal et al. [17] presented a system for feature-based 3D tracking of a noncooperative satellite. Sequential 3D motion estimation from a stereo camera has also been tackled using filtering schemes.

Up to now, most of the algorithms and systems are focused on a scene that the camera optical axis is parallel to the spin direction of the target. This hypothesis greatly reduces the degree of difficulty, because the feature face is in the field of view all the way. In our paper, we propose the optical axis and the spin direction, mutually perpendicular. So there is no fixed feature face to our camera. This proposes high demand of the robustness and universality of the algorithm. The contribution of this paper is the proposal of a new single candidate RANSAC algorithm that exploits the probabilistic prediction obtained from the EKF in order to increase the efficiency of the spurious match rejection step. The calculation speed is further improved by using a very fast feature extractor. Meanwhile, we introduce the image pyramid and the binary descriptor for the sake of ensuring the scale invariance and rotational invariance.

2. Feature Initialization

2.1. Reference Frames

As in Figure 1, there are some frames defined as follows.

(1) Pixel Frame . The frame is centered at the top left point of the resolution, the -axis is directed to the right level, and -axis is orthogonal with -axis in the same plane.

(2) Image Frame . The frame is centered at the place where the optic axis intersects with the image plane; the -axis and -axis are parallel to the -axis and -axis, respectively.

(3) Camera Frame . The frame is centered at the optic center, -axis is parallel to the -axis, -axis is parallel to the -axis, and -axis is defined as the cross-product of the other two axes.

(4) World Frame . The frame has its origin and axes anywhere in the real world space.

2.2. Feature Detection and Description

Our proposed feature builds on the well-known fast keypoint detector [18] and the recently developed BRIEF descriptor [19]. Both these techniques are attractive because of their good performance and low cost. FAST takes one parameter, the intensity threshold between the center pixel and those in a circular ring around the center. We use circular radius of 9, which has good performance. FAST does not produce multiscale features. We employ a scale pyramid of the image and produce fast features at each level in the pyramid.

Our approach uses a simple but effective measure of corner orientation, the intensity centroid [20]. Rosin defines the moments of a patch aswhere and belong to a circular region of radius in the pixel frame. With these moments, we may find the centroid:

We can construct a vector from the corner’s center, , to the centroid, . The orientation of the patch then simply is

The BRIEF descriptor is a bit string description of an image patch constructed from a set of binary intensity tests. Consider a smoothed image patch, . A binary test is defined bywhere is the intensity of at a point . The feature is defined as a vector of binary tests:

The vector length could be chosen as 128, 256, 512, and so on. BRIEF has no orientation, so if we want to allow BRIEF to be invariant to in-plane rotation, we should add one to it. Just as described in [21], we use the steered BRIEF operator to solve the problem.

3. EKF-Based Estimation

3.1. State Vector Definition

The whole state vector at step is composed of a set of camera parameters and feature set . All of these are referred to by the world frame .where is the covariance matrix. The estimated feature set is composed of point features . Point features are parametrized in inverse depth coordinate [22] which can be defined by the dimension 6-state vector:

The inverse depth parametrization stores in its six parameters the 3D camera position when the feature was initialized , the azimuth-elevation pair encoding the unit ray pointing to the feature, and its inverse depth along the ray [23]. They could be converted to Euclidean coordinates if and when the projection equation becomes linear enough.where .

3.2. Dynamic Model

The camera state is composed of pose terms: camera optical center position and modified Rodrigues parameters (MRPs) defining orientation; linear and angular velocity and relative to world frame and camera frame .

The state update equation for the camera iswhere and are zero-mean Gaussianly distributed velocity noise coming from an impulse of acceleration.

3.3. Measurement Model

The measurement model used in the experiments of the paper is a pinhole camera model plus a two-parameter radial distortion. The camera is assumed to be calibrated in advance. Inverse depth and Euclidean points in the state vector are first transformed to the camera reference frame:where represents a rotation matrix computed from the state MRPs and    is the function converting azimuth-elevation angles to a unit vector. Points in the camera frame are then projected using the standard pinhole model:

Here stands for the focal length of the camera, and represent the pixel size, and are the image center coordinates. The imaged point is finally transformed using the two parameters , , modeled below, resulting in the distorted measurement

4. Single Candidate RANSAC for EKF Estimation

In the case of the new single candidate RANSAC presented here, extra information for the camera motion comes from the probability distribution function that the EKF naturally propagates over time. Computational savings with respect to standard RANSAC [24] can be easily derived. The number of RANSAC random hypotheses necessary to guarantee that at least one of them is mismatch-free with probability can be computed using the following formula:where is the assumed inlier ratio and the number of measurements that instantiate the model. As a simple but illustrative example, if the inlier ratio is 0.4 and the probability equals 0.99, the number of random hypotheses would be reduced from 447 for to only 9 for . The efficiency is highly improved. The update is performed in two steps using the standard extended Kalman filter equations.

Firstly, we will find the matching relationship between the point features estimated from those in the last image and those extracted in the current one. The inliers in the current image are represented by .

Next, we could update the state vector by picking up one point from the set of randomly.

Consequently we get a new set, and then we finally update the EKF state and covariance.

The initial measurement for a newly observed feature inserted into the state vector iswhere represent the current camera pose estimate, is the image observation, and is the depth prior.

In (17),where and are normalized retina image coordinates. Despite being a non-unit directional vector, the angles by which we parametrize its direction can be calculated as

The covariance of , , , , and is derived from the image measurement error covariance and the state covariance estimate . We set and .

The state covariance after feature initialization is

After each estimation step, the linearity indexis computed for every feature coded in inverse depth:where is computed using (9) and is the submatrix covariance matrix or the corresponding considered feature. The full state covariance matrix is transformed with the corresponding Jacobian:

In Algorithm 1, a pseudocode is shown to state the core steps of single candidate RANSAC for EKF estimation.

    input: , ,
   output: ,
    
    
    for to
    
    
    
    
  
  
  
  
  end if
  
  end for
  

5. Experiment Results

The performance of our algorithm has been tested on real image sequences acquired with a monocular camera under space lighting conditions, based on a terrestrial simulation mockup. The experiment was carried out with a real size model and a camera with a 40° field of view and resolution, capturing monochrome image sequences at 10 fps. In the end of a dark room, the model covered with a reflective golden foil was mounted on a swivel table which operated at the speed of 2 deg/s around the vertical direction. And on the other side, the camera was fixed on a 3-degree-of-freedom rail. A direct high-power floodlight, simulating the spectrum of the sun, illuminates the surface of the model at different incident angles.

Figures 2(a)2(d) show a spaceflight mockup from different view with the red circles on behalf of the tracked features and the blue ones representing the lost ones. Figures 3 and 4 display the equivalent trajectory of the camera in 2 dimensions and 3 dimensions, respectively. Because the camera was fixed in front of the mockup, the trajectory is almost round. Because the satellite is rotating continuously, the measured angle changes periodically. Figure 5 correctly exhibits the situation. There are two lines in the left column in Figure 6: the blue dotted one stands for the actual value, while the red one represents the estimated value. Meanwhile, the absolute error and relative error are shown in Figure 7. We present 3 cycles. Because we use Euler angle to express the rotation, the maximum degree is 180 deg. And the jump points just show the phase switches. If we use the angular accumulation to express the rotation, there will be no jump points. We present here a fitting function using the angular dataset and the image sequence, based on linear polynomial regression algorithm.where is the number of the frame in image sequence and represents the corresponding angle. Through analysis, it is pointed out that the mean error between actual spinning rate and the estimated is below 0.1 deg/s.

6. Conclusions

In this paper, we presented a system for feature-based 3D tracking of a noncooperative spaceflight. The rotational angle and velocity of the target are estimated in camera frame, using image-plane motion fields. Preliminary experimental results have been obtained from a simulation mockup, demonstrating robustness and accuracy properties of the proposed method. The rigid motion is estimated by a bank of single candidate RANSAC EKF which greatly promotes the efficiency of computation. With the combination of scale and rotational invariance feature extractor, the system becomes more robust.

Future system improvements will be the integration of a laser and a camera or two cameras to obtain the depth information, possibly including different features (points, vanish points, lines, planes, etc.), and introducing loop-closure mechanisms and bag of words to improve the accuracy and stability.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The research was sponsored by Shanghai Rising-Star Program (no. 16QB1401000), Key Project of Shanghai Science and Technology Committee (no. 16DZ1120400), and the National Natural Science Foundation of China (Project no. 51705187).