Abstract

In the SLAM application, omnidirectional vision extracts wide scale information and more features from environments. Traditional algorithms bring enormous computational complexity to omnidirectional vision SLAM. An improved extended information filter SLAM algorithm based on omnidirectional vision is presented in this paper. Based on the analysis of structure a characteristics of the information matrix, this algorithm improves computational efficiency. Considering the characteristics of omnidirectional images, an improved sparsification rule is also proposed. The sparse observation information has been utilized and the strongest global correlation has been maintained. So the accuracy of the estimated result is ensured by using proper sparsification of the information matrix. Then, through the error analysis, the error caused by sparsification can be eliminated by a relocation method. The results of experiments show that this method makes full use of the characteristic of repeated observations for landmarks in omnidirectional vision and maintains great efficiency and high reliability in mapping and localization.

1. Introduction

The first problem of a mobile robot is that it has to deal with a complicated environment when the robot fulfills its assignments. Simultaneous localization and mapping (SLAM) is one of the key enabling technologies for a mobile robot’s autonomous ability. SLAM addresses the problem of building up a map within an unknown environment, while at the same time keeping track of their current location. Many popular SLAM implementations use a distance sensor such as a laser range finder or sonar to explore the environments [13]. We propose to extend this approach to an omnidirectional vision-based system for economic reasons.

Many vision sensor applications have been developed in recent years. Vision sensors can provide continuous image data. Until now, the research work about vSLAM mainly used stereo vision or monocular vision [4, 5]. Paper [6] presents a vSLAM method based on an encoder integrated system and applies a distributed particle filter approach. A vSLAM method based on a tracking approach for “2.5D space” is proposed in paper [7]. A semantic structure of environments is built based on object recognition and laser scan [8]. A calibrated monocular camera is used in a vehicle SLAM system [9]. 3D reconstruction of the environment is built through the matching interest points.

However, there are limitations in the view angle of ordinary vision sensors. In contrast, omnidirectional vision has a 360 degree field of view and it has been widely applied in robot navigation, video conferencing, and surveillance [10]. Omnidirectional vision based robot localization has been researched and has been popular for years. Paper [11] develops a topological localization method, where the matching algorithm of omnidirectional images based on a color histogram is proposed. It obtains relatively complete environmental information and strengthens the capability of tracking a target; this makes up for deficiencies in the study of vSLAM.

Omnidirectional vision not only provides opportunities for studying vSLAM problems, but it also provides several challenges. First, the large amount of information will increase the computational complexity of the SLAM algorithm; second, the distortion of the omnidirectional image is relatively large, which is very difficult to extract and match features directly. As a result, vSLAM methods based on omnidirectional vision have seldom been applied in the case of localization and mapping for an unknown environment. Therefore, the omnidirectional vision based SLAM is not only able to promote vSLAM technology, and provide a new idea, but it also can broaden the application field of omnidirectional vision (e.g., environment exploration and rescue).

The extended information filter SLAM (EIF-SLAM) algorithm [3, 12] describes the uncertainty of SLAM by information matrix and vector. However, the efficiency of traditional EIF-SLAM is very low due to the enormous computation of mean vector restoration from matrix and vector information. Papers [12, 13] improve the accuracy of EIF-SLAM through sparsification of the information matrix by constructing a reasonable topology structure of Bayes network. Paper [14] eliminates elements by using a new sparsification approach in EIF-SLAM and maintains SLAM consistent. Because there are more features in the omnidirectional images, these methods cannot reduce the computational complexity due to the large landmarks classification. In this paper, based on an analysis of structural characteristics of the information matrix, an improved extended information filter SLAM algorithm (IEIF-vSLAM) and sparsification rule are presented. The analysis of sparsification error and the verification of experiments show that this algorithm not only improves computational efficiency, but also maintains the accuracy of the estimated result by using proper sparsification of the information matrix. It can be used in a mobile robot and fulfill map building tasks though omnidirectional vision in indoor environments.

2. IEIF-vSLAM Algorithm

2.1. Algorithm Structure

The structural diagram of the proposed IEIF-vSLAM algorithm is shown in Figure 1. It includes 3 parts: features detection, IEIF, and robot control.

In the module of features detection, a Harris-SIFT feature extraction method based on SIFT [10] is used to match and track features [15]. The KLT operator [16] is used to track Harris-SIFT feature points. The RANSAC method [17] is used for affine transform and removing the wrong match points which are caused by blocking. If a new landmark has been found, it will be recorded in the feature database.

In the IEIF module, the information variables based on the observation data will be updated in blocks of observation update. According to the movement of the robot, the state variables will be estimated in the block of motion update. In fact, the computational burden of IEIF-vSLAM is mainly concentrated on both of the processes of observation and of motion update. Due to the wide scale of information and a huge amount of omnidirectional image feature, the dimension of information variables rapidly increases. This situation will heat up if the environment becomes larger [18]. Therefore, it is reasonable for an improved IEIF-vSLAM algorithm—based on an analysis of structural characteristics of the information matrix—to improve computational efficiency and maintain accuracy of the state estimated.

2.2. Information Matrix of SLAM

At time , the state estimation of robot pose and landmark position satisfies Gaussian distribution:

: State of robot, including the pose of the robot.

: Set of landmarks.

: The observation of landmarks.

: The input vector of robot control.

As shown in (1), if and are given, the joint conditional probability density of and satisfies normal distribution in time domain, whose mean value is and covariance matrix is . It also satisfies inverse normal distribution, whose mean value is and covariance matrix is in the information domain.

The IEIF-vSLAM algorithm estimates the robot pose and landmarks positions according to the update of information vector and information matrix . However, and have no obvious physical meanings, so they should be converted to and . The conversion formula is

As shown in Figure 1, observation will be updated at every motion cycle through movements. The motion update, observation update, and add features are expressed as follows [13]:

Motion Update where

expresses the robot poses’ own correlation.

is the Jacobian matrix of the mobile robot’s motion model.

Add Features expresses the correlation between robot pose and landmark.

is the Jacobian matrix of omnidirectional vision’s observation model.

Observation Update

In (6), is the sparsification matrix. Given the main processes of EIF-SLAM, the question of how to reduce the computation complexity of motion update, observation update, and add features to the database is important to EIF-SLAM’s application. The performance of enhancing efficiency by sparsification should be reasonable. Prior to discussion of sparsification, the observation model of omnidirectional vision is analyzed.

2.3. Observation Model of Omnidirectional Vision

In an omnidirectional vision system, is the observation of features (landmarks), its position in image pixel coordinate is . is its Gaussian noise. is its covariance matrix. The observation measurements, which are similar to those of a laser sensor, are used to replace . is the distance between the robot and features and is the head angle for features in robot local coordinates. This allows us to describe the derivation process of transformation.

Figure 2 is the omnidirectional vision schematic plot, which is fixed on top of the robot. The height of the lens from the floor is . Omnidirectional vision’s hyperboloid mirror is designed as . Its focal point is (one of the foci of the mirror) and the focal distance is . The imaging pixel of features is [19].

Then,

As shown in Figure 2, the 2D coordinates are built based on the other focus point of the mirror. Then, the function of the mirror is changed to

The reflected light goes through the center point of the lens: , its slope is

Then the intersection point between the incident light ray and the mirror will be calculated as

According to the geometrical relation [19]:

So, the new observations are obtained through (7) and (11). The observation of landmarks in different distance for omnidirectional vision has the same gauss distribution . However, the uncertainty of landmarks observation is different and is determined by calibration [20]. In hyperboloid mirror, , , and  mm. The uncertainty of one radius axis of omnidirectional image is obtained by calibration with the interval of 5 pixels. According to the characteristics of center symmetry of omnidirectional image, the uncertainty in other direction of whole image will also be determined and recorded in a table for query.

According to (6), we get

Therefore, in (6) is described in a sparsification matrix expression [13].

3. Sparsification of Information Matrix

3.1. Structure Characteristics Analysis

According to the process description of motion update, observation update and add features, the structural characteristics [14] of the information matrix based on the EIF formula can be concluded as follows:(1)information matrix is Hermit matrix of symmetric positive definite [12, 13];(2)according to (3), the element of will be decreased during movement;(3)according to (6), is the sparsification matrix. Then, in , only the elements which are related to robot pose and positions of unupdated landmarks are nonzero. Furthermore, observation update strengthens the expression of the unupdated landmarks.

Then, the structural characteristics of information matrix are obtained as follows:(1)two elements in the diagonal line denote the link strength of two notes;(2)if the shortest link between two nodes is longer, then that from the main diagonal in the information matrix of the relevant elements is farther;(3)the endpoints of vice diagonal denote the link strength between the current state and landmark , which is relevant to . So, if the robot observes the landmark again, the link strength becomes larger.

From the above analysis, most of the elements in the information matrix of omnidirectional vision SLAM are nearly zero. It is reasonable for the structure of the information matrix to be sparsed.

Because of th0e required omnidirectional vision’s observation model linearization and data association, the information matrix should be calculated in these processes (motion update, observation update, and add features). Among them, the main computation is concentrated on solving (2). In order to analyze this conveniently, it has been transformed into standard linear equations to solve the problem: where, both and are the unit matrix with the same dimension.

With an increase of landmarks is in the map, the computation cost of solving linear equations will increase greatly. As mentioned above, the information matrix is an almost sparse matrix. If we can reasonably use the sparse characteristic of the information matrix, the efficiency of the calculation of extended information filtering will be enhanced.

Compared with the dense matrix, the efficiency of linear equations solving through a sparse matrix will be improved significantly. The sparse matrix operation method [21] is used in this paper. Due to the special operation rule of the sparse matrix, it can greatly reduce the computation cost. If the sparse degree is unchanged, the inversion cost for sparse and dense matrix has an exponential relationship with the matrix dimension increase.

3.2. Improved Sparsification Rule

The improved sparsification rule is proposed in this chapter to reduce the computation cost of EIF-SLAM. According to the analysis of the omnidirectional vision observation model and the structural characteristics of the information matrix, this rule is described as shown in Algorithm 1.

while
    
    
end

Where, denotes the conservative coefficient, is the initial step of sparsification, is the information matrix at time , denotes the dimension of , and denotes the number of rows and columns of the information matrix. The larger is, the higher the precision of EIF-SLAM will be and the poorer the real time performance will be. The larger λ is, the more conservative the initial sparsification is.

Only motion update will make the information matrix dense [22]. Therefore, the process of sparsification should be carried out once in every time cycle. Under the above analysis, the proposed sparsification algorithm will lead to higher computational efficiency. However, the proposed sparsification algorithm will bring an error to the information matrix and decrease the accuracy of state estimation simultaneously.

In SLAM application, the error caused by sparsification can be eliminated by a loop closure method. A closed loop environment should be designed for a robot to navigate; however it is unrealistic in practice. In information matrix, the correlations are expressed as the elements of matrix. The difference of different correlations is large due to the distortion and wide range view of omnidirectional vision. So, the sparsification rule has been improved in this chapter. The sparse observation information has been utilized and the strongest global correlation has been maintained.

Based on the improved sparsification rule, a relocation method is used to eliminate the error. Relocation refers to the robot finding landmarks which have been recorded in its database (this area has been explored before).

In the process of sparsification, environmental features map are divided into three independent parts [22]:

Among that, are the passive features (they have some correlation with robot pose) and are converted from active features (they have no correlations with robot pose), are the features which are still maintained active, and are the features which are still maintained passive.

In SLAM, the posterior probability of state vector is . According to Bayes theory, it will break down as follows:

In (15), if and are given, it is said that the correlation between robot pose and the passive feature is independent. So, the passive feature can be set as an arbitrary value, such as its estimate value .

In the traditional sparsification rule, is mandatorily converted to passive. Like , is also independent of robot post. So, the posterior probability of state vector in (15) is expressed as follows:

has been divided into two parts:

Among that, is the feature which is mandatorily maintained passive during prediction. is the feature which has been observed again and maintained active. In order to limit the amount of active features under a boundary, as the features with weaker correlation should be selected from . It has the same amount as . Then, is divided into two parts:

Among that, is the feature which is maintained active during prediction. is the feature which has been mandatorily converted to passive. In the improved sparsification rule, expresses the feature which has been converted to passive from active, and is the feature which is maintained actively. Then, we can say that

As shown in Figure 3, in time , there are 3 landmarks in the omnidirectional image of the robot. The active feature is expressed as . Under the above analysis, the feature with a weaker correlation will be removed during the prediction of . According to the traditional rules, the correlation between the robot and whichever is farthest from will be removed. So, . However, in fact, will be observed in time . The correlations between both robot and , robot and will become stronger. According to the improved rule, will be removed. Then, . Due to the correlation of that has been maintained in the sparsification matrix, the repeated observation of is defined as relocation, it will decrease the uncertainty and the error of localization.

According to the extended information filtering algorithm [12, 13], the positions of landmarks which have been observed repeatedly will be modified. Positions of landmarks without repeated observations but relevant to the relocation area will also be revised. The overall error of state estimated results will be greatly reduced. In practice, omnidirectional vision has a wide range of view, as shown in Figure 4. There is a large amount of repeated landmark information in two continuous images, if the robot is placed in an environment with obvious features. Compared with the previous sparsification approach [13], the proposed sparsification algorithm does not need to classify landmarks. The relocation method can effectively restrain estimation errors caused by sparse matrix information.

4. Experimental Results and Discussions

4.1. Experiment Platform

Intelligent wheelchair-JiaoLong is used to evaluate the proposed IEIF-vSLAM algorithm. It is developed based on a commercial powered wheelchair and equipped with two encoders (one encoder for one driven wheel), a smart motion controller, an on-board PC, a laser range finder, and an omnidirectional vision system [23]. In sensor layer, laser range finder and omnidirectional vision are used to gather environment information. The sensor is processed and IEIF-vSLAM algorithm is running in on-board PC of controller layer which is equipped with a 1.8 GHz i5 CPU. In powered wheelchair layer, the wheelchair base is driven by two differential wheels and is powered by a group of 12 V batteries. The users can operate the wheelchair through the joystick. Figure 5 provides an overview of Jiaolong’s system architecture. During verification, Harris-SIFT and IEIF-vSLAM algorithms also run in the on-board PC.

The omnidirectional vision system which has a Point Grey Chameleon camera with an image resolution of 1296(H) × 964(V) pixels is used in this experiment. The camera’s frame rate is 10 Hz.

4.2. Platform Model Analysis

As a differential-driven mobile robot, JiaoLong’s motion model is expressed as where, and are the robot pose at time and . and are linear and angular velocity. is the control period. is a white Gaussian noise with mean value of 0 and variance of . So, the Jacobian Matrix of the mobile robot’s motion model in (3) will be expressed as

The observation model of omnidirectional vision is expressed as where, are the global coordinates of features. is the current pose of robot. is the measurement (distance and angle) of omnidirectional vision. is a white Gaussian noise with mean value of 0 and variance of . So, the Jacobian Matrix of observation model in (5) will be expressed as

In the proposed IEIF-vSLAM algorithm, the symmetric form of observation model is also used:

Then, in (6) will be expressed as

4.3. Map Building Experiments

The wheelchair is placed in an environment with a room, a door, and a long corridor. A map built by a laser range finder [23] is superimposed for comparison with the map built by proposed IEIF-vSLAM. The trajectory followed by JiaoLong in the experiment is shown in Figure 6. JiaoLong is driven by a user through a joystick to follow this trajectory, starting inside room number 200, travelling through the door, entering two atriums and corridors, and travelling back to room number 200. This process has been repeated for 3 cycles. During the driving process, the data are transmitted and recorded in the on-board PC. The data contains time , odometer data, data of laser range finder, and omnidirectional images. In order to avoid significant data redundancy, a lower threshold is set. If the wheelchair does not move by more than 4 cm or turn by more than 1.5°, the image will be discarded. The map building and localization experiments are carried out offline to verify the proposed omnidirectional vision SLAM algorithm. The robot’s “real” pose (as the ground truth compared with the pose obtained by IEIF-vSLAM algorithm, the accuracy of which has been limited to within 4 cm) is calculated by scan-match method [24] based on odometer data and data of laser range finder. Based on time which is recorded in each frame of data during acquisition, the one-to-one relationship between the “real” pose and the pose obtained by IEIF-vSLAM is built. The data acquisition experiment cost 10 minutes and the total distance of 3 cycles is 360 m.

As shown in Figure 6, the blue dotted “■” indicates the planned trajectory of wheelchair, which is controlled through the joystick. This trajectory is obtained by scan-match method in the 1st circle. The Sparsification index is set as , .

In Figure 7, some images of omnidirectional vision and feature map are shown. Room number 200 is an example of a complex indoor environment. There is rich furniture, such as tables, chairs, and various objects. The atrium had some exhibition boards, doors, and windows along the wall. The corridor, by contrast, has white walls with very few features available. In Figure 8, the black rectangle “■” indicate the localization results and the green solid line indicates the odometer trajectory for the 1st circle from starting point to destination. During the initial period, the feature points are extracted, meanwhile the map was built based on omnidirectional vision’s observation. Although the odometer data have small accumulative errors, there is no obvious distinction between the trajectories of odometer and plan trajectory (Figure 6). However, when the wheelchair navigates out of Room number 200, odometer errors accumulate away from the planned trajectory. Near the destination, odometer errors have accumulated on a large scale, which does not reflect the wheelchair’s navigation position exactly. As shown in Figure 8, based on modification of the IEIF-vSLAM algorithm, the deviation of IEIF-vSLAM localization is still relatively small from start to end during navigation of three circles.

4.4. Localization Analysis

In order to analyze the localization accuracy of IEIF-vSLAM, as shown in Figure 8, the robot trajectory is divided into 3 sections:(i)Section I: from the starting point to the door exit;(ii)Section II: from the door exit to the entrance of the last corridor;(iii)Section III: from the entrance of the last corridor to the destination.

In Figure 8, the blue star “” indicates the position of landmark features. There are rich feature points on both sides of the path in Section I and their distribution is more dispersed; there are fewer feature points on both sides of the path in Section II and they are relatively concentrated; the feature points of Section III are rich and more concentrated. Table 1 shows the localization error (compared with the results of scan-match, Max., and Avg. are the maximum and average value of all absolute errors) during the three sections.

In Section I, it is a complex indoor environment. There is rich furniture, such as tables, chairs, and various objects. There are more features with more dispersed distribution. Therefore, some more accurate results are obtained.

In Section II, much of the environment is a white wall with windows with similar features. There are fewer features. It leads to inaccurate localization which is caused by lack of relocation.

In Section III, there are more feature points and their distribution is relatively concentrated, which easily leads to miss-match. Therefore, positioning errors are some bigger than that of Section II.

As shown in Table 1, the position of the robot become more accurate in the 3rd circle of all 3 Sections. In the 3rd circle, the average error is 0.246 m in -direction, 0.198 m in -direction, and 1.6° in heading angle. There are more repeated observations for landmarks which lead to more relocations; the estimation errors of robot pose have been effectively restrained.

4.5. Comparison Experiments

In order to verify the validity of the proposed IEIF-vSLAM algorithm, the EIF [13] algorithm is selected to be compared with the proposed algorithm. The EIF algorithm combined with Harris-SIFT and SLAM algorithm also runs on the same platform which described in Section 4.1. The comparison experiment is also repeated for 3 cycles in same scenario. Only the Max. and Avg. in the 3rd circle are calculated and recorded. as the algorithm’s average of time consuming (including of EIF or IEIF, excluding of image processing, feature detecting, matching, and tracking) of each algorithm cycle has also been recorded.

Through the results in Table 2, compared with EIF algorithm, the proposed EIF-vSLAM algorithm is better in accuracy and the time consumption of the proposed EIF-vSLAM algorithm is only 1/2 of that of EIF algorithm. The advantage of computation efficiency of IEIF-vSLAM algorithm is obvious.

5. Conclusion

An IEIF-vSLAM method based on omnidirectional vision has been proposed in this paper. Both of the characteristics of information matrix structure and omnidirectional vision’s repeated observations for landmarks are analyzed. Based on these analyses, the sparsification rule has been improved. The sparse observation information has been utilized and the strongest global correlation has been maintained. Both the computation efficiency and accuracy of the estimated results have been improved by using proper information matrix sparsification. Before real platform experiments, through the error analysis, the error caused by sparsification can be eliminated by proposed method. The results of the experiments show that this method which uses omnidirectional vision’s characteristic of repeated observations for landmarks can be used for mobile robot map building and localization.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is partly supported by the National High Technology Research and Development Program of China under Grant 2012AA041403 and the Natural Science Foundation of China under Grant 61175088.