Mathematical Problems in Engineering

Volume 2014 (2014), Article ID 452803, 14 pages

http://dx.doi.org/10.1155/2014/452803

## Feature Based Stereo Matching Using Two-Step Expansion

^{1}School of Instrumentation Science & Opto-Electronics Engineering, Beihang University, Beijing 100191, China^{2}National Institute of Metrology, Beijing 100029, China

Received 2 January 2014; Revised 19 June 2014; Accepted 21 July 2014; Published 18 December 2014

Academic Editor: Yi Chen

Copyright © 2014 Liqiang Wang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

This paper proposes a novel method for stereo matching which is based on image features to produce a dense disparity map through two different expansion phases. It can find denser point correspondences than those of the existing seed-growing algorithms, and it has a good performance in short and wide baseline situations. This method supposes that all pixel coordinates in each image segment corresponding to a 3D surface separately satisfy projective geometry of 1D in horizontal axis. Firstly, a state-of-the-art method of feature matching is used to obtain sparse support points and an image segmentation-based prior is employed to assist the first region outspread. Secondly, the first-step expansion is to find more feature correspondences in the uniform region via initial support points, which is based on the invariant cross ratio in 1D projective transformation. In order to find enough point correspondences, we use a regular seed-growing algorithm as the second-step expansion and produce a quasi-dense disparity map. Finally, two different methods are used to obtain dense disparity map from quasi-dense pixel correspondences. Experimental results show the effectiveness of our method.

#### 1. Introduction

Stereo matching is an international research focus of computer vision [1]. It can produce a disparity map from stereo images which are captured by cameras in different viewpoints. This technology is important in 3D reconstruction, virtual view rendering, and automatic navigation. It is a key point to know how to compute a precise disparity map in a complex environment by stereo matching. There is much excellent research to solve this problem. However, it still has some inherent challenges, such as unavoidable light variations, textureless regions, occluded areas, and nonplanar surface, that make the disparity estimation difficult [2–4].

To solve the inherent problems, numerous methods have been proposed in the past two decades. They consist of local and global methods [5, 6]. Local methods generally compute the correlation between these points and candidates over an adequate window and then use winner-takes-all (WTA) algorithm to find the best candidate to the point [7, 8]. They are fast to compute a disparity and flexible to model parametric surfaces within the neighborhood but have difficulties in handling poorly textured and ambiguous surfaces. Global methods are different from local approaches; they commonly integrate prior constraints into optimization of the point correspondences to solve the poorly textured areas and lessen the matching ambiguities. They produce the disparity map by an energy minimization algorithm and have a better performance in poorly textured and textureless regions but are limited to model piecewise planar scenes [9]. Global methods have a goodish performance when the viewpoints are close [10] but do not handle well when the space of viewpoints becomes large [11, 12].

In large-scale stereo images, ambiguous areas exist more than their short-baseline counterpart. Whether the viewpoints are close or wide, there are always some significant features, such as points of interest, which are invariable. An alternative method uses reliable feature correspondences as seeds and expands these points by using a growing-like process to obtain more point correspondences [13–18]. The methods named seed-growing or region-growing can yield much better results in large perspective distortions and increased occluded areas than traditional ones. Seed-growing methods have a low computational complexity since they are not using global optimization but are sensitive to mismatches. To lessen the influence of wrong points, Cech and Sara [19] employed an optimal solution and introduced an improved growing method which can handle many difficult instances, such as repetitive or complex textures. The method does not need each seed to be accurate in disparity map. However, seed-growing algorithms only generate a semidense disparity map because of sparse feature points.

To overcome drawbacks of traditional matching methods and seed-growing algorithms, the matched features are naturally integrated into state-of-the-art stereo methods as soft constraints [3, 20]. In these methods, a primary work is to find accurate point correspondences as GCPs (ground control points) [21]. GCP-based approaches improve stereo matching accuracy and correctness. However, GCP-based approaches need much time to obtain an accurate disparity map.

In this paper, a two-step expansion based robust dense matching algorithm is proposed based on the previous works [19, 22–24]. Sparse support points are obtained by state-of-the-art feature matching methods [22, 23]. Before two-step expansion, the segmentation-based prior [24] is used to encode the assumption that the region which has the same color is a 3D surface. The first-step is a feature expansion that is presented based on the invariant cross ratio of projective transformation. The basic idea is to match more features from initial support points in uniform region via cross ratio constraint. However, there is no ability to find enough matched pixels to obtain dense disparity map. To obtain more point correspondences, in the second-step, the matched features from the first-step are used as seeds to grow and build a quasi-dense disparity map which is denser than the feature correspondences of the first-step but not an absolutely dense disparity. About the process stage from quasi-dense disparity to dense disparity, the paper introduces two methods: (i) fitting process: a planar surface fitting is used to remove mismatches and can fill blank occluded areas in the uniform region and (ii) synthesized method: an optimal solution incorporates quasi-dense pixels into global energy methods to reduce the matching ambiguities.

This new work mainly focuses on the first-step that uses a feature-expanded algorithm for stereo matching. In the first step, we suppose that it is a set of sparse points whose coordinates are given in the same 3D surface, and the coordinates of the homologous image pixels satisfy projective geometry of 1D in horizontal axis. Our motivation comes from the theory that the points of axis satisfy 1D projective transformation and that the cross ratio is invariant. By using the invariance of cross ratio, the inhomogeneous coordinates of each corresponding pixel can be approximated. The accurate coordinates of the corresponding pixel are found by a search model that computes a correlation statistic for neighboring pixels. In addition, to solve the poorly textured regions, we employ a propagation algorithm to expand low feature pixels. Occluded areas can be filled by a fitting process or a synthesized method, and the fitting process method does not use cross-checking (checking and optimizing the disparity by computing the differences between left-to-right disparity and right-to-left disparity). Experimental results demonstrate that the method of two-step expansion has considerable performances over the existing ones. It can produce denser disparity than these existing seed-growing algorithms, and it has a goodish result in short-baseline and wide-baseline stereo matching.

The paper is structured as follows: firstly, related work is discussed in Section 2. In Section 3, we introduce a support-point based expansion algorithm with cross ratio constraint. Then, a two-step expansion method is described, and it mainly presents the first-step about application of features expanded in Section 4. In Section 5, we describe two different methods to produce dense disparity map. Finally, we give the experimental validation supporting the feasibility of the method in Section 6. In Section 7, we give a conclusion and hint some future works.

#### 2. Related Work

There are numerous literatures related to this work. Firstly, Scharstein and Szeliski [1] summarized dense stereo methods and established an early test bed for stereo matching algorithm. Then, Geiger et al. provided a newly outdoor challenge [25] for the quantitative evaluation of large-scale stereo matching. Seitz et al. [26] introduced a comprehensive study and made a comparison of stereo techniques. It included two main strategies for obtaining stereo correspondence: feature correspondences based local approaches and energy-minimum based global methods. In our method, the previous two-step expansion algorithm and the latter fitting process stage belong to the first strategy, and the later synthesized method falls in the second one.

Dense energy-minimum based global methods had a good performance in the past decade. Local stereo algorithms based on feature correspondences are speedy to estimate disparity [1, 27] but cannot effectively handle the blurry border and mismatches [7]. Hence, most excellent stereo matching algorithms rely first on using local approaches to find the pixel correspondences and then incorporate them into global constrains by dynamic programming (DP) [28–31], level sets [32], space carving [33], PDE [12, 34], EM [35], and voxel coloring [36]. Recently, two global methods based on Markov random fields (MRFs) are used as basic algorithms to be improved: Graph Cuts [37] and Belief Propagation [38]. Many works of research about both of them have achieved a desirable result [4, 39, 40]. Both methods are often used to be comparable data of the top contenders in the realm of dense stereo matching and are powerful tools to produce disparity map but intractable to finish the solution in wide-baseline stereo. In contrast, our method can lessen the matching ambiguities and is efficient to large-scale stereo matching.

Sparse local feature based approaches are robust to the large-scale images. Image features play an important role in computer vision. They have already been used in wide-baseline stereo matching [41–43]. In a wide-baseline setup, the inherent problems are perspective distortions and occlusion. Feature based matching methods are particularly effective because features are robust, distinctive, and invariant to various image and scene transformations [22, 23, 44–47]. However, the traditional methods based on feature matching produces only sparse pixel correspondences. To find more matched points than features, a propagation algorithm from the matched points to their neighbors is introduced.

The rule of growing a region from primary seeds was used to segment image [48]. The seed-growing principle was originally introduced into stereo matching by Otto and Chau [49], O’Neill and Denos [50], and Kim and Muller [51] and used for photogrammetric community. Then Lhuillier and Quan [15, 52] employed the epipolar constraint and uniqueness constraint to greedily reproduce adjacent components in disparity blankness from corresponding seeds. The growth algorithm cannot achieve a good performance in the areas of repetitive patterns. The best first strategy as an optimal solution was used to replace the pixel-wise growth increments by Zeng et al. [17, 18]. And the optimization cannot be able to remove the previous match errors, especially in complex scenes. Kannala and Brandt [53] and Megyesi et al. [54] introduced a propagation algorithm by affine deformation of image similarity patches. But it had inaccurate affine parameters due to wrong initial seeds and made a bad propagation. Cech and Sara [19] introduced an optimal solution and presented a seed-growing method that could recover from errors in initial seeds. However, the method only produced a semidense disparity map. In contrast, our method can not only handle the difficult instances (e.g., repetitive texture, complex scene, and wrong initial seeds) but also produce denser point correspondences than the existing methods.

To compute an accurate dense disparity map, we incorporate quasi-dense pixel correspondences as GCPs into state-of-the-art global matching framework. In these literatures about stereo matching, GCPs-based methods can achieve a precise result. Bobick and Intille [2] used GCPs to optimize DP solution and reduce large occlusion. GCPs were used in preprocessing stage to guide the previous matching process and could reduce false matched points by using the method of Kim [3] and Wang et al. [20]. In [21], a GCPs-based regularization was incorporated into global method by using the Bayes optimization rule. In contrast, our method does not need provided special GCPs and can offer quasi-dense pixel correspondences as GCPs.

Geiger et al. proposed a generative probabilistic model ELAS [7] for wide-baseline stereo matching and offered a challenging KITTI dataset [25]. On KITTI dataset, these methods [55–57] that were used to compute optical flow had better results. In contrast, our method is a just strategy for stereo matching and receives a result compared with ELAS.

#### 3. Efficient Expansion with Cross Ratio Constraint

##### 3.1. Cross Ratio Constraint Model

In the epipolar geometry of two views, it can restrict the corresponding point on the polar line. To find the precise position of the corresponding point, traditional algorithms employ exhaustive search along the corresponding line and give a statistic for correlation of all candidates. To fasten the position estimation of the corresponding point on line, we introduce a new constraint based on 1D projective geometry.

We assume there is a stereovision system as shown in Figure 1. It can be seen that there are three sets of four collinear points in the polar plane . Each set is related to the others by a line-to-line projective transformation. Since the cross ratio is invariant under a 1D projective geometry, it has the same value as