Abstract
In traditional adaptiveweight stereo matching, the rectangular shaped support region requires excess memory consumption and time. We propose a novel linebased stereo matching algorithm for obtaining a more accurate disparity map with low computation complexity. This algorithm can be divided into two steps: disparity map initialization and disparity map refinement. In the initialization step, a new adaptiveweight model based on the linear support region is put forward for cost aggregation. In this model, the neural network is used to evaluate the spatial proximity, and the meanshift segmentation method is used to improve the accuracy of color similarity; the Birchfield pixel dissimilarity function and the census transform are adopted to establish the dissimilarity measurement function. Then the initial disparity map is obtained by loopy belief propagation. In the refinement step, the disparity map is optimized by iterative leftright consistency checking method and segmentation voting method. The parameter values involved in this algorithm are determined with many simulation experiments to further improve the matching effect. Simulation results indicate that this new matching method performs well on standard stereo benchmarks and running time of our algorithm is remarkably lower than that of algorithm with rectangleshaped support region.
1. Introduction
Stereo vision is a fundamental technique for extracting 3D information of a scene from two or more 2D images. It is widely applied in robot navigation, remote sensing, and industrial automation. One of the key technologies of stereo vision is stereo matching, which produces a disparity map. The stereo matching algorithm can be classified into two broad categories: globalbased and localbased algorithms.
Globalbased matching algorithms follow the energy minimization principle. First, an energy function is established, consisting of a data term and a smoothness term. Next, this function is minimized with a global optimization method. Dynamic programming [1], loopy belief propagation (LBP) [2, 3], and graph cut [4, 5] are usually employed to identify the minimum energy required for a globalbased algorithm. Comprehensive global constraint information can produce a more accurate disparity map in a globalbased algorithm.
A localbased matching algorithm is a simple and effective method for stereo matching that is commonly used. An important underlying principle of localbased matching is that pixels in a support region have an approximately equal disparity. To satisfy this principle, it is very important to determine the support region size. The support region must contain enough pixels for intensity variation, and the support region must include only those pixels with the same disparity. Thus, the traditional, localbased matching method is prone to false matching for pixels from the depth discontinuities region, since those pixels are from different depths. To ensure that a localbased matching algorithm performs well in practical applications, various approaches have been proposed. For example, adaptive windows have been used to improve matching results. These methods search an appropriate support region for each pixel, greatly improve the performance of matching results, and outperform standard localbased methods [6–10]. However, it is difficult to search a support region with an arbitrary shape and size, and most of these methods have a high computational complexity. Other researchers assign different supportweights to the pixels in a support region, keeping the size and shape of a support region constant [11–13].
In recent years, several methods for acquiring satisfactory effect of stereo matching have been adopted. Yang et al. [17] presented a stereo matching algorithm which integrates colorweight, belief propagation, leftright checking, color segmentation, plane fitting, and depth enhancement. Mei et al. [18] integrated the ADcensus cost measurement function, the cost aggregation method based crossbased region, the scanline optimize method, the multistep refinement method, and the accelerative algorithm based on CUDA into their algorithm.
The algorithm presented in this paper is inspired from adaptiveweight matching algorithm. In this paper, the aim is to propose a low computation complexity and high accuracy stereo matching algorithm. So the rectangleshaped support region is substituted for the lineshaped support region. Lacking of enough pixel information is a main weakness of the lineshaped support region, which is easy to cause error matching. Adaptiveweight can make full of limited pixel information, by analyzing the characteristic of the adaptiveweight model proposed in [13] on disparity accuracy, we use neural network (NN) to determine the spatial proximity and meanshift based segmentation method to effectively describe the color similarity.
In addition, several approaches are applied to complete the algorithm. We develop a new pixel dissimilarity measurement function which combines Birchfield pixel dissimilarity measurement function and census transform to compute the matching cost. The loopy belief propagation method proposed in [2] is employed to estimate the initial disparity map, which is optimized with min convolution and image pyramid. There are several measurable improvements for the initial disparity map. To further improve the accuracy of the initial disparity map, iterative leftright consistency (LRC) checking and segmentation voting are used to refine the disparity map by analyzing the features of the initial disparity map.
2. Algorithm Description
The algorithm presented in this paper can be divided into two steps: a disparity map initialization step and a refinement step. The framework of the algorithm is shown in Figure 1. A detailed description of this algorithm is given in the following sections.
2.1. AdaptiveWeight Based Cost Aggregation Method
Assuming the two pixels and , the disparity of center pixel wants to be computed. is the support region of pixel , while is a neighboring pixel of in the support region. The supportweight of is assigned by the following according to [12]: where represents the spatial proximity, represents the color similarity, and is the supportweight. Our algorithm is designed on the basis of this framework. The list of variables used in this paper is shown in the end of the paper.
2.1.1. The Model of LineBased AdaptiveWeight
Wang et al. [13] noted that when the support region is large enough, color similarity plays a major role in computing the center pixel disparity within a certain range. As shown in Figure 2, red represents the support region . We used the pixels in to compute the disparity of . The effects of spatial proximity can be neglected in the pale blue region according to [13]. We call this region the transition area, represented by . To satisfy this principle, the neural network can be applied in the design of this spatial proximity model.
Figure 3 shows the spatial proximity model established by neural network. The position of a pixel is the input, the spatial proximity is the output, and the connect weights are shown as in the figure. In fact, the distance is the input of neural network. To simplify the notations, suppose that the center point is at , the distance can be simplified into which represents the position of a pixel. The concrete form of the spatial proximity is expressed by where is the sigmoid function. Figure 4 demonstrates the varied trend of the spatial proximity according to the position of a pixel.
In Figure 4, the space between the two blue lines is the transition area and the support region is represented by the whole axis. It can be seen from Figure 4 that the spatial proximity of pixel in the transition area is significantly greater than that of pixel in other area, which accords with the spatial proximity model of traditional adaptiveweight theory; there is not much difference between these pixels in the transition area for the spatial proximity, which means that the influence of spatial proximity can be neglected.
According to the segmentationbased stereo matching principle [14, 19], a new model of color similarity is established by the following in [20]:
Equation (3) shows that color similarity contributes enormously to measure the dissimilarity between center pixel and its neighbor pixel when and belong to the same segmentation.
That color similarity model based on image segmentation can achieve good performance, as introduced in [13]. Meanshift is a nonparametric estimation iterative technique, and its application domains include computer vision, clustering, and image processing [21]. In this work, we use meanshift as the segmentation method.
2.1.2. Cost Aggregation
The matching cost of pixel with disparity is represented bywhere and are the corresponding pixels of and , respectively, when the disparity of the center pixel is .
The pixel dissimilarity measurement function in (4) is very important for cost aggregation. The absolute difference and Birchfield function [22] are widely used in cost aggregation. To improve the matching accuracy of the textureless and repetitive regions, the pixel dissimilarity measurement function is described by combing Birchfield function and census transform:
To validate the effect of this cost aggregation method, simulation results on Teddy and Cones with Birchfield method and our method are shown in Figure 5.
(a)
(b)
(c)
2.2. Initial Disparity Determination
The winnertakeall (WTA) searching strategy is a common method for determining disparity, which can be expressed by
WTA tends to produce a low accuracy disparity map. Therefore, we adopt an efficient LBP algorithm proposed in [2] in this paper. In this LBP algorithm, FFT convolution and image pyramid are integrated into LBP, which can effectively decrease the complexity of LBP and increase the matching effect. The flowchart of the initial determination procedure is shown in Algorithm 1.

2.3. Disparity Refinement
It is inevitable that initial disparity maps will contain many errormatched pixels. To refine the disparity map, a twostep postprocessing method is put forward in this section.
2.3.1. LeftRight Consistency Check
The disparity map for the left image is computed by previous steps. The disparity map for the right image is computed in a similar manner. In [15, 17], pixels are classified into several types according to , , and to remove outliers. Then different strategies are designed for different type of pixels to determine its disparity.
In this work, we analyzed the property of initial disparity map. Figure 6 shows the distribution of bad pixels after executing the disparity initialization step of our algorithm for Tsukuba. It can be seen from this figure that most of the bad pixels concentrate in the occluded region. According to this result, most of the pixels match correctly and the initial disparity in the nonocc region should be trusted. Therefore an iterative leftright consistency check is proposed for handling this.
Pixels can be divided into two types: undependable pixels and dependable pixels. Pixel is classified as dependable when it meets the following condition:
Pixel is considered to be undependable if it fits the following condition: The new disparity of the undependable pixel can be computed as in Algorithm 2.
Figure 7 shows the result after the execution of iterative leftright consistency checking. Table 1 shows the detailed data for the role of iterative leftright consistency checking.
2.3.2. Segmentation Voting
The pixelwise region shown in Figure 8 is established according to color consistency. This region can be represented by . Pixels in satisfy the following condition:
Let represent the frequency distribution of disparity in . The new disparity is updated by Algorithm 3.

The median filter is applied to the left disparity map. Figure 9 shows the effect of segmentation voting on the accuracy of the disparity map.
3. Results and Discussion
3.1. Parameters Determining
The parameters involved in our algorithm greatly affect the performance of the algorithm. In this section, we present the parameter settings.
We considered eight main parameters: , , , , , , , and , which are kept constant for all benchmarks.
Figure 10 shows the influence of on the accuracy of disparity map obtained by our algorithm. When varies from 35 to 65, our algorithm is insensitive to .
(a)
(b)
Figure 11 demonstrates that the performance of our algorithm varies with and . From these figures, it can be noted that generally speaking when is larger than 15, the influence of over algorithm performance is small for Tsukuba, Venus, and Teddy. Our algorithm shows good performance for Tsukuba and Teddy when and . The error percentage tends to decrease with a decreasing when is smaller than 15. In regard to Cones, the performances of this algorithm improve as increases, and when lies between 15 and 30.
(a)
(b)
(c)
(d)
Figure 12 shows the influence of on performance. The trend of the error percentage is Ushaped for Tsukuba when is smaller than 20. The error percentage bottom out when is between 7 and 13. The error percentage for Venus follows a downward trend. For Teddy and Cones, the error percentage is inversely proportional to . When is larger than 12, the error percentage is insensitive to .
(a)
(b)
Figure 13 shows the performance of our algorithm according to and . From these figures, our algorithm has a good ability of robustness with different values of . When , the error percentage of the disparity map obtained by our algorithm is still fairly low.
(a)
(b)
(c)
(d)
For the disparity map refinement step, two main parameters must be set. and are previously introduced. Figure 14 shows the influence of and on performance. When and , all data sets show good performance.
(a)
(b)
(c)
(d)
3.2. Experimental Results
We evaluate our algorithm on Middlebury benchmarks [23] with error threshold 1. The test platform hardware consists of T9600 CPU and 5 GB memory. Software consists of MATLAB 2014a and VS2012. Parameters are shown in Table 2.
Simulation results on Middlebury data sets are presented in Figure 15. The quantitative performance of our algorithm is shown in Tables 3 and 4. Our algorithm ranks 5th in the Middlebury data set (July 1, 2014).
(a)
(b)
(c)
These results demonstrate our algorithm has a good performance. However, it is difficult to decrease the error percentages in the three regions (nonocc, all, and disc) at the same time. This is because many pixels with correct disparity are classified as undependable, according to (8). Tsukuba can be used as an example. Figure 16 represents the bad pixels detected by (7)(8).
Figure 6 shows the bad pixels obtained by comparing the initial disparity map and ground truth. Figure 6 indicates that true bad pixels are distributed in the occlusion region. However, it can be inferred from Figure 16 that many pixels in nonocc and disc regions are mistakenly classified as undependable. When standard LRC checking is used, misclassified pixels in the disc region may be assigned a wrong disparity, as shown in Figure 17. Table 5 shows quantitative comparison results for standard LRC checking and iterative LRC checking.
Thus, it can be seen that the error percentage in the disc region greatly increases after applying standard LRC checking. Because the disc region is part of the nonocc region, the error percentage in the nonocc region also increases. Our iterative LRC checking method can effectively improve the performance of our stereo matching algorithm.
3.3. Running Time of Our Algorithm
In this section, we investigate the computation running time of our algorithm. The running time directly reflects computational complexity. Without a loss of generality, our algorithm runs 50 times and average running time was calculated. Results are shown in Table 6.
The new rectanglebased adaptiveweight method can be obtained by imposing the direction constraint on the new weight model. Table 7 displays the running time of cost aggregation of our algorithm with the rectangleshaped support region under different sizes of support region. The rectanglebased algorithm also runs 50 times under each size of support windows.
Our algorithm with the rectangleshaped support region performed best when the size of the support region is ; the error percentages of the initial disparity map in nonocc, all, and disc regions are 1.45%, 3.42%, and 7.08%, respectively. Tables 5–7 show that our algorithm produced notable results in decreasing computation complexity and improving performance.
4. Conclusions
In this work, we proposed a new linebased adaptiveweight stereo matching algorithm that integrates several methods. The main conclusions that can be drawn from our results are as follows.(1)Cost aggregation is the most timeconsuming part of stereo matching algorithm. Using a lineshaped support region can dramatically reduce the elapsed time of cost aggregation.(2)The adaptiveweight model proposed in this paper can produce a rather satisfactory initial disparity map in the absence of enough pixel information.(3)Experimental results show that the algorithm proposed in this paper can attain a better matching effect with less running time.
Although our algorithm has a good performance on Middlebury data sets, there is still much room for improvement.(1)There are too many parameters in our algorithm to accommodate different image pairs. In further research, we will analyze the intrinsic relationship among the parameters and reduce the number of parameters.(2)Figure 13 indicates that image characteristics have a significant impact on the optimum values of and . In future studies, we will explore how the optimum values of and vary according to the texture of image.
List of Variables
:  The halflength of support region 
:  Shape controlling parameter 
:  The halflength of transition region 
:  Spatial radius of meanshift 
:  Color radius of meanshift 
:  The halflength of segment for segmentation voting 
:  Confidence level of color similarity for segmentation voting. 
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
The authors express their appreciation for the financial support of the Shandong Natural Science Foundation, Grant no. ZR2013FL033. They also extend their sincere gratitude to reviewers for their constructive suggestions.