Stereo Image Matching for Vehicle-Borne Mobile Mapping System Based on Digital Parallax Model
Considering automatic and effective stereo image matching for vehicle-borne mobile mapping system (VMMS), a new stereo image matching algorithm based on digital parallax model (DPM) is proposed in this paper. The new matching propagation strategy is designed in this algorithm, which includes two processes as DPM construction and parallax prediction. With some known matched points, the DPM of stereo image pairs is firstly constructed, and parameters for confirming conjugate epipolar line is also worked out. Then searching range during dense matching can be confirmed under constraints of DPM and epipolar line, which can improve matching speed and accuracy. Furthermore, to improve matching robustness, the computation model of similarity measurement combined with local structure feature and global color feature is designed. The new algorithm is applied to actual stereo images taken by VMMS to verify its validity. Experimental results show that the proposed approach has higher reliability and accuracy.
In recent years, vehicle-borne mobile mapping systems (VMMS) integrated with multisensors as GPS, INS, laser scanner, and CCD cameras have grown rapidly for quick acquisition of geospatial data [1–5]. The most important subsystem of VMMS is digital photogrammetric system, which can extract geometric information of ground objects efficiently and rapidly, and has been widely used in many fields as surveying and mapping [6, 7], 3D modeling [8–10] industrial surveying , traffic engineering , and so forth. In digital photogrammetric system, stereo image matching is the key step for automatic search of homologous image points and automatic calculation of geometric information. However, digital close-range images have some inconvenient factors for image matching, such as high resolution, complex distortion, small change of objects’ partial texture, more phenomena of parallax discontinuity, and so forth. Therefore, how to improve reliability, accuracy and speed of image matching is a critical problem in the application of digital stereo photogrammetry.
Two basic problems should be resolved in stereo image matching. One is the computation of similarity measurement and the other is the confirmation of searching range of homologous image point. Similarity measurement is the basis for judging whether two image points are homologous, which will impact matching robustness. Size of searching range determines the amount of candidate points, which will impact matching efficiency and reliability. After long-term research, many scholars proposed numerous image matching algorithms focused on the above two problems. In terms of computation methods of similarity measurement, there are mainly two types of similarity measurement as measurement based on grey level [13–16] and measurement based on features [17–20]. Regarding confirmation methods of searching range of homologous point, there are mainly following matching strategies like hierarchy matching strategy based on wavelet pyramid or Gauss pyramid [21, 22], matching strategy based on epipolar-line constraint [23–25] matching strategy based on parallax continuity constraint [26–28], and so forth. These methods not only improve the automatic level and processing speed of digital photogrammetry to a large extent, but also have a great reliability on the open areas with smooth terrain. However, in face of various kinds of close-range images, some problems still exist in these methods. For example, similarity measurement based on grey level cannot work well in textureless regions such as water bodies, and the reliability of similarity measurement based on features depends on the effectiveness of used features. Although epipolar geometric constraint helps to transform searching range from two dimensions to one dimension, the matching efficiency is still unsatisfactory when searching along the whole epipolar line. Parallax continuity constraint can further narrow the searching range of candidate point, but this constraint will lead to many matching errors in regions with discontinuous parallax. Furthermore, under this constraint, matching accuracy of the latter point depends on its previous adjacent point, which will result in catenation response of matching errors. The same problem also exists in hierarchy matching strategy based on image pyramid, and the process of generating image pyramid is also time consuming.
Therefore, considering that the main aim of VMMS is to obtain 3D coordinates of feature points on ground objects, and two cameras in VMMS share characteristics of fixed baseline and correlation, this paper proposes a new stereo image matching algorithm based on digital parallax model (DPM). In the new algorithm, by utilizing known calibration data of stereo photogrammetric system, DPM of stereo image is constructed first, meanwhile, parameters for confirming conjugate epipolar line is also worked out. Then according to the parallax prediction based on DPM, searching ranges of homologous points on right image are confirmed along conjugate epipolar line. Next, to synthetically use color and structure information of matching window, the similarity measurement combined with (scale invariable feature transformation) SIFT and color feature is designed to help the proposed algorithm to be possessed with a higher matching accuracy rate and reliability on close-range stereo image.
2. Description of the Algorithm
2.1. Construction of Digital Parallax Model
Generally, before acquiring 3D coordinate information of objects with VMMS, calibration need to be performed for its stereo photogrammetric system in three-dimensional calibration field (Shown in Figure 1(a)), which can work out demanded parameters for confirming coordinate computation model. Moreover, a lot of known control points must be selected exactly on stereopair during system calibration (Shown in Figure 1(b)). In this way, a large number of homologous image points can be obtained accurately.
(a) 3D calibration field
(b) Known control points on calibration stereopair
Therefore, by using the meanings and expression method of DEM  for reference, the paper defines the following digital parallax model: DPM is the finite sequence of 3D vectors for expressing horizontal parallax. Its function format is described as follows: where is column and row number of pixel i in left image. is horizontal parallax of pixel at the location . is column number of homologous point corresponding to pixel i in right image. N is the total number of homologous points on calibration stereopair.
As a lot of homologous image points and () can be obtained during system calibration, the paper directly uses discrete 3D vector sequence defined by formula (1) to express horizontal parallax relation between point on left image and its homologous point on right image, namely, digital parallax model of stereopair. This expression way of digital parallax model brings convenience to consequent work of parallax prediction based on DPM. To intuitively express proposed DPM, Figure 2(a) shows data structure of DPM, which is a two-dimensional relation table. Figure 2(b) shows the sketch of DPM. In this figure, the parallelogram represents the left image. Each cuboid represents one pixel at the position in DPM, and the cuboid height represents the horizontal parallax value of this pixel.
(a) Data structure of DPM
(b) The sketch of DPM
The above method of DPM construction is also adaptable to arbitrary stereopairs acquired by uncalibrated stereo photogrammetric system. For a stereopair, if sparse homologous image points and () are obtained by manual or automatic methods in advance, its DPM can also be constructed by formula (1). On the basis of DPM, matching propagation can be carried out with the following method described in Section 2.3.
2.2. Computation of Conjugate Epipolar Line Parameters in Stereopair
In stereopair acquired by close-range photography, the plane composed of the photographic baseline and one ground point is called the epipolar plane of this point. The intersection between epipolar plane and image plane is called the epipolar line. The homologous image point is bound to be located on the corresponding epipolar line. If the corresponding epipolar line can be found out during image matching, searching range for confirming homologous point can be reduced from two dimensions to one dimension, and matching speed can also be improved correspondingly. As some homologous image points have been obtained during DPM construction, these points can also be used to compute parameters for confirming conjugate epipolar line in stereopair. The paper uses relative orientation linear transformation (RLT) algorithm , which is based on the coplanar condition equation, to work out epipolar parameters and confirm conjugate epipolar lines momentarily during image matching on right image.
Using n pairs of homologous image points and () obtained during DPM construction, the formula of computing parameters for representing conjugate epipolar line can be described as follows: where , ,
After epipolar parameters is obtained by RLT algorithm, the homologous point () on right image corresponding to the matching point on left image can be calculated by the following equation, where searching range of is from 1 to N (the size of image is M rows × N columns)
2.3. Confirmation Searching Range of Homologous Point Based on DPM and Epipolar Line
Although epipolar line transforms the searching range of homologous point from two dimensions to one dimension during image matching, the matching efficiency is still unsatisfactory when searching along the whole epipolar line. If partial range on the epipolar line can be obtained in advance, the matching speed will be improved a lot. Considering the characteristic that the baseline and correlation of two cameras in stereo photogrammetric system are fixed, the DPM constructed with sparse homologous points can be used to interpolate the initial horizontal parallax value of matching point. Then searching range of homologous point along the epipolar line can be confirmed by this initial parallax on right image.
By using the elevation interpolation method based points in DEM  for reference, the paper uses DPM to interpolate the initial horizontal parallax value of matching point on the right image. The detailed parallax interpolation method based on DPM is described as follows. Firstly, select n points in DPM, which fall into the square centered by the matching point, with the edge length L. Secondly, use the bivariate high-order polynomial defined by the following formula (5) to fit parallax surface, and use selected n points to work out coefficients of polynomial by least squares method. Lastly, input row and column coordinates of matching point to the formula (5), and work out initial horizontal parallax value of this point on right image
This paper uses bivariate quadratic polynomial to interpolate the initial horizontal parallax value of matching point. Coefficients of this polynomial can be worked out by the formula (6) with n points (xi, yi) obtained in DPM where , ,
The sketch of parallax interpolation for each pixel based on DPM is shown in Figure 3. In this figure, dots represent pixel points in the DPM, the triangle represents the matching point on left image, and the square with the edge length L is parallax interpolation area. If the order of bivariate polynomial for fitting parallax surface is m, the total number n of pixel points within square area should be (m + 1) × (m + 1) at least. This paper makes n more than during parallax interpolation. If the number of pixel points within square area is insufficient, the edge length L of this square should be increased till the number of points meets the demand for parallax interpolation (In the paper, edge length L increases dually each time).
After interpolating the initial parallax value of one pixel on left image with DPM, the column searching range of its homologous point is on right image. The row number of each candidate pixel corresponding to in this searching range can be worked out by the epipolar-line formula (4). Figure 4 shows the sketch for confirming searching range of homologous point based on DPM and epipolar line. In this figure, dots on left image represent pixel points in DPM. Triangle points and represent two matching points on the left image, with their homologous points being, respectively, and on right image. Square area with broken line on the left image is parallax interpolation region, and two short lines which are approximately perpendicular to epipolar line on right image represent the column searching range of homologous point. From Figure 4, we can see that searching ranges of matching points and along the epipolar lines on right image are worked out independently by parallax interpolation method based on DPM. In this way, matching process of the latter point will not be influenced by matching result of its former point, and matching process of all points can be performed with parallel model. However, if parallax continuity constraint is used to perform matching propagation, column searching range of is , and this range could only be confirmed after completing the matching process of point . Thus, matching results of the latter points are easy to be influenced by their former neighboring points, and matching errors will be transferred and accumulated point by point.
2.4. Calculation of Similarity Measurement Based on SIFT Feature and Color Feature
How to define and calculate similarity measurement is the primary task of image matching. So far, various kinds of similarity measurement are proposed. To synthetically use structure and color feature of image, the paper designs the similarity measurement based on SIFT and color feature. SIFT operator is an image feature description method based on scale space proposed by Lowe in 2004 . The local feature description operator of image is invariant to image rotation, zoom, and scale transformation. Mikolajczyk and Schmid  pointed out that SIFT feature has maximal tolerance to generic image distortion after detailed comparison and summary of different local feature description operator of image. At present, SIFT feature description operator has been widely used in such fields as robot localization and navigation , object recognition , image registration  and image retrieval , and so forth. SIFT feature description method mainly involves the following four steps: first, detect extremum in scale space. Second, confirm accurate position of feature points. Third, compute description information of feature point. Lastly, generate description vector of feature point. As this paper only uses SIFT feature description method to generate feature vector for computing similarity measurement, the paper does not do with such steps of SIFT feature description method as building image multiscale space, detecting extremum and confirming feature point position. Steps of computing SIFT feature vector of one pixel which locates at position in image are described as follows.
Choose the subwindow with size of , which is centered on the pixel from image f. Compute gradient magnitude and orientation of each pixel in the subwindow with formula (8)where is grey level of the pixel location at position in R, or G or B color channel.
Divide this subwindow into 16 subregions (The size of each subregion is pixels). Compute weight sums along 8 directions (0°, 45°, 90°, 135°, 180°, 225°, 275°, 315°) with the formula (9) in each subregion, and obtain an eight-dimensional feature vector for describing structure feature of each subregion. The function of weight is to increase the contribution of these pixels nearby the center point to final feature vector and decrease the contribution of these pixels far from the center point. Computation window of SIFT feature is shown in Figure 5(a). In this figure, length of arrow represents gradient magnitude, and direction of arrow represents gradient orientation. The circle represents the scope of Gauss weight where H(i, β) is the weight sum along the direction βin subregion i is column number and row number of pixel j of subregion i in image f. σ is half size of computation window.
(a) The window for computing SIFT feature
(b) The sketch of SIFT feature vector
Combine eight-dimensional feature vectors of 16 subregions together by the location of each subregion accordingly, and generate a 128-dimensional feature vector, which is SIFT feature vector for describing the structure feature of computation window. In SIFT feature vector, the first element corresponds to the weight sum along the 0° direction in the first subregion, the second element corresponds to the weight sum along the 45° direction in the first subregion, …, the ninth element corresponds to the weight sum along the 0° direction in the second subregion, and so on. Furthermore, to make SIFT feature vector robust to illumination change, each element of SIFT feature vector must be normalized by its length. The sketch of SIFT feature vector is shown in Figure 5(b), and the length of each arrow represents magnitude of vector element along certain direction in this figure.
For the target window image f of matching point, and the searching window image of searching point, the paper firstly computes SIFT feature vector S of matching point and SIFT feature vector S′ of searching point, and then computes the similarity measurement based on SIFT feature vector between matching point and searching point with
The similarity measurement based on SIFT feature is cosine of angle between two vectors in fact. The more the two vectors are similar, the larger the magnitude of will be. As SIFT feature contains the structure information of feature point’s local neighbourhood, the similarity measurement based on SIFT feature can obtain satisfactory matching result when the difference in local neighbourhood information is obvious. However, when there are many areas with similar texture, this similarity measurement will obtain a lot of wrong matching results. Therefore, the paper adds global color information of image window to improve the capability of similarity measurement based on SIFT feature. Namely, after computing similarity measurement , the paper also computes three grey-level correlation coefficients ρrg, ρgg, and ρbg between matching point and searching point in R, G, and B channel, respectively. And color similarity measurement is the average value of ρrg, ρgg, and ρbg. Then take the average value of and as the final similarity measurement ρ. The improved similarity measurement can synthetically utilize local structure information and global color information of image window to acquire higher matching accuracy. The grey-level correlation coefficient (example with ρrg in R channel) and similarity measurement can be calculated by following formulas:
2.5. Realization Process of Matching Algorithm in this Paper
To sum up, realization process of stereo image matching algorithm in this paper is given as follows:(1)For a stereopair, according to some known homologous image points acquired by manual or automatic method in advance, construct its digital parallax model, and work out parameters for confirming conjugate epipolar line.(2)Extract required matching points manually or automatically on left image. For each matching point, use digital parallax model to interpolate the initial horizontal parallax value of its homologous point on right image firstly. Then according to the epipolar-line constraint, confirm searching range of its homologous point along the conjugate epipolar line.(3)By the method of computing similarity measurement based on SIFT and color feature, calculate the similarity measurement between matching point and each searching point in searching range on right image, and choose the point with the maximal similarity measure as the final homologous point of matching point.(4)Stop the pixel level matching process after all matching points are matched.
3. Experimental Results
To verify the validity of proposed algorithm in this paper, it is realized by Visual C#.NET program language in the Windows XP environment and tested by real stereo images of natural scenes. All images, with each size of 1392 × 1040 pixels, are acquired in Nanjing by vehicle-borne mobile mapping system of Nanjing Normal University. The appearance of vehicle-borne mobile mapping system is shown in Figure 6(a), and one stereopair captured by VMMS is shown in Figure 6(b).
(a) The appearance of vehicle-borne mobile mapping system
(b) One stereopair captured by VMMS
To fully verify matching accuracy, precision, and efficiency of our matching method, the method is applied to actual stereo pair taken by VMMS in outdoor 3D calibration field of Nanjing Normal University. In the field, there are lots of control points with known 3D coordinates and high repetition degree of surrounding texture. Advantages of selecting stereo images of calibration field for matching experiments are described as follows. Firstly, matching points on left image can be exactly selected because control points have special shape feature. Secondly, 3D coordinates of matching points can be worked out by coordinate analytical model of VMMS, and these computed coordinates can be compared with their real coordinates. Thus, beyond evaluating matching index of accuracy, the precision of matching method may also be evaluated by coordinate comparison. Figure 6(b) shows one stereopair taken in outdoor 3D calibration field. Cross-hairs on left image are 106 control points with known coordinates, and they are selected as matching points manually.
For stereopair shown in Figure 6(b), the following comparison experiment is carried out. When using our similarity measurement as well, compare matching performance of different matching strategies such as our matching strategy based on DPM and epipolar-line constraint, matching strategy based on parallax continuity constraint and matching strategy based on image pyramid, which is used to evaluate matching reliability of different matching strategies. The image pyramid includes four layers images, with each layer image being generated by moving average method with nine pixels. As our matching method is mainly used in close-range photogrammetry which demands position error less than one meter, so the point with position error higher than 1 meter is considered as mismatched result in experiment. After removing mismatched points of each matching method, the position error of all right matched points can be computed by following formula, and this position error is used as matching precision of each matching method. Matching results of comparison experiment is shown in Table 1 where Δx, Δy, and Δz are, respectively, difference of computed coordinates and real coordinates on the direction of X, Y, and Z. is position error of ith point. N is the total number of right matched points.
Figure 7 shows matching results of different matching methods in comparison experiment. In this figure, Figure 7(a) shows partial image of left image with matching points. Matching results of parallax continuity constraint strategy, image pyramid strategy and our matching strategy on right image are, respectively, shown in Figures 7(b), 7(c), and 7(d).
(a) Partial left image with matching points
(b) Matching result of parallax continuity constraint strategy
(c) Matching result of image pyramid strategy
(d) Matching result of our matching strategy
Furthermore, to show matching results of different matching methods in comparison experiment more intuitively, the paper presents position error distribution maps of all matching points by different methods in Figure 8. From position error distribution maps, mismatched or right matched points, can be easily found out. In Figure 8, position error distribution maps of matching methods based on parallax continuity constraint strategy, image pyramid strategy and our matching strategy are, respectively, shown in Figures 8(a), 8(b), and 8(c).
(a) Position error distribution map of matching method based on parallax continuity constraint strategy
(b) Position error distribution map of matching method based on image pyramid strategy
(c) Position error distribution map of matching method based on our matching strategy
Moreover, if matching propagation is done along the whole epipolar line rather than constrained by DPM, the matching accuracy is about 91.5%, but total matching time is about 26 seconds. Therefore, the following conclusions can be obtained from results of comparison experiment. The proposed DPM can well predict initial horizontal parallax value of matching point on right image, which can further decrease searching range of homologous point on the epipolar line. Thus, the matching speed of method based on DPM is doubled than that of the method based on whole epipolar-line searching. Although the matching method based on parallax continuity constraint has the fastest running speed, it will transfer and accumulate matching errors when some points are mismatched, which leads to the low matching accuracy. Under the condition of pixel level matching, our matching method based on DPM not only has the higher matching speed, but also has the highest matching accuracy and precision. This shows that our similarity measurement can reflect image feature more efficiently by synthetical utilization of structure and color feature of matching window.
Proposed matching method is also applied in our spatial information acquisition system based on VMMS. This system has been used to acquire spatial data (like 3D coordinate, length, and area) of municipal components in Taicang city, Jiangsu Province, China. Figure 9 shows some application examples. Figures 9(a), 9(b), and 9(c), respectively, shows survey results of coordinate, length, and area of ground objects (lines on left and right image are epipolar lines). From these survey results, we can intuitively see that matching results in stereopairs are right, which can also verify the validity of proposed method.
(a) Survey result of coordinate
(b) Survey result of length
(c) Survey result of area
Aiming at the characteristic of stereo photogrammetric system with fixed baseline, the paper proposes the conception of digital parallax model (DPM) by using the meanings and expression method of DEM for reference and constructs the DPM of stereopair through sparse matched points. Furthermore, the parallax prediction method based on DPM is put forward. Combined with the epipolar-line constraint, this method can well decrease the searching range of homologous point on right image, which can greatly improve matching efficiency and accuracy. Our matching method has been applied in the vehicle-borne 3D data acquisition and processing system developed by Nanjing Normal University, and the system has also been well applied in data collection projects such as municipal components surveying. Main features of proposed matching method can be summarized as follows: for each matching point, its searching range of homologous point can be “worked” out by DPM-based method independently. Matching process of all points can be completed synchronously, and the problem of the influence of former points on latter points does not emerge anymore. Therefore, the proposed method makes it easy to realize parallel process of image matching and overcomes phenomenon of error transfer and accumulation in matching method based on parallax continuity constraint. The similarity measurement based on SIFT and color feature can better utilize structure and color information of local matching window and significantly improve matching reliability. In our algorithm, epipolar image needs not to be generated, and epipolar parameters are only used to confirm location of searching point in real time during matching process. In this way, complex computation of image resampling can be avoided. Process of our algorithm is parallel in essence and is convenient for real-time processing and realizing on simple hardware.
However, matching speed and accuracy of our method still needs to be improved. In future work, the issue of utilizing other information (such as texture, topological relationship, etc.) to improve matching reliability of similarity measurement, more effective utilization of DPM to predict initial searching range of homologous point should be further studied.
This research is supported by National Natural Science Foundation of China under the grant number 40901200, Momentous Science Foundation of Jiangsu province under the grant number 07KJA42005, Surveying and Mapping Research Foundation of Jiangsu province under the grant number of JSCHKY201011, and the Priority Academic Program Development of Jiangsu Higher Education Institutions.
B. Grinstead, A. Koschan, D. Page, A. Gribok, and M. A. Abidi, “Vehicle-borne scanning for detailed 3D terrain model generation,” in Proceedings of the SAE Commercial Vehicle Engineering Congress and Exhibition, pp. 210–218, Chicago, Ill, USA, November 2005.View at: Google Scholar
K. Ishikawa, J. Takiguchi, Y. Amano, and T. Hashizume, “A mobile mapping system for road data capture based on 3D road model,” in Proceedings of the IEEE International Conference on Control Applications, pp. 638–643, Munich, Germany, 2006.View at: Google Scholar
X. X. Geng and S. D. Zhong, “A mobile system using lidar and photogrammetry for urban spatial objects extraction,” in Proceedings of the International Conference on Information Engineering and Computer Science (ICIECS '09), pp. 1–4, Wuhan, China, December 2009.View at: Publisher Site | Google Scholar
P. Zhao and N.-H. Wang, “Precise perimeter measurement for 3D object with a binocular stereo vision measurement system,” Optik—International Journal for Light and Electron Optics, vol. 121, no. 10, pp. 953–957, 2010.View at: Google Scholar
Y. Dufournaud, C. Schmid, and R. Horaud, “Image Matching with Scale Adjustment[J],” Computer Vision and Image Understanding, vol. 93, no. 2, pp. 175–194, 2004.View at: Google Scholar
L. Maxime and Q. Long, “Roubst dense matching using local and global geometric constraints,” in Proceedings of the 16th International Conference on Pattern Recognition, vol. 1, pp. 968–972, Barcelona, Spain, 2000.View at: Google Scholar
K. Zhang, Y. H. Sheng, Y. Q. Li, B. Han, C. Liang, and W. Sha, “Image matching for digital close-range stereo photogrammetry based on constraints of Delaunay triangulated network and epipolar-line,” in Geoinformatics 2006: Remotely Sensed Data and Information, vol. 6419 of Proceedings of SPIE, pp. 64191W_1–64191W_15, Wuhan, China, October 2006.View at: Publisher Site | Google Scholar
Q. Zhu, J. Zhao, H. Lin, and J. Gong, “Triangulation of well-defined points as a constraint for reliable image matching,” Photogrammetric Engineering & Remote Sensing, vol. 71, no. 9, pp. 1063–1069, 2005.View at: Google Scholar
Z. L. Li and Q. Zhu, Digital Elevation Model, Wuhan University Press, Wuhan, China, 2003.
Z . X. Zhang and J. Q. Zhang, Digital Photogrammetry, Wuhan University Press, Wuhan, China, 1997.
S. Helmer and D. G. Lowe, “Object class recognition with many local features,” in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, pp. 187–194, Washington, DC, USA, 2004.View at: Google Scholar
C. L. Lin and Y. M. Zhao, “Trademark retrieval algorithm based on SIFT feature,” Computer Engineering, vol. 34, no. 23, pp. 275–277, 2008.View at: Google Scholar