Abstract

Enlarged images can be obtained by various methods. Stitching is one of the efficient methods. It can produce panoramic images by stitching adjacent images which contain overlapping regions even though they are obtained through separate image sensors. Images that contain multiple different planes are hard to be stitched together because each plane has a different homography matrix for perspective warping. For this, a dual homography was proposed. However its performance varies depending on feature detectors which are used to find matching feature points between images. In this paper, we propose three feature coverage indexes which evaluate the stitching performance of feature detectors and predict the outcomes of the stitching. We evaluate four well-known feature detectors by the proposed indexes by applying them to the image stitching process and show that the prediction by the index values coincides with the stitching results.

1. Introduction

Enlarged images can be obtained by various methods. Stitching is one of the efficient methods and has long drawn attention of researchers from graphics and computer vision fields. Its primary goal is to integrate multiple images into a single panorama [1].

Stitching depends on a perspective transformation which warps pixels from one coordinate frame to another. Its algorithms have traditionally sought to parameterize the warping using a transformation matrix, such as the affine or homographic matrixes.

This matrix-based parameterization of the warping provides robustness at the cost of flexibility and is only accurate as long as a set of restrictive conditions are met [2]. For example, the homographic transformation is only applicable for planar scenes or parallax-free camera motion between adjacent images. Thus it requires that the one who takes pictures is not allowed to change one’s location but only move in a rotational way.

As the transformation must keep visually accurate alignment of large image regions, it must be tolerant to significant view point shift. Also, as outdoor environments are beyond control, the transformation must also be robust to illumination changes and motion of objects.

The distribution of detected features across an image is known to affect the accuracy of homography calculated from them [3]. It is desirable that the features are evenly distributed across the image because many vision algorithms are robust only when such conditions are met.

When images contain more than one plane, it is hard to stitch them together into a single panoramic image. For example, an image containing both a distant plane and a ground plane that stretches out from the camera’s view point is one of such difficult images. Since both planes have different homography transform matrix, it is hard to build a single universal matrix to apply for the whole image stitching.

For the dual-plane image stitching, existing approaches estimate a single planar perspective transform to align two adjacent images. However a single homography cannot warp the images correctly, requiring postprocessing to remove misalignments.

In [4], it proposes a method to address dual-plane panoramic scene by estimating two-perspective transform per image-pair, resulting in improved alignment before the postprocessing. It estimates dual homographies from matched points and applies different weights to each homography depending on distances to corresponding pixels.

The dual homography approach divides the matched feature points into two groups. Then a perspective matrix is estimated per group. Each pixel uses a weighted sum of the homography matrices to warp into new position as follows:where and mean the ground plane and distant plane homographies, respectively, and is a weight to the pixel at , representing which plane is closer to the pixel.

In [5], it improves the problem of the dual homography: curve effects; that is, straight lines in original images are bended after stitching. It lessens such side effect by adding another homography to the weighted sum, resulting in triple homography:where is the width of image and represents the horizontal coordinate of the pixel.

Both the dual homography and its enhanced version need to find two sets of feature points to estimate homography. Therefore they are dependent on the feature detection algorithms.

In this paper, we propose three feature coverage indexes which evaluate the stitching performance of feature detectors and predict the outcomes of the stitching. It has been attempted to provide a set of indexes to evaluate the quality of image processing results. In [6], the convex hull was employed to indicate the spatial coverage of feature points. In [7], spatial relationship between feature points was measured by dense sampling scheme. We compare four feature detection algorithms for the dual homography. We use SIFT [8], SURF [9], ORB [10], and BRISK [11] to detect features for estimating homographies.

The rest of the paper is organized as follows. Section 2 describes the details of the dual homography procedure and introduces the feature coverage indexes. In Section 3, we experiment with three sets of images for stitching and evaluate the detectors by the proposed indexes. Section 4 concludes the paper.

2. Dual Homography

Figure 1 shows the flowchart of stitching two images by using the dual homography. At first, feature points are extracted from images. The features are then matched one another, resulting in a set of matched pairs. Then the pairs are clustered based on which plane they belong to: the ground and distant planes. By using the two groups of feature pairs, two homographies are estimated, respectively. The stitching is actually the backward projection: the weighted sum of the inverse homographies is used to calculate the pixel location to fill up the position on the resulting stitched image.

Feature detectors have an influence on the success of the image stitching because the following depends on feature detectors: the number detected feature points, the number of matched pairs, the number of feature points per cluster, the location of cluster centroids, and estimated homography of clusters.

We propose three indexes to measure and evaluate the efficiency of feature detectors for the estimation of homography. Those indexes can be used in a way that if a feature detector obtains high score over all three indexes, we can expect the dual homography estimated by the detector to produce seamless stitched images with high probability.

The first index measures how many detected features still remain after matching step:where is the number of detected feature points and is the number of matched feature points. It indirectly represents the efficiency of feature detectors. During the feature detection, the detector requires computing resources such as CPU processing and memory to describe and store feature points. Thus, the more feature points are matched, the less resources are wasted.

The second index is the ratio between the number of feature points belonging to each cluster:where is the number of matched feature points belonging to the distant plane and is the number of matched feature points belonging to the ground plane. If each cluster contains the same or similar number of feature points, the probability to successfully estimate homography matrices becomes higher. Otherwise, the cluster with less number of feature points is more likely to fail to estimate the homography, resulting in the failure of the dual homography warping.

The third index is the variance of the distances of feature points to its cluster centroid:where is the average distance from each of the cluster points to a centroid, is the distance from th cluster point to the centroid, and is the number of feature points of the cluster. If feature points are distributed evenly over its plane, the estimated homography becomes more robust.

In summary, these three indexes specify indirectly three conditions required to estimate homography with success; feature points are extensively distributed and evenly over different planes with a sufficiently large number of detected points.

3. Performance Evaluation

We evaluate feature detectors by comparing three proposed indexes and the quality of generated panorama images. We consider four detectors: SIFT, SURF, ORB, and BRISK.

For evaluations, we use three sets of images. For the first set, we divide a single image into three pieces, each of which has overlapped regions with adjacent subimages. Each subimage contains two planes: ground and distant shown in Figures 2(a)2(c).

By this set, the stitched results can be easily evaluated by comparing with the undivided original image.

For the second set, each image of the first set is transformed by using random perspective transformation matrixes as shown in Figures 2(d)-2(e). Note that the center and the right images are warped, leaving the left one unchanged. It evaluates whether the dual homography performs perspective transformation on images correctly.

The third set contains three separately taken images, each of which has overlapped regions as shown in Figures 5(a)5(c). Each image contains two planes: ground and distant. Stitching the images of this set results in a panoramic image.

Figure 3 shows the stitching results of the first image set when using four feature detectors. All of them stitched subimages successfully into the original image. Since each subimage is not modified with any perspective transformation, the stitching is the process of simple translation.

Figure 4 shows the stitching results of the second image set. Three of the feature detectors were able to stitch the images while BRISK failed to do so. The results of SURF and ORB are not satisfactory. Only SIFT managed to produce the result similar to the original image.

Figure 6 shows the stitching results of the third set images. All of the four feature detectors were able to stitch the images into larger panorama images. However, there exist differences in seamlessness among the results. Both SURF and SIFT produce the results with higher quality than those of ORB and BRISK. In particular, the contour lines of the buildings in the result of BRISK are indistinguishable because of warping.

We now evaluate the four feature detectors by comparing the proposed index values. Figure 7 shows the values of the three indexes of , , and when the first set of the images are processed by the four detectors. For , the higher the value is, the more efficient the detectors are. Since the first image set contains the vertical dividing of a larger image, finding matching feature points among them is not difficult. Thus, the four detectors show similar performance. For , the closer to 1 the value is, the better the chances are to obtain the dual homography. All of the ’s of the four detectors have the values around 1, implying that the stitching by the dual homography proceeds smoothly. For , the higher the value is, the more accurate the estimated homography is because it means that the feature points are evenly distributed over planes. All of the ’s have the value of over 200, implying that all of the four detectors are able to stitch the images successfully. Since all of the four detectors have similar index values, we can predict that they produce similar stitching results, which is true when observing the panorama images of Figure 3.

Figure 8 shows that the index values for the second image set vary depending on the detectors. It is because the images of the second set are distorted after dividing a larger image. Thus finding matching feature points is challenging for some of the detectors. Note that only SIFT has the values which fall into proper ranges; in particular is very close 1 and is as high as 600. BRISK has no results because it failed to stitch images. SURF and ORB have the values which are away from 1 and the values around 200 implying that the feature points are not proper to estimate correct homography. From these, it is predicted that only SIFT can stitch the images correctly while others cannot, which is true when observing the results of Figure 4.

Figure 9 shows the index values for the third image set. All of the feature detectors excluding BRISK have similar values for all the indexes. Note that the value of BRISK has less than 200, implying that the estimated homography has errors. From these, it is predicted that the stitching results of SURF, SIFT, and ORB are similar while BRISK produces an incorrect result, which is proved true by the results of Figure 6.

In summary, it is possible to predict the stitching results only by observing the three indexes which capture the capability of the feature detectors to contribute to the estimation of the dual homography. Particularly, we can observe that and are closely related with the correct estimation of the homography.

4. Conclusions

We proposed three feature coverage indexes which evaluate the stitching performance of feature detectors and predict the outcomes of the stitching. Particularly, these indexes are developed to evaluate the stitching process involving the dual homography which are for the images containing multiple different planes. We evaluated the four well-known feature detectors by the proposed indexes by applying them to the image stitching process and showed that the prediction by the index values coincides with the stitching results.

We note that the proposed indexes need improvements in the following area in future works. Firstly, the indexes are applicable only to the cases when stitched images contain more than two planes. It is particularly because of which involves the number of planes as parameter. Secondly, the indexes are not sufficient enough to evaluate the completeness of panoramic results because the indexes are mostly related with registration while the blending part is not covered. Future works will extend the indexes to be able to evaluate the quality of the stitched boundary.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgment

This work was supported by the University of Incheon International Cooperative Research Grant in 2012.