An Improved Building Reconstruction Algorithm Based on Manhattan World Assumption and Line-Restricted Hypothetical Plane Fitting
An improved patch-based multiview stereo (PMVS) algorithm based on Manhattan world assumption and the line-restricted hypothetical plane fitting method according to buildings’ spatial characteristics is proposed. Different from the original PMVS algorithm, our approach generates seed points purely from 3D line segments instead of using those feature points. First, 3D line segments are extracted using the existing Line3D++ algorithm, and the 3D line segment clustering criterion of buildings is established based on Manhattan world assumption. Next, by using the normal direction obtained using the result of 3D line segment clustering, we propose a multihypothetical plane fitting algorithm based on the mean shift method. Then, through subdividing on the triangle mesh constructed based on the building hypothetical plane model, semidense point cloud can be quickly obtained, and it is used as seed points of the PMVS pipeline instead of the sparse and noisy seed points generated by PMVS itself. After that, dense point cloud can be obtained through the existing PMVS expansion pipeline. Finally, unit and integration experiments are designed; the test results show that the proposed algorithm is 15%∼23% faster than the original PMWS in running time, and at the same time, the reconstruction quality of buildings is improved as well by successfully removing many noise points in the buildings.
Accurate spatiotemporal information is the fundamental guarantee of the implementation of smart cities. The digital building model is one of the core elements of spatiotemporal information, and its automatic 3D reconstruction technology has received significant attentions in the field of geomatics and computer vision [1, 2]. Although several kinds of data sources could be used to reconstruct buildings such as images and LiDAR, image-based building reconstruction has become a research hotspot for it is able to provide very fine building models with relatively low cost [2–6].
Among those existing image-based building reconstruction algorithms, structure from motion (SfM) and multiview stereo (MVS) are able to reconstruct 3D point clouds from multiple building images. As the most frequently used MVS algorithm, patch-based multiview stereo (PMVS) proposed by Furukawa is able to reconstruct the dense point cloud building model with very good performance [3, 7, 8].
The advantages of PMVS algorithms include the following: (1) no need of bounding volume prior such as convex hull and bounding box; (2) being able to reconstruct both single object and crowded scenes; and (3) having high accuracy and efficiency . However, as many researchers mentioned [4, 5, 9, 10], the main disadvantages of PMVS are the following: (1) the obtained normal information of the reconstructed points is not well consistent with its local geometry, and the problem becomes more obvious when the image capturing configuration is downward shooting or upward shooting; (2) the algorithm itself is with high time and memory complexities, which makes its time and memory consumption unacceptable when processing large-scale high-resolution images; and (3) it sometimes fails to reconstruct the complete point cloud of objects with poor-textured regions and non-Lambertian planes.
Recently, some studies were conducted to improve the building reconstruction technology. Shi et al.  proposed an improved strategy based on a patch adjusting trick through using the scene geometric information and a multiresolution expanding approach, which improves the normal vector precision and surface smoothness. Ying et al.  defined matching strategy constraints in the PMVS feature matching process; as a result, the precision and integrity of the reconstructed point cloud are improved. Wu et al.  used the tensor to improve the PMVS algorithm, through fully using the surface geometry information, and the normal vector of PMVS reconstruction point cloud is effectively improved. Fei et al.  presented a novel object-based MVS algorithm using structural similarity (SSIM) index matching cost in a coarse to fine workflow, and the efficiency can be greatly improved when the overlap between images is large. Aiming at large UAV image processing in narrow baseline cases, Xiao et al.  proposed an image grouping and self-adaptive patch-based multiview stereo matching algorithm aided by DEM information, making the processing efficiency significantly better than that of the PMVS algorithm. For generating accurate large-scale 3D models of buildings employing point clouds derived from UAV-based photogrammetry, a new two-level segmentation approach based on efficient RANSAC shape detection was developed to segment potential facades and roofs of the buildings and extract their footprints . Li et al.  used the Canny boundary detection operator to improve the PMVS algorithm and obtained a better reconstruction result. Considering the difficulties in the reconstruction of textureless surface under Manhattan world assumption, Furukawa et al.  proposed a method using the point cloud reconstructed by PMVS to extract the hypothetical plane and using the MRF model to reconstruct the depth map of each photo, and this method provides a better performance in poor-texture condition. Zhu and Leow  presented an alternative approach for reconstructing textured mesh surfaces from point cloud recovered by using the patch-based MVS method, and their test results showed that the reconstructed 3D models are sufficiently accurate and realistic for 3D visualization in various applications. Li et al.  defined a texture probabilistic grammar and developed an algorithm to obtain the 3D information in a 2D scene by training the texture probabilistic grammar from the prebuilt model library, and the algorithm was reported to be superior to the traditional reconstruction method based on point clouds.
The above approaches are mostly improved through enhancing the precision and the normal vectors of point clouds, while few studies have been performed on the improvement of efficiency, and the result in the research by Xiao et al.  could accelerate the process time, however, it depends on DEM data, which could not be directly used in building reconstruction. To adapt to the massive reconstruction data and rapid data updating of smart cities, the reconstruction efficiency of the PMVS algorithm remains to be improved .
Basically, the PMVS algorithm uses weak constraints to perform image feature matching and reconstructs the sparse point cloud , which causes local optimal results and the reconstructed sparse point cloud to be contaminated by a lot of noise points. When such noise-rich sparse points are used as the seed points for the subsequent patch expansion and optimization process, meaningless expansions could be executed, and the algorithm efficiency will be downgraded since the patch expansion and optimization process is the main time-consuming process in the PMVS algorithm. Therefore, theoretically, the PMVS algorithm could be improved through enhancing the accuracy and density of the seed points.
In this study, an improved PMVS algorithm effectively using the spatial characteristics of buildings is proposed, which generates seed points for the PMVS pipeline directly using 3D line segment information. In Section 3, a 3D line segment clustering algorithm based on the Manhattan world assumption is presented, which is able to determine the directions of three coordinate axes through clustering line segments into two groups, axis directional and nonaxis directional, quickly and accurately. Then, the hypothetical plane fitting algorithm based on 3D line sets is presented to obtain the building hypothetical plane model. After that, in Section 4, an initial triangle mesh is constructed using the point cloud sampled from 3D line segments for each hypothetical plane, and semidense seed points can be obtained through triangle mesh subdivision with a penalty mechanism; consequently, a dense point cloud model of the building is obtained through existing PMVS expansion and filtering pipeline. Finally, in Section 5, tests are designed to verify the performance of our algorithm, and the experiments show that the proposed algorithm is able to accelerate the building reconstruction speed of the PMVS, and our algorithm successfully removed many noise points that do not belong to buildings.
2. The Overall Working Flow of the Improved PMVS Algorithm
The overall working flow of the improved PMVS algorithm proposed in this study is described in Figure 1.
In this method, the 3D line segments of the building are first identified and extracted using Line3D++ algorithms ; after that, the line segments are clustered using its direction information, and then, the hypothetical plane model of the building is fitted. After obtaining the hypothetical planes, the triangle meshes of the building are generated, and an accurate semidense point cloud can be quickly obtained by subdividing the triangle meshes with a penalty mechanism. Finally, the semidense point cloud is processed as the seed points of PMVS, and the dense point cloud model of the building could be constructed through the existing PMVS expansion and filtering pipeline.
3. Hypothetical Plane Fitting Algorithm Based on the 3D Line Segment Model of Buildings
The hypothetical planes are the potential planar faces in building. Regarding the data source, the existing building hypothetical plane fitting algorithm can be divided into two categories: (1) based on dense points , and the disadvantage is that the acquisition of dense points is a time-consuming process and (2) based on sparse points and sparse 3D line segments , and the corresponding issue of such algorithms is that the sparse point cloud reconstructed from SfM usually contains much noise, which may reduce the precision of the fitted hypothetical plane. In addition, SfM depends on feature point detection, and if there is textureless region or region with non-Lambertian reflectance in the building surfaces, it will be very difficult to obtain reliable sparse or dense points.
Real-world buildings generally consist of several mutually perpendicular or parallel faces being rectangles or rectangle-like styles. Considering the above spatial geometric features, this paper proposes a hypothetical plane fitting algorithm based on the 3D line segment model of buildings. First, the Line3D++ algorithm by Hofer et al. is applied to reconstruct the 3D line segment model ; then, a 3D line segment clustering criterion under the Manhattan world assumption is proposed, and after that, the spatial multihypothesis planes are fitted using the mean shift method. The algorithm has higher reliability and efficiency in the complex reconstruction environment.
3.1. 3D Line Segment Clustering Algorithm Based on Manhattan World Assumption
Manhattan world assumption holds that the normal vectors of the planes of a building are usually parallel to one coordinate axis of the same Cartesian coordinate system , and it is widely used in building reconstruction and its postprocess [19–22]. A Manhattan world coordinate system is shown in Figure 2.
Different from those previous research studies, we are using Manhattan world assumption to obtained seed points for the follow-up PMVS expansion and filtering pipeline. We present a 3D line segment clustering algorithm to quickly and accurately extract those axis-aligned and nonaxis-aligned line segment clusters, together with the three axis directions of the Cartesian coordinate system as well. As shown in Figure 3, the main steps of the algorithm are as follows: Step 1. Perform preliminary clustering of the 3D line segments of the building using line direction information (refer to Algorithm 1 in the Supplementary Section), and then, the initial cluster set of the 3D line segments are obtained. Step 2. According to the perpendicular relationship among different line segment clusters, all cluster trios, which are perpendicular with each other, are extracted from the initial 3D line segment clusters, and a cost function is defined to find the optimal axis-aligned cluster trio, which has the greatest possibility of being parallel with the Cartesian coordinate axes (refer to Algorithm 2 in the Supplementary Section). Step 3. Extract the nonaxis-aligned clusters using the information of the optimal axis-aligned cluster trio. Step 4. Reclaim those missed line segments during the above clustering process by performing the clustering algorithm using the included angle information between the missed line segment and the cluster direction. Step 5. Filter all clustered line segments and line segment clusters according to the length of line segments and the size of line segment clusters. As a result, those line segments with their length being smaller than certain threshold will be treated as noise and be removed.
The detailed description of Step 1 is as follows. In the process of preliminary clustering of 3D line segments, we define as the unclassified line segment set and as the classified line segment cluster set. First, put all the line segments into in descending order according to their length. For each element in , check whether it belongs to any existing cluster according to its included angle and length, and if it does not belong to any existing cluster, generate a new cluster if its length is greater than the threshold value, or otherwise, put it back to waiting for the next loop operation. The threshold value is the average length of all the line segments in the 3D line segment model, and constrained by that threshold value, the length of the first line segment of the newly added line segment clustering can be avoided to be too short, which ensures the accuracy of the newly added clustering, and is conducive to improving the robustness of the clustering algorithm. Repeat the above operations until U is empty or the number of cycles reaches the upper limit to obtain the initial 3D line segment cluster set .
The detailed description of Step 2 is as follows. In the extraction process of the three axis-aligned line segment clusters, first, the main direction of each initial 3D line segment cluster is calculated; then, the cluster pair whose main directions are perpendicular to each other are identified, and for each perpendicular cluster pair, find the cluster in the remaining cluster set that forms a Cartesian coordinate system; as a result, a candidate axis cluster trio is formed. For each group of candidate axis cluster trio, the cost function shown in (1) is established to evaluate its credibility, and finally, the optimal axis-aligned cluster trio , which is with the greatest value, is obtained from all the candidate axis cluster trios:where represents a line segment in set , represents the direction of , represents the length of , represents the total number of line segments in set , and , , and , respectively, represent the main directions of the three axis-aligned line segment clusters in the kth candidate axis cluster trio. The cost function actually tells the number of line segments being parallel to the three axes of the Cartesian coordinate system of the current 3D line segment cluster trio.
The extraction process of nonaxis-aligned line segment cluster follows the determination process of the optimal axis cluster trio . For each line segment cluster, calculate the included angles between its main direction and the directions of three axis-aligned clustering in ; if all three angles are greater than a threshold value (the angle is negatively correlated with the direction consistency of line segments in the axis-aligned line segment cluster, and the recommended threshold value is 20°, which is obtained by experiment), the line segment cluster is determined to be a nonaxis-aligned one and is added to the nonaxis-aligned line segment cluster set . After second clustering and filtering procedure, the final line segment cluster set B can be obtained.
3.2. A Multihypothetical Plane Fitting Algorithm Based on the Mean Shift Method
As aforementioned, real-world buildings generally consist of several mutually perpendicular or parallel planes; if we successfully get all those planes’ model, the building could be represented using a B-rep model. To fit planes on a building, we use a spatial multihypothetical plane fitting algorithm based on the mean shift algorithm . First, we generate the sparse point cloud through equidistance point sampling from the 3D line segments, then determine the normal vector of the hypothetical plane according to the main direction of the line segment cluster, and after that, we use the mean shift algorithm to cluster the point cloud and get the spatial multiplane point cloud clusters along the direction of each normal vector. The spatial point cloud cluster of each plane represents the hypothetical plane that needs to be fitted.
3.2.1. Fitting of the Axis-Aligned Plane
The axis-aligned direction is defined to be parallel to one of the three coordinate axes in the Manhattan world Cartesian coordinate system. Based on the Manhattan world assumption, the normal vectors of most building planes are parallel to one of the three coordinate axes; so, the most hypothetical planes of buildings can be extracted by performing spatial point cloud clustering along the three axis-aligned directions. The specific algorithm flow is shown in Figure 4, and the steps are as follows: Step 1. Generate point cloud: generate sparse point cloud through equidistance sampling on each line segment in axis-aligned clusters; equidistance sampling means dividing each line segment in axis-aligned clusters into several parts at equal distance d to get all end points of the subline segment, and d is related to the expected size of point cloud for plane fitting. Step 2. Transform coordinates: after obtaining the sparse point cloud , for the 3D coordinate of each 3D point, use the formula to convert the coordinates of the point from 3D coordinate to 1D, where indicates the axis direction of the plane to be fitted to. Step 3. Fit planes based on the mean shift method: the mean shift clustering method is performed on the 1D point cloud obtained by coordinate transformation of , and multiple initial point cloud clusters are gotten. Each initial point cloud cluster represents an initial hypothetical plane. Step 4. Filter hypothetical planes: a cost function is defined in (2) to evaluate the credibility of the initial hypothetical plane , and those low-credibility ones are filtered out: where is the number of 3D line segments contained in , represents the length of the ath 3D line segment in , and represents the length of the shortest line segment in . The cost function considers both the number and the length of the line segment in the initial hypothetical plane , and more 3D line segments and longer segments usually mean the higher probability that the initial hypothetical plane represents the real plane. Generally, would be adopted if the following condition is satisfied: where is the value of , is the maximum value of of all initial hypothetical planes, is related to the quantity and precision specified for extracting hypothetical plane, and the recommended value for is 50%. Step 5. Repeat steps 2 to 4 on all axis directions, and the final axis-aligned hypothetical plane fitting result is obtained.
3.2.2. Nonaxis-Aligned Plane Fitting
Unlike plane fitting in axis-aligned direction, it is difficult to determine the normal vector of those nonaxis-aligned planes. As shown in Figure 5, for the same nonaxis-aligned line segment cluster, there are multiple possibilities of the plane clustering direction (or the normal vector direction of the hypothetical plane). From the information provided by the existing line segment clusters, we can determine the direction of the plane clustering and find the most possible hypothetical plane distribution. Suppose H is the set of nonaxis-aligned line segment, and F is the final set of hypothetical planes. The mean shift algorithm is selected in the paper to conduct plane fitting due to its three advantages: (1) it does not need any prior information about the characteristics of the data model and can be applied to clustering of any feature space; (2) it only need one parameter (bandwidth of the kernel function) as input; and (3) it adopts statistical features and has strong robustness to noise. Figure 6 is the process sequence diagram, and the detailed description (refer to Algorithm 3 in the Supplementary Section) is as follows: Step 1. Initialize the set H of line segments to be clustered: take out each member from set N, the nonaxis-aligned line segment cluster, and add all its line segments to H. Step 2. Generate the sparse point cloud : generate sparse point cloud from a 3D line segment in H by means of equidistance point sampling. Step 3. Determine the direction of plane fitting: randomly pick two line segments and , compute the normal vector of the corresponding plane S, and decide whether is the valid plane clustering direction by counting the number of line segments embedded in plane S in H. If the number is greater than given threshold (the recommended value is 3), will be determined to be the effective plane clustering direction, and the subsequent plane fitting operation will be performed; otherwise, two new line segments will be randomly picked from H again, and repeat aforementioned operation. Step 4. Fit planes based on the mean shift method: if is the valid clustering direction, perform mean shift clustering on point cloud to obtain the point cloud cluster set , which will be used to fit the corresponding hypothetical planes. Step 5. Filter hypothetical planes: for each hypothetical plane in , if its line segment number is times greater than total line segment number in H, it will be treated as valid and be added to F, and all corresponding line segments will be removed from H, where the recommended value for is 50%. In addition, those left invalid hypothetical planes in will be kept first and processed later. Step 6. Repeat steps 2 to 5 until H is empty or the cycle number reaches the upper limit. Step 7. Find the optimal hypothetical plane cluster: suppose after r cycles of operations from steps 2 to 5, there are r hypothetical plane clusters in total, and represents the jth hypothetical plane. We define a cost function shown in (4) to evaluate the credibility of each hypothetical plane cluster: where is the value of the ith hypothetical plane in , and m is the total hypothetical plane number in . From the r hypothetical plane clusters, select the one with the largest as the optimal hypothetical plane cluster and name it as . Step 8. Filter hypothetical planes: filter using cost function W2 defined in (2), and add to the set F. Step 9. Traverse each member in the nonaxis-aligned line segment cluster set N, repeat steps 1 to 8 to get the final nonaxis-aligned hypothetical plane fitting result.
Additionally, the criterion is that the optimal hypothetical plane cluster is the one with the least number of hypothetical planes, and the maximum sum of hypothetical planes credibility is used to solve the problem of direction ambiguity of hypothetical plane fitting in the nonaxis-aligned direction. Now, the building hypothetical plane model is obtained and can provide reliable building’s plane information for the subsequent triangle mesh subdivision algorithm.
4. An Improved Expanding Algorithm Based on Triangle Mesh Subdivision
Considering the building’s spatial and geometry characteristics, we improve the expanding mechanism of the PMVS algorithm based on triangle mesh subdivision. By using the information of the hypothetical plane model of the building, semidense point cloud is obtained from spatial subdivision of the triangle meshes of the building and is used to replace sparse point cloud generated by the original PMVS algorithm working as the expanding seed points in the PMVS pipeline, so that the efficiency of the PMVS algorithm can be improved.
4.1. Working Flow
The point cloud cluster in the hypothetical plane obtained from the above hypothetical plane fitting algorithm is used as the input data, and the working flow of the improved algorithm based on triangle mesh subdivision is shown in Figure 7. The detailed steps are as follows: Step 1. Construct the initial Delaunay triangle meshes Step 2. Subdivide and refine the initial model Step 3. Use the Nelder–Mead nonlinear optimization algorithm to optimize and filter the semidense point cloud through patch optimization and filtering Step 4. Reconstruct the triangle meshes of the point cloud and update the normal information Step 5. Repeat steps 2 to 4 within times, and a recommended value for is 3 Step 6. Set the optimized semidense point cloud as the seed points of the PMVS algorithm, repeat the expansion and filtering process defined in the original PMVS pipeline for times, and the final dense point cloud is obtained, where a recommended value for is 2
4.2. Building the Initial Triangle Meshes
To implement subdivision, the initial triangle mesh should be obtained first. For each hypothetical plane, the triangle mesh is reconstructed using the GPU-based 2D constrained Delaunay triangulation algorithm  with line segments derived from the 3D line segment model as constraint. And mesh information needs to be initialized after the reconstruction, including the normal vector, visibility information, and subdividing penalty value. The normal vector is the average unit normal vector of all triangles planes, which contain this vertex. The visibility information is a set of images in which the vertex is visible, which is introduced by PMVS . The subdividing penalty values are initialized to 0, which will be described in detail in Section 4.5.2.
4.3. Subdivision of Triangle Meshes
In the subdivision process, the area of the triangle patch is used to determine whether it needs subdivision. If its area is bigger than a certain threshold, the center point of the triangle patch is added, and the original triangle patch is subdivided into three triangle patches. The process could be described in Algorithm 4 in the Supplementary Section. And the detailed steps are as follows: Step 1. Given the th of the triangle in the initial triangle mesh , the area threshold can be computed as where represents the average triangle area of the triangle mesh, loop represents the cycle number of subdivision, and represents the subdivision penalty value of , which can be computed as where , , and , respectively, represent subdivision penalty values of three vertices in . Two sets and are used to represent the set of triangles to be subdivided and the set of temporary triangles generated during subdivision process, respectively, and compute the area of ; if >, add to , or otherwise, repeat step 1 on the next triangle of . Step 2. For the th of the triangle in the set UT, compute the center point of , and add the center point to the point cloud as a new expanded vertex. Update the normal vector of and the visibility information as follows: set the normal vector of to the normal vector of plane ; and set the visibility information of as , where , , and represent the visibility information of the three vertices of , respectively. Step 3. After inserting point to the point cloud as the new expanded vertex, and the original three vertices of the triangle t(m2) will form three new triangles, and then add the three new triangles to the set . Step 4. Repeat steps 2 to 4 for each triangle in the set , and exchange the set and after each cycle. Step 5. Repeat the step 4 until set is empty. Step 6. Repeat steps 1 to 5 for the next triangle until all triangles in are traversed.
4.4. Patch Optimization and Filtering
By using the aforementioned triangle mesh subdivision algorithm, the initial triangle meshes are greatly refined. However, such subdivision is executed indiscriminately, and consequently noise vertices could be introduced. Therefore, additional optimization and filtering operations are needed to remove those noise points.
In the original PMVS pipeline, three optimization parameters are used in the subdivision optimization process : patch position and two Euler angles related to the normal vector of patch . This paper fully uses the geometric information provided by the triangle mesh of the hypothetical planes and reduces the parameters in the original nonlinear optimization of PMVS to just one parameter: the patch position . And the normal vector of the patch remains unchanged during the optimization process. The patch position of the original PMVS algorithm is limited to the line which connects the center of the patch to the camera’s optical center corresponding to the reference photo. In our algorithm, the patch is restricted to move along the normal direction during the optimization process.
Since the normal vector of the patch is regarded as a known parameter in this process, the number of optimization parameters is reduced, and the optimization speed can be therefore improved.
4.5. Triangle Mesh Updating Based on Subdivision Penalty Mechanism
4.5.1. Update of Vertex Normal Vector
The point cloud of a given hypothetical plane has changed a lot after patch optimization and filtering. It is necessary to reconstruct the triangle mesh and update the normal vector information of all vertexes in the new triangle mesh using the average of all normal vectors of those triangles containing the vertex.
4.5.2. Subdivision Penalty Mechanism
If we subdivide every triangle in the mesh using the subdivision strategy mentioned in Section 4.3 but without a penalty mechanism, each region of the hypothetical plane will almost have the same point cloud density. When there is a hole in the hypothetical plane, such an algorithm will not only downgrade the processing efficiency but also bring additional noise points.
To overcome the problem, a penalty mechanism is introduced to determine whether to perform subdivision on certain triangle patch guided by the results of patch optimization and filtering to improve both the processing efficiency and the accuracy of the generated point cloud.
Suppose is used to represent the new triangle mesh corresponding to the initial triangle mesh after triangle mesh subdivision, patch optimization, and filtering. The process of the subdivision penalty mechanism could be described in Algorithm 5 in the Supplementary Section. And the detailed steps are as follows: Step 1. Compute the subdivision success rate r of all triangles in the initial triangle mesh . The subdivision success rate is used to decide whether a triangle should be punished. For the ith of the triangle in , the subdivision success rate is computed using where is the total number of vertices obtained after the subdivision of the triangle , and indicates the number of vertices that the triangle contains after the improved patch optimization and filtering process. The subdivision success rate reflects the probability that is in a hole region: generally, the smaller is, the more probable is in a hole region. As a result, all vertices contained in should be punished, which makes them have low probability to be subdivided. Step 2. For the jth vertex in the new triangle mesh , use to represent its penalty value, and use to represent its penalty in mesh if it does not come from subdivision. If comes from subdivision, find the parent triangle of , and then go to Step 3; otherwise, , the penalty value of will not update, and then go to Step 4. Step 3. If the subdivision success rate of (the parent triangle of ) satisfies the following conditions: then the needs to be penalized, and we need modify the penalty value of using (9); otherwise, just directly update it using (10): where , , and , respectively, represent the subdivision penalty values of the three vertices of the parent triangle , is a preset penalty value, is a preset subdivision success rate threshold, and the recommended value for is 0.5. Step 4. Repeat steps 2 to 3 for each mesh vertex in , and finally complete the update of the mesh vertex subdivision penalty value of .
5. Experiment and Analysis
5.1. Test Datasets and Test Environment
To verify the performance of the algorithm presented in this paper, four sets of building sequence photos are used, which are named house1, house2, hall, and coffee shack, respectively. The detailed information such as the photo numbers and resolutions are shown in Table 1, and the previews of the four datasets are shown in Figure 8.
The hardware configuration is as follows: CPU: Intel Core i5-6500 @ 3.2 GHz quad core; GPU: NVIDIA GeForce GTX 750, 2 GB RAM; and memory: 16G.
The experiment carried out below can be divided into three parts. The first part is designed to verify the effectiveness of the hypothetical plane fitting algorithm, and the second part is designed to verify the effectiveness of triangle mesh reconstruction and subdivision. The third part runs the whole algorithm flow and compares the time consumption and the quality of point cloud reconstruction with the original PMVS algorithm.
5.2. Hypothetical Plane Fitting
5.2.1. 3D Line Segment Clustering Algorithm Based on Manhattan World Assumption
Before the experiment, the COLMAP’s SfM algorithm  is used to perform camera calibration and sparse point cloud reconstruction for each experimental dataset; thus, the camera intrinsic and extrinsic parameters, initial feature point matching information, and sparse point cloud information are obtained. After that, Line3D++ , a 3D line segment model reconstruction algorithm, is used to obtain all 3D line segment information for each dataset.
These 3D line segment models are used as the input data, and clustering operations are performed using a 3D line segment clustering algorithm based on Manhattan world assumption. The 3D line segment clustering results of the four datasets are shown in Figure 9 (different types of line segments are represented using different colours in the figure), and the figure shows that the algorithm successfully extracts the main directions of the building line segment model, which includes the main axis-aligned and nonaxis-aligned directions. It could be found that(1)It performs effectively on the nonaxis-aligned direction and works robustly on clustering of short line segments. As shown in Figure 10, the nonaxis line segments in the datasets house1 and hall are still clustered correctly, although they are very short (about 1 m in Figure 10(a) and 0.2 m in Figure 10(b)), and the roof on the left side of the dataset house2 is not extracted just because the detected segments are too sparse.(2)It works robustly when there is strong noise caused by natural landscapes. Most of the natural landscapes do not follow the Manhattan world hypothesis, and its reconstructed line segments can easily be filtered out because it is difficult to find their category in line segment clustering. Therefore, besides line segment clustering functionality, it also has the filtering function of line segment noise brought by natural landscape or nonartificial objects. In the dataset coffee shack, the trees introduce a large number of noise line segments to the Line3D++ reconstruction result, as shown in Figure 11(a), while after using the line segment clustering algorithm proposed in this paper, most of these noise segments are filtered out, as shown in Figure 11(b).
5.2.2. Multihypothesis Plane Fitting Based on the Mean Shift Method
After obtaining the 3D line segment clusters of each dataset, the multihypothesis plane fitting algorithm based on the mean shift method is used to fit the point cloud sampled from the line segment model. The results of plane fitting are shown in Figures 12 and 13. In the two figures, the hypothetical plane models of each dataset are presented in a point cloud clustering manner. Each point cloud cluster represents a hypothetical plane using different colours.
The fitting results show that the main planes of each building in the four datasets (including the main facades and the planes where the roof is located) are correctly extracted.
The extraction completeness information of hypothetical planes is shown in Table 2, where the completeness rate refers to the ratio of the number of final extracted hypothetical planes to the number of actual hypothetical planes. The actual hypothetical plane refers to the plane determined manually from the 3D line segment model of the dataset (tiny planes are not counted). Table 2 shows that our algorithm maintains a high completeness rate of hypothetical plane extraction for each dataset. Specially, the root causes of the five cases less than 1.00 are as follows: (1) there are too few reconstructed 3D line segments to successfully recover the plane information in the missed hypothetical plane; and (2) the quantitative area of the missed plane is too small or the line segments is too short.
Although the algorithm may miss some hypothetical planes with small areas, the main hypothetical planes are correctly extracted, which will show little effect on the subsequent triangle mesh subdivision.
The consumed time information of the test is shown in Figure 14. The relationship between the consumed time and the total number of 3D line segments shows a linear-like relationship, and the time on line segment clustering is much less than that of plane fitting, which means the time is mainly consumed on the mean shift clustering method.
5.3. Triangle Mesh Construction and Subdivision
5.3.1. Triangle Mesh Construction
In this test case, triangle mesh models are reconstructed using the algorithm described in Section 4.2 on the hypothetical planes model of the four datasets. Since the constrained Delaunay triangulation algorithm used in the paper is limited by the line segment in the 3D line segment model, the reconstructed triangle mesh preserves the geometric features of the building well. As shown in Figure 15, the window, doors, and facades can be clearly identified. In addition, a “fake mesh region” phenomenon caused by noise line segments from natural landscape appeared in the results of the dataset coffee shack, which will not affect the final reconstruction results because it will be filtered by the subsequent subdivision penalty strategy.
5.3.2. Triangle Mesh Subdivision Algorithm
In this case, the algorithm described in Section 4.3 is used to perform multiround subdivision of the triangle meshes on the dataset house1 with different loops. The test result of a wall is shown in Figure 16. The figure shows that the spatial subdivision algorithm can uniformly and densely subdivide the triangle meshes with the normal vectors of points transiting smoothly. There is no erroneous normal vector being contrary to that of the hypothetical plane, which proves the validity of updating the normal vector in the spatial subdivision algorithm.
Figure 17 shows the results of the subdivision test with a penalty mechanism. The subdivision algorithm with a penalty mechanism is able to treat those triangles discriminately, which effectively avoids the invalid subdivision of the meshes in the real cavity region, and improves the efficiency and accuracy. The subdivided meshes are able to faithfully present the original real plane: sparse in the cavity region and dense in the noncavity region.
5.3.3. Improved PMVS Patch Optimization Algorithm
In this test case, the improved PMVS patch optimization algorithm in Section 4.4 and the original PMVS patch optimization algorithm are used to optimize the same input point cloud produced by triangle mesh subdivision with different loop numbers separately. The comparison of the consumed time and the success rate are shown in Figures 18 and 19.
Figure 18 shows that the time consumption of the improved patch optimization algorithm is significantly lower, and the running efficiency is increased by 65% on an average comparing with the original algorithm. Figure 20 shows the visual comparison between test results of the original and the improved patch optimization algorithms on point cloud from different viewports, and the processed point cloud is obtained after two rounds of triangle mesh subdivision. As shown in Figure 20, the improved algorithm has the same optimization ability as the original algorithm, and there is no clear visual difference on the two groups of optimized point clouds.
From all above speaking, the comparison shows that the optimization effect of the improved optimization algorithm and the original PMVS optimization algorithm are visually similar. Although in some small area, the improved optimization algorithm could lead to partially missing points because of the failure of optimizing, and the consumed time of the improved optimization algorithm is only 35% of that of the original algorithm.
5.4. Integrated Experiment and Results Analysis
In the integrated experiment, algorithms including the hypothetical plane fitting algorithm based on the 3D line segment model, the mesh subdivision algorithm with the penalty mechanism, and the original PMVS expansion algorithm are used to reconstruct dense point clouds on the four datasets, respectively, and the setting parameters in the reconstructing process such as iteration number are shown in Table 3.
To compare our algorithm with the original PMVS algorithm, the original PMVS algorithm (actually PMVS2) is also run to reconstruct dense point clouds on the four datasets. The original PMVS algorithm has 3 iterations, and its parameters are all default settings.
To visually compare the final results, the seed point cloud reconstructed by the original PMVS algorithm and the improved algorithm are shown in Figures 21 and 22, respectively. From the two figures, we can find that the seed point cloud of the original PMVS algorithm is sparse and with a lot of noise, but the seed point cloud reconstructed by the improved algorithm is very clean and highly accurate.
The reconstructed point clouds of the original PMVS algorithm and our improved algorithm are compared in Figure 23. As shown in Figure 23, the reconstruction results of the improved algorithm are similar to that of the original PMVS algorithm or even no difference in the main part of the building, while they have some points missing in the nonmain part and nonbuilding area, which have been marked with a red circle in the figure.
Considering the algorithm efficiency performance, the comparing test results of running time and point cloud density between the original PMVS algorithm and the improved algorithm are shown in Figures 24 and 25, and it can be found that the improved algorithm is 15%∼23% faster than the original PMVS algorithm, while the point cloud losing rate is kept within −10.10%∼3.73%, which demonstrates that the improved algorithm efficiency is effectively improved. As known, the losing rate of point cloud refers to the reduction rate of the number of reconstruction points of the optimization algorithm comparing with the original algorithm. In this test, the improved algorithm shows the best performances on the dataset house1 with 10.10% more points and 23.32% less consumed time than the original PMVS algorithm, as shown in Figures 24 and 25, which proves that the improved algorithm is very suitable for buildings with simple geometric features and structures.
In addition, as shown in Figure 23, the point cloud reconstructed by the original PMVS algorithm contains many noise points (shown in the red oval box in Figure 23(a)), which proves that the improved algorithm not only improves the efficiency but also improves the accuracy of the reconstructed model by removing a large amount of noise points.
It could be found that the final reconstructed point clouds lose partial points, and the root cause is that these three buildings have some special structures which are not included in the hypothetical plane model of buildings. And these structures are more independent of the building’s main body. When there are few seed points appearing on these structures, it is difficult to realize the reconstruction of these structures in the expansion iteration process of PMVS. For example, the two chimneys and column towers on the building’s roof in the dataset house2 and the white wall on the left side of the main cabin in the dataset coffee shack were not reconstructed by the algorithm. However, these parts are not the building’s main body, and the original PWMS algorithm also failed to reconstruct them and failed to build a consolidated building model.
Overall speaking, from the aforementioned experimental results, we can find that the improved PMVS algorithm achieves similar point cloud reconstruction accuracy with the original PMVS algorithm, but our algorithm is 15% to ∼23% faster than the original PMVS algorithm in running time, and our algorithm successfully removed many noise points in the buildings.
For accelerating building reconstruction, this paper presents a novel method, which generates seed points for the PMVS expansion pipeline using the 3D line segment information. Two algorithms are proposed in the study: a hypothetical plane fitting algorithm using the 3D line segment information of the building and a triangle mesh subdivision and patch optimization algorithm with a penalty mechanism, and the first algorithm works as the input of the second one:(1)Based on Manhattan world assumption, the hypothetical planes fitting algorithm clusters the 3D line segments by using greedy thought to extract the coordinate directions and which later is used to guide the subsequent procedures including the clustering of point cloud based on the mean shift method and the fitting of hypothetical planes(2)The triangle mesh subdivision and patch optimization algorithm generates the semidense point cloud using hypothetical plane information with a penalty mechanism, and then, the semidense point cloud is sent to the original PMVS expansion pipeline as the seed points(3)Experiments show that the improved approach not only accelerates the speed of building reconstruction but also provides visually better reconstruction results and more accurate performance through effectively removing noise points from the building model
The future work will focus on how to effectively recognize those small hypothetical planes in buildings and recover those small features without being affected by noises, and technologies such as deep learning would be used to recognize the building structures and guide the 3D reconstruction process.
All data used for this study can be accessed through the following URLs: (1) house1 is from https://github.com/zxg519/3D_reconstruction_dataset/tree/master/house1, (2) house2 is from https://demuc.de/colmap/datasets/, (3) hall is from http://mirror.openstreetmap.nl/sources/pmvs-2/data/hall/, and (4) coffee shack is originally from http://idav.ucdavis.edu/hkim/mvs/dataset/, and it can also be accessed on http://https://github.com/zxg519/3D_reconstruction_dataset/tree/master/coffeeshack.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
The authors would like to thank the support from Jiangsu Overseas Visiting Scholar Program for University Prominent Young & Middle-aged Teachers and Presidents, and thanks are also due to Prof. Ligong Wang from Soochow University for his great suggestion during our research. This work was supported by the National Key Technologies R&D Program of China (No. 2016YFB0502103).
This section includes supplementary information for pseudocodes of algorithms proposed in the paper. (Supplementary Materials)
M. Hess, S. Robson, and A. Hosseininaveh Ahmadabadian, “A contest of sensors in close range 3D imaging: performance evaluation with a new metric test object,” ISPRS—International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. XL-5, pp. 277–284, 2014.View at: Publisher Site | Google Scholar
S. M. Seitz, B. Curless, J. Diebel, D. Scharstein, and R. Szeliski, “A comparison and evaluation of multi-view stereo reconstruction algorithms,” in Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), pp. 519–528, New York, NY, USA, June 2006.View at: Publisher Site | Google Scholar
L.-M. Shi, F.-S. Guo, and Z.-Y. Hu, “An improved PMVS through scene geometric information,” Acta Automatica Sinica, vol. 37, pp. 560–568, 2011.View at: Google Scholar
S. Ying, W. Wenjian, and B. Xuefei, “3D dense reconstruction method based on multiple features,” Journal of Computer Science and Technology, vol. 9, no. 5, pp. 594–603, 2015.View at: Google Scholar
T.-P. Wu, S.-K. Yeung, J. Jia, and C.-K. Tang, “Quasi-dense 3D reconstruction using tensor-based multiview stereo,” in Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1482–1489, San Francisco, CA, USA, June 2010.View at: Publisher Site | Google Scholar
B. Li, Y. Venkatesh, A. Kassim et al., “Improving PMVS algorithm for 3D scene reconstruction from sparse stereo pairs,” in Proceedings of the Pacific-Rim Conference on Multimedia, pp. 221–232, Nanjing, China, December 2013.View at: Google Scholar
M. Li, P. Wonka, and L. Nan, “Manhattan-world urban reconstruction from point clouds,” in Proceedings of the European Conference on Computer Vision, pp. 54–69, Amsterdam, Netherlands, October 2016.View at: Google Scholar
T. Holzmann, M. Maurer, F. Fraundorfer et al., “Semantically aware urban 3D reconstruction with plane-based regularization,” in Proceedings of the European Conference on Computer Vision (ECCV), pp. 468–483, Munich, Germany, September 2018.View at: Google Scholar