#### Abstract

On the basis of Alpha Shapes boundary extraction algorithm for discrete point set, a grid partition variable step Alpha Shapes algorithm is proposed to deal with the shortcomings of the original Alpha Shapes algorithm in the processing of nonuniform distributed point set and multiconcave point set. Firstly, the grid partition and row-column index table are established for the point set, and the point set of boundary grid partition is quickly extracted. Then, the average distance of the -nearest neighbors of the point is calculated as the value of . For the point set of boundary grid partition extracted in the previous step, Alpha Shapes algorithm is used to quickly construct the point set boundary. The proposed algorithm is verified by experiments of simulated point set and measured point set, and it has high execution efficiency. Compared with similar algorithms, the larger the number of point sets is, the more obvious the execution efficiency is.

#### 1. Introduction

From the traditional electronic total station, handheld satellite positioning collector, to mobile (vehicle-mounted/airborne) three-dimensional laser radar, modern spatial information acquisition technology has entered the era of massive data at the GB and TB level. How to store, process, and express massive data efficiently has become a new challenge to the related fields such as computer information technology (IT), computer-aided design/manufacturing (CAD/CAM), geographic information (GIS) and remote sensing (RS), and even building information modeling (BIM) [1–4]. The boundary information of the discrete point set is composed of discrete points representing the original contour features of the measured object. It is a basic and important technical work in spatial data processing to quickly and efficiently construct the boundary information from the discrete point set. Two-dimensional point set boundary information is the basic data of land area statistics, road earthwork calculation, and other engineering applications [5, 6]. Three-dimensional point set boundary information plays an important role in the process of 3D model reconstruction [7–10].

To construct the boundary information of point set, it is necessary to study the shape of point set composed of two-dimensional or three-dimensional discrete points. From the literature reading analysis, it is known that, since the 1970s, scholars and experts at home and abroad have successively carried out research work in this field and got good research results. Graham [11] proposed a scanning algorithm for determining the convex hull of a planar set, but the algorithm is only effective for convex hull boundary and cannot deal with the case of concave boundary; Jarvis [12] proposed a numerical method based on two-dimensional convex hull point set to express the contour geometry of two-dimensional point set, but again it can only deal with convex hull boundary; Sampath and Shan [13] improved R. A. Jarvis’s algorithm, which can be used to deal with two-dimensional discrete point concave boundary, but the algorithm is not as efficient as the new algorithm proposed by subsequent researchers. In the 1980s, Edelsbrunner et al. [14] gave a rigorous mathematical definition on the “shape of a set of points” based on two-dimensional plane point set and proposed a point set boundary construction algorithm called Alpha Shapes (AS). Based on rigorous mathematical definition, this algorithm can deal with complex discrete point set boundaries including convex hull, concave points, and holes. Ten years later, H Edelsbrunner extended the AS algorithm [15] to be applied to three-dimensional point set surface reconstruction, thus greatly expanding the scope of the application of the algorithm. Jochem et al. [16] used the AS algorithm to automatically extract the building roof from airborne LiDAR point cloud; Shen et al. [17] and Li et al. [18] used the improved AS algorithm to extract building contour; Wang et al. [19] used the improved algorithm to extract edges from massive point cloud data in mountainous areas; Li and Li [20] used the improved algorithm to reconstruct the 3D surface model from the point cloud data of handicrafts; Sun et al. [5] applied the improved algorithm to extract the plot boundary from the trajectory data points collected by the vehicle-mounted satellite navigation receiver of agricultural machinery and then finely measured the farmland area; Li et al. [21] applied the algorithm to construct the tree crown three-dimensional model; Fu et al. [22] applied the algorithm to construct a three-dimensional model from jujube tree point cloud.

#### 2. Alpha Shapes Algorithm

##### 2.1. Algorithm Content

In literature [14], Edelsbrunner gave a rigorous mathematical definition of the geometric shape of a two-dimensional plane point set, namely, -Shape. Let be a two-dimensional planar point set, and give any parameter ; the polygon extracted from by the AS algorithm rule is -Shape, which can be used to express the boundary contour of the point set, and its precision is determined by the value of the parameter .

Simplified AS algorithm steps are as follows: Step 1: Input two-dimensional point set , and calculate the point set average point spacing as the value of . Step 2: Traverse , take as the center of the circle, and take the length not greater than as the radius to construct the search circle, and search the point set . Step 3: Traverse and construct -shape criterion to determine whether line segment is a boundary edge. If so, add it to -shape set and return to Step 2. If not, proceed to the next point until the traversal is complete and return to Step 2. Step 4: After traversing is complete, output -Shape collection.

The simplified flow chart of Alpha Shapes algorithm is shown in Figure 1.

Among these steps, the third step, “construct -shape criterion,” is the core step of the algorithm, and the rules are as follows.

Draw a circle with radius (as and in Figure 2). If no point in point set falls into the circle, the line segment is the boundary line; otherwise, it is the nonboundary line. The judgment method of whether a point in the point set falls into the circle is shown in Figure 2. According to formula (1), the center point is calculated, the point set is traversed, and the distance between each point (as in Figure 2) and the center point is calculated. If , it indicates that a point falls into the circle.

The formula for calculating the coordinates of the center point can be obtained via the “distance intersection algorithm” in Geomatics [23]:

In the formula, .

The process of constructing point set boundary by AS algorithm can be understood as a circle with radius rolling outside the edge of the point set . When the value of is appropriate, the trajectory that the circle rolls through is the boundary of point set , as shown in Figure 3. At the range of ， when , all points in are boundary points; when , the convex hull of is the boundary; when the value of is reasonable and the distribution of points in is uniform, the AS algorithm can construct the boundary of point set in an ideal way. If the value of is too large, the turning angle of the lines connecting the boundary points would be replaced by a larger blunt angle, resulting in a blunted effect at the corner, as shown in the shaded area in Figure 3.

##### 2.2. Algorithm Shortcomings

Compared with the previous similar algorithms, AS algorithm has the advantages of rigorous mathematical definition, the ability to deal with complex two-dimensional point set boundary and three-dimensional point set surface, and a wide range of applications, but there are also shortcomings. The only parameter in the algorithm determines the fineness of the point set shape, which needs to be manually input and adjusted according to different scenarios. At the same time, the unicity of parameter determines that the algorithm is very suitable for processing convex hull point sets with uniform distribution density but cannot ideally deal with the two following scenarios:(1)For point sets with nonuniform density distribution per unit area, such as the discrete points at the farmland boundary collected by handheld or vehicle-mounted satellite navigation receiver, and the three-dimensional laser point cloud of bare tree branches, the processing effect is not very ideal.(2)For the point set composed of many concaves, such as complex buildings, roads, water flow, and other linear features, the processing effect in the concave corner area is not perfect. Therefore, it is necessary to improve and perfect the algorithm.

When applying AS algorithm to extracting building contour, it is found in [17] that if the value of is too small, the building contour point set will be very fragmented; if it is too large, the concave corner area of the building will be distorted due to being excessively blunted (Figure 3). According to different application scenarios, literatures [5, 18] put forward an improved algorithm called “dual threshold Alpha Shapes,” aiming at the specific problems caused by the excessively single value of . Based on the AS algorithm, literature [19] used the grid detection method to quickly filter nonboundary points and proposed an algorithm for fast extracting edges from massive point clouds. However, the value of the algorithm is fixed, which still cannot overcome the specific problems caused by the single value of . Literature [20] proposed a surface reconstruction algorithm using “self-adaptive step Alpha Shapes algorithm,” which can automatically calculate different values according to the point density of different regions of the point set and better deal with the problem caused by nonuniform point distribution. However, this algorithm requires that every point in the point set has the same status to participate in the search calculation, and there is still room for the improvement of the execution efficiency. Based on the algorithms proposed in literatures [14, 15], this paper puts forward a grid partition variable step Alpha Shapes (GPVAS) algorithm, which has higher computational efficiency while solving the problems caused by nonuniform distribution of point sets.

#### 3. Grid Partition Variable Step Alpha Shapes Algorithm

##### 3.1. Algorithm Overview

The main improvements of the GPVAS algorithm are as follows:(1)The point set is partitioned, the nonboundary grid area is removed by fast filtering, and the boundary grid point set is extracted.(2)For points in point set , within the range of point set , variable step Alpha Shapes (VAS) algorithm [20] is applied to construct the boundary shape of point set . Figure 4 shows the simplified flow chart of GPVAS algorithm.

##### 3.2. Extracting Boundary Grid Partition Point Set

This step consists of two steps: “grid partition” and “extracting point set of boundary grid partition.” The steps are as follows.

###### 3.2.1. Grid Partitions

The envelop rectangle of point set is constructed, and its inflexion points are respectively.

The envelop rectangle is divided into grid partition set by the square grid with side length of (usually 2-3 times the average point spacing of point set ), wherewhere is the number of rows, is the number of columns, and the symbol [] is the integer of real numbers.

###### 3.2.2. Extracting Point Set of Boundary Grid Partition

A row-column index table is established for the point set , and the formula for calculating the row-column index value of point is as follows:where is the row number in the grid partition where point is located and is the column number in the grid partition where point is located. Thus, the mapping relationship between point and grid partition is established, as shown in Figure 5.

Traverse the grid, and quickly determine whether the grid contains a point from the row-column index table of point set . If it contains a point, it is 1; otherwise, it is 0.

Traverse the grid, extract the point set of boundary grid partition (the diagonal filling grid shown in Figure 5), and then quickly obtain the boundary grid point set through the row-column index table of the point set . The judgment rule of the boundary grid is that the boundary grid contains points, and at least one of its eight adjacent grids does not contain any point.

##### 3.3. Extracting Boundary Grid Partition Point Set

Once the value of parameter in AS algorithm (i.e., the search step in constructing -shape) is set, it is constant throughout the boundary construction process, which is the reason for the unsatisfactory effect of the algorithm in dealing with the nonuniform distribution point set or point set containing concave points. Literature [20] proposed VAS algorithm and used kd-tree to calculate the average distance of -nearest neighbors of the point as value to participate in the construction of -shape. The average distance of -nearest neighbors of a point is the average distance between the nearest number of points and the point in a point set. At this point, the value of is variable; when the point density is large, the value is small, which ensures the continuity of the boundary and the high -shape construction efficiency into account; when the point density is small, the value becomes larger, which can prevent boundary fragmentation caused by the excessively small value. However, in the process of calculating value and applying value to construct -shape, the VAS algorithm needs to search the entire point set under the worst condition. The time complexity is , and the efficiency of the algorithm is not ideal.

The GPVAS algorithm proposed in this paper still adopts the calculation steps of VAS algorithm, first calculating the value and then applying the value to construct the -shape. The biggest improvement of GPVAS algorithm lies in the grid partition and point set row-column index table constructed based on the above steps. No matter whether it is calculating value or constructing -shape, it is convenient to take the grid where the target point or center point is located as the center and search layer by layer from small to large distance (as shown in Figure 6), which is equivalent to arranging the point set in ascending sort order according to the distance value from the target point or center point. The experimental results show that the efficiency of the algorithm is significantly improved.

#### 4. Comparative Experiments

To intuitively verify the time efficiency of the algorithm, two kinds of data are designed for experimental verification based on AS algorithm, VAS algorithm and GPVAS algorithm: one is the data point set of computer numerical simulation (hereinafter referred to as the simulated point set), and the other is the data point set of engineering measurement (hereinafter referred to as the measured point set).

##### 4.1. Comparison of Simulated Point Sets

The data of random point set generated randomly by circular analytic formula (4) according to the density of the upper semicircle point is 3 times that of the lower semicircle point, and the inner point of the upper and lower semicircle is generated randomly according to the uniform distribution, as shown in Figure 7.

The effective range of parameter in the calculation of the average distance of -nearest neighbors in VAS algorithm and GPVAS algorithm is [9, 24]. According to the experimental statistics in literature [20], is considered to be more ideal. In this paper, is also used to calculate the nearest neighbor average distance as , and 2 is taken as the grid edge length. The experimental laptop is configured as Intel(R) Core(TM) i7-10750H [email protected] GHz, 16 G memory, and 64-bit Windows 10 operating system; experimental algorithm program is based on Microsoft Visual Studio 2010 IDE, C# language; the simulation point set is divided into four control groups (1000, 5000, 10000, and 20000) according to the number of points. The experimental results are shown in Table 1.

According to the comparison of experimental results in Table 1, when the number of point sets is small, the efficiency of VAS algorithm and GPVAS algorithm is lower than that of AS algorithm. The order of efficiency is as follows: AS > VAS > GPVAS. This is because the VAS algorithm and GPVAS algorithm need to spend some time to dynamically calculate the average distance of -nearest neighbors as value before each execution of the -shape criterion to filter the boundary points. In addition, GPVAS also needs to establish a grid partition and row-column index table for the point set. As the number of points increases, the execution efficiency of VAS algorithm and GPVAS algorithm begins to improve. Compared with AS algorithm, the execution efficiency of VAS algorithm can be improved by 20%∼30%. When the number of points increases, the efficiency of GPVAS algorithm increases rapidly. When the number of points is 20,000, the execution efficiency increases by more than 10 times. The order of the execution efficiency of the algorithm is GPVAS > VAS > AS. This is because the time-consuming distance budget time complexity of the AS algorithm is , and the GPVAS algorithm is able to construct the grid partition and row-column index table of the point set with negligible extra storage space before the -shape criterion filtering the boundary point set with the time complexity of , so as to quickly extract the boundary point set. Compared with the increase in the total number of point sets, the increase in the number of boundary point sets constructed by grid partitioning is very limited.

##### 4.2. Comparative Study of Strip Terrain Points on Mountain Highway

In order to verify the effectiveness of the algorithm applied to the measured point set, this paper selects the highway strip terrain data (Figure 8), which is composed of more than 8000 GPS measuring points with a total length of about 17 km in the mountainous area of southern Anhui, and the average distance between the measuring points is 28.513 m. There are many curves in the strip terrain data in mountainous areas, dense collection points in undulating sections, and sparse collection points in flat sections. The density distribution of point set is not uniform, and there are many concave areas, so it is an ideal experimental data to verify the algorithm. The experimental results are shown in Table 2; Figures 8 and 9(a) to Figure 9(c) (Figures 9(a)–9(c)) are the enlarged images of the boundaries extracted by different algorithms at highway curve in Figure 8.

**(a)**

**(b)**

**(c)**

The experimental results show that all the three algorithms can be used to construct the boundary of experimental data, and the construction efficiency is GPVAS > VAS > AS. Regarding the construction results, GPVAS algorithm and VAS algorithm are almost the same with good effect (Figure 9(c)), while the AS algorithm is greatly affected by the set value of . When the value of is small (10 m in the experiment), the boundary is relatively fine, and it is easy to form the boundary of holes in the area of sparse points (Figure 9(a), where the diagonal filling area is determined as the hole). When the value of is large (30 m in the experiment), the boundary is relatively crude (Figure 9(b)), and the algorithm execution efficiency is lower than that when the value of is small.

#### 5. Conclusions

Based on the Alpha Shapes algorithm for extracting the boundary of discrete point sets, this paper analyzes and summarizes the previous research work. In view of the shortcomings of Alpha Shapes algorithm in processing nonuniform distributed point sets and multiconcave point sets, this paper proposes the grid partition variable step Alpha Shapes algorithm, which is used to quickly construct the boundary of point sets. This algorithm has two main advantages:(1)Establish grid partition and row-column index table for point set, quickly filter nonboundary point partition, and extract boundary grid partition point set involved in subsequent -shape construction; compared with the increase in the total number of point sets, the increase in the number of point sets of boundary grid partition constructed by grid partition is very limited, which is the main reason why GPVAS algorithm can effectively deal with a large number of point sets.(2)The average distance of -nearest neighbors of the point calculated by kd-tree is used as the value. In the region with dense point distribution, the value is small, and, in the region with sparse point distribution, the value is large, so that the algorithm can well deal with the regional boundary with nonuniform point distribution.

The algorithm is verified by simulated point set and measured point set, and the execution efficiency of the algorithm is very high. Compared with similar algorithms, the larger the number of point sets is, the more obvious the efficiency improvement is. As an alternative algorithm, this algorithm has been effectively verified in engineering scenarios such as land area statistics and road earthwork calculation. In the field of 3D point cloud surface reconstruction with broader application scenarios, this algorithm has not been verified, which is also the follow-up research direction of this paper.

#### Data Availability

The data used to support the findings of this research were generated from experiments.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

#### Acknowledgments

This research was funded by the National Natural Science Foundation of China (41906168), Natural Science Research Project of Anhui Education Department (KJ2018JD04), and AHJZU-Anhui Huali Construction Co., Ltd., Joint Research Project (HYB20190152).