Abstract

For complex CAD models, model segmentation technology is an important support for model retrieval and reuse. In this article, we first propose a novel CAD model segmentation method that uses the fusion of the program/project evaluation and review technique (PERT) and the Laplacian spectrum theory. By means of PERT, spectral theory, and the CAD models’ geometrical and topological information, we transform the b-rep model faces into two-dimensional coordinate points corresponding to the nodes of the attributed adjacent graph (AAG). The k-means approach with the Silhouette coefficient was employed to conduct unsupervised learning of the coordinate points. The experimental results demonstrate that (1) the proposed approach can effectively transform the b-rep model into a two-dimensional coordinate point set; (2) the k-means algorithm can efficiently cluster points to achieve segmentation; and (3) in view of human cognition, the segmentation results are more reasonable. It can effectively divide the point set into several groups to achieve the model segmentation.

1. Introduction

In terms of the CAD model retrieval, model decomposition as important support technology has been a concerned issue in computational geometry. For complex CAD models, model segmentation technology is an important support for model retrieval and reuse. Especially in partial retrieval, it aims at dividing the complex CAD model into several simple components. That is to say, a CAD model will be converted into a set with all components as elements. Subsequently, the efficiency of the partial retrieval is determined by the granularity of the model elements. When the models are complex, the triangles in mesh models are not sufficiently large as the face in b-rep models [1]. Therefore, it is imperative to introduce model segmentation, where models are divided into several model components with some basic shape features, replacing triangles or faces.

Traditional segmentation methods are primarily used for mesh models instead of general mechanical CAD models [1]. Their core idea is to divide the model surface into a group of connected subgrids with a simple shape, i.e., by calculating the discrete curvature [2, 3]. This kind of segmentation could be interpreted only in a purely geometric sense without any semantic meaning. It is difficult for the segmented regions to be used for model retrieval. Furthermore, these methods require more computational cost which limits the algorithm usage for mechanical CAD models.

To solve the issue mentioned above, we introduce a novel segmentation approach using the fusion of the program (or project) evaluation and review technique (PERT) and the spectral theory.

The PERT is used in project management to analyze and schedule the tasks involved in completing a given project. It is a statistical tool based on network diagram to calculate the time parameters and allocate resources for each task [4]. Castro et al. presented how to use the PERT to control the project [5]. Ballesteros-Pérez updated the PERT and proposed a refined M-PERT to conduct the modeling of real-life projects and deal with activities that have uncertain durations [6].

Spectral theory is a branch of mathematics that relates the eigenvalue spectra of the adjacency matrix or Laplacian matrix of graphs with other geometric invariants of the graph [7, 8]. Graph spectrum is an ordered sequence of eigenvalues of adjacency matrices, which can describe topological structure information of graphs in vector form, and the computation time is polynomial. The Laplacian matrix can be used to find many useful properties of a graph [9]. Chung proposed a refined version of the graph spectra, which is based on the Laplacian matrix of the graph and correlates with the graph invariants better than the spectra of the original adjacency matrix [10].

In this article, we propose an innovative method for b-rep model: employing the PERT and spectral theory to preprocess the CAD models to transform each face of the b-rep model into a two-dimensional coordinate point, and taking use of he k-means approach with the Silhouette coefficient to conduct unsupervised learning of the coordinate points. The clustering results are evaluated by the Silhouette coefficient, which could ensure the accuracy and uniqueness. In this study, we mitigate the problem by means of a new partitioning method, and the research results show that the partitioning results are more reasonable boundary regions.

3D CAD model segmentation has been an important issue in the computer design area. Various algorithms have been proposed for segmentation.

Most 3D mesh model segmentation algorithms are based on 2D image segmentation, i.e., the 2D space extends to the 3D space. The approaches can be broadly classified into three categories: (1) Region-based: based on features of models (e.g., surface), divide the model into different and nonoverlaping regions using methods such as the region growing method [11, 12] and the watershed algorithm [13, 14]. (2) Graph-based: it generally represents the model in terms of an undirected graph (dual graph [3] and attributed adjacent graph [15]) and segmented by graph cuts [16], normalized cuts [17], and spectral theory [18]. (3) The extracted features of the model are clustered, and the points or surfaces of a model having similar property can be obtained according to the clustering result [3, 19, 20].

Especially for the mesh model and the point cloud model, clustering-based methods are typical and widely used. The clustering algorithm is an unsupervised learning method; it does not require training samples. Many clustering methods are critical for model segmentation. Examples include hierarchical clustering [21, 22], fuzzy clustering with graph cuts [23], and spectral clustering [24], and Shlafman et al. introduced the steps of applying the k-means clustering algorithm to 3D polyhedral models segmentation systematically [25]. Garland proposed a hierarchical clustering methodology such that the clusters can be well approximated with planar elements, which can be used for CAD model segmentation [26]. Katz and Tal used a hierarchical fuzzy k-means algorithm to segment the objects [21]. As in Shlafman et al., their algorithm clusters triangles with a distance function that is a weighted sum of the geodesic distance of the barycenter and the angle between the triangles [25]. In the binary partitioning case, their algorithm partitions the mesh into two segmentation areas and a fuzzy area.

Previous literature on problems related to mesh model segmentation abound. However, for mechanical 3D CAD solid models, segmentation methods are rare.

Hitherto, few segmentation methods for CAD models have appeared. Wang et al. proposed a feature-based partitioning algorithm that extracts features from the attributed adjacency graphs using ant colony clustering to conduct the multiobjective optimization. The segmentation effect is achieved by optimizing the partitioning face and partitioning order [27]. From the assembly point of view, Wang et al. likewise developed a graph partitioning method based on ant clustering. The complicated CAD model was first represented and simplified by the attributed adjacent graph, and subsequently the connection mode and part attribute of the model were analyzed to construct the correlation degree matrix, according to the model topological connection relations to reconstruct the local range and density function of ant clustering algorithm [15]. Ma employed a surface region segmentation method for CAD models. According to the boundary concavity, the model is divided into a region set of the least number of areas, which is composed of a few interconnected surfaces [28].

For the sake of clarity, we present the visual comparison of each segmentation method which was analyzed before in Table 1.

These segmentation methods may encounter both the resulting uncertainty problem and the efficiency problem. The region-based algorithms exhibit satisfying efficiency, but the results may not be unique, since the algorithm starts with different seed faces [1]. The cluster-based algorithms can yield a certain result, but their costs are high [1]. Meanwhile, they are less effective to find the meaningful components. Additionally, the b-rep model shape is prominent, the boundary between the surfaces is obvious, and the partitioning result is not ideal by a simple curvature dividing [32].

3. Model Description

In this section, we will present several basic concepts that are related to the model decomposition; subsequently, the concrete steps of the segmentation method are given.

3.1. Attributed Adjacency Graph for CAD Models

The b-rep model is typically used for the description and analysis of solid models, since the b-rep models could present a visualized boundary of the models and also could represent the 3D model unambiguity [33, 34].

CAD systems often use some form of b-rep as their internal representation. Most of them generate and express the AAG of the model faces from a STEP file, as presented by El-Mehalawi and Miller [33]. A labeled typical mechanical model with its AAG is shown in Figure 1. An AAG represents the b-rep structure of an entity. In this context, the AAG is four-tuple and could be formulated aswhere is a set of nodes in AAG that represents the model faces; is a set of edges in AAG that represent the model edges; is a set of properties containing the face properties and edge properties; and is the node degree. One-to-one correspondence occurs between the faces of a model and its AAG nodes and between its edges and its AAG links.

3.2. Basic Conceptions

The geometric information of the b-rep model primarily concludes the face and edge information. Herein, the descriptions of the faces and edges are based on the geometric information.

3.2.1. Faces Code Representation

The AAG nodes’ attributes contain the geometrical and topological information of a face in a CAD model. We present a code that describes the properties of a face-like face type, i.e., convexity. These two properties are represented as follows:where and are both integers that represent the face type and face convexity. In particular, if the face is a plane, cylinder, cone, sphere, or others, respectively; and if the face is planar, convex, concave, or others, respectively.

3.2.2. Edges Code Representation

The AAG’s edges represent the adjacency relationship between two surfaces in a 3D CAD model. They, as the intersection boundaries, correspond to one or more edges connecting the two faces. An external edge angle is designed as an angle between the two outside faces of two adjacent surfaces, while the angle between the two inside faces of two adjacent surfaces is called as an internal edge angle. The external edge angle and internal edge angle are presented in Figure 2, respectively. Here, the value of edge convexity is determined by the external angle (shown in Figure 2):

(1) Convex Edge. If the external edge angle is greater than (or the internal edge angle is less than ), the edge intersected by two adjacent surfaces is called a convex edge. The two adjacent surfaces can be any convex or concave surfaces. Figure 3(a) shows a convex edge between two adjacent surfaces [35].

(2) Concave Edge. If the external edge angle of the two adjacent surfaces is less than (or the internal edge angle is greater than ). The case is shown in Figure 3(b) [35].

(3) Adjacency Matrix. The adjacency matrix AM is a square symmetric matrix; its elements of the matrix indicate whether pairs of nodes are adjacent in the attributed adjacency graph . In this special case, using , the AM is defined as follows:

For an AM, and for all .

3.3. Model Description

By means of AAG, a b-rep model in 3D Euclidean space could map into a 2D description space S. In AAG, nodes and edges are represented by shape parameters that can be extracted from the STEP and used for further calculation. Each surface of the 3D CAD model is transformed into a 2D coordinate point by the PERT and the graph theory.

In a single code network, the calculation principle of the earliest start time (ES) is as follows: the maximum value of the earliest finish time (EF) of the front closely activities, wherein the earliest start time plus duration is the earliest finish time of the work.

For example, the ES of activity D depends on the EF of B and C; according to the maximum value principle, the ES of D is 9 (see in Figure 4).

Likewise, the AAG of CAD models can be seen as a single node network (see in Figure 5). Weight is equivalent to duration. The duration is 0 or D, respectively, if the weight is 1 or −1. The horizontal coordinate of node 3 depends on the maximum value of and .

3.3.1. Horizontal Coordinates Computation

According to the PERT, geometry, and topological information of each surface of the model, the horizontal coordinate is determined as follows:Step 1: the coordinate matrix corresponding to the AM is computed line by line as follows:where represents the i-th horizontal coordinate of a point corresponding to node . and are weights. is the j-th face type, and is the j-th face convexity. is the number of −1 from the first to the i-th row in the j-th column of the AM. D is constant. In this article, the value of D is 10.Step 2: calculate the temporary coordinates of a node point-by-point. For the i-th node, obtain the j value of all in the i-th row of the AM. Record the j values in the temporary array ip. Subsequently, the horizontal coordinate of the i-th node is computed as follows:where represents the 1∼i-th row, is the maximum value in the 1∼i-th row of the i-th column, n represents the n-th value in the temporary array ip, and is the maximum value in the 1∼i-th row of the n-th column.

3.3.2. Vertical Coordinates Computation

The graph theory is introduced to describe the topological information of the graphs in vector form. A graph spectrum is an ordered sequence of eigenvalues of the AM and can describe the topological structure information of the graphs in vector form; additionally, the computation time is polynomial.

The Laplacian matrix L of the model is calculated based on the AM as follows:where and are the nodes of the attributed adjacency graph and and represent the degrees of node and node , respectively.

(1) Spectral Vector. : the spectral vector is the eigenvalue of the Laplacian matrix in a descending order. . In fact, is the corresponding point vertical coordinate.

Thus, a face f can be described by two-dimensional point coordinates , where is the horizontal coordinate and is the vertical coordinate.

4. Segmentation Method

Based on the analysis of 3D CAD models, we propose a clustering-based segmentation approach. The main flow chart is shown in Figure 6. It contains five phases: (1) taking b-rep model as input, (2) building the corresponding face attributed adjacency graphs of the 3D CAD models that are introduced in the literature [33], (3) transforming 3D CAD models into 2D coordinate point sets by using the fusion of PERT technology and spectral theory, (4) clustering 2D coordinate points by using of k-means algorithm based on the Silhouette coefficient, and (5) segmenting the 3D CAD model according to the result of clustering.

4.1. The -Means Clustering Algorithm

Clustering is an iterative process of separating a set of samples into a number of groups with no supervised learning [36]. In general, k-means is widely used owing to its simplicity, versatility, and relatively higher efficiency over other clustering methods. Therefore, we make use of the k-means method to cluster two-dimensional coordinate points.

The k-means clustering algorithm gathers data points into k clusters. The cluster is associated with the cluster centroids [37]. We denote the set of data points as , and let be the distortion between any two points and . Herein, denotes the Euclidean distance between and .

The vital part of the proposed algorithm is to construct and utilize the Laplacian matrix to reduce the dimensionality of the dataset. Furthermore, the former transforms the 3D model into a point set in a lower-dimensional eigenspace by utilizing the eigenvectors of a Laplacian matrix derived from the AAG. Subsequently, the k-means algorithm can be performed on this point set to obtain the final clustering result.

The clustering problem can be taken as partitioning an undirected graph, which is an unsupervised learning process. Suppose that an undirected graph is an AAG, i.e., . Here, E is represented by an adjacent matrix [38]. The evaluation criteria are to render the correlation of the two nodes larger in the same subgraph and smaller between different subgraphs [24].

In general, the isolated point affects the effect of the algorithm significantly. Fortunately, in a CAD model, an isolated face does not exist, and each face is adjacent to one or more faces.

4.2. Construction of the Silhouette Coefficient

The Silhouette coefficient is a method of interpretation and validation of consistency within clusters of data. This is to say, it is a type of evaluation method for the performance of clustering algorithms, which provides a parsimonious graphical representation of how well each sample data lies within its cluster [39].

The Silhouette coefficient value can measure how similar a sample data is to its own cluster (cohesion) compared to other clusters (separation). It ranges from −1 to +1, where a high value indicates that the sample data is well matched to its own cluster and poorly matched to neighboring clusters [39]. Amid the specified iteration times, once the Silhouette coefficient reaches the maximum value, the clustering configuration is appropriate. Conversely, the clustering configuration may have too many or too few clusters with a low Silhouette value.

The Silhouette coefficient of point is calculated as follows:where is a measure of how well is matched to its cluster (the smaller the value, the better the matching) and is the lowest average distance of to all points in any other cluster, of which is not a member [40].

The global Silhouette coefficient is the average of the Silhouette coefficient for all points, which can be calculated as follows:where is the number of sample points.

4.3. The Primary Process of -Means Based on the Silhouette Coefficient

The k-means algorithm is used to cluster the point set corresponding to the 3D CAD model to achieve model segmentation. The experimental results show that the optimal cluster is less than 10, so the value of k is [2, 10]. The specific steps are as follows:Step 1: given the number of clustering k, initially set k = 2.Step 2: initialize the cluster centroid randomly for k clusters.Step 3: calculate the Euclidean distance between each point and the cluster centroid c to classify; the distance is defined asThe nearest cluster centroid is determined by the minimum value of the Euclidean distance.Step 4: According to the previous results, recalculate the mean value of all points in each cluster; subsequently, determine a new cluster centroid and the formulas are as follows:where m represents the iterated times and is the cluster centroid iterated m times.Step 5: proceed to steps 3 and 4, until the cluster centroid position (i.e., threshold ) does not change significantly Step 6: calculate the Silhouette coefficient S of the k clusters based on equation (8).Step 7: set , if , proceed to steps 2–7; if , then stop.Step 8: proceed to steps 2–5, obtain the value corresponding to the maximum of the Silhouette coefficient S; it is considered the optimal cluster number.Step 9: output the cluster result.

5. Implementation and Results

To verify the proposed algorithm in this study, we performed some experiments for the 3D model segmentation. Here, an example is described to illustrate the implementation of the proposed approach.

5.1. Implementation

Following the discussion above, the proposed algorithm can be formally described as follows:(1)Input a set of models (2)Transform all the models in M into AAGs (3)Transform all the AAGs in G into discrete point set based on equations (5)–(7)(4)Let , and iterate the following steps until (the value of k is not fixed; instead, it is according to the number of the model faces)(5)Initialize the cluster centroid randomly for k clusters(6)Compute the centroid for each cluster to obtain a new set of clusters(7)Calculate the Silhouette coefficient S of the k clusters. Obtain the value corresponding to the maximum of the Silhouette coefficient S(8)Output the final clustering results in cell array

Example 1. The models in Figure 1 show a solid model a. The Laplacian matrix of the model a is depicted in Appendix; all eigenvalues are real and positive and each eigenvalue of is the corresponding point vertical coordinate as follows:We now illustrate the determination of the horizontal axis based on a given AM1 and AAG1.The AAG is an undirected graph. The edge connects two adjacent nodes n and and is an undirected edge; therefore, it has no specific direction, i.e., and represent the same edge. To avoid duplicate computations, we only use the upper triangular part of AM1. Using node 2 and node 4 as an example, node 2 connects node 4 and node 7 with a concave edge; therefore, and are equal to −1(in red) and . Node 4 connects node 6 with a concave edge; therefore, (in red) and .
According to equations (5)∼(7), we transform the model a into a point set, as shown in Figure 7(a), and subsequently employ the k-means algorithm introduced in Section 3.
Finally, the Silhouette coefficients automatically confirm the number of clusters as k = 4 (shown in Figure 8). The horizontal axis is the number of clusters, and the ordinate is the value of Silhouette coefficient. When k = 4, the value of Silhouette coefficient is maximum.
The final segmentation result is shown in Figure 7(b). According to this method, the model a could be decomposed into several significant regions (see in Figure 9), where each color represents one part of the model a.

5.2. Experiments and Results

We performed some experiments using the library models to validate the approach proposed herein. These tests were implemented on a computer with an Intel 3.20 GHz CPU and a 4.0 GB RAM. The ACIS 3D Geometric Modeler and Visual C++7.1 development environment were used.

The efficiency and results of the algorithm proposed in this paper are discussed. The method is not only efficient but also could be used in fields of segmentation of general CAD models overcoming the limitations of model types. As it is seen, with utilization of the proposed approach, it is easy to desirably partition the 3D CAD model, and its segmentation results are more in line with human perception.

5.2.1. Run Time for Segmentation

Time estimation of our approach is visualized in Figure 10. From this curve, we can see that the proposed algorithm is efficient. The implementation on average takes less than a second.

The x-axis represents the number of model faces, and the y-axis represents the computational time. The time cost consists of two parts: one is the coordinates’ calculation of points, and the other one is the k-means clustering algorithm complexity. Actually, the former takes a very short time and is negligible. The latter plays a decisive role. That is to say, the complexity of the proposed algorithm strongly depends on the complexity of the k-means algorithm. Therefore, the complexity of proposed algorithmic is , where n is the number of model faces, k is the number of clusters, and m is the iterations. As a fast clustering method, k-means clustering algorithm performs quite well, for example, when the size of the AAGs is 587 nodes and the computation time is only 1.374 s.

5.2.2. Effectiveness for Segmentation

Table 2 shows the comparison of our approach with method in Ref. [31]. As for each model, the faces of the same color represent the same region and n represents the number of segmentation. The method in Ref. [31] tends to oversegment models. For model 1, it only contains 29 faces. However, the model is divided into 8 regions, and each region does not have sufficient engineering meaning. Furthermore, the segmentations in the four models could not combine the through-hole feature into any other regions, and it also increases the computational complexity of subsequent model comparisons. The segmentation is difficult to apply to model retrieval. All of these models are segmented correctly into meaningful components by our method. Meanwhile, compared with the other methods in Ref. [31], the segmented result in this paper is more consistent with human perception, thus the improving efficiency of model retrieval.

In order to verify the segmentation approach, a set of various models are chosen. The following are some segmentation experiments for the typical mechanical parts using the proposed method, and the segment results are shown in Figure 11. Rounds and fillets are ignored, since they will not affect the result. Furthermore, the results show that the method proposed is universal, and it is suitable for the segmentation of most conventional CAD models.

Another area that can be benefitted from use of the proposed segmentation method is model reuse. It is possible to get and save meaningful segmented components into the model library for product design. Each segmented component could be conveniently reused by retrieving, copying, and modifying.

6. Conclusions

In this paper, we develop a segmentation method that is different from the methods of 3D mesh model segmentation. It is innovative to employ the PERT and spectral theory to transform each face of the b-rep model into a two-dimensional coordinate point. We consider both geometrical and topological information and treat model segmentation as a points-clustering problem, using the k-means approach with silhouette coefficient to conduct unsupervised learning of the coordinate points. Additionally, we mainly focus on the concavity of the boundary, since the division mainly appears in the concave edge. That is, the weight between two nodes is −1. Therefore, this method is suitable for many CAD models.

The experimental results clearly show that(1)According to the interval parameters and transformational rule, the proposed approach can effectively transform the b-rep model into a two-dimensional discrete coordinate point set succinctly.(2)The k-means algorithm can efficiently cluster points to achieve segmentation. The silhouette coefficient could ensure the accuracy and uniqueness and clustering automatically.(3)In view of human cognition, the partitioning results are more reasonable boundary regions with some meaningful engineering semantics.

Appendix

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The support from the National Natural Science Foundation of China (51775445) and the Natural Science Basic Research Plan in the Shaanxi Province of China (Program No. 2016JM5040) for this research is gratefully acknowledged.