Abstract
Fitting spline curves to data points is a very important issue in many applied fields. It is also challenging, because these curves typically depend on many continuous variables in a highly interrelated nonlinear way. In general, it is not possible to compute these parameters analytically, so the problem is formulated as a continuous nonlinear optimization problem, for which traditional optimization techniques usually fail. This paper presents a new bioinspired method to tackle this issue. In this method, optimization is performed through a combination of two techniques. Firstly, we apply the indirect approach to the knots, in which they are not initially the subject of optimization but precomputed with a coarse approximation scheme. Secondly, a powerful bioinspired metaheuristic technique, the firefly algorithm, is applied to optimization of data parameterization; then, the knot vector is refined by using De Boor’s method, thus yielding a better approximation to the optimal knot vector. This scheme converts the original nonlinear continuous optimization problem into a convex optimization problem, solved by singular value decomposition. Our method is applied to some illustrative realworld examples from the CAD/CAM field. Our experimental results show that the proposed scheme can solve the original continuous nonlinear optimization problem very efficiently.
1. Introduction
Fitting spline curves to data points is a problem that appears very frequently in many scientific and engineering fields. Typical examples span from regression analysis in statistics [1, 2] to contour reconstruction in medical imaging [3]. They also encompass the computation of outlines in image processing [4], shape manipulation in geometric modeling and processing [5–7], and data approximation methods in numerical analysis [8, 9], to mention just a few examples. In this paper, our main motivation comes from the fields of computeraided design and manufacturing (CAD/CAM) where spline curves are intensively used in many problems [10–14]. One of them is to fit data points obtained from metrology in CAD/CAM, a field consisting of the application of measurement technology to the quality control assessment of designed or manufactured products in many manufacturing industries (automotive, aerospace, ship building, shoes, etc.).
In spite of its wide range of applications, the use of spline curves is still challenging because they typically depend on many different continuous variables (data parameters, knots, and spline coefficients) in a highly nonlinear way [15–20]. These sets of variables are also interrelated, meaning that changes in the values of a particular set of parameters affect the behavior of the others, and hence they cannot be manipulated independently [21–24]. For instance, the choice of knots depends on the curve parameterization, which in turn depends on the underlying structure of data points. Similarly, the computation of the spline coefficients depends on both the parameterization and the knots, and so on. From a mathematical standpoint, this implies that the fitting problem cannot be partitioned into independent subproblems for the different sets of variables. As a consequence, it is not possible in general to compute all these parameters analytically [19, 25]. Instead, the typical formulation in the field is to treat this problem as a continuous nonlinear optimization problem [26, 27]. The bottom point is that traditional optimization techniques have also failed to provide satisfactory answer to this optimization problem. Among the alternatives suggested to solve this limitation, those based on artificial intelligence techniques captured the interest of the scientific community some years ago. The main line of research focused on the neural networks [28–30] and their extension, the functional networks [31–34]. However, the solutions reported were partial and applicable only to some particular problems. Consequently, there is a need for more efficient approaches to tackle this issue.
During the last few years, scientists and engineers have turned their attention to bioinspired computation, a field where the interplay between nature and computers has allowed us to model the living phenomena by using mathematics and computer science [35–37]. Simultaneously, the study of life has led to improved schemes to solve many problems in mathematics and computer science, including optimization problems [38–40]. Due to their good behavior for complex optimization problems involving ambiguous and noisy data, there has recently been an increasing interest in applying bioinspired optimization techniques to the spline fitting problem. However, there are still few works reported in the literature. Recent schemes in this area are described for particle swarm optimization [41–43], genetic algorithms [27, 44, 45], artificial immune systems [46, 47], estimation of distribution algorithms [48], and hybrid approaches [49–51].
Specially remarkable is the fact that some bioinspired methods have proved to be able to solve difficult optimization problems unsolvable with traditional optimization techniques. Being this our case, we turned our attention to a powerful bioinspired metaheuristic called firefly algorithm, recently introduced by Professor XinShe Yang to solve difficult continuous optimization problems. The firelfy algorithm is inspired in the flashing behavior of the fireflies and their social interaction in the natural environment (see Section 3 for details).
In this paper, we present a new bioinspired scheme for computing all parameters of a spline curve approximating a given set of data points. Our proposal is based on two fundamental techniques: the indirect approach and the firefly algorithm, which are combined in our method to perform the optimization of the knots and the data parameters, respectively. The indirect approach tries to overcome the fact that computing the knots requires a previous parameterization which, at its turn, requires a previous knot vector, leading in practice to a neverending vicious circle. In the indirect approach, the knots are not initially the subject of optimization but precomputed with a coarse approximation scheme, which will be further improved at a later stage. This precomputed knot vector plays the role of an initial seed for the data parameterization step. An obvious risk of this indirect approach is that the whole method relies on this optimization stage. In this way, data parameterization becomes the most critical step, since it carries out the most significant part of the optimization effort. Consequently, we need a powerful, reliable optimization method for this task. As it will be shown later on, the firefly algorithm is a good choice for this step. It is applied in the second step to perform optimization on data parameterization; then, the knot vector is refined by using De Boor’s method, thus yielding a better approximation of the optimal knot vector. These two combined methods convert the original nonlinear continuous optimization problem into a convex optimization problem, which is solved by applying singular value decomposition. This scheme is applied to some illustrative realworld examples from the CAD/CAM field, including the side profile curve of a car body, the outline curves of a paint spray gun, and a 3D CAD/CAM workpiece from the automotive industry. Our experimental results show that the proposed scheme can solve the original continuous nonlinear optimization problem very efficiently.
The structure of this paper is as follows. Firstly, some basic concepts about parametric spline curves are given in Section 2. Then, Section 3 describes the firefly algorithm, the bioinspired metaheuristic used in this paper. The core of the paper is in Section 4, where our proposed method for spline curve fitting is reported in detail. Section 5 describes the experimental results of the application of our method to three illustrative realworld problems from the CAD/CAM field. The paper closes with the main conclusions and our plans for future work.
2. Parametric Spline Curves
In this section we describe the basic concepts needed in this paper about the parametric spline functions. The interested reader is referred to [6, 7, 52, 53] for a more detailed discussion about this subject. Note that in this paper vectors are denoted in bold.
Let be a parametric function defined on a finite interval . Consider now a strictly increasing sequence of real numbers called knots. The function is a parametric polynomial spline of degree with knots if the following two conditions are fulfilled for and : (1) is a polynomial spline of degree up to on each interval , (2) and its derivatives up to order are continuous on .
Different basis functions can be used for polynomial splines. In this paper, we consider the Bspline basis functions of degree defined on according to the CoxdeBoor recursive formula [52]: where , , and is the unit function with support on the interval . The dimension of the vector space of functions satisfying conditions (1) and (2) is . The given knot vector yields linearly independent basis functions of degree . The remaining basis functions are obtained by introducing the boundary knots and . With this choice of boundary knots all basis functions vanish outside the interval domain . Every parametric spline curve is represented by where are the spline coefficients of the curve and are the basis functions defined above. The th derivative of is a spline of degree given by with for and .
3. The Firefly Algorithm
The firefly algorithm (FFA) is a bioinspired metaheuristic algorithm introduced in 2008 by Yang to solve optimization problems [54–58]. The algorithm is based on the flashing behavior of the fireflies and their social interaction in the natural environment. The key ingredients of the method are the variation of light intensity and formulation of attractiveness. In general, the attractiveness of an individual is assumed to be proportional to their brightness, which in turn is associated with the encoded objective function. The reader is kindly referred to [40] for a comprehensive review of the firefly algorithm and other natureinspired metaheuristic approaches. See also [59] for a gentle introduction to metaheuristic applications in engineering optimization.
In the firefly algorithm, there are three particular idealized rules, which are based on some of the major flashing characteristics of real fireflies [54] as follows(1)All fireflies are unisex, so that one firefly will be attracted to other fireflies regardless of their sex. (2)The degree of attractiveness of a firefly is proportional to its brightness, which decreases as the distance from the other firefly increases due to the fact that the air absorbs light. For any two flashing fireflies, the less bright one will move towards the brighter one. If there is not a brighter or more attractive firefly than a particular one, it will then move randomly. (3)The brightness or light intensity of a firefly is determined by the value of the objective function of a given problem. For instance, for maximization problems, the light intensity can simply be proportional to the value of the objective function.
The distance between any two fireflies and , at positions and , respectively, can be defined as a Cartesian or Euclidean distance as follows: where is the th component of the spatial coordinate of the th firefly and is the number of dimensions. In the firefly algorithm, as attractiveness function of a firefly one should select any monotonically decreasing function of the distance to the chosen firefly, for example, the exponential function: where is the distance defined as in (4), is the initial attractiveness at , and is an absorption coefficient at the source which controls the decrease of the light intensity.
The movement of a firefly which is attracted by a more attractive (i.e., brighter) firefly is governed by the following evolution equation: where the superscripts between brackets in this expression are used to denote the corresponding generations. The first term on the righthand side is the current position of the firefly at generation , the second term is used for considering the attractiveness of the firefly to light intensity seen by adjacent fireflies, and the third term is used for the random movement of a firefly in case there are not any brighter ones. The coefficient is a randomization parameter determined by the problem of interest, while is a random number generator uniformly distributed in the space .
The method described in previous paragraphs corresponds to the original version of the firefly algorithm, as originally developed by its inventor. Since then, many different modifications and improvements on the original version have been developed, including the discrete FFA, multiobjective FFA, chaotic FFA, parallel FFA, elitist FFA, Lagrangian FFA, and many others, including its hybridization with other techniques. The interested reader is referred to the nice paper in [60] for a comprehensive, updated review and taxonomic classification of the firefly algorithms and all its variants and applications.
4. The Method
In this section our FFAbased method is fully explained. The section begins with the description of the optimization problem to be solved. Then, a general overview of the method and its flowchart are given. Then, each step of the method is discussed in detail. Finally, some details regarding the implementation issues are also given.
4.1. The Optimization Problem
Let us suppose that we are provided with a set of measured data points obtained by laser scanning, layout machine, or other digitizing methods, as it typically happens in many scientific and engineering problems. The goal consists of obtaining a parametric spline curve of degree defined as above approximating the . Due to the conditions on the boundary knots, we can take and and perform approximation on the remaining parameters; that is, Equation (7) can be written in matrix notation as where , , is the matrix of sampled spline basis functions, and represents the transpose of a vector or matrix. The dimension of the search space in (8) is given by , which could be of several thousands of variables for nontrivial shapes. Since the system (8) is overdetermined, the matrix of basis functions is not invertible and no direct solution can be obtained. Therefore, we consider the leastsquares approximation of (7), defined as the minimization problem given by where are scalar weights and represents the Euclidean norm (although any other norm might be used instead). Note that the parameters and knots are related by nonlinear basis functions, thus leading to a highdimensional continuous nonlinear optimization problem. Assuming that a suitable data parameterization can be obtained, we have to solve a nonlinear continuous optimization problem involving both the spline coefficients and the knots as free variables of the problem. Unfortunately, this approach makes the optimization problem nonconvex, because is a nonconvex function of the knots [19, 26, 52]. To overcome this problem, we follow the socalled indirect approach, in which the knots are precomputed before the optimization process is executed and then refined for better fitting. With this strategy, the resulting problem is convex, so a global optimum can eventually be found. In order to apply the previous strategy, we need to obtain a suitable parameterization of data points, which thus becomes the most critical step of this approach. We solve this parameterization problem by applying the firefly algorithm, as it will be explained in next paragraphs.
4.2. Overview of the Method
The main steps of our method are summarized in Figure 1, showing the flowchart of our approach. Our initial input is given by the set of data points and two parameters that are freely chosen by the user: the length of knot vector (determined by variable ), and the curve degree (typical values for are between and , although any natural value can be used in our method). The first step of our approach consists of computing an initial knot vector. Then, we apply the firefly algorithm to perform data parameterization (Step 2). A new (refined) knot vector is computed based on the parameterization obtained in the previous step. Then, the convex optimization problem is solved by using singular value decomposition (Step 4). After this step, the best fitting spline curve to the data points for the given degree is finally obtained.
4.3. Main Steps of the Method
The proposed method consists of four main steps, analyzed in the next paragraphs.
Step 1 (knot vector initialization). The first step of the method computes an initial knot vector, which is required in order to evaluate the fitness functions during the optimization step for data parameterization. To this aim, we consider a clamped knot vector (this condition is a result of our choice of boundary knots) whereas the internal knots are uniformly distributed on the interval ; that is, , for . This choice of knots is not optimal because it does not reflect the distribution of data points. This limitation will be overcome in Step 3, where this initial knot vector will be further refined.
Step 2 (data parameterization). In this step, the data points parameterization is carried out. As mentioned above, this is the most critical step of our method; the set of data parameters along with the knot vector, computed in the previous step and refined in the next one, allows us to convert the original nonlinear nonconvex optimization problem (9) into a convex optimization problem. This task is accomplished by applying the firefly algorithm described in Section 3. To this purpose, a collection of particles (fireflies) is considered. Each firefly corresponds to a vector of real numbers on the interval . The components of each firefly vector are always sorted in an increasing order to reflect the ordered structure of the data points. The fireflies are initialized by using a random function of uniform distribution on the interval domain . We have used a collection of fireflies for the examples reported in this paper. We also checked our results with larger populations by changing this parameter from to fireflies with stepsize and do not notice significant variations in our results. However, a larger value could be required for very massive sets of data points (≥ data points) exhibiting very complicated shapes.
Once the initial population of fireflies is generated, some parameters of the firefly algorithm have to be determined in order to apply them to our problem. Our choice of the parameters for the FFA is mostly empirical: we initially rely on standard values reported in the existing literature and then perform computer experiments to validate our parameter tuning. In this paper, we consider the following set of parameter values for the FFA method: , , and . This set of parameter values has already been used in a previous paper by the authors for a Bézier surface parameterization problem with good results [61]. Our computer experiments also confirmed their good performance for the examples discussed in this paper and some others not reported here to keep the paper at manageable size. On the contrary, the number of iterations and the value of parameter required some improvement. We initially used a fixed number of iterations in our experiments as the stopping criterion, but this choice was found to be very inefficient for this problem. The main reasons are that the method can potentially be applied to sets of very different number of data points, meaning that the dimension of the search space can vary dramatically from one example to another, and that the initial knot vector used in our method is not optimal yet, making it difficult to determine in advance how many iterations are needed to achieve convergence. We then turned to a different termination condition, where the number of iterations is determined manually for each specific example, based on the observation of the convergence diagrams (as those shown in Figures 3 and 5). Although it is a tedious and timeconsuming task, we found it to be more reliable in order to ensure convergence is properly achieved. Regarding the value for , our initial choice soon revealed to be too drastic, as the fireflies went out of range in just a few iterations (sometimes, even a single one was enough). We decreased its value gradually and carried out a lot of computer simulations. As a result, the best value was found at .
The last required component of the FFA is the fitness function. It corresponds to the evaluation of the leastsquares function given by the operator to be minimized in (9); that is,
where the weights are scalar numbers to express the degree of confidence of data points (larger weights are assigned to more reliable data). In this paper, we assume a constant confidence value for all data, so we take , . After the selection of those parameters and the fitness function, the firefly algorithm is performed iteratively until the termination criterion is reached. To remove the stochastic effects of single executions, 20 independent executions have been carried out for each simulation trial. Then, the firefly with the best (i.e., minimum) fitness value is selected as the best solution to the problem. As a result, the best parameterization vector of data points is obtained.
Step 3 (knot vector refinement). Based on the parameterization obtained in the previous step, the knot vector is subsequently refined for better performance. To this purpose, the placement of knots should reflect the distribution of data parameters . In this paper, the internal knots are computed by following a procedure firstly proposed by De Boor in [52]. The corresponding algorithm is described in Algorithm 1. It has been proved that this algorithm guarantees that every knot span contains at least one . At its turn, this condition ensures that the matrix is positive definite and well conditioned. Furthermore, has a semibandwidth less than (see [52] for details). All these properties imply that the system of equations related to the convex problem can be solved by using efficient and reliable numerical methods [62], as explained in the next paragraphs.

Step 4 (spline coefficient determination). As we mentioned above, in this paper, we follow an indirect approach to compute the fitting spline curve to the data points. In this scheme, data parameters and knots are computed at earlier stages of the optimization process so that the approximation problem (9) becomes a convex problem. In that case, premultiplying by at both sides of (8), we get where . Note that is a symmetric square matrix and positive semidefinite, so system (11) always has a solution. It can be solved numerically by Gaussian elimination, LU decomposition, or the singular value decomposition (SVD) (see [62] for details). In this paper, SVD has been used since it provides the best numerical answer in the sense of least squares for those cases in which the exact solution is not possible. To this aim, is decomposed as the matrix product , where is a column orthogonal matrix, is a diagonal matrix with positive or zero elements called the singular values, and is a square orthogonal matrix. Furthermore, its inverse can readily be obtained as: . In addition, the knot vector obtained by the procedure described in Step 3 guarantees that is positive, definite, and, therefore, nonsingular, so the problem can be solved by using this inverse matrix .
4.4. Implementation Issues
Regarding the implementation, all computations in this paper have been performed on a 2.6 GHz Intel Core i7 processor with 8 GB of RAM. The source code has been implemented by the authors in the native programming language of the popular scientific program Matlab, version 2012a. In our opinion, Matlab is a very suitable tool for this task: it is fast and provides reliable, welltested routines for efficient matrix manipulations. It also contains a bulk of resources regarding the solving of systems of equations. For instance, Matlab provides us with the command mldivide to solve the equation for both squared and nonsquared systems (by using Gaussian elimination with partial pivoting and leastsquares techniques, resp.). Depending on the general structure of matrix , this command applies specialized LAPACK and BLAS routines to get the best possible solution to this system. Also, Matlab provides us with a specialized command svd for computing the SVD of a matrix. This command carries out the matrix SVD decomposition automatically so it can be used in a rather black boxlike way. Besides, Matlab provides excellent graphical options and optimized code for input/output interaction and high performance computations.
5. Experimental Results
Our method has been applied to several realworld examples of CAD/CAM shapes for the automotive industry. In this section we discuss three of them. The reported examples reflect the variety of cases our method can be applied to: they include both open and closed curves, as well as 2D and 3D shapes comprised of a single curve and multiple curves. We think we have provided enough examples to convince the reader of the broad applicability of our approach.
5.1. Side Profile Curve of a Car Body
The first example consists of the data fitting of a set of data points from the In (axis) side profile section of a model of a notchback threebox sedan car body. The data points were obtained by a layout machine from the Spanish car maker provider Candemat years ago. This example has been primarily chosen because it includes areas of varying slope, ranging from the strong slope at both ends (upwards on the left, downwards on the right) to the soft slope in middle part, with a horizontal area between pillars A and C and in the cargo box area and a soft upward area in the car hood. In addition, it is a good example of a truly parametric curve that cannot be faithfully represented by simpler functions such as explicit functions and the like. Consequently, it is a very good candidate to check the performance of our approach.
The problem is also challenging because we are trying to represent the whole shape with a single curve. It is worthwhile mentioning here that the use of a single curve for a whole object is not common at all in industrial environments, where the shapes of final products such as a car body are usually represented by a very large set (several hundreds of thousands, even millions) of simpler curves. Therefore, even though we are using data from a realworld shape, this example must be understood as a purely academic example rather than a genuine realworld problem in the automotive industry. Yet, this example is very useful in this paper to analyze the performance of our approach.
Figure 2 shows our simulation results for a spline curve of degree . Top figure shows the original data points, represented by red cross symbols along with the reconstructed data points, displayed as empty circles in blue. The picture is intended to show the correspondence between the original and the reconstructed data points in a graphical way. As the reader can see, our method yields a very good matching between both sets of points. This observation is confirmed by Figure 2 (bottom), representing the same original data points along with the approximating spline curve, displayed as a solid line in blue. Figure 3 shows the evolution of the fitting error of the operator for iterations. Since the diagram shows a similar general behavior for the 20 independent executions and larger number of iterations no longer improves the fitting error, we conclude that this number of iterations is enough for convergence. The corresponding fitting error in this example is . This value gives only partial information because it does not consider the number of sampled points. This means that increasing the number of data points leads automatically to larger fitting errors even though each data point might be better fitted. To overcome this drawback, we also compute the rootmean square error (RMSE), which gives a better measure of the quality of the approximation. In this example we obtain an RMSE fitting error of . This value confirms the good performance of the proposed method for this problem. A close inspection of Figure 2 reveals, however, that some parts can still be further improved: this fact is visually noticeable in Figure 2, specially at the corners of the front and rear fenders and their linkings to the lower car body line, as well as at the lower parts of front and rear bumpers. This effect is attributable to our indirect approach, in which we do not compute the optimal knot vector but an approximation. We are currently working towards an improved approach to solve this limitation.
5.2. Outline Curves of a Paint Spray Gun
Figure 4 shows the results of our method for spline curves of degree when applied to a paint spray gun model. Left and right pictures of the figure have the same meaning as the top and bottom pictures of the previous figure, respectively. The spray gun model consists of two different curves for the outer and inner boundary lines with and data points, respectively. We applied our method to each curve independently. The fitting errors of the operator for the outer and inner curves are and , respectively. The RMSE fitting errors are and , respectively. These values are reached for iterations, well within the convergence area as shown in Figure 5, which displays the evolution of the fitting error evolution of the operator for iterations. Fitting errors for the outer and the inner curve in that figure are displayed in blue and red, respectively. Once again, the numerical errors and the visual appearance confirm the good performance of the method in these two cases as well. This example also shows that our approach has a great flexibility, being able to deal with both open and closed curves such as the outer and inner curves of this model, respectively. Note also that even though the data points of the outer curve exhibit many changes of concavity, the method can approximate them very accurately with a single spline curve.
5.3. 3D CAD/CAM Workpiece
Figure 6 illustrates the application of our method to a complex CAD/CAM workpiece of a car body. This example aims at showing the ability of our method to perform well in a real industrial problem involving several geometric shapes. The figure shows two different views of a 3D automotive part comprised of curves stored in an industrial IGES file obtained from Candemat. As the reader can see, the shape consists of curves with very different topologies, ranging from simple regular shapes such as straight lines and conics to complicated irregular shapes. We therefore applied two different strategies for this example: simple regular shapes are reconstructed by using basic primitives (lines, circles, etc.) for which only some parameters have to be computed (e.g., using two data points for straight lines and three nonaligned data points for the center and radius of a circle), while the complicated shapes are reconstructed with our approach. A total of 243 curves have been reconstructed with our method in Figure 6 by using quadratic and cubic spline curves. The maximum, minimum, and average values of the RMSE fitting error are , , and , respectively. These error values confirm the good performance of our approach for a 3D realworld automotive part comprised of several (both open and closed) spline curves of different degrees.
6. Conclusions and Future Work
In this paper we introduce a new bioinspired method for computing a spline curve that approximates a given set of data points. This task involves many different variables which are interrelated with each other in a nonlinear way, leading to a continuous nonlinear optimization problem. Our approach solves this problem by combining two different procedures for the knots and the data parameters. For the former, an indirect approach that precomputes the knots instead of optimizing them is applied. This strategy leaves the most significant part of the optimization effort to the data parameterization. In our scheme, this task is addressed by a powerful bioinspired metaheuristic technique well suited for difficult continuous optimization problems, the firefly algorithm. Then, the knot vector is refined by using De Boor’s method, thus yielding a better approximation to the optimal knot vector. The combination of the indirect approach and the firefly algorithm converts the original nonlinear continuous optimization problem into a convex optimization problem, solved by SVD. The proposed method has been applied to some illustrative realworld examples from the CAD/CAM field involving 2D and 3D open and closed curves. Our experimental results show that the proposed scheme can solve the original continuous nonlinear optimization problem very efficiently.
Regarding the future work, this method can be improved in several ways. As mentioned in Section 5, the indirect approach used in this paper implies that the knot vector is generally very good but not optimal, opening the door for further improvement. We think that a direct approach could lead to better results for the knot vector and, hence, for the overall method. Another interesting field of research is the use of some powerful modifications of the firefly algorithm which have been reported to return better results [63] or extend the capabilities of the standard version for continuous multiobjective optimization [64]. Also, the extension of this approach to other recently described bioinspired methods, such as the cuckoo search [65, 66] or the bat algorithm [67–69], might lead to further improvement of our results. They are all part of our plans for future work in the field.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper. Any commercial identity mentioned in this paper is cited solely for scientific purposes.
Acknowledgments
This research has been kindly supported by the Computer Science National Program of the Spanish Ministry of Economy and Competitiveness, Project Reference no. TIN201230768, Toho University (Funabashi, Japan), and the University of Cantabria (Santander, Spain). The authors are particularly grateful to the Department of Information Science of Toho University for all the facilities given to carry out this work.