Abstract

In the paper, a simple but effective algorithm is proposed to solve the problem of curve identification in the presence of curve intersections. The problem is motivated by the fact that, in several engineering problems, one has to deal with samples of possibly intersecting curves, and a successful identification can significantly improve their visualization through a successive function interpolation step. Numerical results are presented, together with possible areas of application of the algorithm.

1. Introduction

In several problems in engineering, one has to deal with samples of possibly intersecting curves. As an example, when studying dispersive wave-propagation properties of phononic [1] and photonic [2] periodic structures, one simultaneously obtains samples of various curves by evaluating the eigenvalues of a suitable Hermitian matrix (or, more often, the generalized eigenvalues [3] of a pair of suitable Hermitian matrices) depending continuously on a scalar parameter, for various discrete choices of such a parameter. In this case, curve intersections can arise in the presence of eigenvalue degeneracy, i.e., for those values of the parameter for which at least one eigenvalue has multiplicity larger than one as a root of the characteristic polynomial. As a second example, curves may represent the positions of point objects (e.g., of markers detected by cameras in a motion-capture system [4]) with respect to time. In this situation, even assuming that no two objects can occupy the same position at the same time, intersections may still arise either in case of noisy measures or in case of only two components of the three-dimensional position vectors (e.g., the ones belonging to either a horizontal or a vertical plane) being measured.

Attributing the curve samples to the correct curves (where the “correct” choice may be defined, e.g., as the one making the resulting reconstructed curves as smooth as possible) not only is of interest per se, but also represents an important preliminary step for a successive curve interpolation procedure, which can significantly improve their visualization in case of an initially small number of samples. The latter may be motivated, e.g., by a possibly high cost of generating the single sample or simply by a technical limit in the acquisition process (e.g., a small frame rate). For instance, in the example above related to wave propagation, it often happens that the Hermitian matrix has large dimension, which makes the computational time required to solve the eigenvalue problem large for each choice of the parameter, especially if a high precision in the eigenvalue computations is needed. This is the typical case of eigenvalue problems of interest in engineering applications, as they are often ill-conditioned [5].

In this framework, in the paper we propose and test a simple but effective algorithm for the problem of curve identification in the possible presence of curve intersections. The algorithm requires only, as a very reasonable assumption, that the curves do not intersect for at least one value of the independent variable (for simplicity and without loss of generality, the first one, in the description of the algorithm provided in the next section). Related works are [6, 7] (an extended version of which is [8]). However, [7] presents a multitarget tracking algorithm which requires, among other assumptions, the availability of a model of the “law of motion” for each curve, whereas [6] presents an approach more similar to the one adopted in the present work (by using current possible curve orientations to estimate next curve points, emulating human perception of intersecting curves), but does not process all the curves simultaneously. Instead, it relies on a preliminary rough curve detection based on an algorithm proposed in [9], followed by a correction of the estimated curve at successive junctions, based on current possible curve orientations for the estimation of next curve points. In more detail,(i)the correction of the preliminarily estimated curve is implemented only when such a curve is characterized by large slope changes (i.e., numerical approximations of second derivatives) on a discretization of the horizontal axis;(ii)the corrected estimated curve is constrained to have small slope changes on the same discretization.

These two features may lead to an incorrect curve identification, respectively, when(i)two intersecting curves are erroneously identified by the algorithm proposed in [9]; in this case, when their second derivatives are small with respect to a given threshold, no correction is implemented;(ii)the curves to be identified have large second derivatives in some parts of the domain.

In contrast, the algorithm proposed in this work(i)always uses (apart from its first two initialization steps (see Section 2 for their description)) current possible curve orientations for the estimation of next curve points, without needing a preliminary rough curve identification and thresholds on the second derivatives of preliminarily estimated curves;(ii)does not require any bounds on the slope changes of each curve, allowing in principle identifying even curves with large slope changes.

The paper is organized as follows. In Section 2, the proposed algorithm is presented. Section 3 provides numerical results related to the application of the algorithm to four different tests. Section 4 applies the algorithm to the problem of identifying smooth eigenvalue curves. Finally, Section 5 concludes the paper and discusses possible extensions.

2. Description of the Algorithm

To describe the proposed algorithm, the following notation is introduced. Let be positive integers and, for , let be smooth functions (i.e., smooth curves parameterized by the independent variable ). The functions are uniformly sampled for a total of samples each, located at . Moreover, it is assumed that the samples are distinct for . For , the set of samples evaluated at is denoted by . The idea behind the algorithm is the following. For each sampled value of the independent variable, the algorithm attributes to the -th reconstructed curve a sample , and (for ) a feasible increment . The two are combined (apart from the cases and ) to generate, still for the -th reconstructed curve, a prediction of the point obtained for the successive sampled value of the independent variable. The predictions are compared with the set of actual samples , to attribute each of them, in a one-to-one way, to one of the reconstructed curves, generating the points and, apart when , also the next feasible increments . To conclude, each reconstructed (sampled) curve is the collection . In more detail, the algorithm works as follows.

Step 1. Each element of is attributed to one of the reconstructed curves. For , let be the sample attributed to the -th reconstructed curve.

Step 2. The elements of are attributed (matched) to the elements of (then, due to Step 1, also to the corresponding reconstructed curves) in the following way (possible ties in the next descriptions are solved by coin flipping). A copy of and a copy of are constructed. Then, one denotes by the nearest element (in terms of the Euclidean distance) of from . Such an element is attributed to the element of that minimizes its own distance from . Then, and are removed from and , respectively, and the procedure is repeated until both sets become empty. Finally, for , let be the sample attributed to the -the reconstructed curve. Figure 1 shows an example of the sets and and of their matchings obtained using Step 2 of the proposed algorithm.

Step 3. This step is repeated for . For each such , the elements of are attributed (matched) to the elements of (then, due to either Step 2 or the previous iteration of Step 3, also to the corresponding reconstructed curves) in the following way (again, possible ties in the next descriptions are solved by coin flipping). For each , the feasible direction is constructed, together with the estimate of the next point on the -th reconstructed curve. The set of such estimates is defined. Copies , , and of , , and , respectively, are constructed. Then, one denotes by the nearest element (in terms of the Euclidean distance) of from . Such an element is attributed to the element of whose associated estimate minimizes its own distance from . Then, , , and are removed from , , and , respectively, and the procedure is repeated until all the three sets become empty. Finally, for , let be the sample attributed to the -th reconstructed curve. Figure 2 shows an example of the sets and and of their matchings obtained using Step 3 of the proposed algorithm, for .

The reason for which the algorithm is able to generate smooth reconstructed curves is that it penalizes large slope changes when attributing points to the reconstructed curves (no slopes are used in Step 2 since they are not yet available at that stage). As an example, for , , and , Figure 3 compares, for a possible situation, (a) the portions of the reconstructed curves ( and ) provided by the proposed algorithm up to the stage and (b) the ones ( and ) obtained by swapping the two labels associated with that stage. The figure shows clearly that a much larger smoothness is obtained in the first case.

Remark 1. If the curves do not intersect in and sampling is sufficiently fast, then Steps 1 and 2 provide a correct identification of the curves at and also of their initial slopes. In this way, at its first iteration (i.e., for ), Step 3 receives correct estimates of the feasible directions , for .

Remark 2. Step 2 of the algorithm could be replaced by the optimal solution to an instance of the minimum cost Euclidean bipartite matching problem [10], since such a solution assigns, in a one-to-one way, each point in to a point in in such a way to minimize the sum of the Euclidean distances of the pairs of matched points. However, no difference between such attribution and the one produced by Step 2 is obtained if, for each point in , there is a point in (a different for each ) whose distance from is significantly smaller than the distance from of any other point in (this situation occurs, e.g., in the case reported in Figure 1). Similarly, also Step 3 of the algorithm could be replaced, for , by the optimal solution to an instance of the minimum cost Euclidean bipartite matching problem, substituting and with and , respectively. However, also in this case, no difference in the attribution is obtained, under a condition analogous to the one reported above for Step 2 (again, this situation occurs, e.g., in the case reported in Figure 2). The advantage of Steps 2 and 3 is that they are less computationally demanding than solving the corresponding instances of the minimum cost Euclidean bipartite matching problem (indeed, the algorithm proposed in [10] to solve the minimum cost Euclidean bipartite matching problem requires time (it is worth recalling that the minimum cost Euclidean bipartite matching problem can be formulated as an integer linear programming problem, but also as the associated relaxed continuous linear programming problem, whose domain has the form , where is a suitable constraint matrix; indeed, since A is totally unimodular and has integer components, the vertices of such domain have integer coordinates [11])).

3. Numerical Tests and Results

To evaluate the effectiveness of the algorithm, we consider the four following numerical tests, which have been implemented in MATLAB 8.2 (R2017a) [12].

In the first test, we generate scalar-valued functions (i.e., ) of the form , where the coefficients are chosen as realizations of independent random variables uniformly distributed on . In the following, we choose and . The obtained samples are plotted in Figure 4(a). In the figure, they are plotted using the same color, because the identity of each curve is not provided a priori to the algorithm, but the algorithm has to find it. In Figure 4(b), the curves reconstructed by the algorithm are reported, using a different color for each curve. It is clear from the figure that the reconstructed curves are smooth. Incidentally, in this case, each reconstructed curve coincides with one of the original curves (so, no separate figure is presented to show the ground truth).

In the second test, we consider a similar setting as before. In this case, however, the number of samples per curve is only . The samples provided as inputs to the algorithm are represented as circles in Figure 5(a). As shown by Figure 5(b), also in this case the proposed algorithm is able to produce smooth curves. Moreover, its preliminary identification of smooth curves makes the successive linear interpolation step (also presented in Figure 5(b)) particularly effective for an improved visualization of the curves reconstructed by the algorithm.

The third test refers to the case , i.e., to curves in . For this test, curves have been considered, with samples for each curve. The two components and of the vector-valued functions have been generated in a similar way to the functions in the first numerical test. For the third test, Figure 6 shows (a) the samples used, and (b) the curves reconstructed by the algorithm (a different color is used for each reconstructed curve). Again, also in this case, each reconstructed curve coincides with one of the original curves.

In the last test, we illustrate a particular situation in which the algorithm adopted in this work has better performance than the one proposed in [6], considering a simple case with and only two intersecting curves, and samples for each curve. We also assume their wrong preliminary identification by the algorithm proposed in [9] (consistently with an analogous case of erroneous identification reported in [8, Figure 2.3]). More precisely, we assume that, for each value of , , that algorithm associates the first curve with the smallest element of , and the second curve with the largest value of . Since the two intersecting curves have very small slope changes, no correction step is performed by the algorithm proposed in [6] (see also Section 1). So, in this particular case, the two algorithms from [6, 9] produce the same output (more extensive numerical comparisons in which the correction step of [6] is actually performed are among the possible subjects of further research, together with real-world applications of the proposed algorithm). Figure 7 shows that, in contrast, the proposed algorithm is able to identify correctly the two intersecting curves.

4. Application to the Identification of Smooth Eigenvalue Curves

In the following, we illustrate how the algorithm proposed in this paper can be applied to the identification of smooth eigenvalue curves, considering a last numerical test, whose description is detailed as follows. At first, we generate three real matrices (whose elements are obtained, for simplicity, as realizations of independent random variables, uniformly distributed on ). Then, for , we construct the real matrix and the symmetric real matrix , where is the transpose of . For each , is defined to be one of the (real) eigenvalues of the matrix . The functions are unknown to the algorithm, which has at its disposal only the sets (i.e., the collections of eigenvalues, for each , ). Nevertheless, the algorithm attributes the eigenvalues to the functions not sequentially according to their increasing values, but in such a way to make the resulting functions smooth. It is worth noting that, since by construction the matrix is symmetric and depends analytically on the parameter , its eigenvalues are analytic functions of , due to [13, Theorem 6.1 in Chapter II]. Interestingly, this holds for both simple and multiple eigenvalues. Finally, the applicability of the algorithm (more precisely, of its Steps 1 and 2) derives from the fact that, with probability 1 with respect to the random construction of the matrix , the matrix has distinct eigenvalues (which is assumed in the following).

Figure 8 shows (a) the samples obtained according to the construction above (with curves (the simulation has been actually performed with curves; however, since by running the simulation several times, one curve is typically well-separated from the other ones, in order to improve the visualization, Figure 8 refers only to the other 9 curves) and sets of samples ), and (b) the curves reconstructed by the proposed algorithm (a different color is used for each reconstructed curve). The figure shows that, indeed, the algorithm is able to identify quite smooth curves.

5. Conclusions

An algorithm has been proposed for the identification of possibly intersecting curves. Despite its simplicity, the algorithm has been demonstrated to be very effective and applicable to problems such as the automatic identification of eigenvalue curves. Although artificially generated eigenvalue curves have been considered in the numerical example presented in the paper, the algorithm could be applied also to dispersion curves arising in wave-propagation problems, since these are also eigenvalue (or, more generally, generalized eigenvalue) curves [1]. As remarked in the Introduction, the algorithm is also potentially applicable to the analysis of motion-capture data [4]. Other possible applications are in medical image analysis [14] and in handwritten character recognition [15].

The numerical results presented in the paper refer to curves embedded in the plane and in the space. However, similar results can be obtained for curves embedded in higher-dimensional spaces. As a possible future research direction, the algorithm could be extended to the case of possibly intersecting surfaces, which would be potentially applicable to the identification of dispersion surfaces (instead of curves), still in wave-propagation problems [16]. Another research direction concerns the formulation of a suitable optimization problem modeling curve identification, for which the proposed algorithm (or its suitable variation) would provide a good approximation of the optimal solution value. Finally, in the paper, the proposed algorithm has been applied to noiseless data. This is justified by the fact that, in several applications, the data are not corrupted by noise, but still one has only a few such data (because their generation could be computationally expensive, as in the case of data coming from dispersion curves associated with complex physical-mathematical models). However, the algorithm could be also applied (possibly with some variations, such as the use of regularization techniques typical of machine learning [1721]) in the case of noisy data and compared, on such problems, with existing multitarget tracking algorithms [7, 22]. In this particular case, the final interpolation step, implemented in the second test presented in Section 3, could be replaced by a more appropriate approximation step.

Data Availability

The MATLAB code and the data used for the numerical tests presented in the paper are included within the Supplementary Materials.

Disclosure

The author is a member of GNAMPA-INdAM (Gruppo Nazionale per l’Analisi Matematica, la Probabilità e le loro Applicazioni - Istituto Nazionale di Alta Matematica).

Conflicts of Interest

The author declares that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The author was partially supported by a FFABR (Fondo per il Finanziamento delle Attività Base di Ricerca) research fund from the Italian Ministry of Education, University and Research (MIUR).

Supplementary Materials

test1.m: MATLAB code for the generation of Figures 4 a) and b); it uses workspace1.mat. test2.m: MATLAB code for the generation of Figures 5 a) and b); it uses workspace2.mat. test3.m: MATLAB code for the generation of Figures 6 a) and b); it uses workspace3.mat. test4.m: MATLAB code for the generation of Figures 7 a), b), and c); it uses workspace4.mat. test_eigenvaues.m: MATLAB code for the generation of Figures 8 a) and b); it uses workspace_eigenvalues.mat. (Supplementary Materials)