An Improved Exact Algorithm for Least-Squares Unidimensional Scaling

Palubeckis, Gintaras

doi:https://doi.org/10.1155/2013/890589

Journal of Applied Mathematics

On this page

Abstract Introduction Conclusions Appendix References Copyright Related Articles

Research Article | Open Access

Volume 2013 | Article ID 890589 | https://doi.org/10.1155/2013/890589

An Improved Exact Algorithm for Least-Squares Unidimensional Scaling

Gintaras Palubeckis¹

Academic Editor: Frank Werner

Received16 Oct 2012

Accepted31 Mar 2013

Published07 May 2013

Abstract

Given n objects and an symmetric dissimilarity matrix D with zero main diagonal and nonnegative off-diagonal entries, the least-squares unidimensional scaling problem asks to find an arrangement of objects along a straight line such that the pairwise distances between them reflect dissimilarities represented by the matrix D. In this paper, we propose an improved branch-and-bound algorithm for solving this problem. The main ingredients of the algorithm include an innovative upper bounding technique relying on the linear assignment model and a dominance test which allows considerably reducing the redundancy in the enumeration process. An initial lower bound for the algorithm is provided by an iterated tabu search heuristic. To enhance the performance of this heuristic we develop an efficient method for exploring the pairwise interchange neighborhood of a solution in the search space. The basic principle and formulas of the method are also used in the implementation of the dominance test. We report computational results for both randomly generated and real-life based problem instances. In particular, we were able to solve to guaranteed optimality the problem defined by a Morse code dissimilarity matrix.

1. Introduction

Least-squares (or ) unidimensional scaling is an important optimization problem in the field of combinatorial data analysis. It can be stated as follows. Suppose that there are objects and, in addition, there are pairwise dissimilarity data available for them. These data form an symmetric dissimilarity matrix, , with zero main diagonal and nonnegative off-diagonal entries. The problem is to arrange the objects along a straight line such that the pairwise distances between them reflect dissimilarities given by the matrix . Mathematically, the problem can be expressed as where denotes the vector and , , is the coordinate for the object on the line. We note that two (or even more) objects may have the same coordinate. In the literature, the double sum given by (1) is called a stress function [1].

Unidimensional scaling and its generalization, multidimensional scaling [1, 2], are dimensionality reduction techniques for exploratory data analysis and data mining [3]. The applicability of these techniques has been reported in a diversity of fields. In recent years, unidimensional scaling has found new applications in several domains including genetics, psychology, and medicine. Caraux and Pinloche [4] developed an environment to graphically explore gene expression data. This environment offers unidimensional scaling methods, along with some other popular techniques. Ng et al. [5] considered the problem of constructing linkage disequilibrium maps for human genome. They proposed a method for this problem which is centered on the use of a specific type of the unidimensional scaling model. Armstrong et al. [6] presented a study in which they evaluated interest profile differences across gender, racial-ethnic group, and employment status. The authors used two measures of profile similarity, the so-called correlation and Euclidean distances. In the former case, unidimensional scaling appeared to be a good model for interpreting the obtained results. Hubert and Steinley [7] applied unidimensional scaling to analyze proximity data on the United States Supreme Court Justices. They have found that agreement among the justices was better represented by this approach than using a hierarchical classification method. Dahabiah et al. [8, 9] proposed a data mining technique based on a unidimensional scaling algorithm. They applied this technique on a digestive endoscope database consisting of a large number of pathologies. Ge et al. [10] developed a method for unsupervised classification of remote sensing images. Unidimensional scaling plays a pivotal role in this classifier. Bihn et al. [11] examined the diversity of ant assemblages in the Atlantic Forest of Southern Brazil. To visualize nestedness of ant genera distribution, a unidimensional scaling approach was adopted.

From (1), we see that in order to solve the unidimensional scaling problem (UDSP for short) we need to find not only a permutation of objects, but also their coordinates preserving the order of the objects as given by the permutation. Thus, at first glance, it might seem that the UDSP is a much more difficult problem than other linear arrangement problems. Fortunately, this is not the case. As shown by Defays [12], the UDSP can be reduced to the following combinatorial optimization problem: where is the set of all permutations of , that is, the set of all vectors such that , , and , , . Letting , , denote the sum , we can write Thus, the UDSP is essentially a problem over all permutations of objects. With an optimal solution to (2) at hand, we can compute the optimal coordinates for the objects by setting and using the following relationship [13] for :

There is an important body of the literature on algorithms for solving the UDSP. Since the feasible solutions to the problem are linear arrangements of objects, a natural approach to the UDSP is to apply a general dynamic programming paradigm. An algorithm based on this paradigm was proposed by Hubert and Arabie [14]. However, as pointed out in the papers by Hubert et al. [15] and Brusco and Stahl [13], the dynamic programming algorithm requires very large memory for storing the partial solutions and, in fact, runs out of memory long before running out of time. As reported in these papers, the largest UDSP instances solved using dynamic programming were of sizes 25 and 26. An alternative approach for solving the UDSP is to use the branch-and-bound method. The first algorithm following this approach is that of Defays [12]. In his algorithm, during branching, the objects are assigned in an alternating fashion to either the leftmost free position or the rightmost free position in the -permutation under construction. For example, at level 4 of the search tree, the four selected objects are , , , and , and there are no object assigned to positions in the permutation . To fathom partial solutions, the algorithm uses a symmetry test, an adjacency test, and a bound test. Some more comments on the Defays approach were provided in the paper by Brusco and Stahl [13]. The authors of that paper also proposed two improved branch-and-bound algorithms for unidimensional scaling. The first of them adopts the above-mentioned branching strategy borrowed from the Defays method. Meanwhile, in the second algorithm of Brusco and Stahl, an object chosen for branching is always assigned to the leftmost free position. Brusco and Stahl [13, 16] presented improved bounding procedures and an interchange test expanding the adjacency test used by Defays [12]. These components of their branch-and-bound algorithms are briefly sketched in Section 2 and Section 4, respectively. Brusco and Stahl [13] have reported computational experience for both randomly generated and empirical dissimilarity matrices. The results suggest that their branch-and-bound implementations are superior to the Defays algorithm. Moreover, the results demonstrate that these implementations are able to solve UDSP instances that are beyond the capabilities of the dynamic programming method.

There are also a number of heuristic algorithms for unidimensional scaling. These include the smoothing technique [17], the iterative quadratic assignment heuristic [15], and simulated annealing adaptations [18, 19]. A heuristic implementation of dynamic programming for escaping local optima has been presented by Brusco et al. [20]. Computational results for problem instances with up to 200 objects were reported in [18–20].

The focus of this paper is on developing a branch-and-bound algorithm for the UDSP. The primary intention is to solve exactly some larger problems than those solved by previous methods. The significant features of our algorithm are the use of an innovative upper bounding procedure relying on the linear assignment model and the enhancement of the interchange test from [13]. In the current paper, this enhancement is called a dominance test. A good initial solution and thus lower bound are obtained by running an iterated tabu search algorithm at the root node of the search tree. To make this algorithm more efficient, we provide a novel mechanism and underlying formulas for searching the pairwise interchange neighborhood of a solution in time, which is an improvement over a standard -time approach from the literature. The same formulas are also used in the dominance test. We present the results of empirical evaluation of our branch-and-bound algorithm and compare it against the better of the two branch-and-bound algorithms of Brusco and Stahl [13]. The largest UDSP instance that we were able to solve with our algorithm was the problem defined by a Morse code dissimilarity matrix.

The remainder of this paper is organized as follows. In Section 2, we present an upper bound on solutions of the UDSP. In Section 3, we outline an effective tabu search procedure for unidimensional scaling and propose a method for exploring the pairwise interchange neighborhood. Section 4 gives a description of the dominance test for pruning nodes in the search. This test is used in our branch-and-bound algorithm presented in Section 5. The results of computational experiments are reported in Section 6. Finally, Section 7 concludes the paper.

2. Upper Bounds

In this section, we first briefly recall the bounds used by Brusco and Stahl [13] in their second branch-and-bound algorithm. Then we present a new upper bounding mechanism for the UDSP. We discuss the technical details of the procedure for computing our bound. We choose a variation of this procedure which yields slightly weakened bounds but is quite fast.

Like in the second algorithm of Brusco and Stahl, in our implementation of the branch-and-bound method partial solutions are constructed by placing each new object in the leftmost free position. For a partial solution , , we denote by the set of the objects in . Thus , and is the set of unselected objects. At the root node of the search tree, is empty, and is equal to . The first upper bound used by Brusco and Stahl, when adopted to , takes the following form: where denotes the sum of the terms in (3) corresponding to assigned objects. Thus, .

To describe the second bound, we define to be an matrix obtained from by sorting the nondiagonal entries of each row of in nondecreasing order. Let and for . For other values of , is obtained from the equality , . The second upper bound proposed in [13] is given by the following expression:

The previous outlined bounds are computationally attractive but not strong enough when used to prune the search space. In order to get a tighter upper bound, we resort to the linear assignment model where , are entries of an profit matrix with rows and columns indexed by the objects in the set . Next, we will construct the matrix . For a permutation , we denote by (resp., ) the sum of dissimilarities between object and all objects preceding (resp., following) in . Thus, , . From (3), we have and for each . Suppose now that is a partial solution and is the set of all -permutations that can be obtained by extending . Assume, for the sake of simplicity, that . In order to construct a suitable matrix , we have to bound and in (8) from above. To this end, for any , let denote the list obtained by sorting , , in nonincreasing order. To get the required bounds, we make use of the sums , , , and , , . Suppose that is any permutation in and for a position . Then it is easy to see that and . Indeed, in the case of , for example, we have , and the last term in this expression is less than or equal to . We define to be the matrix with entries , . Let denote the optimal value of the linear assignment problem (LAP) (7) with the profit matrix . From the definition of , we immediately obtain an upper bound on the optimal value of the objective function .

Proposition 1. For a partial solution , .

From the definition of and values, it can be observed that, with the increase of , increases, while decreases. Therefore, when constructing the th row of the matrix , it is expedient to first find the largest for which . We denote such by and assume that if . Then for , and for .

The most time-consuming step in the computation of the upper bound is solving the linear assignment problem. In order to provide a computationally less expensive bounding method we replace with the value of a feasible solution to the dual of the LAP (7). To get the bound, we take , , , . Clearly, for each pair . Thus the vector satisfies the constraints of the dual of the LAP instance of interest. Let . Then, from the duality theory of linear programming, we can establish the following upper bound for .

Proposition 2. For a partial solution , .

Proof. The inequality follows from Proposition 1 and the fact that .

One point that needs to be emphasized is that strengthens the first bound proposed by Brusco and Stahl.

Proposition 3. .

Proof. Indeed, . The equality in the previous chain of expressions comes from the choice of and our assumption on , whereas the inequality is due to the fact that for each .

As an example, we consider the following dissimilarity matrix borrowed from Defays [12]: The same matrix was also used by Brusco and Stahl [13] to illustrate their bounding procedures. From Table 2 in their paper, we find that the solution is optimal for this instance of the UDSP. The optimal value is equal to 388. We compute the proposed upper bound for the whole Defays problem. In this case, and . The matrix is constructed by noting that for each . It takes the following form: Using the expressions for and , we find that , , , and . Thus, . Hence, our bound for the Defays instance is tight. For comparison, the bound (5) is equal to . To obtain the second bound of Brusco and Stahl, we first compute the values. Since and , the bound (6) (taking the form ) is also equal to 500.

3. Tabu Search

As it is well known that the effectiveness of the bound test in branch-and-bound algorithms depends on the value of the best solution found throughout the search process. A good strategy is to create a high-quality solution before the search is started. For this task, various heuristics can be applied. Our choice was to use the tabu search (TS) technique. It should be noted that it is not the purpose of this paper to investigate heuristic algorithms for solving the UDSP. Therefore, we restrict ourselves merely to presenting an outline of our implementation of tabu search.

For , let denote the permutation obtained from by interchanging the objects in positions and . The set is the 2-interchange neighborhood of . At each iteration, the tabu search algorithm performs an exploration of the neighborhood . A subset of solutions in are evaluated by computing . In order to get better quality solutions we apply TS iteratively. This technique, called iterated tabu search (ITS), consists of the following steps: (1)initialize the search with some permutation of the objects; (2)apply a tabu search procedure. Check if the termination condition is satisfied. If so, then stop. Otherwise proceed to ; (3)apply a solution perturbation procedure. Return to .

In our implementation, the search is started from a randomly generated permutation. Also, some data are prepared to be used in the computation of the differences . We will defer this issue until later in this section. In step of ITS, at each iteration of the TS procedure, the neighborhood of the current solution is explored. The procedure selects a permutation , which either improves the best available objective function value so far or, if this does not happen, has the largest value of among all permutations for which the objects in positions and are not in the tabu list. The permutation is then accepted as a new current solution. The number of iterations in the tabu search phase is bounded above by a parameter of the algorithm. The goal of the solution perturbation phase is to generate a permutation which is employed as the starting point for the next invocation of the tabu search procedure. This is achieved by taking the current permutation and performing a sequence of pairwise interchanges of objects. Throughout this process, each object is allowed to change its position in the permutation at most once. At each step, first a set of feasible pairs of permutation positions with the largest values is formed. Then a pair, say , is chosen randomly from this set, and objects in positions and are switched. We omit some of the details of the algorithm. Similar implementations of the ITS technique for other optimization problems are thoroughly described, for example, in [21, 22]. We should note that also other enhancements of the generic tabu search strategy could be used for our purposes. These include, for example, multipath adaptive tabu search [23] and bacterial foraging tabu search [24] algorithms.

In the rest of this section, we concentrate on how to efficiently compute the differences . Certainly, this issue is central to the performance of the tabu search procedure. But more importantly, evaluation of pairwise interchanges of objects stands at the heart of the dominance test, which is one of the crucial components of our algorithm. As we will see in the next section, the computation of the values is an operation the dominance test is based on.

Brusco [19] derived the following formula to calculate the difference between and : Using (11), for all permutations in can be computed in time. However, this approach is actually too slow to be efficient. In order to describe a more elaborate approach we introduce three auxiliary matrices , , and defined for a permutation . Let us consider an object . The entries of the th row of the matrices and are defined as follows: if , then , ; if , then , ; if , then . Turning now to the matrix , we suppose that . Then the entries of are given by the following formula: Unlike and , the matrix is symmetric. We use the matrices , , and to cast (11) in a form more suitable for both the TS heuristic and the dominance test.

Proposition 4. For , , and ,

Proof. First, notice that the sums and in (11) are equal, respectively, to and . Next, it can be seen that and . Finally, the third term in (11), ignoring the factor of 4, can be rewritten as It is easy to see that the first sum in (14) is equal to , each of the next two sums is equal to , and the last sum is equal to . After simple manipulations, we arrive at (13).

Having the matrices and using (13), the 2-interchange neighborhood of can be explored in time. The question that remains to be addressed is how efficiently the entries of these matrices can be updated after the replacement of by a permutation chosen from the neighborhood . Suppose that is obtained from by interchanging the objects in positions and . Then, we can write the following formulas that are involved in updating the matrices and : From now on, we will assume that the positions and of the objects and are defined with respect to the permutation . More formally, we have , . The procedure for updating , and is summarized in the following: (1)for each object do the following: (1.1) if , then compute , , by (15) and , , by (17); (1.2) if , then compute , , by (16) and , , by (18); (1.3) if , then compute by (15) for and by (16) for ; (1.4) if or , then first set and afterwards compute by (15) for and by (16) for ; (1.5) if , then first set and afterwards compute by (17) for and by (18) for ; (2)for each pair of objects such that do the following: (2.1)if , then set ; (2.2)if , then set ; (2.3)if , then compute anew using (12); (2.4)if has been updated in , then set .

It can be noted that for some pairs of objects and the entry remains unchanged. There are five such cases: ; ; ; ; .

We now evaluate the time complexity of the described procedure. Each of Steps to requires time. Thus, the complexity of Step is . The number of operations in Steps (, (, and is , and that in Step is . However, the latter is executed only times. Therefore, the overall complexity of Step , and hence of the entire procedure, is .

Formula (13) and the previous described procedure for updating matrices , , and lie at the core of our implementation of the TS method. The matrices , , and can be initialized by applying the formulas (15)–(18) and (12) with respect to the starting permutation. The time complexity of initialization is for and , and for . The auxiliary matrices , , and and the expression (13) for are also used in the dominance test presented in the next section. This test plays a major role in pruning the search tree during the branching and bounding process.

4. Dominance Test

As is well known, the use of dominance tests in enumerative methods allows reducing the search space of the optimization problem of interest. A comprehensive review of dominance tests (or rules) in combinatorial optimization can be found in the paper by Jouglet and Carlier [25]. For the problem we are dealing with in this study, Brusco and Stahl [13] proposed a dominance test for discarding partial solutions, which relies on performing pairwise interchanges of objects. To formulate it, let us assume that , , is a partial solution to the UDSP. The test checks whether for at least one of the positions . If this condition is satisfied, then we say that the test fails. In this case, is thrown away. Otherwise, needs to be extended by placing an unassigned object in position . The first condition in our dominance test is the same as in the test suggested by Brusco and Stahl. To compute , we use subsets of entries of the auxiliary matrices , , and defined in the previous section. Since only the sign of matters, the factor of 4 can be ignored in (13), and thus instead of can be considered. From (13) we obtain

We also take advantage of the case where . We use the lexicographic rule for breaking ties. The test outputs a “Fail” verdict if the equality is found to hold true for some such that . Indeed, can be discarded because both and partial solution are equally good, but is lexicographically larger than .

We note that the entries of the matrices , , and used in (19) can be easily obtained from a subset of entries of , , and computed earlier in the search process. Consider, for example, in (19). It can be assumed that because if , then . For simplicity, let and . To avoid ambiguity, let us denote by the matrix that is obtained during processing of the partial solution . The entry of is calculated by (temporarily) placing the object in position . Now, referring to the definition of , it is easy to see that . Similarly, using formulas of types (15) and (17), we can obtain the required entries of the matrices and . Thus (19) can be computed in constant time. The complexity of computing for all is . The dominance test used in [13] relies on (11) and has complexity . So our implementation of the dominance test is significantly more time-efficient than that proposed by Brusco and Stahl.

In order to make the test even more efficient we, in addition, decided to evaluate those partial solutions that can be obtained from by relocating the object from position to position (here position is excluded since in this case relocating would become a pairwise interchange of objects). Thus, the test compares against partial solutions , . Now suppose that and are extended to -permutations and, respectively, such that, for each , the object in position of coincides with that in position of . Let us define to be the difference between and . This difference can be computed as follows: The previous expression is similar to the formula in [19]. For completeness, we give its derivation in the Appendix. We can rewrite (20) using the matrices and .

Proposition 5. For , , and ,

The partial solution is dominated by if . Upon establishing this fact for some , the test immediately stops with the rejection of . Of course, like in the case of (13), a factor of 4 in (21) can be dropped. The computational overhead of including the condition in the test is minimal because all the quantities appearing on the right-hand side of (21), except , are also used in the calculation of (see (19)). Since the loop over needs to be executed the time complexity of this part of the test is . Certainly, evaluation of relocating the last added object in gives more power to the test. Thus our dominance test is an improvement over the test described in the Brusco and Stahl [13] paper.

5. A Branch-and-Bound Algorithm

Our approach to the UDSP is to start with a good initial solution provided by the iterated tabu search heuristic invoked in the initialization phase of the algorithm and then proceed with the main phase where, using the branch-and-bound technique, a search tree is built. For the purpose of pruning branches of the search tree, the algorithm involves three tests: the upper bound test (based on ); the dominance test; the symmetry test. The first two of them have already been described in the prior sections. The last one is the most simple and cheap to apply. It hinges on the fact that the values of the objective function at a permutation and its reverse defined by , , are the same. Therefore, it is very natural to demand that some fixed object in , say , being among those objects that are assigned to the first half of the positions in . More formally, the condition is where is such that . If this condition is violated for , then the symmetry test fails, and is discarded. Because of the requirement to fulfill the previous condition, some care should be taken in applying the dominance test. In the case of and (see the paragraph following (19)), this test in our implementation of the algorithm gives a “Fail” verdict only if either or .

Through the preliminary computational experiments, we were convinced that the upper bound test was quite weak when applied to smaller partial solutions. In other words, the time overhead for computing linear assignment-based bounds did not pay off for partial permutations of length that is small or modest compared to . Therefore, it is reasonable to compute the upper bound (described in Section 2) only when , where is a threshold value depending on the UDSP instance matrix and thus also on . The problem we face is how to determine an appropriate value for . We combat this problem with a simple procedure consisting of two steps. In the first step, it randomly generates a relatively small number of partial permutations , each of length . For each permutation in this sample, the procedure computes the upper bound . Let be the values obtained. We are interested in the normalized differences between these values and , where is the solution delivered by the ITS algorithm. So the procedure calculates the sum and the average of such differences. In the second step, using , it determines the threshold value. This is done in the following way: if , then ; if , then ; if , then . The described procedure is executed at the initialization step of the algorithm. Its parameters are , , and the sample size .

Armed with tools for pruning the search space, we can construct a branch-and-bound algorithm for solving the UDSP. In order to make the description of the algorithm more readable we divide it into two parts, namely, the main procedure B&B (branch-and-bound) and an auxiliary procedure SelectObject. The former builds a search tree with nodes representing partial solutions. The task of SelectObject is either to choose an object, which can be used to extend the current permutation, or, if the current node becomes fathomed, to perform a backtracking step. In the description of the algorithm given in the following, the set of objects that can be used to create new branches is denoted by . The main procedure goes as follows.

B&B (1) Apply the ITS algorithm to get an initial solution to a given instance of the UDSP. Set . (2) Calculate . (3)Initialize the search tree by creating the root node with the empty set attached to it. Set ( denotes the tree level the currently considered node belongs to). (4) Set . (5) Call SelectObject(). It returns and, if , additionally some unassigned object (let it be denoted by ). If , then go to . Otherwise, check whether . If so, then return to (perform a backtracking step). If not (in which case ), then proceed to . (6) Create a new node with empty . Set . Increment by 1, and go to . (7)Calculate the coordinates for the objects by applying formula (4) with respect to the permutation and stop.

Steps and of B&B comprise the initialization phase of the algorithm. The initial solution to the problem is produced using ITS. This solution is saved as the best solution found so far, denoted by . If later a better solution is found, is updated. This is done within the SelectObject procedure. In the final step, B&B uses to calculate the optimal coordinates for the objects. The branch-and-bound phase starts at Step of B&B. Each node of the search tree is assigned a set . The partial solution corresponding to the current node is denoted by . The objects of are used to extend . Upon creation of a node (Step for root and Step for other nodes), the set is empty. In SelectObject, is filled with objects that are promising candidates to perform the branching operation on the corresponding node. SelectObject also decides which action has to be taken next. If , then the action is “branch,” whereas if , the action is “backtrack.” Step helps to discriminate between these two cases. If , then a new node in the next level of the search tree is generated. At that point, the partial permutation is extended by assigning the selected object to the position in (see Step ). If, however, and is positive, then backtracking is performed by repeating Steps and . We thus see that the search process is controlled by the procedure SelectObject. This procedure can be stated as follows.

SelectObject() (1)Check whether the set attached to the current node is empty. If so, then go to . If not, then consider its subset . If is nonempty, then mark an arbitrary object , and return with it as well as with . Otherwise go to . (2)If , then go to . Otherwise, set , , , and go to . (3)If , then go to . Otherwise, for each , perform the following operations. Temporarily set . Apply the symmetry test, the dominance test, and, if , also the bound test to the resulting partial solution . If it passes all the tests, then do the following. Include into . If , assign to the value of the upper bound computed for the subproblem defined by the partial solution . If, however, , then set . If after processing all objects in , the set remains empty, then go to . Otherwise proceed to . (4)Mark an arbitrary object , and return with it as well as with . (5)Let . Calculate the value of the solution . If , then set and . (6)If , then drop the th element of . Return with .

In the previous description, denotes the set of unassigned objects, as before. The objects included in are considered as candidates to be assigned to the leftmost available position in . At the root node, . In all other cases, except when , an object becomes a member of the set only if it passes two or, for longer , three tests. The third of them, the upper bound test, amounts to check if . This condition is checked again on the second and subsequent calls to SelectObject for the same node of the search tree. This is done in Step for each unmarked object . Such a strategy is reasonable because during the branch-and-bound process the value of may increase, and the previously satisfied constraint may become violated for some . To be able to identify such objects, the algorithm stores the computed upper bound as at Step of SelectObject.

A closer look at the procedure SelectObject shows that two scenarios are possible: when this procedure is invoked following the creation of a new node of the tree (which is done either in Step or in Step of B&B); when a call to the procedure is made after a backtracking step. The set initially is empty, and therefore Steps to of SelectObject become active only in the first of these two scenarios. Depending on the current level of the tree, , one of the three cases is considered. If the current node is the root, that is, , then Step is executed. Since in this case the dominance test is not applicable, the algorithm uses the set comprising all the objects under consideration. An arbitrary object selected in Step is then used to create the first branch emanating from the root of the tree. If , then the algorithm proceeds to Step . The task of this step is to evaluate all partial solutions that can be obtained from the current partial permutation by assigning an object from to the th position in . In order not to lose at least one optimal solution it is sufficient to keep only those extensions of for which all the tests were successful. Such partial solutions are memorized by the set . One of them is chosen in Step to be used (in Step of B&B) to generate the first son of the current node. If, however, no promising partial solution has been found and thus the set is empty, then the current node can be discarded, and the search continued from its parent node. Finally, if , then Step is executed in which the permutation is completed by placing the remaining unassigned object in the last position of . Its value is compared with that of the best solution found so far. The node corresponding to is a leaf and, therefore, is fathomed. In the second of the above-mentioned two scenarios (Step of the procedure), the already existing nonempty set is scanned. The previously selected objects in have been marked. They are ignored. For each remaining object , it is checked whether . If so, then is included in the set . If the resulting set is empty, then the current node of the tree becomes fathomed, and the procedure goes to Step . Otherwise, the procedure chooses a branching object and passes it to its caller B&B. Our implementation of the algorithm adopts a very simple enumeration strategy: branch upon an arbitrary object in (or when Step is performed).

It is useful to note that the tests in Step are applied in increasing order of time complexity. The symmetry test is the cheapest of them, whereas the upper bound test is the most expensive to perform but still effective.

6. Computational Results

The main goal of our numerical experiments was to provide some measure of the performance of the developed B&B algorithm as well as to demonstrate the benefits of using the new upper bound and efficient dominance test throughout the solution process. For comparison purposes, we also implemented the better of the two branch-and-bound algorithms of Brusco and Stahl [13]. In their study, this algorithm is denoted as BB3. When referring to it in this paper, we will use the same name. In order to make the comparison more fair, we decided to start BB3 by invoking the ITS heuristic. Thus, the initial solution from which the branch-and-bound phase of each of the algorithms starts is expected to be of the same quality.

6.1. Experimental Setup

Both our algorithm and BB3 have been coded in the C programming language, and all the tests have been carried out on a PC with an Intel Core 2 Duo CPU running at 3.0 GHz. As a testbed for the algorithms, three empirical dissimilarity matrices from the literature in addition to a number of randomly generated ones were considered. The random matrices vary in size from 20 to 30 objects. All off-diagonal entries are drawn uniformly at random from the interval . The matrices, of course, are symmetric.

We also used three dissimilarity matrices constructed from empirical data found in the literature. The first matrix is based on Morse code confusion data collected by Rothkopf [26]. In his experiment, the subjects were presented pairs of Morse code signals and asked to respond whether they thought the signals were the same or different. These answers were translated to the entries of the similarity matrix with rows and columns corresponding to the 36 alphanumeric characters (26 letters in the Latin alphabet and 10 digits in the decimal number system). We should mention that we found small discrepancies between Morse code similarity matrices given in various publications and electronic media. We have decided to take the matrix from [27], where it is presented with a reference to Shepard [28]. Let this matrix be denoted by . Its entries are integers from the interval . Like in the study of Brusco and Stahl [13], the corresponding symmetric dissimilarity matrix was obtained by using the formula for all off-diagonal entries and setting the diagonal of to zero. We considered two UDSP instances constructed using Morse code confusion data—one defined by the submatrix of with rows and columns labeled by the letters and another defined by the full matrix .

The second empirical matrix was taken from Groenen and Heiser [29]. We denote it by . Its entry gives the number of citations in journal to journal . Thus the rows of correspond to citing journals, and the columns correspond to cited journals. The dissimilarity matrix was derived from in two steps, like in [13]. First, a symmetric similarity matrix, , was calculated by using the formulas , , , and , . Then was constructed with entries , , , and , .

The third empirical matrix used in our experiments was taken from Hubert et al. [30]. Its entries represent the dissimilarities between pairs of food items. As mentioned by Hubert et al. [30], this food-item matrix, , was constructed based on data provided by Ross and Murphy [31]. For our computational tests, we considered submatrices of defined by the first food items. All the dissimilarity matrices we just described as well as the source code of our algorithm are publicly available at http://www.proin.ktu.lt/~gintaras/uds.html.

Based on the results of preliminary experiments with the B&B algorithm, we have fixed the values of its parameters: , , and . In the symmetry test, was fixed at 1. The cutoff time for ITS was 1 second.

6.2. Results

The first experiment was conducted on a series of randomly generated full density dissimilarity matrices. The results of B&B and BB3 for them are summarized in Table 1. The dimension of the matrix is encoded in the instance name displayed in the first column. The second column of the table shows, for each instance, the optimal value of the objective function and, in parentheses, two related metrics. The first of them is the raw value of the stress function . The second number in parentheses is the normalized value of the stress function that is calculated as . The next two columns list the results of our algorithm: the number of nodes in the search tree and the CPU time reported in the form hours : minutes : seconds or minutes : seconds or seconds. The last two columns give these measures for BB3.

As it can be seen from the table, B&B is superior to BB3. We observe, for example, that the decrease in CPU time over BB3 is around for the largest three instances. Also, our algorithm builds smaller search trees than BB3.

In our second experiment, we opt for three empirical dissimilarity matrices mentioned in Section 6.1. Table 2 summarizes the results of B&B and BB3 on the UDSP instances defined by these matrices. Its structure is the same as that of Table 1. In the instance names, the letter “M” stands for the Morse code matrix, “J” stands for the journal citations matrix, and “F” stands for the food item data. As before, the number following the letter indicates the dimension of the problem matrix. Comparing results in Tables 1 and 2, we find that for empirical data the superiority of our algorithm over BB3 is more pronounced. Basically, B&B performs significantly better than BB3 for each of the three data sets. For example, for the food-item dissimilarity matrix of size , B&B builds almost 10 times smaller search tree and is about 5 times faster than BB3.

Table 3 presents the results of the performance of B&B on the full Morse code dissimilarity matrix as well as on larger submatrices of the food-item dissimilarity matrix. Running BB3 on these UDSP instances is too expensive, so this algorithm is not included in the table. By analyzing the results in Tables 2 and 3, we find that, for , the CPU time taken by B&B to solve the problem with the food-item submatrix is between 196 (for ) and 282 (for ) percent of that needed to solve the problem with the food-item submatrix. For the submatrix of size , the CPU time taken by the algorithm was more than 14 days. In fact, a solution of value 120526370 was found by the ITS heuristic in less than 1 s. The rest of the time was spent on proving its optimality. We also see from Table 3 that the UDSP with the full Morse code dissimilarity matrix required nearly 21 days to solve to proven optimality using our approach.

In Table 4, we display optimal solutions for the two largest UDSP instances we have tried. The second and fourth columns of the table contain the optimal permutations for these instances. The coordinates for objects, calculated using (4), are given in the third and fifth columns. The number in parenthesis in the fourth column shows the order in which the food items appear within Table 5.1 in the book by Hubert et al. [30]. From Table 4, we observe that only a half of the signals representing digits in the Morse code are placed at the end of the scale. Other such signals are positioned between signals encoding letters. We also see that shorter signals tend to occupy positions at the beginning of the permutation .

In Table 5, we evaluate the effectiveness of tests used to fathom partial solutions. Computational results are reported for a selection of UDSP instances. The second, third, and fourth columns of the table present , , and , where (respectively, and ) is the number of partial permutations that are fathomed due to symmetry test (respectively, due to dominance test and due to upper bound test) in Step of SelectObject, and . The last three columns give the same set of statistics for the BB3 algorithm. Inspection of Table 5 reveals that is much larger than both and . It can also be seen that values are consistently higher with BB3 than when using our approach. Comparing the other two tests, we observe that is larger than in all cases except when applying B&B to the food-item dissimilarity submatrices of size and . Generally, we notice that the upper bound test is most successful for the UDSP instances defined by the food-item matrix.

We end this section with a discussion of two issues regarding the use of heuristic algorithms for the UDSP. From a practical point of view, heuristics, of course, are preferable to exact methods as they are fast and can be applied to large problem instances. If carefully designed, they are able to produce solutions of very high quality. The development of exact methods, on the other hand, is a theoretical avenue for coping with different problems. In the field of optimization, exact algorithms, for a problem of interest, provide not only a solution but also a certificate of its optimality. Therefore, one of possible applications of exact methods is to use them in the process of assessment of heuristic techniques. Specifically, optimal values can serve as a reference point with which the results of heuristics can be compared. However, obtaining an optimality certificate in many cases, including the UDSP, is very time consuming. In order to see the time differences between finding an optimal solution for UDSP with the help of heuristics and proving its optimality (as it is done by the branch-and-bound method described in this paper), we run the two heuristic algorithms, iterated tabu search and simulated annealing (SA) of Brusco [19], on all the problem instances used in the main experiment. For two random instances, namely, p21 and p25, SA was trapped in a local maximum. Therefore, we have enhanced our implementation of SA by allowing it to run in a multi start mode. This variation of SA located an optimal solution for the above-mentioned instances in the third and, respectively, second restarts. Both tested heuristics have proven to be very fast. In particular, ITS took in total only 2.95 seconds to find the global optima for all 30 instances we experimented with. Performing the same task by SA required 5.44 seconds. Such times are in huge contrast to what is seen in Tables 1–3. Basically, the long computation times recorded in these tables are the price we have to pay for obtaining not only a solution but also a guarantee of its optimality. We did not perform additional computational experiments with ITS and SA because investigation of heuristics for the UDSP is outside the scope of this paper.

Another issue concerns the use of heuristics for providing initial solutions for the branch-and-bound technique. We have employed the iterated tabu search procedure for this purpose. Our choice was influenced by the fact that we could adopt in ITS the same formulas as those used in the dominance test, which is a key component of our approach. Obviously, there are other heuristics one can apply in place of ITS. As it follows from the discussion in the preceding paragraph, a good candidate for this is the simulated annealing algorithm of Brusco [19]. Suppose that B&B is run twice on the same instance from different initial solutions, both of which are optimal. Then it should be clear that in both cases the same number of tree nodes is explored. This essentially means that any fast heuristic capable of providing optimal solutions is perfect to be used in the initialization step of B&B. An alternative strategy is to use in this step randomly generated permutations. The effects of this simple strategy are mixed. For example, for problem instances F25 and F27, the computation time increased from 148.3 and, respectively, 1169.8 seconds (see Table 2) to 413.3 and, respectively, 2062.5 seconds. However, for example, for p25 and M26, a somewhat different picture was observed. In the case of random initial solution, B&B took 407.5 and 468.7 seconds for these instances, respectively. The computation times in Table 1 for p25 and Table 2 for M26 are only marginally shorter. One possible explanation of this lies in the fact that, as Table 5 indicates, the percentage of partial solutions fathomed due to bound test for both p25 and M26 is quite low. However, in general, it is advantageous to use a heuristic for obtaining an initial solution for the branch-and-bound algorithm, especially because this solution can be produced at a negligible cost with respect to the rest of the computation.

7. Conclusions

In this paper we have presented a branch-and-bound algorithm for the least-squares unidimensional scaling problem. The algorithm incorporates a LAP-based upper bound test as well as a dominance test which allows reducing the redundancy in the search process drastically. The results of computational experiments indicate that the algorithm can be used to obtain provably optimal solutions for randomly generated dissimilarity matrices of size up to and for empirical dissimilarity matrices of size up to . In particular, for the first time, the UDSP instance with the Morse code dissimilarity matrix has been solved to guaranteed optimality. Another important instance is defined by the food-item dissimilarity matrix . It is a great challenge to the optimization community to develop an exact algorithm that will solve this instance. Our branch-and-bound algorithm is limited to submatrices of of size around . We also remark that optimal solutions for all UDSP instances used in our experiments can be found by applying heuristic algorithms such as iterated tabu search and simulated annealing procedures. The total time taken by each of these procedures to reach the global optima is only a few seconds. An advantage of the branch-and-bound algorithm is its ability to certify the optimality of the solution obtained in a rigorous manner.

Before closing, there are two more points worth of mention. First, we proposed a simple, yet effective, mechanism for making a decision about when it is useful to apply a bound test to the current partial solution and when such a test most probably will fail to discard this solution. For that purpose, a sample of partial solutions is evaluated in the preparation step of B&B. We believe that similar mechanisms may be operative in branch-and-bound algorithms for other difficult combinatorial optimization problems. Second, we developed an efficient procedure for exploring the pairwise interchange neighborhood of a solution in the search space. This procedure can be useful on its own in the design of heuristic algorithms for unidimensional scaling, especially those based on the local search principle.

Appendix

Derivation of (20)

We will deduce an expression for , , assuming that is a solution to the whole problem, that is, . Relocating the object from position to position gives the permutation . By using the definition of we can write Let us denote the first two terms in (A.1) by and the remaining two terms by . It is easy to see that stems from moving the object to position and stems from shifting objects , , by one position to the right. To get a simpler expression for , we perform the following manipulations:

Continuing, we get

Finally,

Similarly, for we have By summing up (A.4) and (A.5) for we obtain (20).

References

I. Borg and P. J. F. Groenen, Modern Multidimensional Scaling: Theory and Appplications, Springer, New York, NY, USA, 2nd edition, 2005.
View at: Zentralblatt MATH | MathSciNet
L. Hubert, P. Arabie, and J. Meulman, The Structural Representation of Proximity Matrices with MATLAB, vol. 19, SIAM, Philadelphia, Pa, USA, 2006.
View at: Publisher Site | Zentralblatt MATH | MathSciNet
S. Meisel and D. Mattfeld, “Synergies of Operations Research and Data Mining,” European Journal of Operational Research, vol. 206, no. 1, pp. 1–10, 2010.
View at: Publisher Site | Google Scholar
G. Caraux and S. Pinloche, “PermutMatrix: a graphical environment to arrange gene expression profiles in optimal linear order,” Bioinformatics, vol. 21, no. 7, pp. 1280–1281, 2005.
View at: Publisher Site | Google Scholar
M. K. Ng, E. S. Fung, and H.-Y. Liao, “Linkage disequilibrium map by unidimensional nonnegative scaling,” in Proceedings of the 1st International Symposium on Optimization and Systems Biology (OSB '07), pp. 302–308, Beijing, China, 2007.
View at: Google Scholar
P. I. Armstrong, N. A. Fouad, J. Rounds, and L. Hubert, “Quantifying and interpreting group differences in interest profiles,” Journal of Career Assessment, vol. 18, no. 2, pp. 115–132, 2010.
View at: Publisher Site | Google Scholar
L. Hubert and D. Steinley, “Agreement among supreme court justices: categorical versus continuous representation,” SIAM News, vol. 38, no. 7, 2005.
View at: Google Scholar
A. Dahabiah, J. Puentes, and B. Solaiman, “Digestive casebase mining based on possibility theory and linear unidimensional scaling,” in Proceedings of the 8th WSEAS International Conference on Artificial Intelligence, Knowledge Engineering and Data Bases (AIKED '09), pp. 218–223, World Scientific and Engineering Academy and Society (WSEAS), Stevens Point, Wis, USA, 2009.
View at: Google Scholar
A. Dahabiah, J. Puentes, and B. Solaiman, “Possibilistic pattern recognition in a digestive database for mining imperfect data,” WSEAS Transactions on Systems, vol. 8, no. 2, pp. 229–240, 2009.
View at: Google Scholar
Y. Ge, Y. Leung, and J. Ma, “Unidimensional scaling classifier and its application to remotely sensed data,” in Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS '05), pp. 3841–3844, July 2005.
View at: Publisher Site | Google Scholar
J. H. Bihn, M. Verhaagh, M. Brändle, and R. Brandl, “Do secondary forests act as refuges for old growth forest animals? Recovery of ant diversity in the Atlantic forest of Brazil,” Biological Conservation, vol. 141, no. 3, pp. 733–743, 2008.
View at: Publisher Site | Google Scholar
D. Defays, “A short note on a method of seriation,” British Journal of Mathematical and Statistical Psychology, vol. 31, no. 1, pp. 49–53, 1978.
View at: Google Scholar
M. J. Brusco and S. Stahl, “Optimal least-squares unidimensional scaling: improved branch-and-bound procedures and comparison to dynamic programming,” Psychometrika, vol. 70, no. 2, pp. 253–270, 2005.
View at: Publisher Site | Google Scholar
L. J. Hubert and P. Arabie, “Unidimensional scaling and combinatorial optimization,” in Multidimensional Data Analysis, J. de Leeuw, W. J. Heiser, J. Meulman, and F. Critchley, Eds., pp. 181–196, DSWO Press, Leiden, The Netherlands, 1986.
View at: Google Scholar
L. J. Hubert, P. Arabie, and J. J. Meulman, “Linear unidimensional scaling in the $L_{2}$ -norm: basic optimization methods using MATLAB,” Journal of Classification, vol. 19, no. 2, pp. 303–328, 2002.
View at: Publisher Site | Google Scholar
M. J. Brusco and S. Stahl, Branch-and-Bound Applications in Combinatorial Data Analysis, Springer Science + Business Media, New York, NY, USA, 2005.
V. Pliner, “Metric unidimensional scaling and global optimization,” Journal of Classification, vol. 13, no. 1, pp. 3–18, 1996.
View at: Google Scholar
A. Murillo, J. F. Vera, and W. J. Heiser, “A permutation-translation simulated annealing algorithm for $L_{1}$ and $L_{2}$ unidimensional scaling,” Journal of Classification, vol. 22, no. 1, pp. 119–138, 2005.
View at: Publisher Site | Google Scholar
M. J. Brusco, “On the performance of simulated annealing for large-scale $L_{2}$ unidimensional scaling,” Journal of Classification, vol. 23, no. 2, pp. 255–268, 2006.
View at: Publisher Site | Google Scholar
M. J. Brusco, H. F. Köhn, and S. Stahl, “Heuristic implementation of dynamic programming for matrix permutation problems in combinatorial data analysis,” Psychometrika, vol. 73, no. 3, pp. 503–522, 2008.
View at: Publisher Site | Google Scholar
G. Palubeckis, “Iterated tabu search for the maximum diversity problem,” Applied Mathematics and Computation, vol. 189, no. 1, pp. 371–383, 2007.
View at: Publisher Site | Google Scholar
G. Palubeckis, “A new bounding procedure and an improved exact algorithm for the Max-2-SAT problem,” Applied Mathematics and Computation, vol. 215, no. 3, pp. 1106–1117, 2009.
View at: Publisher Site | Google Scholar
J. Kluabwang, D. Puangdownreong, and S. Sujitjorn, “Multipath adaptive tabu search for a vehicle control problem,” Journal of Applied Mathematics, vol. 2012, Article ID 731623, 20 pages, 2012.
View at: Publisher Site | Google Scholar
N. Sarasiri, K. Suthamno, and S. Sujitjorn, “Bacterial foraging-tabu search metaheuristics for identification of nonlinear friction model,” Journal of Applied Mathematics, vol. 2012, Article ID 238563, 23 pages, 2012.
View at: Publisher Site | Google Scholar
A. Jouglet and J. Carlier, “Dominance rules in combinatorial optimization problems,” European Journal of Operational Research, vol. 212, no. 3, pp. 433–444, 2011.
View at: Publisher Site | Google Scholar
E. Z. Rothkopf, “A measure of stimulus similarity and errors in some paired-associate learning tasks,” Journal of Experimental Psychology, vol. 53, no. 2, pp. 94–101, 1957.
View at: Publisher Site | Google Scholar
I. Düntsch and G. Gediga, “Modal-Style Operators in Qualitative Data Analysis,” Tech. Rep. CS-02-15, Department of Computer Science, Brock University, St. Catharines, Ontario, Canada, 2002.
View at: Google Scholar
R. N. Shepard, “Analysis of proximities as a technique for the study of information processing in man,” Human factors, vol. 5, pp. 33–48, 1963.
View at: Google Scholar
P. J. F. Groenen and W. J. Heiser, “The tunneling method for global optimization in multidimensional scaling,” Psychometrika, vol. 61, no. 3, pp. 529–550, 1996.
View at: Google Scholar
L. Hubert, P. Arabie, and J. Meulman, Combinatorial Data Analysis: Optimization by Dynamic Programming, SIAM, Philadelphia, Pa, USA, 2001.
B. H. Ross and G. L. Murphy, “Food for thought: cross-classification and category organization in a complex real-world domain,” Cognitive Psychology, vol. 38, no. 4, pp. 495–553, 1999.
View at: Google Scholar

Copyright

Copyright © 2013 Gintaras Palubeckis. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1037

Downloads

845

Citations