Mathematical Problems in Engineering

Volume 2015 (2015), Article ID 241379, 10 pages

http://dx.doi.org/10.1155/2015/241379

## A Two-Sided Matching Decision Model Based on Uncertain Preference Sequences

School of Management, Huazhong University of Science and Technology, Wuhan 430074, China

Received 28 February 2015; Revised 23 May 2015; Accepted 28 May 2015

Academic Editor: Kyandoghere Kyamakya

Copyright © 2015 Xiao Liu and Huimin Ma. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Two-sided matching is a hot issue in the field of operation research and decision analysis. This paper reviews the typical two-sided matching models and their limitations in some specific contexts, and then puts forward a new decision model based on uncertain preference sequences. In this model, we first design a data processing method to get preference ordinal value in uncertain preference sequence, then compute the preference distance of each matching pair based on these certain preference ordinal values, set the optimal objectives as maximizing matching number and minimizing total sum of preference distances of all the matching pairs under the lowest threshold constraint of matching effect, and then solve it with branch-and-bound algorithm. Meanwhile, we take two numeral cases as examples and analyze the different matching solutions with one-norm distance, two-norm distance, and positive-infinity-norm distance, respectively. We also compare our decision model with two other approaches, and summarize their characteristics on two-sided matching.

#### 1. Introduction

Two-sided matching is an important research branch in the field of operation research and decision analysis, which has been applied into many aspects of engineering and economics, such as commerce trading [1, 2], work assignment [3, 4], and resource allocation [5, 6]. Two-sided matching decision problem derives from Gale and Shapley’s research on stable marriage matching and college admission problem in 1962 [7]. Based on the pioneering work of Gale and Shapley, Roth first gives an accurate conception of two-sided matching: “two-sided” refers to the fact that agents in such markets belong to one of two disjoint sets, for example, firms or workers, that are specified in advance, and matching refers to the bilateral nature of exchange in these markets, for example, if I am employed by the University of Pittsburgh, when the University of Pittsburgh employs me [8]. In many actual two-sided matching cases, due to the difficulty of information acquisition and fuzziness of the information identification, it is much easier for a decision maker to acquire the preference sequence of each element than other kinds of information, such as the weight value of connection between two elements in two disjoint sides, so preference sequence usually may be the essential and even the only basis for decision making. However, when the scale of data grows rapidly, on one hand, it is nearly impossible to collect the complete preference sequence and, on the other hand, two or more elements in the preference sequence cannot be distinguished which ranks higher or lower because they have the same preference degree for the preference subject.

Now we define some conceptions about the preference sequence. If preference sequence of an element in one set to the other disjoint set includes all the elements in the latter set, we call it a complete preference sequence; otherwise we name it as an incomplete preference sequence. If any two elements in the preference sequence of an element in one set to the other disjoint set do not have the same preference degree, we call it a strong preference sequence; otherwise we name it as a weak preference sequence. We define the preference sequence of an element in one set as a certain preference sequence only when all the elements in the other set have direct and certain ordinal value in accordance with this sequence; otherwise it is an uncertain preference sequence. Obviously, a preference sequence is certain only when it is complete and strong at the same time; otherwise it is an uncertain preference sequence. The decision model constructed in this paper is just against this background: the only information given for two-sided matching is the uncertain preference sequences.

#### 2. Research on Two-Sided Matching

In general, we can categorize two-sided matching problem into three typical kinds of models in terms of different decision objectives: stable matching, maximum cardinality matching, and maximum weight matching. In the first model, the objective is to seek a stable matching solution, and we count a solution as stable matching only when there does not exist any alternative pairing (, ) in which and are individually better off than they would be with the element currently matched. Gale and Shapley put forward an approach, also named Gale-Shapley algorithm, to get a stable matching solution in the perspective of mathematics and game theory, which symbolizes the beginning of two-sided matching research and enlightens the subsequent scholars to pay more attention to this topic. In the second model, the objective is to seek a solution in which the number of matching pairs is maximized. This kind of problem has been widely applied to graph theory, and one common solving approach is the Hungarian algorithm, which was put forward by Hungarian mathematician Edmonds in 1965 on the basis of the Hall Theorem [9]. The key point of the Hungarian algorithm is to seek an augmenting path, and this kind of two-sided matching problem is equivalent to maximum matching in the bipartite graph. Maximum weight matching sometimes also is called optimal weight matching, for minimum weight matching is easy to transfer to maximum weight matching and has the same solving approach. In this model, each matching pair consisting of two elements in two disjoint sets has a corresponding weight value, and the objective is to maximize or optimize the total weight sum of all the matching pairs. This matching model can be classified as assignment problem in the field of operation research, and one common solving approach is the Hungarian method, which was first put forward by Kuhn in 1955 on the basis of a mathematical theorem found by another Hungarian mathematician König [10]. It is worth mentioning that the names of approaches put forward by Edmonds and Kuhn are similar because of “Hungarian,” but they are two totally different methods or algorithms. Meanwhile, assignment problem model also can be regarded as a maximum weight matching problem in the bipartite graph, in which one common solving approach is the Kuhn-Munkres algorithm put forward by Munkres in 1957 [11].

Based on the previous scholar’s work, the current research on two-sided matching is usually conducted in the following two ways. The first one is to seek a more effective method or analyze some certain algorithms for the typical models, especially the stable matching model; for example, Roth puts forward the hospital-resident algorithm regarding many-to-one matching case [12], Knoblauch researches the characteristics of the Gale-Shapley algorithm on the condition of randomly distributed preference ordinal value [13], McVitie and Wilson put forward a new algorithm based on “Breakmarriage Operation” regarding the situation when the number of the two sides is not the same [14], and Teo et al. study strategic issues in the Gale-Shapley stable marriage model [15]. The second one is the specific application in different decision contexts; after all each decision context has its own characteristics and the decision makers have to take the distinctive constraints into consideration, so they should extend or revise the typical two-sided matching model; for example, van Raalte and Webers research a two-sided market where the one type of agents needs the service of a middleman or matchmaker in order to be matched with the other type [16] and Sarne and Kraus address the problem of agents in a distributed costly two-sided search for pairwise partnerships in a multiagent system [17]. The research content in this paper is just conducted in the second way.

In many actual decision situations, we cannot easily classify most of the two-sided matching cases into the mentioned typical matching models. For example, it is really hard to set an appropriate weight value for each matching pair directly. Though we have some quantitative methods or techniques, such as AHP just shown in [18], to help to determine the weight value, they take little effect when the scale of data in the decision background is very huge or the useful information given is very scarce. Many researchers set the stable matching as their most important optimal objective; however stable matching under the uncertain preference sequence has some limitations. When the preference sequence information is incomplete but strong, the Gale-Shapley algorithm still takes effect. The solution is still stable, while the number of matching pairs may be reduced. In fact, when the preference sequence information is incomplete but strong, or weak but complete, the solutions are both able to be solved in polynomial time [19]; nevertheless when the preference sequence information is incomplete and weak at the same time, the problem is NP-hard, and the common solving approach is to release some constraints or adopt approximation algorithms, just shown in [20, 21]. What is more, sometimes decision objective on two-sided matching is not to get a stable matching, because stable matching not only cannot take the benefit of two sides into consideration at the same time, but also cannot maximize the total utility in economics or other perspectives. One of the common decision objectives of two-sided matching in rational economic perspective is to maximize the total utility, so a decision maker generally should transfer the preference sequence information to utility value with some data processing methods and then use typical maximum weight matching model to solve it. Li et al. replace the utility with satisfactory and try to maximize the total sum of it. Based on a hypothesis that satisfactory degree decreases with the growth of preference ordinal value and the speed of its decline is slower and slower, Li et al. construct a transformation mechanism between preference ordinal value and satisfactory value [22]. However in their research, not only is the preference sequence complete, but also the matching solution highly depends on the transfer function between preference ordinal value and satisfactory value and it maybe changes if any parameter in the function or the function itself changes. Regarding this situation, we research the two-sided matching on the condition of uncertain preference sequence information and put forward a new decision model which integrates typical maximum cardinality model and optimal weight matching model. The objectives in the integrated model are maximizing the matching number and minimizing the distance from ideal status.

#### 3. Uncertain Preference Sequences and Ordinal Value

The two-sided matching model constructed in this paper computes preference distance of any two elements in two disjoint sets, respectively, on the basis of their preference ordinal value, so we first design a data processing method, to get the preference ordinal value in uncertain preference sequence. We denote two disjoint sets by and , and the number of elements in them is and , respectively, , , so and are the elements in and , , , , . We denote the preference sequence of to by (or ). Specifically, only when the preference sequence of to is incomplete and would not like to accept the elements not in its preference sequence do we label the preference sequence of to as ; otherwise we label the preference sequence of to as . (or ) consists of the elements in , and these elements are sorted in descending order according to the preference degree of . If has the same preference degree as some elements in (or ), we label them with parentheses. For example, and both have 5 elements, and the preference sequence of , one of the elements in , to is labeled as (or ). One of its instances is , which refers to the fact that, for , is its first preference item in , and are tied for the second item, and is the fourth item. As the preference sequence of is an incomplete preference sequence and is not in it, in order to deal with this situation, we classify incomplete preference sequence into two types: one is that after considering the elements in the preference sequence preferentially: it would like to accept the elements which are not in its preference sequence; the other one is that it would not like to accept the elements not in its preference sequence at any time. To the latter situation, we label it with ; for example, if we denote the preference sequence of by directly, it refers to the fact that would like to accept as its preference item, after considering , , , and preferentially; if we denote it by , it refers to the fact that would not like to accept as its preference item at any time.

We denote the real ordinal value of in (or ) by , so when is one of the elements in (or ), is ; and when is not in (or ), is . Take mentioned above as an example; the real ordinal value of , , , and is , , , and , and the real ordinal value of is . We also denote the preference ordinal value in (or ) by ; according to the data processing method designed in this paper, the transition between and is defined as follows: (a) when is , if the preference sequence of is labeled as , is , where is the number of elements in , and else if the preference sequence of is labeled as , is ; (b) when is not , if there does not exist any other element tied with in (or ), equals , and else if there exists any other element tied with in (or ), which we denote including by set , equals arithmetic average value of the real ordinal value of all the elements in , labeled as . The definition of and transition between and is presented as follows:Similarly, we denote the preference sequence of to by (or ). Specifically, we label the preference sequence of to as only when the preference sequence of to is incomplete, and would not like to accept the elements not in its preference sequence at any time; otherwise we label it as directly. We denote the real ordinal value of in (or ) by and the preference ordinal value of in (or ) by , and the transition between and also is similar to and : (a) when is , if the preference sequence of to is labeled as , is , where is the number of elements in , and else if the preference sequence of to is labeled as , is ; (b) when is not , if there does not exist any other element tied with in (or ), equals , and else if there exists any other element tied with in (or ), which we denote including by set , equals the average value of the real ordinal value of all the elements in , labeled as . The definition of and the transition between and is presented as follows:

#### 4. Preference Distance

As we cannot measure the relationship between preference ordinal value and preference utility exactly, no universal transition function between them is widely accepted. No matter whether it is a linear function or a nonlinear function, both depend on the specific decision background. However, the preference ordinal value is a good unit to measure the distance from ideal status, which we name as preference distance. The larger the ordinal value is, the further the distance is, and the less value the matching effect has. So our model first computes the preference distances from the ideal matching situation on the basis of preference ordinal values, and then minimizes the total sum of these distances.

Take the elements and in and , respectively, as an example. In , the preference sequence of , the preference ordinal value of is ; and in , the preference sequence of , the preference ordinal value of is . We use (, ) to represent the matching status between and . In the ideal matching status, and are both the first preference items of each other, and we label this ideal matching status as (1, 1). Preference distance of and is the distance between the real matching status and ideal matching status of this matching pair, denoted by . According to the definition of distance given by Minkowski, the computation of is defined as follows:If we take the different importance of two sides into consideration, the computation mentioned above can be modified as follows: And is the importance factor to balance two sides. In this paper, we ignore the difference and assume that two sides have the same importance, so is always assigned as 1 and the computation of preference distance is equivalent in (3) and (4). In the computation equation, it refers to different kinds of distance when varies. Theoretically, can be any real number from 1 to , but the most common value is 1, 2, and .

When , is a one-norm distance, also called Manhattan distance. It is a form of geometry in which the distance between two points is the sum of the absolute differences of their Cartesian coordinate value. In this case, between point (, ) and point (1, 1) is , a linear expression of their coordinate value. A common criterion to evaluate a matching solution is the total sum of preference ordinal values of all the matching elements. The total sum of preference ordinal values and the total sum of one-norm distances are linearly equivalent in mathematics; namely, all the feasible solutions of minimizing total sum of one-norm distances and minimizing total sum of preference ordinal values are completely the same. And both of them have a negative correlation with matching effect. One-norm distance applies to the decision situation where the ranges of preference ordinal value of two sides do not have significant difference, and two sides roughly have the same metrics so that we can use simple additive relationship to represent the whole matching effect.

When , is a two-norm distance, also called Euclidean distance. It is a form of geometry in which the distance between two points is the length of the line segment connecting them and is computed by Pythagorean formula. In this case, between point (, ) and point (1, 1) is , just equaling their Euclidean distance. Since the coordinate value is integer number generally, involves floating computing if . So in the perspective of solving efficiency, the solving time will increase rapidly when the scale of data grows. Therefore, two-norm distance applies to the situation where the dimensions represented in two sides are independent, and the data scale is in an accepted scope.

When , is a positive-infinity-norm distance, also called Chebyshev distance. It is a form of geometry in which the distance between two points is the greatest of their differences along any coordinate dimension. In this case, between point (, ) and point (1, 1) equals the larger value in and . In mathematics, take one point as the origin of coordinates; the points which have Chebyshev distance with origin of coordinates make up a quadrate. The origin of coordinates is the central point of this quadrate, and the length of its each side is ; meanwhile each side is parallel with coordinate axes. Positive-infinity-norm distance applies to the situation where the balanced performance of two sides is important, and the difference of two sides should not be too much.

#### 5. Modeling Construction and Solving

In the typical optimal weight matching model, the main constraint is that any element in one set can only match one element in the other set at most, and the objective is to optimize the total sum of weight values; for example, we take the preference distance of each matching pair as its weight value; the optimal objective is to minimize the total sum of preference distances of all the matching pairs. Besides, matching number is also a constraint, though the value is obvious in the typical model: if the number of elements in two sides is the same, the matching number just equals the number of elements in each side; otherwise, the matching number equals the less one. In the model put forward in this paper, in order to avoid the performance of some matching pairs being too bad, we set a threshold of matching performance; namely, preference distances of all the matching pairs should not be more than the value set in advance. As this new constraint on matching performance, the matching number is not a constant value any more, and it depends on restraint degree of the threshold. Regarding this context, we set two optimal objectives in the decision model: one is to maximize the number of all matching pairs and the other one is to minimize the total sum of all matching pairs’ preference distances.

We use to denote the matching relationship between and , . If is 1, it refers to the fact that and match each other and make up a matching pair; else if is 0, it refers to the fact that and do not match each other and also do not make up a matching pair. The constraint that any element in one set can only match one element in the other set at most is presented as follows:We denote the maximum value of by and the minimum value by , and the specific value of and depends on the range of parameter and the preference sequences. We also denote as the threshold factor of matching performance, . When , it refers to no constraint on threshold; when decreases, it refers to the fact that the constraint degree on threshold increases; and when , it refers to the fact that the constraint on threshold is the strictest. The constraint is presented as follows:Since one element should not match an element which is not in the former one’s preference sequence and meanwhile the former element also would not like to accept that, we should add a new constraint: if equals , should be zero; and if equals , also should be zero. As the maximum value of and is and , respectively, and the second maximum value of and is and , respectively, this constraint can be presented as a linear expression as follows:Maximizing the matching number and minimizing the total sum of preference distances are presented as follows:As it is a multiobjective optimization, we generally have three different ways to deal with it: the first one is to transfer a multiobjective problem to one single-objective problem, such as simple linear additive weight method, maximal-minimal method, and TOPSIS method: the critical point of this solving way is to ensure that the optimal solution of the new single-objective problem is also the noninferior solution of the original multiobjective problem; the second one is to transfer it to multiple single-objective problems in a special order, such as hierarchical method, interactive programming method: we get the optimal solution of the original problem through solving these single-objective problems one by one, which is also the optimal solution of the last single-objective problem; the third way is some nonuniform methods, such as multiplication division method and efficiency coefficient method. In this paper, we hold that maximizing the number of all matching pairs is the main optimal objective and minimizing the total sum of preference distances are the second optimal objective, so we use the second way mentioned above to deal with it. Namely, we first get the optimal solution of the main objective, then add it as a new constraint, and finally get the optimal solution of the second objective. These two objectives are integrated as a linear equation as follows:where and are both positive real numbers and “” refers to the fact that is far greater than . We can also give a specific value to and if we can measure the importance of these two objectives exactly, but it is not the content discussed in this paper. In conclusion, the decision model constructed in this paper is presented as follows:In the typical maximum weight matching model, as the maximum matching number is constant and obvious, the model is easy to convert to a standard assignment model. In standard assignment model, the number of elements in each side is the same and every element in one set will be matched with one element in the other set, through adding some zero elements to balance the number of two sides. When the scale is not too large, the Hungarian method is one common approach to solve this kind of model, and it is on basis of the following two theorems: (a) if all elements in one column or row of efficiency matrix are plus or minus a number, the optimal solutions of the origin matrix and new matrix are the same; (b) the maximum number of independent zero elements in efficiency matrix equals the minimum number of lines which cover all the zero elements. Now so many literatures have researched and promoted this method and provide programming codes in different programming languages or coding platforms, such as C, JAVA, and MATLAB, so we do not discuss it repetitively.

Regarding nonlinear optimization, the common solving approach is heuristic algorithms, such as genetic algorithm, simulated annealing algorithm, and tabu search algorithm. However, if we first compute in terms of and , then input the matrix and the value of , , and ; all the constraints and optimal objectives involving are linear expressions, so the model we construct is a standard 0-1 integer linear programming model, which we can use branch-and-bound method to solve, even when the scale of data is very large. Branch-and-bound method is an algorithm design paradigm for discrete optimization problems as well as general real valued problems. It is first proposed by Land and Doig in 1960 for discrete programming [23] and has become the most commonly used tool for solving integer programming and NP-hard problem. It has two procedures: branching and bounding; branching refers to dividing the origin problem into some subproblems in which the union set of all solutions covers all the feasible solutions in the origin problem and bounding refers to computing an upper bound and a lower bound for optimal objective value. The main idea of branch-and-bound algorithm is to increase lower bound and decrease upper bound iteratively and get the optimal value finally. It also can be classified into some specific types on basis of the different branching search strategies. Modern linear programming software, such as CPLEX, solves integer programming model with branch-and-bound algorithm package. In this paper, we also use this algorithm to solve our model and analyze the solution in the following section.

#### 6. Numerical Cases and Analysis

We first give a numerical case, named Case 1, and each side of it has 10 elements. Set , , and their preference sequences and are shown in Table 1. The real ordinal value of in , the preference sequence of , makes up a matrix, labeled as , and the preference ordinal value also makes up a matrix, labeled with . Similarly, the real ordinal value of in , the preference sequence of , makes up a matrix, labeled as , and the preference ordinal value also makes up a matrix, labeled with .