An erratum for this article has been published. To view the erratum, please click here.

Advances in Operations Research

Volume 2011, Article ID 476939, 20 pages

http://dx.doi.org/10.1155/2011/476939

## Inapproximability and Polynomial-Time Approximation Algorithm for UET Tasks on Structured Processor Networks

^{1}Laboratoire G-SCOP, 46 avenue Félix Viallet, 38031 Grenoble Cedex 1, France^{2}LIRMM, 161 rue Ada, UMR 5056, 34392 Montpellier Cedex 5, France

Received 26 October 2010; Revised 22 March 2011; Accepted 4 April 2011

Academic Editor: Ching-Jong Liao

Copyright © 2011 M. Bouznif and R. Giroudeau. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

We investigate complexity and approximation results on a processor networks where the communication delay depends on the distance between the processors performing tasks. We then prove that there is no heuristic with a performance guarantee smaller than 4/3 for makespan minimization for precedence graph on a large class of processor networks like hypercube, grid, torus, and so forth, with a fixed diameter . We extend complexity results when the precedence graph is a bipartite graph. We also design an efficient polynomial-time -approximation algorithm for the makespan minimization on processor networks with diameter .

#### 1. Introduction

##### 1.1. Problem Statement

In this paper, we consider the processor network model, which is a generalization of the homogeneous scheduling delay model in which task allocation on the processors does not have any influence over the length of scheduling. Indeed, since the graph of processors (denoted hereafter where is a set of processors and is the set relationship between them) is fully connected, the starting of a task depends only on the potential communication delay, given by precedence graph between and its own predecessors.

In the processor network model, this assumption is relaxed in order to take into account the fact that the processor graph may not be fully connected. Thus, task allocation on the processors can be expressed by its essential and fundamentals characteristics. We consider a model in which a distance function (which is defined hereafter), denoted between two processors and in the graph of processors impacts computation of the communication delay between two tasks and (subject to a precedence constraint) and consequently on the starting time of task . The communication time, using for computing the starting time of a task (this notation indicates that the value of the communication delay between task , which is allotted to processor and task which will be executed on the processor ), is assumed as , where is the communication delay given by the precedence graph.

Formally, the processor network model may be defined as where (resp. ) represents the processor on which task (resp. task ) is scheduled, represents the starting time of task , represents the processing time of task , represents the shortest path in graph (the graph of processor ) between and , and represents the communication delay if two tasks are executed on two neighboring processors (this value is given by the precedence graph).

We consider the classic scheduling UET-UCT (Unit Execution Time-Unit Communication Time, i.e., , , and ) problem on a bounded number of processors such that the processor network is a structured graph with a diameter . In these topologies, processors are numbered as and processor may be communicated with processor with a communication cost equal to where represents the shortest path on graph between processors and . The communication delay is therefore the distance function proposed above.

In scheduling theory, a problem type is categorized by its machine environment, job characteristic, and objective function. Thus, using the * three fields* notation scheme ,(where designates the environment processors, the characteristics of the job, and the criteria.) proposed by Graham et al. [1], we consider the problem of makespan minimization (denoted in follows by ) with unitary task and unitary communication delay (UET-UCT) in presence of a precedence graph on a processors network having a graph such that the communication delay depends on the shortest path on graph . This problem is denoted by .

*Example 1.1. *Figure 1 shows the difference between the two problems and . (The relationship between processors is as follows: and are connected to and .) The processing time of the tasks and the communication delay between the tasks are unitary (UET-UCT problem). Gantt diagram represents an optimal solution for the problem. We can notice that task can be executed on any processor at . Moreover, Gantt diagram represents an optimal solution for the problem . In order to obtain an optimal solution, the task must be delayed by one unit of time and must be processed on the same processor as task at . Thus, task may be executed at only on the processor .

##### 1.2. Organization of the Paper

This paper is organized as follows: the next section is devoted to the related works. In Section 3, after defining the class graph we propose a general nonapproximability result for a nonspecified precedence graph. We also extend the previous result when the precedence graph is a bipartite graph and when the duplication is allowed. In the last section, we design a polynomial-time approximation algorithm with a performance ratio within .

#### 2. Related Works

##### 2.1. Complexity Results

To the best of our knowledge, the first complexity result was given by Picouleau [2]. The considered problem was to schedule unit execution time tasks with a precedence graph on an unbounded number of processors and on a chain or star (a star is a tree of depth one) topology. Picouleau proved that this problem is -complete if the precedence graph is a tree or an outtree. Recently in [3], the authors proved that there is no heuristic with a performance guarantee smaller than 6/5 for minimizing the makespan on a processor network represented by a star. This model is closest to the master-slave architecture. In [4], the authors proved that there is no hope to finding a polynomial-time approximation algorithm with a ratio for the problem to schedule a set of tasks on a ring or a chain as processors network (see Table 1).

###### 2.1.1. Approximation Results

In ring topology, Lahlou developed, in [5], using the list scheduling proposed by Rayward-Smith [6], a -approximation algorithm with where is the number of processors.

Moreover, Hwang et al. [7] studied approximation list algorithms for scheduling problems where the communication times depend on contention and a distance function for the tasks involved and on the processors that execute the tasks. The authors examined a simple strategy called *extended list scheduling, ELS*, which is a straightforward extension of list scheduling. They proved that the ELS strategy is unsatisfactory, but improved a strategy called *earliest task first*.

Recently, in [3] the authors proposed a sophisticated polynomial-time approximation algorithm with a ratio equal to four based on three steps for the problem for the makespan minimization problem on a processor networks as a star forms. In [4] the authors develop two polynomial-time approximation algorithms for processor networks with limited or unlimited resources.

##### 2.2. Our Contributions

In this paper, we answer the following interesting question: * is there a large class of graphs, for which it exists a polynomial-time reduction from *-PARTITION*, to show the **-completeness? Therefore, it is sufficient to show if the graph ** is belonging to this class, in order to prove the nonexistence of **?* In order to complete the study of processor networks, we design a polynomial-time approximation algorithm within a ratio at most where designates the diameter of the graph .

#### 3. Computational Complexity for a Large Class of Graph

##### 3.1. The Class Graph

We propose a large class of graph for which the problem of deciding whether an instance ; is -complete.

We present now a graph class for which we may apply the same polynomial-time transformation mechanism from 3-PARTITION problem to show that our scheduling problem when processor networks belong to this class is -complete. Hereafter, we give the definition of the prism graph.

*Definition 3.1. *A prism of size and length () is a connected undirected graph for that (i)there are two sets of vertices and such as , , and . The vertices are denoted (resp. ); (ii)it exists an order on and vertices such that () there is a path of length denoted between and ; (iii).

Moreover, the size of a prism is polynomial in . An illustration is given in Figure 2.

*Definition 3.2. *Let be a collection of graphs. possess the prism property if and only if , such that contains *a unique* subgraph of induced by vertices with a prism of size and length .

Lemma 3.3. *The class graph is not empty.*

*Proof. *In particular we will see in Section 3.2 classic structured graph like torus, grid, complete binary tree, and so forth, belonging to this class graph.

Theorem 3.4. *The problem of deciding whether an instance of ; has a schedule of length at most two is polynomial with and .*

*Proof. *No communication is allowed between two pairs of tasks.

The remainder of this section is devoted to proving Theorem 3.5.

Theorem 3.5. *The problem of deciding whether an instance of has a schedule of length at most three is -complete with .*

*Proof. *The proof is established by a reduction of the 3-PARTITION problem [8].*Instance*

A finite set of elements , a bound , and a size for each such that each satisfies and such that .*Question 1. * Can be partitioned into disjoint sets of such that for all ,?

3-PARTITION is known to be -complete in the strong sense [8]. (Even if is polynomially bounded by the instance size, the problem is still -complete.)

It is easy to see that .

Given an instance of the 3- PARTITION problem, we construct an instance of the scheduling problem with , in the following way.

The precedence graph , which will be scheduled on the processors network , is decomposed into two disjointed graphs, denoted as follows by and (the graph is a collection of graphs , i.e., ). Hereafter, graphs and are characterized.

*Graph *

Let be an integer such that . Graph consists of vertices denoted by , , where . The precedence constraints between these tasks are defined as follows: (i)arcs for any , , (ii)arcs for any , , (iii)arcs for any , .

*Remark 3.6. *Valid scheduling of length three for the case where the precedence graph is in a path of processors is as follows, for any , , (i)tasks and are executed on , (ii)tasks are executed at time , for any , if is even, (iii)tasks are otherwise executed at time , for any .

See Figure 3 for graph and Figure 4 for the valid scheduling described in Remark 3.6.

*Graph **Remark 3.7. *A path of length admits vertices.

The graph will be defined as follows. Let be a graph such that , with . By Definition 3.2, we know that it exists a unique subgraph of size and length with desired properties. In the following we set and and the size of is polynomial in . Note that .

The -graph is defined by polynomial-time transformations from the -graph. The graph given in Figure 5 will be used to illustrated the following construction. (i)The paths of length three are created and precedence constraints are added (see Figure 6). The two sets of tasks and are created. (ii)The tasks are partitioned into three subsets , , and (see Figure 7).(iii)The -tasks are now partitioned into two subsets and . We consider the subgraph induced by the -tasks (see Figure 8) as the graph.

The purpose of removing these tasks is to allow the tasks of -graph when the tasks of -graph, deprived of these tasks, will be executed on the graph of processors.

The set of vertices is partitioned into two sets : (i) the vertices of , and defined the vertices of the unique paths of length respecting the characteristics given by Definition 3.1, (ii), the set of an other vertices. Note that these vertices do not belong to graph.

The definition of the graph is given below. (i), we create a path of length three , and , with edges . The set of tasks will be denoted . The cardinality of is (see Figure 6). (ii), we create a path of length three . This set of tasks will be denoted . The number of tasks is with . (iii), we add the edges and (see Figure 6).

Now, tasks are removed from -graph. (In order to clarify the polynomial-time transformation, we give priority to create tasks and remove some ones instead of enumerating all precedence constraints.) Therefore, we consider the following index sets: (i), (ii), (iii), (iv).

We remove from the -set the following tasks , with , (resp. , with ). denotes the set of removed tasks (see Figure 7). Finally, we put with (see Figure 8).

Figures 5, 6, 7, and 8 describe the construction of -graph from .

is the set of arcs as described above.

Lastly, the number of processors is , and they are numbered as with .

In summary the precedence graph is composed by with tasks and the precedence constraints given before and the graph with tasks.

The transformation is computed in polynomial time.(i)Let us assume that can be partitioned into disjoint subsets with each summing up to . We will then prove that there is a schedule of length three at most.

Let us construct this schedule.

First, the task is executed on the processors to with (if this task exists).

Consider the processors on which the set of -tasks are scheduled. By the previous allocation, these processors are numbered as .

Let be a partition of . Consider with a fixed . The tasks of , are executed between processors and . Moreover, the tasks , , (resp., , ) are scheduled on processors in succession in order to respect a schedule of length three.

Thus without loss of generality, we suppose that the tasks of are scheduled between processors and . In similar way, the tasks (resp., ) are executed between processors and (resp. and ).

(ii)Let us assume now that there is a schedule of length at most three. We will prove that can be partitioned into disjoint subsets with each summing up to .

Lemma 3.8. *In any valid schedule of length three there is no idle time.*

*Proof. *The number of processors is and the number of tasks is ( for -graph and for graph).

Lemma 3.9. *In any valid schedule of length three, the subgraph induced by tasks must be executed on processors in succession.*

*Proof. *Consider the subgraph induced by the tasks. This precedence graph admits paths of length two and these paths must be executed on the same processor (no communication delay is allowed).

Consider the tasks of path of length one. Let be a task without predecessor. By construction admits one successor denoted by .

Suppose that these two tasks are allotted on the same processor . Since that admits another predecessor denoted by then is allotted at .

The task cannot be executed at on since this task admits another successor as . Therefore, it exists an idle slot at on the processor . By construction there is no independent task and since the graph admits only path of length one, then no task can be allotted on this idle slot. This is impossible

In conclusion, the subgraph induced by tasks must be executed on processors in succession.

Lemma 3.10. *In any valid schedule of length three, two subgraphs induced by the tasks from two disjoint paths of length cannot be allotted on the same processors.*

*Proof. *Consider the tasks which are elements of two disjoints paths of length . A task without predecessor of one path cannot be allotted on the same processor as a task without successor of other path since there is no isolated task to schedule.

Lemma 3.11. *In any valid valid schedule of length three the tasks must be executed on the same processors as the tasks.*

*Proof. *Let tasks allotted on be the set of processors on which the tasks are executed.

Suppose that the -tasks are executed on processors . By Lemma 3.8, there is no idle slot, then the tasks on the path of length three are necessarily allotted on processor . This is impossible by Lemma 3.9.

With previous lemmas, we know that tasks (the tasks and the -tasks) are executed on the disjoints paths of length . By Definition 3.2, we know that the graph admits a unique set of disjoints paths of length with desired properties. Moreover with the precedence constraints, these tasks are allotted on a processor path of length . Without loss of generality, we suppose that a task is executed on the processor with .

Building the partition with desired property from schedule of length three, we know that two tasks of the same subgraph (see Lemma 3.11) cannot be executed on two different paths. The edge distance between these two processors is at least two.

We define such that if and only if the tasks of the graph are executed between the processors numbered as to with a fixed .

Now, we will compute .

Using previous remarks, without loss of generality, we suppose that with and (if it exists) are executed on with . Consider the -tasks which are scheduled between processors and for a fixed except the index such that paths of length three constituted by tasks from , are allotted on .

Using Lemma 3.9, we know that the number of tasks executed on processors and for a fixed is .

In conclusion we have which forms a with desired properties.

The construction suggested previously can be easily adapted to obtain a bipartite graph of depth one. Moreover, from the proof of Theorem 3.5, we can derive the following theorem.

Theorem 3.12. *The problem of deciding whether an instance of has a schedule of length at most three is -complete with .*

*Proof. *The proof is similar as the proof of Theorem 3.5 by considering the graph instead of widget . Nevertheless each path of length two induced by the tasks is transformed into two paths of length one.

We use the same construction as it is proposed for the proof of Theorem 3.5. Nevertheless, all paths of length three are transformed into two paths in the following way: and . These three must be executed on the same processors. Indeed, if admits several predecessors, it is obvious. Otherwise, suppose that is allotted on a processor . So must be executed at on . The task is scheduled at on a neighborhood processor. Therefore no task from the graphs and can be executed on processor at . Now using the same arguments as previously there is a schedule of length three if and only if the set can be partitioned into disjoint subsets each summing up to .

The proof of Theorem 3.5 therefore implies that the problem where the tasks can be duplicated is also -complete.

Corollary 3.13. *The problem of deciding whether an instance of with has a schedule of length at most three is -complete with .*

*Proof. *The proof comes directly from Theorems 3.5 and 3.12. In fact, Lemma 3.8 implies that no task can be duplicated (the number of the tasks is equal to the number of processors times 3).

Moreover, nonapproximability results can be deduced.

Corollary 3.14. *No polynomial-time algorithm exists with a performance bound less than 4/3 unless for the problems ; ; with .*

*Proof. *The proof of Corollary 3.14 is an immediate consequence of the impossibility theorem; see [9, page 4].

##### 3.2. Discussion

In the previous section, we propose a class graph for which the problem of deciding whether an instance of ; has a schedule of length at most three is -complete with and .

Hereafter, we will exhibit the parameters for some classic structured graphs in order to prove that the class graph is not empty.(i)For a grid (, where the couple designates the the position in the the line; ) (or torus) topology, we need lines and columns. The set of vertices for the graph a subgraph of with the desired properties given by Definition 3.2 is and . (ii)For the complete binary tree, it is sufficient to consider a tree with height of . (iii)For the Hypercube topology (or cube connected cycles), it is sufficient to have . (iv)….

#### 4. An Approximation Algorithm for Processor Networks with a Fixed Diameter

##### 4.1. Description and Correctness of an Algorithm

In order to design an efficient polynomial-time approximation algorithm, the classic strategy consists of taking an instance of the combinatorial optimization problem and applying some transformations and/or using polynomial-time algorithms as subroutines (shortest path, spanning tree, maximum matching, etc.). Afterwards, it is sufficient to evaluate the best lower bound for any optimal solution, and this lower bound may be compared to the feasible solution for the combinatorial optimization problem in order to determine the ratio of an approximation algorithm.

Here, instead of considering an instance and trying to directly develop a feasible solution for the problem, we consider a partial instance of of our scheduling problem (An instance is constituted by a precedence graph with unit execution time and unit communication time, processors in graph form, with the distance function.), denoted . (The partial instance of is constituted only by the precedence graph with unitary tasks and unitary communication time) For any instance , we use the classic approximation algorithm proposed by Munier and König [10] for the ; problem. We obtain a feasible schedule, denoted (we omit consideration of the processor graph for the moment) for the previous problem. Nevertheless, this solution is not feasible for our scheduling problem.

We proceed with polynomial-time chain of transformations, from schedule to a schedule , in order to get a feasible schedule. It is only in the last step, only for schedule , that we guarantee a feasible schedule for the problem .

This chain is defined as follows: (The schedule is a feasible solution for the problem.), where is the Munier-König algorithm [10], the dilatation algorithm (see [11] for details or Appendix A) and the folding algorithm (see [12] for details or Appendix B).

Subsequently, we will consider the three following scheduling problems:(i);; , (ii); , (iii)and finally .

The principal steps of the algorithm are described below.

An approximation algorithm uses three steps. In each step we apply an algorithm for a specified scheduling problem [10–12]. In the two first steps, a schedule is produced (these schedules are not feasible for our problem).(i)In the first step of an algorithm, a schedule (denoted on an unbounded number of processors), for the scheduling problem is produced. For this problem, Munier and König [10] presented a -approximation algorithm that is based on an integer linear programming formulation. They use the following procedure: an integrity constraint is relaxed, and a feasible schedule is produced by rounding. (ii)The second step of an algorithm produces a schedule (denoted , also on an unbounded number of processors) from by applying the dilatation principle proposed by [11] for the problem (this algorithm produces a feasible schedule for the large communication delay problem from unitary communication delay. We therefore have where is the dilatation algorithm. (iii)The third step produces a schedule (feasible for the problem) on the topology from using the folding principle [12]. The folding procedure constructs a feasible schedule on restricted number of processors from a feasible schedule on an unbounded number of processors. Thus, with being the folding algorithm.

Note that the length of schedule is less than , which is less than . The three steps are summarized in Figure 9. The notation description is given in the proof of Theorem 4.2.

Theorem 4.1. *The previous algorithm leads a feasible schedule for the problem .*

*Proof. *Proof is clear from the previous discussion concerning the description of an algorithm. Indeed, the communication delay is preserved and the precedence constraint is respected. Moreover, at most tasks are executed at any time.

##### 4.2. Relative Performance Analysis

Theorem 4.2. *The problem may be approximable within a factor of using the previous algorithm.*

*Proof. *We denote using with , --, and the length of the schedule. Moreover (resp., ) designates the performance ratio on a processor network model with a bounded (resp., unbounded) number of processors.

Now let us examine the relative performance of this algorithm. (i)According to an algorithm, the first step deals with the problem .

First of all the *Schedule* (UET-UCT,∞) is not optimal. Using the algorithm from [10] gives us a 4/3 relative performance. And so, by [10], we know that
(ii)In the second step, a feasible solution for a large communication delay (recall that stands for the diameter of processors network) is created. This solution comes from using the dilatation algorithm. Then, the expansion coefficient is ([11]). And so,
Thus, we have a schedule on a UET-LCT task system with a communication delay equal to and an infinite number of processors.

By definition it is obvious that
It is necessary to evaluate the gap between the optimal length for the schedule on a fully connected processor graph and a processor graph with a diameter of length . For this, we consider unitary tasks subject to precedence constraints and an unbounded number of processors.

Lemma 4.3. *The gap between a schedule on a fully connected graph of processors with a large communication delay , for all pairs of tasks, and a schedule on a graph of processors with a diameter of length , is at most .*

*Proof. *We need to compare first the relative performance of this schedule on our model with network processor. The relative performance for the UET-LCT task system is not valid for our model. We need to compute a new bound for this schedule on our model.

Let be a critical path of the schedule (i.e., a path that gives the length of the schedule). Suppose that there is a communication delay between each pair of tasks with . In the UET-LCT task system ( with a communication delay equal to for all pair of tasks) the length of the schedule would be units of time. In the graph of processors with a diameter of length , the same path allows a length of units of time. The worst case of the length for this path is and the best case is . So, the ratio is . For the large , we obtain the desired result.

By applying Lemma 4.3, which is valid for all schedules, and in particular for the optimum, with , we obtain and so Now we have to transform this schedule using an infinite number of processors into a schedule with a bounded number of processors. This can be done easily using the method from [12]. The new worst-case relative performance is just increased by one. Thus we have

*Remark 4.4. *Note that the order of the operations may be modified. Nevertheless, the ratio becomes . Indeed, the folding principle may be used just after the solution given by an algorithm proposed by Munier and König [10]. We then obtain a schedule on processors. Afterwards, we apply the dilation principle. This order yields a polynomial-time approximation algorithm with a ratio bounded by .

*Remark 4.5. *we may recall two classic results in scheduling problems for which the performance ratio increases by one between the unbounded and bounded versions.

(1) When the number of processors is unlimited, the problem of scheduling a set of tasks under precedence constraints with noncommunication delay is polynomial. It is sufficient to use the classical algorithm given by Bellman [13] as well as the two techniques widely used in project management: CPM (Critical Path Method) and PERT (Project/Program Evaluation and Review Technique). In contrast, when the number of processors is limited, the problem becomes -complete and a -approximation is developed by Graham, see [14], where designates the number of processors based on a list scheduling in which no order on tasks is specified.

(2) The second illustration is given by the transition to UET-UCT on unrestricted version to the restricted variant. In [10], we know the existence of a 4/3-approximation algorithm. Using the previous result Munier and Hanen in [15] design a 7/3-approximation for the restricted version.

#### 5. Conclusion

We have sharpened the demarcation line between the polynomially solvable and -hard case of the central scheduling problem (UET-UCT) on a structured processor network by showing that its decision is polynomially solvable for while it is -complete for . This result is given for a large class of graph with a nonconstant diameter. This result implies there is no -approximation algorithm with . These results are extended to the case of precedence graph is a bipartite graph.

Lastly, we complete our complexity results by developing a polynomial-time approximation algorithm for with a worst-case relative performance of , where designates the diameter of the graph. An interesting question for further research is to find a polynomial-time approximation algorithm with performance guarantee with .

#### Appendices

#### A.

This section describes the dilatation principle. This principle has been studied in [11], and used for designing a new polynomial-time approximation algorithm with a nontrivial performance guarantee for the problem . For the latter problem, the authors propose a -approximation algorithm (the best ratio as far as we know).

##### A.1. Introduction, Notation, and Description of the Method

*Notation 1. *We use to denote the UET-UCT schedule, and by the UET-LCT schedule. Moreover, we use (resp., ) to denote the starting time of the task in schedule (resp., in schedule ).

*Principle*

The tasks in allow the same assignment as the feasible schedule on an unbounded number of processors. We proceed to an expansion of the makespan, while preserving the communication delay () for two tasks and , with , processing on two different processors. For this, the starting time is translated by a factor .

In the following section, we will justify and determine the coefficient .

More formally, let be a precedence graph. We determine a feasible schedule , for the model UET-UCT, using the (4/3)-approximation algorithm proposed by Munier and König [10]. The result of this algorithm gives a couple of values , on the schedule with being the starting time of the task for the schedule and the processor on which the task will be processed at .

From this solution, we will derive a solution for the problem with large communication delays. For this, we will propose a new couple of values derived from couple . The computation of this set of new couples is obtained in the following ways: the start time and, . In other words, all tasks in the schedule are allotted on the same processor as the schedule , and the starting time of a task undergoes a translation with a factor . The justification of the expansion coefficient is given below. An illustration of the expansion is given in Figure 10.

##### A.2. Feasibility, Analysis of the Method, and Computation of the Ratio

Afterwards, we will justify the existence of the coefficient . Moreover, we prove the correctness of the feasible schedule for ; problem. Lastly, we propose a worst-case analysis for the algorithm.

Lemma A.1. *The coefficient of an expansion is .*

*Proof. *Let there be two tasks and such that , which are processed on two different processors in the feasible schedule . We are interested in obtaining a coefficient such that and . After expansion, in order to respect the precedence constraints and communication delay, we must have , and so . It is sufficient to choose .

Lemma A.2. *An expansion algorithm gives a feasible schedule for the problem in .*

*Proof. *It is sufficient to check that the solution given by an expansion algorithm produces a feasible schedule for the UET-LCT model. Let and be two tasks such that . We use (resp., ) to denote the processor on which task (resp., the task ) is executed in schedule . Moreover, we use (resp., ) to denote the processor on which task (resp., the task ) is executed in schedule . Thus, (i)if then . Since the solution given by Munier and König [10] gives a feasible schedule on the model UET-UCT, we have , ; ;(ii)if then . We have ; .

Theorem A.3. *An expansion algorithm gives a -approximation algorithm for the problem .*

*Proof. *We use (resp., ) to denote the makespan of the schedule computed by Munier and König (resp., the optimal value of a schedule ). In the same way, we use (resp., ) to denote the makespan of the schedule computed by an algorithm (resp., the optimal value of a schedule ).

We know that
Thus, we obtain

#### B.

In this section, we present a simple algorithm which gives a schedule on machines from a schedule on an unbounded number of processors for . Let be the set of tasks executed at in using a heuristic . The tasks are executed in units of time in the schedule . We apply this procedure for all . The validity of this algorithm is based on the fact there is at most a matching between the tasks executed at and the tasks processed at (called Brent's lemma, see [12]).

Theorem B.1. *From any polynomial time algorithm with performance guarantee (i.e., ) for the problem , we may obtain a polynomial-time algorithm with performance guarantee for the problem .*

*Proof. *Let (resp., ) be the length of the schedule given by (resp., by ). In the same way, let (resp., ) be the optimal length of the schedule on an unbounded number of processors (resp., in a restricted number of processors). We denote by the number of tasks in the schedule. Clearly, this gives us and . So,
This concludes proof of Theorem B.1.

#### References

- R. L. Graham, E. L. Lawler, J. K. Lenstra, and A. H. G. Rinnooy Kan, “Optimization and approximation in deterministic sequencing and scheduling: a survey,”
*Annals of Discrete Mathematics*, vol. 5, pp. 287–326, 1979. View at Publisher · View at Google Scholar · View at Zentralblatt MATH - C. Picouleau, “UET-UCT schedules on arbitrary networks,” Tech. Rep., LITP, Blaise Pascal, Université Paris VI, 1994. View at Google Scholar
- R. Giroudeau, J. C. König, and B. Valéry, “Scheduling uet-tasks on a star network: complexity and approximation,”
*4OR A Quarterly Journal of Operations Research*, vol. 9, no. 1, pp. 29–48, 2011. View at Google Scholar - V. Boudet, Y. Cohen, R. Giroudeau, and J. C. Konig, “Complexity results for scheduling problem with non trivial topology of processors,” Tech. Rep. 06050, LIRMM, 2006, submitted to Rairo-RO. View at Google Scholar
- C. Lahlou, “Scheduling with unit processing and communication times on a ring network: approximation results,” in
*Proceedings of Europar*, pp. 539–542, Springer, New York, NY, USA, 1996. View at Google Scholar - V. J. Rayward-Smith, “UET scheduling with unit interprocessor communication delays,”
*Discrete Applied Mathematics*, vol. 18, no. 1, pp. 55–71, 1987. View at Publisher · View at Google Scholar · View at Zentralblatt MATH - J. J. Hwang, Y.-C. Chow, F. D. Anger, and C.-Y. Lee, “Scheduling precedence graphs in systems with interprocessor communication times,”
*SIAM Journal on Computing*, vol. 18, no. 2, pp. 244–257, 1989. View at Publisher · View at Google Scholar · View at Zentralblatt MATH - M. R. Garey and D. S. Johnson,
*Computers and Intractability: A Guide to the Theory of 𝒩𝒫-Completeness*, A Series of Books in the Mathematical Science, W. H. Freeman, San Francisco, Calif, USA, 1979. - P. Chrétienne and C. Picouleau,
*Scheduling Theory and Its Applications*, Scheduling with Communication Delays: A Survey, chapter 4, John Wiley & Sons, Chichester, UK, 1995. - A. Munier and J. C. König, “A heuristic for a scheduling problem with communication delays,”
*Operations Research*, vol. 45, no. 1, pp. 145–148, 1997. View at Google Scholar - R. Giroudeau, J.-C. Konig, F. K. Moulai, and J. Palaysi, “Complexity and approximation for precedence constrained scheduling problems with large communication delays,”
*Theoretical Computer Science*, vol. 401, no. 1–3, pp. 107–119, 2008. View at Publisher · View at Google Scholar · View at Zentralblatt MATH - R. P. Brent, “The parallel evaluation of general arithmetic expressions,”
*Journal of the Association for Computing Machinery*, vol. 21, pp. 201–206, 1974. View at Google Scholar · View at Zentralblatt MATH - R. Bellman, “On a routing problem,”
*Quarterly of Applied Mathematics*, vol. 16, pp. 87–90, 1958. View at Google Scholar · View at Zentralblatt MATH - R. Graham, “Bounds for certain multiprocessing anomalies,”
*Bell System Technical Journal*, vol. 45, pp. 1563–1581, 1966. View at Google Scholar - A. Munier and C. Hanen, “An approximation algorithm for scheduling unitary tasks on
*m*processors with communication delays,” private communication, 1996.