Mathematical Problems in Engineering

Volume 2015 (2015), Article ID 484671, 10 pages

http://dx.doi.org/10.1155/2015/484671

## Theoretical Expectation versus Practical Performance of Jackson’s Heuristic

Facultad de Ciencias, UAEM, Avenida Universidad 1001, 62210 Cuernavaca, MOR, Mexico

Received 19 January 2015; Revised 14 May 2015; Accepted 18 May 2015

Academic Editor: Ben T. Nohara

Copyright © 2015 Nodari Vakhania et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

A basic 2-approximation heuristic was suggested by Jackson in early 50s last century for scheduling jobs with release times and due dates to minimize the maximum job lateness. The theoretical worst-case bound of 2 helps a little in practice, when the solution quality is important. The quality of the solution delivered by Jackson’s heuristic is closely related to the maximum job processing time that occurs in a given problem instance and with the resultant interference with other jobs that such a long job may cause. We use the relationship of with the optimal objective value to obtain more accurate approximation ratio, which may drastically outperform the earlier known worst-case ratio of 2. This is proved, in practice, by our computational experiments.

#### 1. Introduction

One of the oldest and commonly used (online) heuristics in scheduling theory is that of Jackson [1] (see also Schrage [2]). It was first suggested for scheduling jobs with release times and due dates on a single machine to minimize the maximum job lateness. In general, variations of Jackson’s heuristic are widely used to construct feasible solutions for scheduling jobs with release and delivery times on a single machine or on a group of parallel machines. Besides, Jackson’s heuristic is efficiently used in the solution of more complicated shop scheduling problems including job-shop scheduling, in which the original problem occurs as an auxiliary one and is applied to obtain lower estimations in implicit enumeration algorithms (although even this latter problem is strongly NP-hard, see Garey and Johnson [3]). At the same time, it is known that, in the worst-case, Jackson’s heuristic will deliver a solution which is twice worse than an optimal one.

The latter worst-case bound might be too rough in practice, when the solution quality is important; that is, solutions with the objective value better than twice the optimal objective value are required. In addition, the solutions may need to be created online (any possible offline modification of the heuristic that may lead to a better performance would be of not much use). In this situation, the online performance measure is essentially important.

As we will see, the quality of the solution delivered by the heuristic is essentially related to the maximum job processing time that may occur in a given problem instance. In particular, the interference of a “long” nonurgent job with the following scheduled urgent jobs affects the solution quality.

We express as a fraction the optimal objective value and derive a much more accurate approximation ratio than 2. In some applications, this kind of relationship can be predicted with a good accuracy. For instance, consider large-scale production process where the processing requirement of any individual job is small relative to an estimated (shortest possible) overall production time of all the products (due to a large number of products and the number of operations required to produce each product). If this kind of prediction is not possible, by a single application of Jackson’s heuristic, we obtain a strong lower bound on the optimal objective value and represent as its fraction (instead of representing it as a fraction of an unknown optimal objective value). Then we give an explicit expression of the heuristic’s approximation ratio in terms of that fraction. In particular, Jackson’s heuristic will always deliver a solution within a factor of of the optimum. We further refine to a smaller more accurate magnitude and use it instead of in our second stronger estimation.

Our estimations may drastically outperform the earlier known worst-case ratio of 2. This is proved, in practice, by the computational experiments. From 200 randomly generated problem instances, more than half of the instances were solved optimally by Jackson’s heuristic, as no above interference with a long job has occurred. For the rest of the instances, the interference was insignificant, so that most of them were solved within a factor of 1.009 of the optimum objective value, whereas the worst approximation ratio was less than 1.03. According to the experimental results, our lower bounds turn out to be quite strong, in practice.

The paper is organized as follows. In the next subsection we give a general overview of combinatorial optimization and scheduling problems mentioning the importance of good approximation algorithms and efficient heuristics for these problems. Then we give an overview of some related work. In the following section we describe Jackson’s heuristic, give some basic concepts and facts, and derive our estimations on the worst-case performance of Jackson’s heuristic. In the final section we present our computational experiments.

##### 1.1. Heuristics and Combinatorial Optimization Problems

* Combinatorial optimization* (CO) problems constitute a significant class of practical problems with a discrete nature. They have emerged in late 40s of 20th century. With a rapid growth of the industry, the new demands in the optimal solution of the newly emerging resource management and distribution problems have arisen. For the development of effective solution methods, these problems were formalized and addressed mathematically.

A CO problem is characterized by a finite set of the so-called* feasible solutions*, defined by a given set of restrictions, and an objective function for these feasible solutions, which typically needs to be optimized, that is, minimized or maximized: the problem is to find an* optimal solution*, that is, one minimizing the objective function. Typically, the number of feasible solutions is finite. Thus theoretically, finding an optimal solution is trivial: just enumerate all the feasible solutions calculating for each of them the value of the objective function and select any one with the optimal objective value. However, this brutal enumeration of all feasible solutions may be impossible in practice. Even for problems with a moderate size (say 30 cities for a classical traveling salesman problem or 10 jobs on 10 machines in job-shop scheduling problem), such a complete enumeration may take hundreds of centuries on the modern computers. Moreover, this situation will not be changed if in the future, much faster computers will be developed.

The CO problems are partitioned into two basic types, type , which are polynomially solvable ones, and NP-hard problems. Intuitively, there exist efficient (polynomial in the* size* of the problem) solution methods or algorithms for the problems from the first class, whereas no such algorithms exist for the problems of the second class (informally, the size of the problem is the number of bits necessary to represent the problem data/parameters in a computer memory). Furthermore, all NP-hard problems, ones from the second class, have a similar* computational complexity*, in the sense that a polynomial-time algorithm for any of them would yield a polynomial time algorithm for any other problem from this class. At the same time, it is believed that it is very unlikely that an NP-hard problem can be solved in polynomial time.

*Greedy (heuristic)* algorithms are efficient polynomial-time algorithms that create a feasible schedule. Such algorithms, typically, work on (external) iterations, where is the number of objects in the given problem. The* size* of the problem (i.e., the number of bits necessary to represent the problem data/parameters in a computer memory) is a polynomial in . Hence, the number of iterations in a greedy algorithm is also polynomial in the size of the problem. Such an algorithm creates a complete destiny feasible solution iteratively, extending the current partial solution by some yet unconsidered object at each iteration. In this way, the search space is reduced to a single possible extension at each iteration, from all the theoretically possible potential extensions. This type of “rough” reduction of the whole solution space may lead us to the loss of an optimal or near-optimal solution.

A greedy/heuristic algorithm may create an optimal solution for a problem in class (though not any problem in class may optimally be solved by a heuristic method). However, this is not the case for an NP-hard problem; that is, no greedy or heuristic algorithm can solve optimally an NP-hard problem (unless , which is very unlikely). Since the majority of CO problems are NP-hard, a compromise accepting a solution worse than an optimal one is hence unavoidable. On this way, it is natural and also practical to think about the design and analysis of polynomial-time* approximation algorithms*, that is, ones which deliver a solution with a guaranteed deviation from an optimal one in polynomial time. The* performance ratio* of an approximation algorithm is the ratio of the value of the objective function delivered by to the optimal value. A *-approximation algorithm* is one with the worst-case performance ratio . Since the simplest polynomial-time algorithms are greedy, a greedy algorithm is a simplest approximation algorithm.

###### 1.1.1. Scheduling Problems

The* scheduling problems* deal with a finite set of requests called* jobs* to be performed (or scheduled) on a finite (and limited) set of resources called* machines* (or* processors*). The aim is to choose the order of processing the jobs on machines so as to meet given objective criteria.

A basic scheduling problem that we consider in this paper is as follows: jobs have to be scheduled on a single machine. Each job becomes available at its* release time *. A released job can be assigned to the machine that has to process job for time units. The machine can handle at most one job at a time. Once it completes this job still needs a (constant)* delivery time * for its* full completion* (the jobs are delivered by an independent unit and this takes no machine time). All above parameters are integers. Our objective is to find a job sequence on the machine that minimizes the maximum job full completion time.

According to the conventional three-field notation introduced by Graham et al. [4] the above problem is abbreviated as : in the first field the single-machine environment is indicated, the second field specifies job parameters, and in the third field the objective criteria are given. The problem has an equivalent formulation in which delivery times are interchanged by* due dates* and the maximum job* lateness *, that is, the difference between the job completion time and its due date, is minimized (due date of job is the desirable time for the completion of job ; there occurs a penalty whenever is completed after time moment ).

Given an instance of , one can obtain an equivalent instance of as follows. Take a suitably large constant (no less than the maximum job delivery time) and define due date of every job as . Vice versa, given an instance of , an equivalent instance of can be obtained by defining job delivery times as , where is a suitably large constant (no less than the maximum job due date). It is straightforward to see that the pair of instances defined in this way are equivalent; that is, whenever the makespan for the version is minimized, the maximum job lateness in is minimized and vice versa (see Bratley et al. [5] for more details).

Because of the above equivalence, we will use both above formulations interchangeably. (As noted briefly earlier, the version with delivery times naturally occurs in implicit enumeration algorithms for job-shop scheduling problem and is used for the calculation of lower bounds.)

##### 1.2. Overview

Jackson’s heuristic iteratively, at each scheduling time (given by job release or completion time), among the jobs released by time schedules one with the largest delivery time (or smallest due date). For the sake of conciseness Jackson’s heuristic has been commonly abbreviated as EDD-heuristic (earliest due date) or alternatively, LDT-heuristic (largest delivery time). Since the number of scheduling times is and at each scheduling time search for a minimal/maximal element in an ordered list is accomplished, the time complexity of the heuristic is .

A number of efficient algorithms are variations of Jackson’s heuristic. For instance, Potts [6] has proposed a modification of this heuristic with for the problem . His algorithm repeatedly applies the heuristic times and obtains an improved approximation ratio of . Hall and Shmoys [7] have elaborated polynomial approximation schemes for the same problem and also an 4/3-approximation an algorithm for its version with the precedence relations with the same time complexity of as the above algorithm from [6]. Jackson’s heuristic can be efficiently used for the solution of shop scheduling problems. Using Jackson’s heuristic as a schedule generator, McMahon and Florian [8] and Carlier [9] have proposed efficient enumerative algorithms for . Grabowski et al. [10] use the heuristic for the obtention of an initial solution in another enumerative algorithm for the same problem. Garey et al. [11] have applied the same heuristic in an algorithm for the feasibility version of this problem with equal-length jobs (in the feasibility version job due-dates are replaced by deadlines and a schedule in which all jobs are complete by their deadlines is looked for). Again using Jackson’s heuristic as a schedule generator, other polynomial-time direct combinatorial algorithms were described. In [12] was proposed an algorithm for the minimization version of the latter problem with two possible job processing times and in [13] an algorithm that minimizes the number of late jobs with release times on a single-machine when job preemptions are allowed. Without preemptions, two polynomial-time algorithms for equal-length jobs on single machine and on a group of identical machines were proposed in [14] and [15], respectively, with time complexities and , respectively.

Jackson’s heuristic has been used in multiprocessor scheduling problems as well. For example, for the feasibility version with identical machines and equal-length jobs, algorithms with the time complexities and were proposed in Simons [16] and Simons and Warmuth [17], respectively. Using the same heuristic as a schedule generator in [18] was proposed an algorithm for the minimization version of the latter problem, where is the maximal job delivery time and is a parameter.

The heuristic has been successfully used for the obtainment of lower bounds in job-shop scheduling problems. In the classical job-shop scheduling problem the preemptive version of Jackson’s heuristic applied for a specially derived single-machine problem immediately gives a lower bound, see, for example, Carlier [9], Carlier and Pinson [19], and Brinkkotter and Brucker [20] and more recent works of Gharbi and Labidi [21] and Croce and T’kindt [22]. Carlier and Pinson [23] have used the extended Jackson’s heuristic for the solution of the multiprocessor job-shop problem with identical machines, and it can also be adopted for the case when parallel machines are unrelated (see [24]). Jackson’s heuristic can be useful for parallelizing the computations in scheduling job-shop Perregaard and Clausen [25] and also for the parallel batch scheduling problems with release times Condotta et al. [26].

#### 2. Theoretical Estimations of Heuristics Performance

We start this section with a detailed description of Jackson’s heuristic. It distinguishes scheduling times, the time moments at which a job is assigned to the machine. Initially, the earliest scheduling time is set to the minimum job release time. Among all jobs released by that time a job with the minimum due date (the maximum delivery time, alternatively) is assigned to the machine (ties being broken by selecting a longest job). Iteratively, the next scheduling time is either the completion time of the latest so far assigned job to the machine or the minimum release time of a yet unassigned job, whichever is more (since no job can be started before the machine gets idle, nether it can be started before its release time). And again, among all jobs released by this scheduling time a job with the minimum due date (the maximum delivery time, alternatively) is assigned to the machine. Note that the heuristic creates no gap that can be avoiding always scheduling an already released job once the machine becomes idle, whereas among yet unscheduled jobs released by each scheduling time it gives the priority to a most urgent one (i.e., one with the smallest due date, alternatively, with the largest delivery time).

Let be the schedule obtained by the application of Jackson’s heuristic (*J-heuristic, for short*) to the originally given problem instance. Schedule, , and, in general, any Jackson’s schedule (*J-schedule*, for short), that is, one constructed by J-heuristic, may contain a* gap*, which is its maximal consecutive time interval in which the machine is idle. We assume that there occurs a 0-length gap whenever job starts at its earliest possible starting time, that is, its release time, immediately after the completion of job ; here (, resp.) denotes the starting (completion, resp.) time of job .

A* block* in a J-schedule is its consecutive part consisting of the successively scheduled jobs without any gap in between preceded and succeeded by a (possibly a 0-length) gap.

J-schedules have useful structural properties. The following basic definitions, taken from [18], will help us to expose these properties.

Given a J-schedule , let be a job that realizes the maximum job lateness in ; that is, . Let, further, be the block in that contains job . Among all the jobs in with this property, the latest scheduled one is called an* overflow job* in (we just note that not necessarily this job ends block ).

A* kernel* in is a maximal (consecutive) job sequence ending with an overflow job such that no job from this sequence has a due date more than . For a kernel , we let .

It follows that every kernel is contained in some block in , and the number of kernels in equals the number of the overflow jobs in it. Furthermore, since any kernel belongs to a single block, it may contain no gap.

The following lemmas are used in the proof of Theorem 6. A statement similar to Lemma 1 can be found in [27] and Lemma 3 in [18]. Lemma 5 is obtained as a consequence of these two lemmas, though the related result has been known earlier. For the sake of completeness of our presentation, we give all our claims with proofs.

Lemma 1. *The maximum job lateness (the makespan) of a kernel cannot be reduced if the earliest scheduled job in starts at time . Hence, if a J-schedule contains a kernel with this property, then it is optimal.*

*Proof. *Recall that all jobs in are no less urgent than the overflow job and that jobs in form a tight sequence (i.e., without any gap). Then since the earliest job in starts at its release time, no reordering of jobs in can reduce the current maximum lateness, which is . Hence, there is no feasible schedule with ; that is, is optimal.

Thus is already optimal if the condition in Lemma 1 holds. Otherwise, there must exist a job less urgent than , scheduled before all jobs of kernel that delays jobs in (and the overflow job ). By rescheduling such a job to a later time moment the jobs in kernel can be restarted earlier. We need some extra definitions to define this operation formally.

Suppose job precedes job in ED-schedule . We will say that * pushes * in if ED-heuristic will reschedule job earlier whenever is forced to be scheduled behind .

Since the earliest scheduled job of kernel does not start at its release time (Lemma 1), it is immediately preceded and pushed by a job with . In general, we may have more than one such a job scheduled before kernel in block (one containing ). We call such a job an* emerging job* for , and we call the latest scheduled one (job above) the* live* emerging job.

From the above definition and Lemma 1 we immediately obtain the following corollary.

Corollary 2. *If contains a kernel which has no live emerging job, then it is optimal.*

We illustrate the above introduced definitions on a problem instance of . The corresponding J-schedule is depicted in Figure 1(a). In that instance, we have 11 jobs, job 1 with , , and . All the rest of the jobs are released at time moment 10 and have the equal processing time 1 and the delivery time 100. These data completely define our problem instance.