Abstract

A basic 2-approximation heuristic was suggested by Jackson in early 50s last century for scheduling jobs with release times and due dates to minimize the maximum job lateness. The theoretical worst-case bound of 2 helps a little in practice, when the solution quality is important. The quality of the solution delivered by Jackson’s heuristic is closely related to the maximum job processing time that occurs in a given problem instance and with the resultant interference with other jobs that such a long job may cause. We use the relationship of with the optimal objective value to obtain more accurate approximation ratio, which may drastically outperform the earlier known worst-case ratio of 2. This is proved, in practice, by our computational experiments.

1. Introduction

One of the oldest and commonly used (online) heuristics in scheduling theory is that of Jackson [1] (see also Schrage [2]). It was first suggested for scheduling jobs with release times and due dates on a single machine to minimize the maximum job lateness. In general, variations of Jackson’s heuristic are widely used to construct feasible solutions for scheduling jobs with release and delivery times on a single machine or on a group of parallel machines. Besides, Jackson’s heuristic is efficiently used in the solution of more complicated shop scheduling problems including job-shop scheduling, in which the original problem occurs as an auxiliary one and is applied to obtain lower estimations in implicit enumeration algorithms (although even this latter problem is strongly NP-hard, see Garey and Johnson [3]). At the same time, it is known that, in the worst-case, Jackson’s heuristic will deliver a solution which is twice worse than an optimal one.

The latter worst-case bound might be too rough in practice, when the solution quality is important; that is, solutions with the objective value better than twice the optimal objective value are required. In addition, the solutions may need to be created online (any possible offline modification of the heuristic that may lead to a better performance would be of not much use). In this situation, the online performance measure is essentially important.

As we will see, the quality of the solution delivered by the heuristic is essentially related to the maximum job processing time that may occur in a given problem instance. In particular, the interference of a “long” nonurgent job with the following scheduled urgent jobs affects the solution quality.

We express as a fraction the optimal objective value and derive a much more accurate approximation ratio than 2. In some applications, this kind of relationship can be predicted with a good accuracy. For instance, consider large-scale production process where the processing requirement of any individual job is small relative to an estimated (shortest possible) overall production time of all the products (due to a large number of products and the number of operations required to produce each product). If this kind of prediction is not possible, by a single application of Jackson’s heuristic, we obtain a strong lower bound on the optimal objective value and represent as its fraction (instead of representing it as a fraction of an unknown optimal objective value). Then we give an explicit expression of the heuristic’s approximation ratio in terms of that fraction. In particular, Jackson’s heuristic will always deliver a solution within a factor of of the optimum. We further refine to a smaller more accurate magnitude and use it instead of in our second stronger estimation.

Our estimations may drastically outperform the earlier known worst-case ratio of 2. This is proved, in practice, by the computational experiments. From 200 randomly generated problem instances, more than half of the instances were solved optimally by Jackson’s heuristic, as no above interference with a long job has occurred. For the rest of the instances, the interference was insignificant, so that most of them were solved within a factor of 1.009 of the optimum objective value, whereas the worst approximation ratio was less than 1.03. According to the experimental results, our lower bounds turn out to be quite strong, in practice.

The paper is organized as follows. In the next subsection we give a general overview of combinatorial optimization and scheduling problems mentioning the importance of good approximation algorithms and efficient heuristics for these problems. Then we give an overview of some related work. In the following section we describe Jackson’s heuristic, give some basic concepts and facts, and derive our estimations on the worst-case performance of Jackson’s heuristic. In the final section we present our computational experiments.

1.1. Heuristics and Combinatorial Optimization Problems

Combinatorial optimization (CO) problems constitute a significant class of practical problems with a discrete nature. They have emerged in late 40s of 20th century. With a rapid growth of the industry, the new demands in the optimal solution of the newly emerging resource management and distribution problems have arisen. For the development of effective solution methods, these problems were formalized and addressed mathematically.

A CO problem is characterized by a finite set of the so-called feasible solutions, defined by a given set of restrictions, and an objective function for these feasible solutions, which typically needs to be optimized, that is, minimized or maximized: the problem is to find an optimal solution, that is, one minimizing the objective function. Typically, the number of feasible solutions is finite. Thus theoretically, finding an optimal solution is trivial: just enumerate all the feasible solutions calculating for each of them the value of the objective function and select any one with the optimal objective value. However, this brutal enumeration of all feasible solutions may be impossible in practice. Even for problems with a moderate size (say 30 cities for a classical traveling salesman problem or 10 jobs on 10 machines in job-shop scheduling problem), such a complete enumeration may take hundreds of centuries on the modern computers. Moreover, this situation will not be changed if in the future, much faster computers will be developed.

The CO problems are partitioned into two basic types, type , which are polynomially solvable ones, and NP-hard problems. Intuitively, there exist efficient (polynomial in the size of the problem) solution methods or algorithms for the problems from the first class, whereas no such algorithms exist for the problems of the second class (informally, the size of the problem is the number of bits necessary to represent the problem data/parameters in a computer memory). Furthermore, all NP-hard problems, ones from the second class, have a similar computational complexity, in the sense that a polynomial-time algorithm for any of them would yield a polynomial time algorithm for any other problem from this class. At the same time, it is believed that it is very unlikely that an NP-hard problem can be solved in polynomial time.

Greedy (heuristic) algorithms are efficient polynomial-time algorithms that create a feasible schedule. Such algorithms, typically, work on (external) iterations, where is the number of objects in the given problem. The size of the problem (i.e., the number of bits necessary to represent the problem data/parameters in a computer memory) is a polynomial in . Hence, the number of iterations in a greedy algorithm is also polynomial in the size of the problem. Such an algorithm creates a complete destiny feasible solution iteratively, extending the current partial solution by some yet unconsidered object at each iteration. In this way, the search space is reduced to a single possible extension at each iteration, from all the theoretically possible potential extensions. This type of “rough” reduction of the whole solution space may lead us to the loss of an optimal or near-optimal solution.

A greedy/heuristic algorithm may create an optimal solution for a problem in class (though not any problem in class may optimally be solved by a heuristic method). However, this is not the case for an NP-hard problem; that is, no greedy or heuristic algorithm can solve optimally an NP-hard problem (unless , which is very unlikely). Since the majority of CO problems are NP-hard, a compromise accepting a solution worse than an optimal one is hence unavoidable. On this way, it is natural and also practical to think about the design and analysis of polynomial-time approximation algorithms, that is, ones which deliver a solution with a guaranteed deviation from an optimal one in polynomial time. The performance ratio of an approximation algorithm is the ratio of the value of the objective function delivered by to the optimal value. A -approximation algorithm is one with the worst-case performance ratio . Since the simplest polynomial-time algorithms are greedy, a greedy algorithm is a simplest approximation algorithm.

1.1.1. Scheduling Problems

The scheduling problems deal with a finite set of requests called jobs to be performed (or scheduled) on a finite (and limited) set of resources called machines (or processors). The aim is to choose the order of processing the jobs on machines so as to meet given objective criteria.

A basic scheduling problem that we consider in this paper is as follows: jobs have to be scheduled on a single machine. Each job becomes available at its release time . A released job can be assigned to the machine that has to process job for time units. The machine can handle at most one job at a time. Once it completes this job still needs a (constant) delivery time for its full completion (the jobs are delivered by an independent unit and this takes no machine time). All above parameters are integers. Our objective is to find a job sequence on the machine that minimizes the maximum job full completion time.

According to the conventional three-field notation introduced by Graham et al. [4] the above problem is abbreviated as : in the first field the single-machine environment is indicated, the second field specifies job parameters, and in the third field the objective criteria are given. The problem has an equivalent formulation in which delivery times are interchanged by due dates and the maximum job lateness , that is, the difference between the job completion time and its due date, is minimized (due date of job is the desirable time for the completion of job ; there occurs a penalty whenever is completed after time moment ).

Given an instance of , one can obtain an equivalent instance of as follows. Take a suitably large constant (no less than the maximum job delivery time) and define due date of every job as . Vice versa, given an instance of , an equivalent instance of can be obtained by defining job delivery times as , where is a suitably large constant (no less than the maximum job due date). It is straightforward to see that the pair of instances defined in this way are equivalent; that is, whenever the makespan for the version is minimized, the maximum job lateness in is minimized and vice versa (see Bratley et al. [5] for more details).

Because of the above equivalence, we will use both above formulations interchangeably. (As noted briefly earlier, the version with delivery times naturally occurs in implicit enumeration algorithms for job-shop scheduling problem and is used for the calculation of lower bounds.)

1.2. Overview

Jackson’s heuristic iteratively, at each scheduling time (given by job release or completion time), among the jobs released by time schedules one with the largest delivery time (or smallest due date). For the sake of conciseness Jackson’s heuristic has been commonly abbreviated as EDD-heuristic (earliest due date) or alternatively, LDT-heuristic (largest delivery time). Since the number of scheduling times is and at each scheduling time search for a minimal/maximal element in an ordered list is accomplished, the time complexity of the heuristic is .

A number of efficient algorithms are variations of Jackson’s heuristic. For instance, Potts [6] has proposed a modification of this heuristic with for the problem . His algorithm repeatedly applies the heuristic times and obtains an improved approximation ratio of . Hall and Shmoys [7] have elaborated polynomial approximation schemes for the same problem and also an 4/3-approximation an algorithm for its version with the precedence relations with the same time complexity of as the above algorithm from [6]. Jackson’s heuristic can be efficiently used for the solution of shop scheduling problems. Using Jackson’s heuristic as a schedule generator, McMahon and Florian [8] and Carlier [9] have proposed efficient enumerative algorithms for . Grabowski et al. [10] use the heuristic for the obtention of an initial solution in another enumerative algorithm for the same problem. Garey et al. [11] have applied the same heuristic in an algorithm for the feasibility version of this problem with equal-length jobs (in the feasibility version job due-dates are replaced by deadlines and a schedule in which all jobs are complete by their deadlines is looked for). Again using Jackson’s heuristic as a schedule generator, other polynomial-time direct combinatorial algorithms were described. In [12] was proposed an algorithm for the minimization version of the latter problem with two possible job processing times and in [13] an algorithm that minimizes the number of late jobs with release times on a single-machine when job preemptions are allowed. Without preemptions, two polynomial-time algorithms for equal-length jobs on single machine and on a group of identical machines were proposed in [14] and [15], respectively, with time complexities and , respectively.

Jackson’s heuristic has been used in multiprocessor scheduling problems as well. For example, for the feasibility version with identical machines and equal-length jobs, algorithms with the time complexities and were proposed in Simons [16] and Simons and Warmuth [17], respectively. Using the same heuristic as a schedule generator in [18] was proposed an algorithm for the minimization version of the latter problem, where is the maximal job delivery time and is a parameter.

The heuristic has been successfully used for the obtainment of lower bounds in job-shop scheduling problems. In the classical job-shop scheduling problem the preemptive version of Jackson’s heuristic applied for a specially derived single-machine problem immediately gives a lower bound, see, for example, Carlier [9], Carlier and Pinson [19], and Brinkkotter and Brucker [20] and more recent works of Gharbi and Labidi [21] and Croce and T’kindt [22]. Carlier and Pinson [23] have used the extended Jackson’s heuristic for the solution of the multiprocessor job-shop problem with identical machines, and it can also be adopted for the case when parallel machines are unrelated (see [24]). Jackson’s heuristic can be useful for parallelizing the computations in scheduling job-shop Perregaard and Clausen [25] and also for the parallel batch scheduling problems with release times Condotta et al. [26].

2. Theoretical Estimations of Heuristics Performance

We start this section with a detailed description of Jackson’s heuristic. It distinguishes scheduling times, the time moments at which a job is assigned to the machine. Initially, the earliest scheduling time is set to the minimum job release time. Among all jobs released by that time a job with the minimum due date (the maximum delivery time, alternatively) is assigned to the machine (ties being broken by selecting a longest job). Iteratively, the next scheduling time is either the completion time of the latest so far assigned job to the machine or the minimum release time of a yet unassigned job, whichever is more (since no job can be started before the machine gets idle, nether it can be started before its release time). And again, among all jobs released by this scheduling time a job with the minimum due date (the maximum delivery time, alternatively) is assigned to the machine. Note that the heuristic creates no gap that can be avoiding always scheduling an already released job once the machine becomes idle, whereas among yet unscheduled jobs released by each scheduling time it gives the priority to a most urgent one (i.e., one with the smallest due date, alternatively, with the largest delivery time).

Let be the schedule obtained by the application of Jackson’s heuristic (J-heuristic, for short) to the originally given problem instance. Schedule, , and, in general, any Jackson’s schedule (J-schedule, for short), that is, one constructed by J-heuristic, may contain a gap, which is its maximal consecutive time interval in which the machine is idle. We assume that there occurs a 0-length gap whenever job starts at its earliest possible starting time, that is, its release time, immediately after the completion of job ; here (, resp.) denotes the starting (completion, resp.) time of job .

A block in a J-schedule is its consecutive part consisting of the successively scheduled jobs without any gap in between preceded and succeeded by a (possibly a 0-length) gap.

J-schedules have useful structural properties. The following basic definitions, taken from [18], will help us to expose these properties.

Given a J-schedule , let be a job that realizes the maximum job lateness in ; that is, . Let, further, be the block in that contains job . Among all the jobs in with this property, the latest scheduled one is called an overflow job in (we just note that not necessarily this job ends block ).

A kernel in is a maximal (consecutive) job sequence ending with an overflow job such that no job from this sequence has a due date more than . For a kernel , we let .

It follows that every kernel is contained in some block in , and the number of kernels in equals the number of the overflow jobs in it. Furthermore, since any kernel belongs to a single block, it may contain no gap.

The following lemmas are used in the proof of Theorem 6. A statement similar to Lemma 1 can be found in [27] and Lemma 3 in [18]. Lemma 5 is obtained as a consequence of these two lemmas, though the related result has been known earlier. For the sake of completeness of our presentation, we give all our claims with proofs.

Lemma 1. The maximum job lateness (the makespan) of a kernel cannot be reduced if the earliest scheduled job in starts at time . Hence, if a J-schedule contains a kernel with this property, then it is optimal.

Proof. Recall that all jobs in are no less urgent than the overflow job and that jobs in form a tight sequence (i.e., without any gap). Then since the earliest job in starts at its release time, no reordering of jobs in can reduce the current maximum lateness, which is . Hence, there is no feasible schedule with ; that is, is optimal.

Thus is already optimal if the condition in Lemma 1 holds. Otherwise, there must exist a job less urgent than , scheduled before all jobs of kernel that delays jobs in (and the overflow job ). By rescheduling such a job to a later time moment the jobs in kernel can be restarted earlier. We need some extra definitions to define this operation formally.

Suppose job precedes job in ED-schedule . We will say that pushes in if ED-heuristic will reschedule job earlier whenever is forced to be scheduled behind .

Since the earliest scheduled job of kernel does not start at its release time (Lemma 1), it is immediately preceded and pushed by a job with . In general, we may have more than one such a job scheduled before kernel in block (one containing ). We call such a job an emerging job for , and we call the latest scheduled one (job above) the live emerging job.

From the above definition and Lemma 1 we immediately obtain the following corollary.

Corollary 2. If contains a kernel which has no live emerging job, then it is optimal.

We illustrate the above introduced definitions on a problem instance of . The corresponding J-schedule is depicted in Figure 1(a). In that instance, we have 11 jobs, job 1 with , , and . All the rest of the jobs are released at time moment 10 and have the equal processing time 1 and the delivery time 100. These data completely define our problem instance.

Consider the initial J-schedule of Figure 1(a) consisting of a single block. In that schedule, jobs are included in the increasing order of their indexes. The earliest scheduled job 1 is the live emerging job which is followed by jobs 2–11 scheduled in this order (note that, for the technical reasons, the scaling on the vertical and horizontal axes is different). It is easy to see that the latter jobs form the kernel in schedule . Indeed, all the 11 jobs belong to the same block, job 1 pushes the following jobs, and its delivery time is less than that of these pushed jobs. Hence, job 1 is the live emerging job in schedule . The overflow job is job 11 since it realizes the value of the maximum full completion time (the makespan) in schedule which is . Therefore, jobs 2–11 form the kernel in .

Note that the condition in Lemma 1 is not satisfied for schedule . Indeed, its kernel starts at time 100 which is more than . Furthermore, the condition of Corollary 2 is also not satisfied for schedule and it is not optimal. The optimal schedule with makespan 120 is depicted in Figure 1(b), in which the live emerging job 1 is rescheduled behind all kernel jobs.

Below we use for the makespan (maximum full job completion time) of J-schedule and (, resp.) for the optimum makespan (lateness, resp.).

Lemma 3. Consider (), where is the live emerging job for kernel .

Proof. We need to show that the delay imposed by job for the jobs in kernel in schedule is less than . Indeed, is a J-schedule. Hence, no job in could have been released by the time when job was started in , as otherwise J-heuristic would have included the former job instead of . At the same time, the earliest job from is scheduled immediately after job in . Then the difference between the starting time of the former job and time moment is less than . Now our claim follows from Lemma 1.

For our problem instance and the corresponding schedule , the above bound is almost reached. Indeed, , whereas ().

As we have mentioned in Section 1.2, Jackson’s heuristic’s preemptive version (which gives an optimal solution to ) gives a lower bound for ; that is, an optimal solution of the preemptive version is a lower bound for the non-preemptive case . By going deeper into the structure of Jackson’s preemptive schedules, Croce and T’kindt [22] have proposed another (a more “complete”) lower bound, which, in practice, also turns out to be more efficient than the above lower bound yielded by an optimal preemptive solution. The lower bound proposed by Gharbi and Labidi [21] is based on the concept of the so-called semipreemptive schedules, derived from the observation that in an optimal nonpreemptive schedule a part of some jobs is to be scheduled within a certain time interval. This yields to the semipreemptive schedules, for which stronger lower bounds can be derived.

Unlike the above lower bounds, Lemma 3 implicitly defines a lower bound of derived from the solution of the nonpreemptive Jackson’s heuristic. This lower bound can further be strengthen using the following concept. Let the delay for kernel , be ( (, resp.) stand again for the live emerging (overflow, resp.) job for kernel ).

Lemma 4. (, resp.) is a lower bound on the optimal job makespan (lateness , resp.).

The proof is similar to that of Lemma 3, with an extra observation that the delay for the earliest scheduled job of kernel is defined more accurately by .

Observe that . In fact, can be drastically smaller than . For instance, if in our problem instance from Figure 1   were 90 (instead of 10) then . In general, observe that the smaller is (the more is ) the more essential is the difference between the lower bounds of Lemmas 4 and 3.

Now we can easily derive a known performance ratio 2 of J-heuristic for version (we note that the estimation of the approximation for the version with due dates with the objective to minimize maximum lateness is less appropriate: for instance, the optimum lateness might be negative).

Lemma 5. J-heuristic gives a 2-approximate solution for ; that is, .

Proof. If there exists no live emerging job for then is optimal by Corollary 2. Suppose exists; clearly, (as job has to be scheduled in and there is at least one more (kernel) job in it). Then by Lemma 3,

For the purpose of the estimation of the approximation given by Jackson’s heuristic, we express as a fraction of an optimal objective value (). Alternatively, instead of the optimal objective value we may use its lower bound from Lemma 4 (as may not be known). Let be such that ; that is, . Since is a lower bound on (), we let , and thus we have that ; that is, is a valid assignment. Then note that for any problem instance can be obtained in time .

In the following two theorems is the live emerging job for kernel , as before.

Theorem 6. Consider , for any .

Proof. By Lemma 3,

Similarly as in Lemma 4, we can strengthen Theorem 6 replacing by redefining, respectively, ; that is, we now allow (where we may let ).

Theorem 7. Consider , for any .

Proof. The proof is saimilar to the proof of Theorem 6 with the only difference that the strong inequality in is replaced by as now .

Note that if we let and , respectively, in Theorems 6 and 7, respectively, and we replace with the lower bound , we obtain valid inequalities , for any and , for any , respectively.

To illustrate the above obtained results, let us consider another modification of our problem instance of Figure 1. In that modified instance the emerging job remains longer than the kernel jobs, although the difference between the processing times is not as drastic as in the previous instance (such an instance characterizes better an “average problem instance”). We have a long emerging job 1 with , , and , and the processing time of the rest of the jobs is again 1. Latter jobs are released at time 5 and also have the same delivery times as in the first instance. The J-schedule with makespan 120 is depicted in Figure 2(a). The lower bound on the optimum makespan defined by Lemma 3 is hence , whereas Lemma 4 defines a stronger lower bound , since . The makespan of the optimal schedule depicted in Figure 2(b) is the same as this lower bound.

The approximation provided by Jackson’s heuristic for this problem instance can be obtained from Theorems 6 and 7. Based on Theorem 6, we use the lower bound 115 on and obtain a valid and the resultant approximation ratio . Using Theorem 7 and the fact that with the lower bound 115, we obtain another valid and the approximation ratio . For this instance, we can also obtain the approximation ratio directly , which coincides with the estimation of Theorem 7.

Observe that for our second (average) problem instance, Jackson’s heuristic gave an almost optimal solution, and the resultant approximation ratio coincides with the estimation of Theorem 7. This encouraging observation is completely supported and even outperformed by our computational experiments discussed in Section 3.

3. On Heuristic’s Practical Behavior

Recall that in our first problem instance from Figure 1 we had a “huge” live emerging job essentially contributing to the makespan of . As a result, Jackson’s heuristic has created a schedule with an almost worst possible performance ratio (see Lemma 5). Two distinct facts were decisive in such a poor performance of the J-heuristic. First, it the processing time of the live emerging job (which was too large); second, it is also large “close” to . We have carried out computational experiments aiming to verify how often, in practice, both of these two events occur. The results were more than encouraging showing that it is highly unlikely that both of these events may take place, as we describe a bit later in this section. As an example, for the second modified problem instance with from the previous section, the second event did not occur (as was 10, instead of 90, a value, relatively small compared to the optimal objective value). As a result, Jackson’s heuristic has provided a good approximation. Our third problem instance from Figure 2(a) reflects typical characteristics of an “average” problem instance.

Our study has shown that in real-life scenarios, where the processing requirement of an individual (live emerging) job can approximately be estimated as a fraction of the expected total work time, the quality of the solution that will deliver Jackson’s heuristic can be predicted in terms of that fraction (Theorems 6 and 7) without actually running the algorithm and can be significantly better than the known worst-case bound of 2. Alternatively, Lemmas 3 and 4 provide the lower bounds on the expected total work time, and the above fraction can directly be derived.

3.1. The Computational Experiments

We have implemented Jackson’s heuristic (the version ) in Java using the development environment Eclipse IDE for Java Developers (version Luna Service Release 1 (4.4.1)) under Windows 8.1 operative system for 64 bits and have used a laptop with Intel Core i7 (2.4 GHz) and 8 GB of RAM DDR3 to run the code. The inputs in our main program are plain texts with job data that we have generated randomly. The program for the generation of our instances was constructed under the same development environment as our main program. The job parameters (the release times, the processing times, and the due dates) were generated randomly, somewhat similar as in [9, 21], as follows. For job release times and due dates a random number was generated with the function in Java, with an open range , where is the number of jobs in each instance. The processing times were generated from the interval (as in [9, 21]) and also from the interval .

For deeper analysis of the created solutions, we have augmented the code with a procedure detecting a kernel , the corresponding live emerging and overflow jobs ( and , resp.), and the corresponding intersection . In this way, for every created J-schedule , we were able to calculate the objective value () and the lower bounds and from Lemmas 3 and 4, respectively.

We have created instances for , resp.), with the processing times from the interval (, resp.). We have created 20 instances for each above , in total 200 instances. For more than half of these instances, in the created J-schedules no emerging job existed. Hence these instances were solved optimally due to Corollary 2. For the rest of the instances, there existed an emerging job and kernel . However, importantly, the corresponding was too insignificant, so that most of these instances were solved within a factor of 1.009 of the optimum objective value, whereas the worst approximation ratio was less than 1.03.

A detailed description of the experimental data can be found in Tables 110. Each table represents the randomly generated 20 problem instances with a particular and . The problem instances that were solved optimally (due to the nonexistence of the live emerging job) have no specific entries. For the rest of the problem instances, the tables, besides the approximation ratio of the obtained solution due to Theorem 6 (A.R.), specify the parameters of the live emerging job and the overflow job for the earliest encountered kernel . In addition, the lateness and the completion time of the overflow job , the makespan of , , and the corresponding are specified. The lower bounds from Lemma 4 in terms of lateness and makespan, respectively, are denoted by and , respectively. As it can be seen from the tables, these lower bounds turned out to be quite strong for the tested problem instances, because of their closeness to the objective value in (represented in columns labeled by for the maximum lateness and for the makespan).

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.