In this research, we study an extended version of the joint order batching and scheduling optimization for manual vegetable order picking and packing lines with consideration of workers’ fatiguing effect. This problem is faced by many B2C fresh produce grocers in China on a daily basis which could severely decrease overall workflow efficiency in distribution center and customer satisfaction. In this order batching and sequencing problem, the setup time for processing each batch is volume-dependent and similarity dependent, as less ergonomic motion is needed in replenishing and picking similar orders. In addition, each worker’s fatiguing effect, usually caused by late shift and repetitive operation, which affects order processing times, is assumed to follow a general form of logistic growth with respect to the start time of order processing. We develop a heuristic approach to solve the resultant NP-hard problem for minimization of the total completion time. For order batching, a revised similarity index takes into account not only the number of common items in any two orders but also the proportion of these items based on the vegetable order feature. To sequence batches, the genetic algorithm is adapted and improved with proposed several efficient initialization and precedence rules. Within each batch, a revised nondecreasing item quantity algorithm is used. The performance of the proposed heuristic solution approach is evaluated using numerical instances generated from practical warehouse operations of our partnering B2C grocer. The efficiency of the proposed heuristic approach is demonstrated.

1. Introduction

In this paper, we consider the optimization of order batching and sequencing on manual vegetable order picking and packing lines as a practical parallel machine serial-batch scheduling problem.

This problem is posed by our partner, an online B2C fresh produce grocery supplier in Beijing, China. In vegetable B2C direct-sales mode, vegetables are harvested from the production bases according to the orders online. Then, after the cleaning and precooling, SKU packing, order picking, and packing, the fresh products are able to be delivered to Intelligent Distribution Dispensers located in different communities on the same or early next day. To solve the issues of perishability and time sensitivity for fresh products, the company adopts JIT operation mode in their distribution center, aiming at zero inventory and processing customized orders. The order picking and packing for fresh products is operated on parallel manually operating lines rather than order picking in aisle system for other commercial products (see Section 2). In the distribution center, order picking and packing is the least efficient and highest cost operation in order processing and usually has to be scheduled to night shift for early delivery, which will lead to a large deviation from expectation if not taking pickers’ fatiguing effect into account.

This study is further motivated by the common need to achieve excellent scheduling operations for manual vegetable order picking and packing lines. Previously, our industry partner’s attempts in developing fast and effective decision approaches for customer order scheduling had limited success. Less satisfactory performance is still observed, which is of a similar situation to many other vegetable order picking and packing lines in China. A common reason for such deficiency is that mathematical models adopted by these tools suffer from somewhat excessive simplification, in addition, order batching and scheduling is not optimized based on unique characters for fresh product. Many relevant aspects of manual picking lines have not been incorporated, including similarity dependency in batch setup time brought by the underlying convenience in replenishing and picking and work fatigue resulting from late shift work and repetitive operation. These practical aspects may affect the actual production plan significantly. In this paper, we accommodate them into an order batching and sequencing optimization model with the expectation to enhance order process efficiency for fresh products and better serve customers. In such systems, each order is considered as the smallest unit and most operations are labor-intensive.

Order batching and sequencing problems in warehouses have been studied for decades [15]. As mentioned, the order picking and packing system studied here is different from traditional order picking and scheduling, which is mostly based on picker’s routing in aisle system, the picker needs to walk and search among the racks, and the objective is to minimize the total walking distance [6, 7], etc. However, human factors like learning and fatiguing effect are generally overlooked. In recent years, some researchers have proposed to consider such ergonomic effects [812]. Therefore, this study focuses on the special order picking system of fresh products with potential human fatiguing effects, which could be considered a practical exploration in this new area. Since orders will be batched and operated on the parallel picking and packing lines (treated as machines) without picker routing, the problem studied here could be categorized as order batching and scheduling on parallel machines, and relevant literature is reviewed next.

Order batching is a method of grouping a set of orders into a number of subsets, which can be retrieved or processed together in one operation. The related methods of order batching have also been studied for decades and classical solution approaches for order batching problems can be distinguished into priority rule-based algorithms, seed algorithms, saving algorithms, and data mining approaches [1316]. Related batching operation is known as batch scheduling problem, which needs to determine optimal grouping and scheduling decisions about jobs that are about to be processed on capacitated machines. Current research has two main assumptions about the jobs in the same batch. One is that all jobs in the same batch are completed together upon the completion of the last job of the batch and, within each batch, the jobs are processed sequentially (also known as s-batch) [1719]. Another assumption is made that the jobs in the same batch are completed together with the same starting time and same completion time (also known as p-batch) [2023]. Based on the involved machines, current research about batch scheduling is reviewed from two main categories: single machine batch scheduling [2427] and batch scheduling on multiple machines. Batch scheduling involved with multiple machines includes two sides: parallel batch machines [2832] and flow shop batching and scheduling problems [3335].

For single machine batching scheduling problems, most of the studies above seek polynomial-time solution methods on special machine scheduling cases. Although they provide theoretical bases for practice, they are not necessarily suitable for delivering practical solutions to many industry clients, including our partner in China. With a certain degree of similarity, studies about multiple machine batching and scheduling are faced with different operating systems, different setups, processing times, different resource constraints, and different optimizing objectives. The processing time of jobs is assumed to be constant values in reviewed literature so far. However, this assumption is not appropriate for the modeling of many manual processes where very often a job, executed in the same or almost the same conditions, has a varied processing time. In manual vegetable order picking and packing line, setup time depends on the similarity between adjacent batches and item types that needs to be prepared, the processing time depends on the current worker’s fatigue and needed item types in the order. A worker has to assemble a large number of similar products and the processing time to assemble one product depends on his knowledge, repetition times, and other aspects which will change along with time. To bridge these research gaps, joint optimization of order batching and sequencing in the manual vegetable order picking and packing system is studied here while taking fatiguing effect of workers into account.

The efficiency of a worker decreases after working for long hours, which can lead to the increased processing time for the same task. This phenomenon is often referred to as the fatiguing effect (also known as the aging effect or efficiency deterioration). In addition, different workers may have different working habits, and thus their working efficiencies may deteriorate following different fatigue curves. It could lead to considerable differences in optimal scheduling for labor-intense tasks, such as different task allocation decisions. The fatiguing effect in the context of scheduling has been recognized in the literature, for instances, on different machine types: single machine scheduling [3638], parallel machine scheduling [39, 40], flow shop scheduling [41, 42]; different types of fatigue: position related fatigue [43], linear (or piecewise linear), and time related fatigue [44, 45]. In recent years, the study of batch scheduling considering fatiguing effect has appeared [25, 4652].

As reviewed, most of the research studies machine scheduling (or job sequencing) with consideration of machine deterioration effects and have not been extended to labor-intense operation systems with human fatiguing effects. In addition, for mathematical convenience, the relationship of job processing time with respect to job start time or position was set to linear or low-degree polynomial in the most relevant literature. Our problem is order batching and sequencing in the context of practical manual order picking and packing operations considering pickers’ fatigue. In this paper, we instead use a generalized logistic function to characterize the work fatiguing effect, and subsequently, job processing time. It is originally developed for growth modeling for various natural biological processes like population growth, weight growth, stress accumulation, etc. [53]. This fatigue trend could be captured very well by a general logistic function that has been used to simulate the relationship between human’s stress and work time in human reliability analysis [54, 55]. Different from unlimited descending machine deterioration, human’s working efficiency has an upper-bound in the beginning and a lower-bound at the end and has a different fatiguing rate. Hence, a general logistic relationship is used here to map human’s operating efficiency into a certain range.

In summary, as an extension to traditional order batching and scheduling problems, the studied problem is proposed based on real world application with some new characters. Pickers’ fatigue needs to be included in the picking operation, and setup time needs to be set based on features of fresh product. This motivates our innovations as follows: (1) We establish the order batching and scheduling model for vegetable order picking and packing system. In the model, the setup time is associated with the similarity degree of batches and the item type number in the batch. The processing time of the order is associated with pickers’ fatigue and order size. This model can reflect the characteristics of vegetable picking more concretely. (2) The fatiguing effect curve of pickers based on logical function is proposed, which is not considered in previous literature, and it could reflect changes of human fatigue more naturally. (3) We propose a heuristic solution approach aiming at solving the scheduling problem in picking lines more efficiently. In the order sequencing stage, a revised nondecreasing item quantity (NDIQ) algorithm is proposed to sequence the orders within a single batch after brief prove while different batches are sorted using an improved genetic algorithm (GA). To overcome drawbacks that standard encoding schemes may result in, two-echelon real-number encoding structure is adopted and two precedence rules are combined to initialize the chromosomes more efficiently. Our contributions lie in two aspects: firstly, this study is the extension of the new research trend of considering human factors in the order picking system; secondly, our model, which is special order batching and scheduling model with similarity-based setup and pickers’ fatiguing effect, enriches the batching scheduling field. For real world implication, the proposed solution approach could be applied and extended for a labor-intense order processing system.

This research is organized as follows. Section 2 introduces and describes the studied problem in detail. Section 3 presents the heuristic solution approach. Section 4 shows the performance of the proposed solution methods. Finally, Section 5 draws conclusions and outlines future research directions.

2. Problem Description

The problem initially arose from our collaborator, a fresh produce supplier in Beijing operating in B2C direct-selling mode. The basic process flow (see Figure 1) is that customers place orders online first and then vegetables are harvested from a planning base based on the day. After cleaning and precooling in the distribution center, fresh products of various types are first prepared as SKUs and stored in a temporary storage area. Then the orders are picked and packed according to customer order and dispatched to meet the requirement of “same day arrival” or “next day arrival.” Following the process flow, the layout of the distribution center is designed in Figure 2, where the order is processed in parallel lines (seen as machines).

Due to perishability and time sensitivity, grouped order picking is adopted in the processing where orders are grouped based on their similarities, and when the grouped orders arrive, the picker will pick different types and amount of SKUs for each order and pack it. This operating mode has integrated the advantages of low error rate from picking by order and efficiency from the grouped operation. Hence, before order picking and packing, similar orders are batched and scheduled in sequence for following assembly operations in picking lines. Thus, our problem can be summarized as a parallel machine batch scheduling problem considering workers’ fatiguing effect, shown in Figure 3.

In this system, deterministic operation and optimization are considered. Although the order processing time is uncertain, it is estimated based on order size and working efficiency. Order size could be measured by the contained number of SKUs. Working efficiency could be defined as unit picking time in lean management which is the time spent on picking one SKU type. The worker has an initial unit picking time and this time will increase nonlinearly along with time, known as a fatiguing effect. Different from the fatigue of machines, workers’ fatiguing effect is often caused by stress and strain with time, and the working efficiency will not continue degrading unlimitedly but with a bound. On the contrary, unit picking time will increase until a bound. The rate of the growth increases from an initial value reaches a maximum at a point of inflexion and then decreases towards zero at an upper asymptote. Furthermore, different workers may have different working habits, and thus their working efficiencies may deteriorate following different fatigue curves. It could lead to considerable differences in optimal scheduling for labor-intense tasks, such as different task allocation decisions. This fatigue trend could be captured very well by a general logistic function, which has been used to simulate the relationship between human’s stress and work time in ergonomic studies. Hence, unit picking time of worker could be described as , and are initial and final asymptote of unit picking time, is deterioration rate. 4 is the time point of stabilization which means that the workers have maximum deterioration rate after about 4 hours (second could also be used) from starting time point (usually the working period is 8 hours). Different values could be used for time points of stabilization based on practice. Therefore, the unit picking time follows this fatigue curve to increase over time nonlinearly with an asymptotical stable value.

Before processing one batch, a certain setup time is needed to preparing all the necessary items, for example, the packing materials, SKUs, etc. The more similar the orders in the batch are, the more consistent that is with requisite materials and SKUs, and less setup time will be needed. Contained item types determine the complexity of the batch, and the more complex the batch is, the longer the setup time will be. We assume that the setup time for batch here follows . is the number of item types in batch . is the overall similarity degree of batch . Orders within each batch are processed sequentially, and our objective is to minimize total completion time.

Summarizing the above description, we report this scheduling problem below:

For serial-batch scheduling problems on picking lines, the setup time of batch is volume and similarity dependent, the processing time on order of operator equals item types quantity multiply current unit picking time and the objective is to minimize total completion time . Orders will be batched and then scheduled sequentially.

Note that the studied problem can be reduced to the order batching problem, which has proved to be NP-hard [56]. Thus, we resort to the heuristic solution.

3. The Heuristics Solution Approach

Since grouped order picking and scheduling is an NP-hard problem and plentiful orders will flow into the decision period, it is urgent to develop a fast solution approach that could be applied on order operation for fresh product suppliers; lots of studies have been conducted on heuristic algorithms for order scheduling [5764]. The focus of this section is to develop an efficient heuristic solution approach to solve the joint order batching and sequencing problem formulated in the previous section. Because of the solving complexity, seeking a global optimum solution by solving this joint problem simultaneously is not realistic. Moreover, the decisions of such planning issues are often made sequentially at an operative level in practice. Naturally, the original problem is divided into two subproblems: order batching and batch sequencing. In the first phase, the orders are batched based on their similarities on common items in the proposed batching algorithm. Next, in the second phase, a revised nonincreasing item quantity algorithm is proposed after being proven to sequence orders within every single batch and then different batches are sorted using a hybrid genetic algorithm based on precedence rules derived from practical production experience. Order batching and batch scheduling are closely related and interact even though they are decomposed into two subproblems out of efficiency consideration. When certain orders are batched together, the similarity between two batches is certain and will determine their interval setup time. In reverse, the result of batch scheduling is feedback to order batching, based on which, orders need to be rebatched and resulted batches will be reevaluated. The flowchart of the overall solution procedure is sketched for proposed two-phase heuristic approach, Figure 4.

3.1. Order Batching Algorithm upon Similarity (OBAS)

To capture similarity more precisely, we consider not only the category number for common items but also the percentage of common items within each order. The similarity of two orders is calculated by the following formula to include the percentage of common items in this research:

Here, in equation (2), it is supposed that there are total and item categories in order and , respectively, with as the category number for common items. , denote quantity number of items in common category for order and .

In current literature, the similarity for two orders is usually defined by included category number of common items [65], which is defined by equation (3), denotes total item numbers in the union of order and .

As shown in equation (3), in current heuristic algorithms that rely on similarity index, shared item type is commonly focused and the impact of the percentage of common items within each order on their similarity has been neglected. Even though two orders have the same item types, a big quantity difference or, more precisely, a big percentage difference may cause lots of unnecessary picking motions when preparing items that are not effective. In addition, the study object in our problem is “order,” not “job,” so that the setup time between batches is similarity value dependent.

There is an instance to explain the difference between these two formulas. In Table 1, 10 items and 3 orders are considered. The value of equation (3) for these 3 orders is the same 2/3, whereas orders 1 and 3 are more similar intuitively because common items are the main components of both orders.

Based on the above discussion, order batching process in the first phase is presented as follows, Algorithm 1:

Step 1: Choose the seed order randomly;
Step 2: Check batch capacity constraint, go to Step 3 if not exceeding capacity, go to Step 6 otherwise;
Step 3: Calculate similarity for orders and sort orders by in descending way;
Step 4: Select order with highest , if multiple orders have same similarity, randomly pick one;
Step 5: Append order to the seed batch and check batch capacity constraint, go to Step 6 if not exceeding capacity, go to Step 7 otherwise;
Step 6: Combine order and as a new order , repeat Step 1 Step 6 until all orders are batched;
Step 7: Output the order batch.

The main idea of this batching process is to batch orders with the highest similarity while satisfying the batch capacity constraint.

3.2. Sequencing between Batches: GA-Based Algorithm

It is proved that the parallel machine sequencing problem in the second phase here is NP-hard in the strong sense generally. For the past decades, genetic algorithm has received considerable attention for solving such difficult combinatorial optimization problems and demonstrated good performance to get satisfying solutions [6668]. To obtain better sequencing solutions between batches, the conventional genetic algorithm is enhanced by including several techniques to improve solving efficiency and increase the exploitation ability. The main functional modules of the proposed enhanced genetic algorithm are interpreted as follows.

3.2.1. Encoding and Initialization

To overcome drawbacks (redundancy problem, insensitivity to crossover operators, etc.) that standard (item-oriented) encoding schemes may result in [69], a two-echelon real-number encoding structure is adopted to represent each chromosome or solution in consideration of problem characteristic. Suppose there are batches and picking lines, the upper echelon is encoded by a string of distinct genes, composed of batch genes and separate machine genes, meanwhile for each batch gene, its contained orders are expressed at the lower echelon. They evaluate the performance of each chromosome. Fitness function is defined as monotonic decreasing function about objective function. The negative exponential function is used here and the larger the fitness means better. The roulette wheel method is adopted for later selection operations.

For most searching algorithms, especially stochastic search techniques, the starting point could have nonnegligible influence on the overall efficiency reflected by the subsequent iterative path and total computation time. To maintain the diversity of population and start from a relatively reasonable solution, two precedence rules, extracted from practical experience, are combined together with random initialization in chromosomes initialization module: (1) picking line with earliest completion time has the priority to be scheduled; (2) the batch with a larger amount of items is more likely to be assigned to high-efficiency picking lines. The details are provided as below, Algorithm 2.

Initialization: ; unassigned batches with index ;
completion time total picking lines ;
vector , objective value
While ( total batch number ) do
if () then
follow rule 1 to choose picking line and batch
 get randomly in unassigned batches
follow rule 2 to choose picking line and batch
 choose subset from randomly
{item quantity
setup time
 processing time setup time
 finishing time

Please note that rule 1 (line 6-line 8) and rule 2 (line 9-line 11) in Algorithm 2 are exclusive and will be applied to each chromosome in the population to choose the current pairwise batch and picking line in initial stage.

3.2.2. Crossover and Mutation

Corresponding to the two-echelon encoding scheme, classical crossover and mutation operators need to be modified here to reflect genetic evolution at different levels. The crossover and mutation procedures are depicted exemplarily for 3 picking lines and 7 batches in Figures 5 and 6.

3.2.3. Crossover Operation
Step 1: Select two parental chromosomes , from the population randomly and select a crossing section (shaded area) for each of them. Picking line genes (with a star) and batch genes may be included at the same time (see Figure 5(a));Step 2: Interchange selected crossing sections between and . The order expression at the lower echelon will also move with batch genes. Then check the redundancy of each offspring chromosome , , repeated genes (batch gene 4 and picking line gene 3∗) are deleted (see Figure 5(b));Step 3: After deletion, the completeness of each chromosome should also be checked. At this same location where deletion happens, the missing packing line and batch genes will be inserted (picking line genes goes first) (see Figure 5(c)).
3.2.4. Mutation Operation
(1)SWAP different batches on the same picking line (see Figure 6(a)).(2)SWAP different batches on different picking lines (see Figure 6(b)).(3)In addition, one or more orders may SWAP between batches or SHIFT from one batch to another batch (see Figure 6(c)).

The crossover and mutation happen with a predefined probability. Please note that these operations might lead to infeasible solutions due to violation of capacity constraints. Such situations are dealt with by applying Best-Fit-Rule [69], which moves the last customer order from an infeasible batch to another batch with sufficient capacity. The procedure terminates if no further improvement can be obtained in user-defined continuous iterations.

3.3. Sequencing within Batches: Revised Nondecreasing Item Quantity Algorithm

As mentioned, fatiguing effects of workers are considered and modeled in the picking process, which leads to a gradual change of order processing time. To sequence the orders within every single batch, a revised nondecreasing item quantity algorithm is proposed after being proved to be optimum scheduling. Some theorems are listed here to support it.

Lemma 1. For the problem , an optimal schedule can be obtained by arranging orders in order of nondecreasing orders of .
Proof. This can be proved by interchanging adjacent jobs. Suppose under an optimal schedule , there are two adjacent orders and , followed by , suppose item quantity in order and satisfy . Let the starting time of order is , and the sum of completion time of order and isIf a nonoptimum schedule is obtained by performing a pairwise interchange on orders and , then similarly,If , it can be verified thatThis contradicts the optimality of .

Remark 1. can be treated as the envelope of countless piecewise linear functions.
For the piecewise linear processing time problem, we use a heuristic algorithm presented by Sundararaghavan and Kunnathur [70] to solve it. The procedure is called Operation Exchange: let be a schedule, define as a job set that includes all the jobs with its starting time less than , and . Exchange a job with a job and reorder the jobs in and reorder the jobs in .
To summarize, for one batch on the picking line, a revised nondecreasing item quantity (NDIQ) heuristic is proposed to schedule orders, Algorithm 3.

Step 1: Sort orders of one batch according to nondecreasing ;
Step 2: Choose a middle point in the sorted sequence and divide the sequence into two parts and ;
Step 3: Carry out an operation exchange with any order in with any other order in the part if it leads to a reduction in the objective function, then continue to carry out operation exchange until no such objective function reducing exchange exists;
Step 4: Keep the order sequence unchanged.

4. Numerical Experiments

In this section, sets of experiments are conducted to evaluate the performances of the proposed two-phase heuristic approach (HGA) in Section 3 for order batching and sequencing. First of all, the effectiveness and scalability of the approach are demonstrated by comparing it with other reported algorithms in relevant research. Then parameter settings for fatigue curves are carried out for the approach, corresponding results from the experiment have been shown. All experiments are implemented in [71] version on Windows 7 operating system with dual Intel Cores (CPU2.0 & 2.5Ghz) and 8 GB RAM.

4.1. Effectiveness of HGA

For effectiveness tests, experimental data are simulated based on transaction order data from an online retailer company in Beijing, China. Characteristics of the simulated dataset are summarized, including the total order number, total item types, maximum quantity for each kind of item in each order, maximum item types in each order, and batch capacity, from left to right in Table 2. Efficiency settings for picking lines are assumed in Table 3. The total number of workers is assumed to be 30. Their initial unit picking time, fatiguing rate and time point of stabilization are uniformly generated in according ranges.

The performance of the proposed HGA is illustrated on different possible order structures by comparing the other four representative algorithms in the area. To show the direct effectiveness of our proposed similarity formula in order batching phase (equation (2)) and two initialization rules in sequencing phase (Algorithm 2), two basic references are specially designed. GA-1 is HGA with the commonly adopted similarity (equation (3)) and GA-2 is HGA without two proposed rules in the initialization module. Besides, SA [72] and GSA [25] are chosen because of a very similar research problem. In the approach of SA and GSA, the orders are ranked and batched following their processing time and then scheduled by simulated annealing and gravitational search algorithms. GSA considers order batching and batch scheduling altogether, while SA batches the order first and then schedules the batches. Since the research problems in the citations are different, some minor revisions have been made when applying these two algorithms here. The MBF (Modified Best Fit) rule is changed to MITN (Minimum Item Type Number, first in our paper) rule to generate an initial solution for SA. For GSA, random encoding is used, and after orders are randomly assigned to processing lines, they are then batched based on order similarity. In each batch, orders are sequenced following the proposed revised NDIQ heuristic here. The parameter setting of HGA in the experiments is listed in Table 4.

Then, for a fixed total order number (800), five different order structures (A(60-12), B(80-16), C(100-20), D(150-30), E(200-40)) are tested here, for instance, A(60-12) means that in order structure A, total item types are assumed to be 60 and maximum item types in each order are 12. The other parameters are kept the same as Table 4 for all algorithms here. Mean values of objective value, setup time, and running time are recorded in Table 5 for each algorithm after ten runs. Imp% is the improvement percentage of HGA against other algorithms. “++” means the imp% less than −100%. Note that, in all following related results, objective value (total completion time) is at scale, setup time is at scale, running time is at scale.

As shown in Table 5, along with increasing of total item types in different order structures, the mean value of three metrics raises up overall for all the algorithms. For the mean objective value under different order structures, generally speaking, HGA has the shortest total completion time followed by GA-1 (being improved at level 1.9%8.0%), SA (10.2%12.9%), and GSA (24.4%30.4%); GA-2 (42.5%49.9%) is the worst in most cases. Taking order structure C(100-20) as an example, HGA has an improvement of 6.7% and 18.2% on mean objective value and mean setup time but higher mean running time comparing with GA-1. Since GA-2 uses the same similarity formula with HGA, their setup time and obtained batch number are very close, see Figure 7. In the histogram of the mean setup time for each algorithm, the resulting batch number is also labeled. The needed setup time is related to the total item types within the next batch and their overall similarity. SA approach has the same batch number 57 always as the total order number is fixed here, and it groups orders based on their processing time; however, its mean setup time changes when the order structure changes.

In addition, to display the full range of variation (from min to max) for total completion times (natural log basis) in ten runs, a boxplot is applied in Figure 8. As shown, the performance of HGA is very stable. GSA has the maximum variability followed by GA-2, and sometimes the min value of GSA could match GA-1.

From this set of experiments on different order structures, it could be derived safely that the proposed two-phase heuristic method with according improvements on similarity formula and genetic algorithm is effective for order batching and sequencing problem.

4.2. Scalability Test

Besides the effectiveness at the current problem scale, the scalability of proposed two-phase heuristics is also explored by increasing total order numbers after fixing the order structure in Table 6.

Under this order structure, the total order number has been added up from 600 to 1500 to examine the performance of the algorithms. Same as last experiments, each algorithm has been run ten times for each order number, the mean value is recorded. Please note that the values are at the same scale in Table 5. In the results of the scalability test (Table 7), as it can be seen, along with increasing of order numbers, the mean value of total completion time, setup time, and running time has been multiplied several times, which is much faster than that of Table 5. However, the overall performance ranking of the algorithms still follows a similar pattern. From the other perspective, the distribution of completion time (natural log basis, see Figure 9) has exhibited the variance in multiple running at a different level of the total order number. Comparing with the grouped boxplot in Figure 8 for different order structures, all algorithms have shown less variability except GA-2 and GSA, but still, GA-2 has the worst performance on mean completion time. Sometimes, the best solution that GSA provides in these multiple runs could almost reach the best level of HGA (Figure 9). However, GSA is the most time consuming before the order number reaches 1200 (Table 7).

In Tables 5 and 7, the mean performance in ten runs of the five algorithms has been listed for different order structures and order numbers, and the variation range of completion time in the ten runs is captured in Figures 8 and 9. It is seen that the best scenario of the GSA algorithm in ten runs has the minimum completion time for order structure A(60-12) and A(600), then HGA starts to outperforms GSA when the order number increased. As seen from the box plot in Figures 8 and 9, HGA and GSA have the best case result than the other three algorithms. Hence, we chose the best scenario results in terms of total completion time for HGA and GSA under different order structures and numbers and summarize them for comparison in Table 8.

Besides performance test on different order structures and numbers, the dynamics of the proposed approach have also been evaluated with different settings for three parameters: batch capacity, crossover, and mutation operator in the GA algorithm. For this group of experiments, fixed simulation parameters are in Table 9. Here the solving process is repeated multiple times to obtain min–max and average value. Along with batch capacity increase from 5 to 30, the obtained total completion time and setup time are plotted in Figure 10. The total setup time is lower when the batch has a larger capacity in Figure 10(b), which leads to a smaller batch number. However, a larger batch needs a longer setup time than a smaller batch. Thus the completion time for each sequenced order in the larger batch is increased accordingly, which leads to the rising of cumulative total completion time in Figure 10(a).

When the crossover operator is fixed to be 0.6, the trends of total completion time and setup time are shown in Figure 11 when the mutation operator has different values. Similarly, Figure 12 displays the corresponding trends for different values of the crossover operator when the mutation operator is set to 0.15. In Figure 11, it could be observed that when the mutation operator gets larger, the variation of total completion time increases while, on the other side, setup time reaches and stabilizes at best possible value it could find. The reason is that mutation operation swaps different batches (interbatches) and also shifts orders between batches (intrabatch), which could result in varied objective with a larger possibility. On the other hand, crossover operator interchanges selected crossing sections (interbatches), therefore the best possible objectives are lower when cross operator increases with fixed mutation operator in Figure 12, and the setup time, which is mostly caused by order numbers within batches (intrabatch), does not have many changes.

In summary, the proposed two-phase heuristic approach outperforms the other algorithms in terms of total completion time in conducted experiments up to now even though it may not be the fastest one in the chosen algorithms, which has proven its great effectiveness and scalability to batch orders firstly and then sequence the orders and batches in the second phase. Next, to understand how different fatigue habits of workers may affect the task allocation decisions, further analysis about the fatigue and other parameter settings are present in the following part.

4.3. Fatigue Analysis

As assumed and stated in Section 2, fatigue accumulation may follow the generalized logistic function. However, different workers will have different working habits. Some workers probably have a higher initial efficiency but higher deterioration rate, while others may be slower in the beginning and have smaller degradation. Thus, different fatigue situations are considered here and depicted in Figure 13.

In the experiment here, total workers are divided into two groups evenly with different fatigue curves. One group is assumed to have the reference curve (Figures 13(a) and 13(b) with 50s and 200s as initial and final unit picking time in about 8 hours, and the other group will have one curve among the four options. Order number and batch capacity are set to be 1500 and 15 in the proposed HGA algorithm. At the end of one day, the orders assigned for the two groups are compared in Table 10. As it can be seen, different groups are preferred when final efficiency or initial efficiency is fixed. For instance, the reference group is given fewer orders (723 out of 1500 orders) comparing with curve A1(70-180) even though it has higher picking efficiency at first. It is indicated that 40s difference of initial working efficiency may be compensated with a slightly higher base level (20s gap). The comparison here could provide some managerial insights to decision makers in assigning orders to pickers according to current personnel fatigue effect and also provide some managerial suggestions for a distribution center in hiring and managing: evaluate the candidate’s fatigue curve and make more incentives or interval reasonable rest policy before the time point reaches highest fatigue rate.

To compare the order assignment with and without considering the fatigue effect of workers, we split the 15 pickers into three groups in Table 11. Picking line ID 1-5 has a small initial unit picking time in range (1, 10). Picking line ID 6-10 has medium-scale initial unit picking time in range (11, 20). Picking line ID 11-15 has a large initial unit picking time in range (21, 30). Within each group, there are different scale values of other parameters. Order information is based on Table 9. The order assignment to each picking line is shown in Figures 14 and 15. It is seen that, without considering fatigue, a large amount of orders is shifted to picking line ID 1-5 because they have a relatively small initial unit picking time. In contrast, the assignment becomes more balanced if their fatiguing effect is considered because the worker’s working efficiency may drop very quickly even though they had some advantage at the initial stage.

5. Conclusions

In this research, order batching and scheduling problem has been modeled for picking lines in the context of e-commerce warehouses with the consideration of operators’ nonlinear fatiguing effect. Due to the NP-hard solving complexity of this problem, an efficient two-phase heuristic method has been developed based on proposed new similarity-based batching rules and an improved genetic algorithm for order and batch scheduling after batching. Numerical instances have been generated from practical warehouse operations of an online vegetable retailer to testify and demonstrate the validity and scalability of the proposed solution approach. The experiments results have shown that the proposed algorithm outperforms the other comparable algorithms with less variance for different order structures and different levels of the total order number. In addition, possible preferences of task assignment have been analyzed for different fatigue curves, and it could shed some light on training and management (e.g., shift policy, incentive policy).

For future study, we will extend the decision model of batching and scheduling using the framework of stochastic programming or online optimization to include uncertainty of ordering in real time. On the other hand, to develop an efficient solution approach for an intelligent decision support system, data-driven order batching and scheduling algorithms based on statistical learning techniques are worth to be explored.

Data Availability

The data for different order structures and different order numbers are available in Github repository https://github.com/Xchunf/order-instances.

Conflicts of Interest

The authors declare that they have no conflicts of interest.


This work was supported by Annual Social Science Foundation Project of Shaanxi Province under Grant 2020R002, Natural Science Foundation Research Project of Shaanxi Province under Grant 2020JQ-281, Scientific Research Startup Foundation of Northwest A&F University under Grant 2452019167, and Basic Operating Expense under Humanities & Social Sciences Program of Northwest A&F University under Grant 2452020068.