Abstract

Dynamic electric vehicle routing problem (DEVRP) is an extension of the electric vehicle routing problem (EVRP) into dynamic logistical transportation system such that the demand of customer may change over time. The routing decision of DEVRP must concern with the driving range limitation of electric vehicle (EV) in a dynamic environment since both load degree and battery capacity are variable according to the time-varying demands. This paper proposes an adaptive memetic algorithm, where a special encoding strategy, an adaptive local search operator, and an economical random immigrant scheme are employed in the framework of evolutionary algorithm, to solve DEVRP efficiently. Numeric experiments are carried out upon a series of test instances that are constructed from a stationary VRP benchmark. The computational results show that the proposed algorithm is more effective in finding high-quality solution than several peer algorithms as well as significant in improving the capacity of the routing plan of EVs in dynamic transportation environment.

1. Introduction

In recent years, electric vehicles (EVs) have begun to widely apply into the logistical transportation systems due to their advantages of energy consumption and low pollution [1]. For example, there are about 2000 EVs, accounting for 10% in total, which are operated by the French electricity distribution company ENEDIS in 2016. However, EVs usually have a short driving range of 100–150 miles due to the technology bottleneck of battery [2]. Furthermore, the energy consumption of EVs is varied with the load degree, the speed, and even the road slope that may cause their driving range to be significantly lower [3]. In reality, the logistical transportation systems are often dynamic. For example, the request of a new customer arrives dynamically or the demand of an old customer can change over time. When deciding the route plan of an EV fleet in these dynamic environments, the influence of dynamic load degree upon the driving range must be considered due to the time-varying delivery requests. Figure 1 illustrates the route execution of a single EV in a dynamic logistical transportation system. Before an EV leaves the depot time , an initial route plans to visit the currently known requests . While EV executes its route, a new request arrives at time and the initial route is adjusted to fulfill it. However, the increased load by new arrival request X limits the driving range of EV that causes its inevitable return to depot once for recharging. Finally, the real executed route of EV is . Therefore, the routing plan of EVs in dynamic logistical transportation system becomes distinctively different from the traditional vehicle routing problem (VRP) in the literature as we will demonstrate throughout this paper.

In this paper, a special dynamic electric VRP (DEVRP) is investigated for the optimal routing plan of a fleet of EVs so that they can serve a set of customers with time-varying demands while minimizing the total driving distance of EVs. Obviously, DEVRP can be regarded as an extension of electric VRP (EVRP) into dynamic transport environment. Recently, a lot of researchers have begun to focus on EVRPs due to the wide application of EVs in the logistical transportation field [49]. Schneider et al. [10] studied an EVRP with time windows and changing stations in order to minimize the total travel distance by a homogenous EV fleet. Felipe et al. [11] proposed several heuristics that were constructed within the framework of simulated annealing algorithm for an EVRP with multiple charging technologies and partial recharges. Goeke and Schneider [12] considered a full charging policy with a linear charging function approximation in the routing problem of a mixed electric and conventional vehicle fleet. Lin et al. [13] discussed the effect of vehicle load on battery consumption in EVRP. Keskin and Çatay [14] investigated the application of an adaptive large neighborhood search algorithm for an EVRP that allowed partial recharging. Montoya et al. [15] designed a hybrid metaheuristic for EVRP, in which the battery-charge level was defined as a nonlinear function of the charging time. Zhang et al. [16] utilized an ant colony algorithm for solving an EVRP with the objective of minimizing the energy consumption of EVs. Jie et al. [17] presented a two-echelon capacitated EVRP with battery swapping stations and utilized a hybrid algorithm that combined column generation and adaptive large neighborhood search to determine the delivery strategy under battery consumption limitations. It is noticeable that the current relevant literature on EVRP were concerned with the recharging operations of EVs along their route or the utilization efficiency of battery during the transport course. However, most researches only considered EVRPs as static optimization problems; that is, all customer demands were given as constant. As shown by the example in Figure 1, rerouting an EV must consider the influence of its driving range with load degree once a new customer demand arrives, which is not involved if a traditional vehicle is used.

According to the method proposed by Pillac et al. [18], there are four categories of VRPs. The first is the static and deterministic VRP, where all input is known beforehand and vehicle routes do not change once they are in execution [19]. The second is the static and stochastic VRP, which is characterized by input partially known as random variables (Gendreau et al.) [20]. The third is the dynamic and deterministic VRP, where part or all of the input is unknown and revealed dynamically during the design or execution of the routes [21]. The final is the dynamic and stochastic VRP that has part or all of the input unknown and revealed the exploitable stochastic knowledge dynamically during the execution of the routes [22].

The investigated problem in this paper belongs to the third category; however, the routing algorithm of DEVRP is significantly complicated due to the influence of EVs’ driving range varying with their load degree in dynamic environment. For DEVRP, an effective optimization algorithm should track these environmental changes and adapt the best routing scheme of an EV fleet to the changes accordingly. Therefore, we will investigate the applications of an adaptive memetic algorithm (MA) for DEVRP in this paper. The contributions of this study can be summarized as follows:(i)We extend EVRP into dynamic delivery transport environment, which is more general and practical, and present a special DEVRP, where the driving range of EV is limited by the variable load degree according to the time-varying demand requests.(ii)We propose an adaptive MA method to solve the investigated DEVRP. The proposed method introduces a special individual representation strategy, an adaptive local search (LS) operator, and an economical random immigrant scheme into the framework of evolutionary algorithm (EA).(iii)We validate the performance of the proposed method using a series of test instances constructed from a stationary VRP benchmark.

The rest of this paper is outlined as follows. Section 2 formally provides the investigated DEVRP. Section 3 describes the proposed MA method in detail. Section 4 constructs DEVRP test suites from a stationary VRP benchmark and evaluates empirically the performance of the proposed MA for DEVRPs. The final section concludes this paper with discussions on future works.

2. Problem Definition

In this section, we provide a formal description of the DEVRP, in which a homogenous fleet of EVs start from a single depot and deliver the goods to a set of customers. Each EV has a fixed load capacity and limited driving range. While EV is traveling, the battery charge level decreases proportionally with the distance traversed and the current load degree. All EVs must return to the depot for the battery recharging. The demands of customers may change during the design stage. As a consequence, the delivery routes may have to be revised to accommodate the corresponding changes of demands.

Let be the set of nodes representing the customers and a node representing the depot. Each customer has a time-varying demand over the design horizon with . The DEVRP is defined on a directed and complete graph , where and represents the set of arcs connecting vertices of . Each arc has two associated nonnegative values: a travel distance and a battery consumption . All EVs have a battery of capacity and a maximal load capacity . Note that can be calculated by the following formulation:where represents the engine efficiency of EV, and , , and represent the load degrees with the current, full, and empty statuses of EV on arc , respectively.

The additional assumptions in the investigated DEVRP are given as follows:(1)Each EV starts with the full-battery capacity from the depot(2)Each EV route ends at the depot(3)Each customer node is visited once by only an EV route(4)Driving speed on each arc is constant(5)The battery is recharged to full each time after returning to the depot(6)No time window constraint is considered for the delivery service and the battery recharging

The purpose of DEVRP investigated in this paper is to decide an optimal delivery route plan so as to minimize the total travel distance of EVs as well as adapting to the time-varying demands of customers during the design stage. Considering that the traditional exact or approximation methods for large-scale VRPs are not very exciting, an adaptive memetic algorithm will be proposed for solving this problem efficiently and the detailed algorithmic designs will be given in the next section.

3. Proposed MA Method for DEVRP

3.1. General Framework of MA

In the EA community, MA can be regarded as a hybrid metaheuristic method inspired by Darwinian principles of natural evolution and Dawkins’ notion of a meme, defined as a unit of cultural evolution that is capable of local refinements. Within the framework of a MA, EA operators are responsible for global rough search and LS operators (an LS is also called a meme) are used for local refinement. Due to the advantage of maintaining an efficient balance between exploration and exploitation, MAs have been successfully used to solve a lot of complex optimization problems [2326]; however, they are rarely considered for dynamic optimization problems [27, 28].

The proposed MA in this paper is a class of EA-based hybrid metaheuristic, which can be expressed by the pseudocode in Figure 2, where , , and are the population size, crossover probability, and mutation probability, respectively. Within this MA, a population of chromosomes are generated randomly and then evaluated at the initialization step. Then, an elite chromosome, i.e., the chromosome with the best fitness, is improved by an LS operator. At each subsequent generation, the chromosomes undergo a binary tournament selection, where two chromosomes are selected randomly from the current population and the best one is taken for one parent, and the selected chromosome undergoes an order crossover (OX) operation with a probability . After crossover is executed, a 2-OPT mutation operator is performed for each newly generated offspring chromosome with a probability . Then, the best chromosomes among all the parents and offspring are selected to proceed into the next generation, and an elite chromosome in the newly generated population is refined in the LS operation. If the immigrant scheme is used, a certain number of immigrants are generated randomly and replace the worst chromosomes in the current population in order to improve the exploration capacity of MA in dynamic environment.

3.2. Individual Representation

How to express a chromosome using a proper encoding method is a basic work when designing an MA for VRP. A good representation scheme can help the algorithm obtain high-quality solutions easily, while an improper representation may make it hard for an algorithm to achieve even feasible solutions.

Prins [29] proposed a simple and effective encoding scheme of GA for VRP, where each chromosome is defined as a sequence of customer nodes, just like in most GAs for traveling salesman problem (TSP). This encoding scheme can be viewed as a giant tour performed by one vehicle of infinite capacity if only the same vehicle performs all trips one by one. Then, an optimal split procedure can partition a given chromosome into the best VRP solution. Based on this encoding mechanism, we develop a permutation encoding scheme in MA for the investigated DEVRP. Note that the evaluation of a chromosome is time-costing since the splitting procedure, as shown in Figure 3, is known as an exact algorithm in . Thus, the evaluation number will be considered as the time index in the proposed MA for DEVRP in this paper.

3.3. Local Search

In many heuristic methods for the routing problems, the inverse operation, where two nodes are selected from a segment which is reversed, is often employed as the local move technique, as shown in Figure 4. The quality of a neighbor generated by executing one inversion can be estimated by a parameter , which is calculated by the formula . Obviously, the smaller the value of is, the higher the quality of the new generated solution is.

In the original inversion scheme, the sequence between two different nodes chosen randomly in a single tour is reversed. Since there are existing multiple tours in the VRP solution, the inversion can be divided into two ways: single-tour-based inversion (SI) and multiple-tour-based inversion (MI). The former starts from choosing a single tour and reverses the sequence between two different nodes in the chosen tour. In the latter, the tours are not considered in isolation, and paths and nodes are allowable to exchange between different tours.

An example of SI and MI is presented in Figure 5, where node 3 and node 6 are selected from single (SI) and multiple (MI) tours and the sequence between them is reversed. It is easy to see that SI can always help a solution make a single-tour local improvement, which means that one local move in SI always occurs in a smaller area around , and MI can perform one move in a wider range upon . Obviously, each local move makes a biased search and hence may be efficient for some classes of problems but not for others. Therefore, how to develop an efficient LS operator and avoid utilizing inappropriate local move becomes a very important issue.

Here, we design an adaptive LS (ALS) method for the investigated DEVRP according to an effective learning mechanism [30] in order to utilize SI and MI operators efficiently. In our proposed ALS method, SI and MI are both allowed to work together and selected by probability to generate a neighbor of an individual at every step of an LS operation on every iteration of the algorithm. In order to obtain the advantages of both of them during different periods when they are effective, an adaptive learning strategy is used to adjust their execution probabilities according to the improvement each inversion operator has achieved on every LS step.

Let and denote the probabilities of applying SI and MI to generate a neighbor of an individual that is used for local search, respectively; . At the start of this strategy, and are both set to 0.5, which means giving a fair competition chance to each inversion operator. As each inversion move always makes a biased search, the inversion operator which produces more improvements should be given a greater selection probability. Let denote the improvement degree of the selected individual when one inversion operator is used to refine it and can be calculated bywhere is the final fitness of the selected individual for the local refinement after executing one LS operation and is its initial fitness before the local refinement. At each generation, the degree of improvement of each inversion operator is calculated when a predefined number of iterations is achieved, and then and are recalculated to proceed with the local improvement in the next generation.

Suppose and , respectively, denote the total improvement of SI and MI at generation . Their selection probabilities and at generation can be calculated orderly by the following formulae:where signifies the relative influence of the degree of the improvement on the selection probability. The proposed inverse-based LS operator can be expressed by the pseudocode in Figure 6.

From Figure 6, it can be seen that SI and MI are used to generate a certain number of ’s neighbors, which compose a neighbor set S, by probability. For each neighbor in , the corresponding value of is calculated to estimate the effect of this local move. It is noticeable that the quality of is only estimated by the value of rather than its fitness , which differs from the general LS method as shown in Figure 1. This is because that the evaluation of fitness is too time-consuming (see the relevant description in Section 2.1), while the total evaluation number in each generation is always limited. Finally, the best neighbor with the smallest value of would be evaluated and replace s if its fitness is lower than .

3.4. Increasing Population Diversity

The major problem when EAs are applied for dynamic optimization problems is that the converging population cannot adapt to the changing environments. The random immigrant scheme, where the population is partially replaced by the new-generated chromosomes at every generation, is a simple and efficient scheme for EAs in dynamic environments since it can introduce a constant diversity into the population [31, 32]. Obviously, it is more helpful to migrate random chromosomes into a converging population than a spread-out one. Thus, it is not necessary to always inject a constant number of random chromosomes into the population at every generation.

Here, we propose an economical random immigrant scheme for the proposed MA, where the immigrant ratio of every generation can be calculated based on the value of a population diversity index . The index can be calculated as follows:where and are the best and average fitness among the fitness values of the population, respectively. The index is a fitness-based measurement of population diversity, and it can be seen as a measurement of the convergence degree of an algorithm. If , there is a high population diversity and therefore the algorithm is far from convergence; if , there is a low population diversity, which means that the convergence is approaching.

With the definition of , the immigrant ratio can be calculated by the following formula:where and are, respectively, the predefined maximum and minimum value of and are preset constant to control the decreasing or increasing speed of . From this formula, it is easy to understand that less random immigrants could be introduced into the population in the presence of higher diversity (i.e., when ) as a result of . However, when the population is converging (i.e., when ), , which increases the level of random immigrants with a great degree.

4. Experimental Study

4.1. Dynamic Test Environment

In this section, the behaviors of investigated algorithms are examined on a series of DEVRPs which are constructed based on an instance from VRPLIB (http://comopt.ifi.uni-heidelberg.de/software/TSPLIB95/). In the basic VRP instance eil51, one depot node and customer nodes are defined by the points in the plane and the cost for each edge is the Euclidean distance between node and node . For each EV, the maximal load capacity is set to 160 and the capacity of battery is set to 20 kWh.

According to the generated method of dynamic optimization problems by Wang et al. [33], the dynamics of DEVRPs is defined as follows: every generations, the demands of customers change according to a random Gaussian variable. More formally, a change of a single customer can be described as follows:where denotes the demand of customer at time , and denote the allowable maximum and minimum demand of a customer, respectively, and denotes a normal distribution random number in . In fact, is equal to the capacity of an EV, while is set to 1 in order to ensure that the demand of each customer is nonnegative. Moreover, the mean demand and variance of all customers is 15 and 8, respectively, in the original data set. According to this constructing method, the parameter controls the speed of changes while controls the severity of changes. A bigger means more severe changes while a smaller means more frequent changes.

In this study, the change speed parameter is set to 50, 100, and 200, respectively, which means that the environment changes very fast, in the moderate speed, and slowly, respectively. The change severity parameter is set to 0.1, 0.5, and 1.0, respectively, in order to examine the performance of algorithms in dynamic environments with different severities: from slight change to moderate variation to intense change . In order to study the behavior of algorithms in randomly changing environment, we also set to be a random number uniformly distributed in , i.e., . In total, a series of 12 different dynamic problems are constructed from each stationary test problem. The dynamics parameter settings are summarized in Table 1.

4.2. Experimental Design

Experiments are carried out in order to study the major features of our proposed MAs and to compare their performance with several other peer algorithms. The following abbreviations represent the algorithms considered in this paper.(1)SGA: A simple GA(2)SGAr: A simple GA with restart from scratch whenever the environment changes(3)RIGA: SGA with the traditional random immigrants’ scheme(4)MA-Inverse: MA with the simple reverse-based LS operator(5)MA-Swap: MA with the simple swap-based LS operator(6)MA-Insert: MA with the simple insert-based LS operator(7)MA-ALS: MA with our proposed ALS operator(8)MAr-ALS: MA-ALS with a simple restart scheme(9)RIMA-ALS: MA with the ALS operator and our proposed economical random immigrant scheme

The following parameters are used in all algorithms: the population size is set to 100 for all MAs, but is set to 120 for SGA, SGAr, and RIGA because the LS operation in MAs may be executed by steps per generation. It is easy to see that the total number of evaluations per generation is always 120 for all algorithms. The order crossover probability equals 0.8 and the inversion mutation probability is set to 0.2 for all GAs and MAs. The specific parameters in our proposed algorithms are set as follows: , , and for the random immigrant scheme, while , , and in the ALS operator.

For each experiment of an algorithm on a test problem, 30 independent runs were executed with the same set of random seeds. For each run of an algorithm on a DOP, 10 environmental changes were allowed and the best-of-generation fitness was recorded per generation. The overall offline performance of an algorithm is defined as the best-of-generation fitness averaged across the number of total runs and then averaged over the data gathering period, as formulated below:where is the number of generations i.e., , is the total number of runs, and is the best-of-generation fitness of generation of run .

4.3. Experimental Study on the Effect of LS Operators

In the experimental study on LS operators, we investigate the performance of MAs with different LS operators, with the aim of examining the effect of our proposed operator in Section 3. In particular, five different algorithms, including MA-ALS, MA-Inverse, MA-Swap, MA-Insert, and SGA, are tested on the stationary instance of DEVRP. For each run of an algorithm on each problem, the maximum allowable number of generations was set to 200. The experimental results are shown in Figure 7, where the data were also averaged over 30 runs.

From Figure 7, it can be seen that the performance of four MAs is always better than that of SGA on the stationary EVRP. This is because the LS operators can improve the performance of an algorithm significantly. The similar results have been obtained by many researchers. It is noticeable that the behavior of different MAs is different during the running course. MA-Insert, MA-Inverse, and MA-Swap significantly outperform SGA at the early iterative stage of running the algorithms, while their behavior becomes a little disappointing at the late iterative stage. However, MA-ALS almost performs much better than other algorithms on the whole course of running. This is because three different LS operators are employed and selected to execute local improvements for an individual with an adaptive mechanism in MA-ALS. Obviously, our proposed LS operator, which combines the features of two inverse operation in an adaptive learning strategy, can improve the individual more efficiently than each of the three single LS operators. The result that MA-ALS outperforms significantly other algorithms shows the efficiency of the proposed ALS operator.

4.4. Experimental Study on the Performance of MAs on DEVRPs

In the experiments on DEVRPs constructed in Table 1, we attempt to compare the performance of our proposed algorithm RIMA-ALS with some other peer algorithms, including SGAr, RIGA, MA-ALS, and MAr-ALS described in Section 4.2.

The experimental results with respect to the overall offline performance are presented in Table 2 and plotted in Figure 8. The corresponding statistical results of comparing algorithms by the one-tailed -test with 58 degrees of freedom at a 0.05 level of significance are given in Table 3. In Table 3, the -test results regarding Alg. 1 − Alg. 2 are shown as “,”, “,” “,” or “” when Alg. 1 is significantly better than, significantly worse than, insignificantly better than, or insignificantly worse than Alg. 2, respectively. From Table 2, Table 3, and Figure 8, several results can be observed and are analyzed below.

First, RIMA-ALS significantly outperforms other peer algorithms on most DEVRPs, as indicated in the relevant t-test results in Table 3. This result validates our expectation of the proposed MA for solving DEVRPs. In RIMA-ALS, the ALS operator can make a sufficient exploitation for the best-fitness chromosome. Many researches in the literature often utilize the LS technique to refine each chromosome in the current population. This is too costly or impossible for an MA in dynamic environments considering that the total cost per iteration in terms of evaluations is always limited. The ALS operator is only applied for the best-fitness chromosome in our proposed algorithm and can always help the algorithm keep tracking a changing high-quality solution, which is obviously feasible, especially on real-world applications. Moreover, the population is also introduced a certain degree of diversity by the economical random immigrant scheme, which can help the algorithm adapt well to environmental changes, especially when the environment is subject to significant changes. More relevant analysis on the effect of the operators in RIMA-ALS will be explained in the late experimental analysis.

Second, MAr-ALS performs much worse than MA-ALS and RIGA, but it is better than SGAr on all dynamic problems. Given the perfect restart scheme, MAr-ALS does not reuse any information from the past population, but only restarts to search the equivalent problem from a random initial state when an environmental change occurs. It is easy to understand that MAs or GAs with the restart scheme always take too much cost to reachieve the high-fitness solutions. The result that MAr-ALS significantly underperforms MA-ALS and RIGA on DEVRPs indicates the importance of reusing the useful information in the past population for MA in dynamic environments. The reason why MAr-ALS outperforms SGAr lies in that the ALS operator can help the algorithm obtain the high-fitness solution more quickly, which has been obtained in the experiment in the last section.

Third, MA-ALS always performs well on most DEVRPs, even outperforms RIMA-ALS on DEVRP when and . This happens because that a new environment is close to the previous one when the value of is very small. It is much advantageous to execute sufficient exploitation for the elite chromosome in the current population when a slight environmental change occurs in terms of severity. When increases to 0.5 or 1.0, MA-ALS is significantly beaten by RIMA-ALS, which shows the necessity of maintaining the sufficient population diversity for MAs in dynamic environments. Since it is usually not possible to judge the strength and weakness of an environmental change in advance, the proposed random immigrant scheme is a good choice when MAs are applied for DOPs; notice that RIMA-ALS always performs better than MA-ALS on all DEVRPs when .

Finally, the environmental parameters affect the performance of algorithms. The performance of all algorithms increases when the value of increases from 50 to 100 to 200. When becomes larger, algorithms have more time to find better solutions before the next change. However, the performance of algorithms hardly changes with the increasing or decreasing of the value of .

5. Conclusion

In this paper, an adaptive MA is proposed for addressing a special DEVRP that is characterized by the time-varying customer demands during the route design stage, under the consideration of the influence of the variable load degree upon the driving range of EVs. In the proposed MA method, there are three major algorithmic schemes. The first is to develop a splitting procedure to translate an individual with simple permutation-based representation to a solution of DEVRP. The second is to propose an ALS operator that utilizes two different inverse-based LS operations, that is, SI and MI, in an adaptive cooperation way. The last is to employ an economical random immigrant scheme to maintain a sufficient population diversity for the proposed algorithm to adapt well to the changing environment.

From the experimental results based on a series of DEVRPs, which are constructed from a stationary instance of VRP, the following conclusions can be drawn on the dynamic test problems. Firstly, MAs with the ALS operator enhanced by the suitable diversity technique can exhibit a better performance in solving DEVRPs. For most test problems, RIMA-ALS, which hybridizes GA with the ALS operator and the economical random immigrant scheme, outperforms other peer algorithms. Secondly, the ALS operator can help the algorithm execute a robust local refinement since it employs multiple LS operators via an adaptive cooperation mechanism. In our experiments, MA-ALS performs much better than other MAs with a single LS operator for the stationary instance of VRP. Thirdly, maintaining a sufficient population diversity and reusing the past information efficiently are both important for MAs in dynamic environments. For most DEVRPs, RIMA-ALS always performs better than MA-ALS and MAr-ALS, while MA-ALS outperforms MAr-ALS. Finally, the environmental parameter can affect the performance of algorithms. In our experiments, algorithms perform better with the increase of the frequency of changes, while the influence of the severity of change seems very limited. Generally speaking, the experimental results indicate that the proposed algorithm, where an EA is hybridized with the ALS operator and an economical random immigrant scheme, seems a good optimizer for DEVRPs.

For the future work, it is straightforward to examine the performance of our proposed algorithm for more DEVRP instances. For example, a mixed fleet of EVs or multiple charging stations can be considered in DEVRPs. Another interesting research topic is to investigate the effect of hybridizing the other LS operations, such as those especially used for EVRPs, within this adaptive learning way. In addition, it is also valuable to carry out the sensitivity analysis on the effect of parameters, e.g., , , etc., on the performance of the proposed algorithm in the future.

Data Availability

The data used to support the findings of this study are available within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant nos. 61703290 and 71671032, the Natural Science Fund Guidance Program of Liaoning Province under Grant no. 2019-ZD-0478, and the Fundamental Research Funds for the Central Universities under Grant no. N180408019.