Abstract
A hybrid optimization algorithm combining finite state method (FSM) and genetic algorithm (GA) is proposed to solve the crude oil scheduling problem. The FSM and GA are combined to take the advantage of each method and compensate deficiencies of individual methods. In the proposed algorithm, the finite state method makes up for the weakness of GA which is poor at local searching ability. The heuristic returned by the FSM can guide the GA algorithm towards good solutions. The idea behind this is that we can generate promising substructure or partial solution by using FSM. Furthermore, the FSM can guarantee that the entire solution space is uniformly covered. Therefore, the combination of the two algorithms has better global performance than the existing GA or FSM which is operated individually. Finally, a reallife crude oil scheduling problem from the literature is used for conducting simulation. The experimental results validate that the proposed method outperforms the stateofart GA method.
1. Introduction
In recent years refineries have to explore all potential costsaving strategies due to intense competition arising from fluctuating product demands and everchanging crude prices. Scheduling of crude oil operations is a critical task in the overall refinery operations [1–3]. Basically, the optimization of crude oil scheduling operations consists of three parts [4]. The first part involves the crude oil unloading, mixing, transferring, and multilevel crude oil inventory control process. The second part deals with fractionation, reaction scheduling, and a variety of intermediate product tanks control. The third part involves the finished product blending and distributing process. In this paper, we focus on the first part, as it is a critical component for refinery scheduling operations. Scheduling of crude oil problem is often formulated as mixed integer nonlinear programming (MINLP) models [2, 5, 6]. The solution approaches for solving MINLP can be roughly divided into two categories [7]: deterministic approaches and stochastic approaches. Some deterministic methods have been available for many years [8]. These methods require the prior step of identification and elimination of nonconvexity and decompose the MINLP models into relevant nonlinear programming (NLP) and mixed integer linear programming (MILP) and then these subproblems have to be iteratively solved. The most common algorithms are branch and bound [9], outerapproximation [10], generalized benders decomposition [11], and so forth. Also, some commercial MINLP solvers have been developed for solving the problem at hand optimally [12]. However, the commercial solver can only handle MINLPs with special properties. The other stream of global optimization is the stochastic algorithms, for example, simulated annealing (SA), GA, and their variants [7]. GA proposed by Holland [13], because of their simple concept, easy scheme, and the global search capability independent of gradient information, have been developed rapidly. Much other attention is given to the development of GA for MINLP. For instance, Yokota et al. developed a penalty function that is suitable for solving MINLP problems [14]. Costa and Oliveira also implemented another type of penalty function to solve various MINLP problems, including industrialscale problems [15]. They also noted that the evolutionary approach is efficient, in terms of the number of function evaluations, and is very suitable to handle the difficulties of the nonconvexity. Going one step further, some mixed coding methods were proposed, which include mixedcoding genetic algorithm [15] and informationguided genetic algorithm (IGA). PonceOrtega et al. [16] proposed a twolevel approach based on GA to optimize the heat exchanger networks (HENs). The outer level is used to perform the structural optimization, for which a binary GA is used. Björk and Nordman [17] showed that the GA is very suitable to solve a largescale heat exchanger network.
Obviously, the two different approaches previously discussed have their own advantages and disadvantages. On the one hand, a deterministic approach usually involves considerable algebra and undeviating analysis to the problem itself, whereas the evolutionary approach does not have this property. On the other hand, some deterministic approaches, such as mathematical programming, usually cannot provide practical solutions in reasonable time, whereas the evolutionary approach can generate satisfying solutions. In this work, a novel genetic algorithm which combined the finite state method and GA is proposed to solve crude oil scheduling problem. A MINLP model is formulated based on the singleoperation sequencing (SOS) time representation. A deterministic finite automation (DFA) model which captures valid possible schedule sequences is constructed based on the sequencing rules. The initialization and mutation operation of GA is based on the model which builds legal schedules complying with sequencing rules and operation condition. Thus, the search space of the algorithm is substantially reduced as only legal sequence is explored. The rest of the paper is organized as follows: the MINLP model is specified in Section 2. Section 3 reviews the background of finite state theory. In Section 4, a novel genetic algorithm which combined the finite state method and GA is proposed to solve the MINLP model. A test problem is studied to verify our approach in Section 5. In the last section, conclusive remarks are given.
2. Mathematic Model
In this section, the MINLP model of refinery crude oil scheduling problem is described [18]. This problem has been widely studied from the optimization viewpoint since the work of Lee et al. [19]. It consists of crude oil unloading from marine vessels to storage tanks, transfer and blending between tanks, and distillation of crude mixtures. The goal is to maximize profit and meet distillation demands for each type of crude blend (e.g., low sulfur or high sulfur blends), while satisfying unloading and transfer logistics constraints, inventory capacity limitations, and property specifications for each blend. The logistics constraints involve nonoverlapping constraints between crude oil transfer operations.
2.1. Sets
The following sets will be used in the model.(i) is the set of priorityslots;(ii) is the set of all operations: ;(iii) is the set of unloading operations;(iv) is the set of tanktotank transfer operations;(v) is the set of distillation operations;(vi) is the set of all operations: ;(vii) is the set of vessels;(viii) is the set of storage tanks;(ix) is the set of charging tanks;(x) is the set of distillation units;(xi) is the set of inlet transfer operations on resource ;(xii) is the set of outlet transfer operations on resource ;(xiii) is the set of products (i.e., crudes);(xiv) is the set of product properties (e.g., crude sulfur concentration).
2.2. Parameters
Parameters used in the paper are defined below:(i) is the scheduling horizon;(ii) are bounds on the total volume transferred during transfer operation ; in all instances, for all operations except unloading for which is the volume of crude in the marine vessel;(iii) are the bounds on the number of distillations;(iv) are flow rate limitations for transfer operation ;(v) is the arrival time of vessel ;(vi) are the limits of property of the blended products transferred during operation ;(vii) is the value of the property of crude ;(viii) are the capacity limits of tank ;(ix) are the bounds of the demand on products to be transferred out of the charging tank during the scheduling horizon;(x) is the gross margin of crude .
2.3. Variables
2.3.1. Assignment Variables
if operation is assigned to priorityslot ; otherwise.
2.3.2. Time Variables
is the start time of operation if it is assigned to priority slot ; otherwise.
is the duration of operation if it is assigned to priority slot ; otherwise.
2.3.3. Operation Variables
is the total volume of crude transferred during operation if it is assigned to priority slot ; otherwise.
is the volume of crude transferred during operation if it is assigned to priority slot ; otherwise.
2.3.4. Resource Variables
is the total accumulated level of crude in tank before the operation was assigned to priorityslot .
is the accumulated level of crude in tank before the operation was assigned to priorityslot .
2.4. Objective Function
The objective is to maximize the gross margins of the distilled crude blends. Let be the individual gross margin of crude ,
2.5. General Constraints
It should be noted that the crude composition of blends in tanks is tracked instead of their properties. The distillation specifications are later enforced by calculating a posteriori the properties of the blend in terms of its composition. For instance, in the problem, a blend composed of 50% of crude A and 50% of crude B has a sulfur concentration of 0.035 which does not meet the specification for crude mix X nor for crude mix Y.
2.5.1. Assignment Constraints
In the SOS model, exactly one operation has to be assigned to each priority slot,
2.5.2. Variable Constraints
Variable constraints are given by their definitions. Start time, duration, and global volume variables are defined with big constraints,
Crude volume variables are positive variables whose sum equals the corresponding total volume variable,
Total and crude level variables are defined by adding to the initial level in the tank all inlet and outlet transfer volumes of operations of higher priority than the considered priority slot,
2.5.3. Sequencing Constraints
Sequencing constraints restrict the set of possible sequences of operations. Cardinality and unloading sequence constraints are specific cases of sequencing constraints. More complex sequencing constraints will also be discussed later.
2.5.4. Cardinality Constraint
Each crude oil marine vessel has to unload its content exactly once. , . The total number of distillation operations is bounded by and in order to reduce the cost of CDU switches,
2.5.5. Unloading Sequence Constraint
Marine vessels have to unload in order of arrival to the refinery. Considering two vessels , signifies that unloads before ,
2.5.6. Scheduling Constraints
Scheduling constraints restrict the values taken by time variables according to logistics rules.
2.5.7. Nonoverlapping Constraint
A nonoverlapping constraint between two sets of operations and states that any pair of operations must not be executed simultaneously.
Unloading operations must not overlap,
Inlet and outlet transfer operations on a tank must not overlap,
Although we do not consider crude settling in storage tanks after vessel unloading, it could be included in the model with a modified version of constraint (14) taking into account transition times. We define as the transition time after unloading operation and TR as the maximum transition time,
Constraint (15) is valid in the four possible cases:
A tank may charge only one CDU at a time,
A CDU may be charged by only one tank at a time,
To avoid schedules in which a transfer is being performed twice at a time, thus possibly violating the flow rate limitations, constraint (19) is included in the model,
2.5.8. Continuous Distillation Constraint
It is required that CDUs operate without interruption. As CDUs perform only one operation at a time, the continuous operation constraint is defined by equating the sum of the duration of distillations to the time horizon,
2.5.9. Resource Availability Constraint
Unloading of crude oil vessels may start only after arrival to the refinery. Let be the arrival time of vessel ,
2.5.10. Operation Constraints
Operation constraints restrict the values taken by operation and time variables according to operational rules.
2.5.11. Flow Rate Constraint
The flow rate of transfer operation is bounded by and
2.5.12. Property Constraint
The property of the blended products transferred during operation is bounded by and . The property of the blend is calculated from the property of crude assuming that the mixing rule is linear,
2.5.13. Composition Constraint
It has been shown that processes including both mixing and splitting of streams cannot be expressed as a linear model. Mixing occurs when two streams are used to fill a tank and is expressed linearly in constraint (10). Splitting occurs when partially discharging a tank, resulting in two parts: the remaining content of the tank and the transferred products. This constraint is nonlinear. The composition of the products transferred during a transfer operation must be identical to the composition of the origin tank,
Constraint (24) is reformulated as an equation involving bilinear terms,
Note that constraint (25) is correct even when operation is not assigned to priorityslot , as then
2.5.14. Resource Constraints
Resource constraints restrict the use of resources throughout the scheduling horizon.
2.5.15. Tank Capacity Constraint
The level of materials in the tank must remain between minimum and maximum capacity limits and , respectively. Let be the initial total level and let be the initial level of crude in the tank . As simultaneous charging and discharging of tanks is forbidden, the following constraints are sufficient:
2.5.16. Demand Constraint
Demand constraints define lower and upper limits, and , on total volume of products transferred out of each charging tank during the scheduling horizon,
3. Finite State Theory
This section presents in a somewhat informal way those basic notions and definitions from formal language and finite state theories, which are relevant for the sections to follow. Related definitions are taken from literature [20, 21]. Readers, who are unfamiliar with formal language theory, are advised to consult the sources whenever necessary.
3.1. Finite State Automata
A DFA is a 5tuple , where is a set of states, is an alphabet, is the initial state, is a set of final states, and is a transition function mapping to . That is, for each state and symbol , there is at most one state that can be reached from by “following” (Figure 2).
3.2. Finite State Transducers
A finite state transducer (FST) is a 6tuple , where , , and are the same as for DFA, is input alphabet, is output alphabet, and is a function mapping to a subset of the power set of (Figure 3). Intuitively, an FST is much like an NFA except that transitions are made on strings instead of symbols and, in addition, they have outputs.
3.3. Finite State Calculus
As argued in Karttunen [22–25], many of the rules used can be analyzed as special cases of regular expressions. They extend the basic regular expression with new operators. These extensions make the finite state automation and finite state transducer become more suitable for particular applications. The system described below was implemented using FSA Utilities [26], a package for implementing and manipulating finite state automata, which provides possibilities for defining new regular expression operators. The part of FSAs built in regular expression syntax relevant to this paper is listed in Table 4.
One particular useful extension of the basic syntax of regular expressions is the replaceoperator. Karttunen [22–25] argues that many phonological and morphological rules can be interpreted as rules which replace a certain portion of the input string. Although several implementations of the replaceoperator are proposed, the most relevant case for our purposes is the socalled “leftmost longestmatch” replacement. In case of overlapping rule targets in the input, this operator will replace the leftmost target, and in cases where a rule target contains a prefix which is also a potential target, the longer sequence will be replaced. Gerdemann and van Noord [27] implement leftmost longestmatch replacement in FSA as the operator: where Target is a transducer defining the actual replacement and LeftContext and RightContext are regular expressions defining the left and right context of the rule, respectively. The segmentation task discussed in the mutation procedure makes crucial use of longestmatch replacement.
4. The Hybrid Algorithm
From the point view of optimization efficiency and robustness, a novel twolevel optimization framework based on finite state method and GA is proposed for the MINLP model in this section.
4.1. TwoLevel Optimization Structure
As the foundation of the framework, a twolevel optimization structure is introduced. Once all binary variables are fixed the original problem becomes a relatively simpler model with only continuous variable. Following this deal, we rewrite (5) as follows: where and represent continuous and binary variables, respectively. Equation (30) shows when is fixed as , the submodel can be solved optimally by continuousoptimization solvers in the inner level; then we update towards the best binary solution in the outer level.
We used an example in Figure 4 to show how binary solution can be mapped to a scheduling sequence. The schedule where 7 stands for the specific operation 7 to assign to position 1 corresponding to the binary decisions .
4.2. Initial Population
Based on the sequencing rules [18] and the extension to the regular expression calculus [22–25], a DFA model which builds legal schedules complying with sequencing rules and operation condition is constructed. The whole set of possible schedules is too huge to be processed at once. The DFA model of the schedule constitutes a reasonable framework, capturing all possible schedules and removing many redundant sequences of operations. Initial values of decision variables must satisfy the equality constraints and operation condition and therefore represent a feasible operating point.
Here, we still use the instance with 8 operations from Mouret et al. [18] to describe an efficient sequencing rule by using a regular expression. A feasible sequence can be described by the following:
However, this automation suffers from a serious problem of overgeneration. For example, the short length of the sequence may lead to infeasibility, while the long length of the sequence may result in an unsolvable model. It is an interesting challenge for finite state syntactic description to specify a sublanguage that contains all and only the sequences of valid length.
Our solution is to construct a suitable constraint for the sequences of valid length. The constraint expressions denote a language that admits sequences of valid length but excludes all others. We obtain the desired effect by intersecting the constraint language with the original language of sequence expressions. The intersection of the two languages contains all and only the valid dates:
The ValidLength constraint is a language that includes all sequences of length :
We have now completed the task of describing the language of valid sequences from the set of possible sequence expressions. It is also possible to create an automation on the basis of the regular expression and ValidSequence and then generate all possible sequences accepted by the automaton. The processes are implemented using FSA Utilities [26] that is a package for implementing and manipulating DFA and finite state transducer. In order to generate all possible sequences. When all possible sequences accepted by the automaton are generated, and the population of the according possible binary decisions is generated. In the initial population stage of GA, the population size is the number of individuals. When the number of individuals is given, a population of candidate solutions is generated by randomly selecting from the population of the all possible binary decisions.
4.3. RuleBased Mutation Approach
In the mutation stage, we use a finite state transducer for this rulebased mutation process. The rulebased mutation strategy must obey the sequencing rule and the nonoverlapping constraint such that all involved solutions in GA are feasible.
The proposed mutation approach is a twostep procedure.
Step 1. Segmentation of the input sequence into a set of subsequences (i.e., the subsequence which belongs to the regular language L7 or L8).
Step 2. Mutation of the subsequences into others.
Formally, the rulebased mutation procedure is implemented as the composition of three transducers (see Algorithm 1).

An example of mutation including the intermediate steps is given for the sequence “7681325712” as shown in Figure 5.
4.3.1. Segmentation Transducer
Segmentation transducer splits an input sequence into subsequences. The goal of segmentation is to provide a convenient representation level for the next mutation step.
Segmentation is defined as shown in Algorithm 2.

The macro “SSequence” defines the set of subsequences. The subsequences which belong to the regular language L7 and L8 are displayed in Tables 1 and 2. Segmentation attaches the marker “–” to each subsequence. The Targets are identified using leftmost longestmatch, and thus at each point in the input, only the longest valid segment is marked.
4.3.2. The Mutation Rules
In the GA process, the mutation rules are made by carefully considering nonoverlapping constraint between operations. A concrete instance for partially illustrating the mutation rules is given in Algorithm 3. Note that the final element of the leftcontext must be a marker and the target itself ends in “–.” This ensures that mutation rules cannot apply to the same subsequence.

5. Experimental Study
In this section, the same problem from the literature [18] is used for computational experiments. The proposed methodology is compared with existing promising algorithms, mixedcoding GA [15, 28]. Figure 1 depicts the refinery configuration for problem. The data involved in the problem are given in Table 3. The performance comparison with different computing times, such as 350 s, 500 s,, 2400 s, is conducted. The objective value is used to statistically analyze the optimization results.
The performance comparison between the two methodologies used is illustrated in Figure 6, which shows that the hybrid optimization algorithm which combined the finite state method and GA will statistically outperform the mixedcoding counterpart. The genetic algorithm which combined the finite state method and GA finds feasible solutions very fast and is able to find better solutions in reasonable time.
In Figure 7, we compare the objective variance of each iteration in the two evolution processes of these two kinds of methodology. By tracking the evolution process, we find that the mixedcoding GA is easy to stick in a local minimal sequence solution. This situation only can be improved through increasing the mutation scaling factor. However, this may result in a hard convergence, unless sufficient iterations are implemented. As for the hybrid optimization algorithm, the optimization processes of binary variable and continuous variable are separated. The performance of the whole methodology mainly depends on the FSM which captures most promising schedules and removes many redundant sequences of operations, so that the user can use a small population size of corresponding discrete variables to obtain suboptimal solutions. From Figure 7, we see that the proposed method has converged at 350 iterations as opposed to 2400 iterations for the mixedcoding GA.
The success of the proposed algorithm lies in a comprehensive analysis of the region of the search space and its capacity to focus the search on the regions with the partial solution. One of the good merits of the hybrid algorithm is that each solution involved in the GA algorithm is guaranteed to be feasible by using the mutation rules generated by DFM method while in existing GA algorithms the procedure to generate feasible solution under complex process constraints is very time costive. The deterministic finite automata (DFA) can easily represent this kind of structure. Furthermore, the complex process constraints can be very difficult to express with mixed integer programming. Consequently, it is unfeasible to solve the industrial problem by using MIP solver.
6. Conclusion
In this paper, a novel hybrid optimization algorithm which combined the finite state method and GA is proposed. The proposed algorithm constitutes a reasonable framework, capturing both the operating condition and sequencing rule of the schedule. The solution captures all possible schedules and removes many redundant sequences of operations. The algorithm is equivalent to introducing new structure information into the optimization process, which will help reduce the risk of trapping in a local minimal sequence solution. The hybrid optimization algorithm is an effective and robust tool to solve the crude oil scheduling problem in terms of efficiency and reliability. Algorithms only with the two properties are suitable for solving practical engineering application.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This research is partially supported by the China National Natural Science Foundation under Grant 61203178, Grant 61304214, and Grant 61290323. The authors thank the financial funds from Shanghai Science and Technology Committee under Grant 12511501002 and Grant 13511501302.