Abstract

Pruning techniques and heuristics are two keys to the heuristic search-based planning. The helpful actions pruning (HAP) strategy and relaxed-plan-based heuristics are two representatives among those methods and are still popular in the state-of-the-art planners. Here, we present new analyses on the properties of HAP. Specifically, we show new reasons for which HAP can cause incompleteness of a search procedure. We prove that, in general, HAP is incomplete for planning with conditional effects if factored expansions of actions are used. To preserve completeness, we propose a pruning strategy that is based on relevance analysis and confrontation. We will show that both relevance analysis and confrontation are necessary. We call it the confrontation and goal relevant actions pruning (CGRAP) strategy. However, CGRAP is computationally hard to be exactly computed. Therefore, we suggest practical approximations from the literature.

1. Introduction

The research of AI planning has advanced to a new level. Pioneers from the AI planning community have developed various practical methods that can solve much larger problems than those toy problems in early days. Two such well known methods are SAT-based planning [1], and heuristic search-based planning [2]. SAT-based planning translates planning problems into propositional satisfiability problems or more general constraint satisfaction problems (CSPs) [3]. An obvious advantage of the method is that it can exploit the power of fast algorithms from the CSP literature [46]. On the other hand, the “planning as search” community has been pursuing more informative heuristics to make the search fast [7, 8]. While recent studies show that planning specific heuristics can make SAT-based planning methods more competitive [9, 10], the planning as search method has shown its potential in many kinds of planning problems including classical planning [11, 12], conformant planning [13], contingent planning [14], and probabilistic planning [15]. Recent International Planning Competitions (IPCs) http://ipc.icaps-conference.org/ have witnessed the success of heuristic search-based planners, since the winners: Fast-Forward (FF) [11], LPG [16], SGPlan [17], and Fast Downward [12] and its successors—LAMA [18] all employ heuristic search. Two enabling techniques underlying heuristic search-based planning are heuristic functions and pruning techniques. A heuristic function measures the distances to goal of states, while pruning techniques eliminate branches that are safe to ignore. Here, we focus on the pruning techniques.

HAP is a pruning strategy developed in FF with the idea of making the search process goal directed [19].Though, initially, it was a byproduct of the relaxed-plan heuristic, its notion has been popular and important in the design of top performance planners, such as Fast Downward [12]. However, the HAP strategy does not guarantee completeness; that is, it may cut branches that can reach the goal. Some of these cases were explained by Hoffmann and Nebel [11]. Nevertheless, here, we uncover a new case in which HAP can cause incompleteness. We study the conditions under which the new case will occur, the way to remedy the HAP strategy, and the cost of doing that. Based on our work, one can gain more insights into why Fast Downward, which employs the helpful transitions strategy, is powerful.

The rest of the paper is organized as follows. In the next section, we introduce some background. Then, we show the incompleteness of HAP and extend it to a more general one called goal relevant actions pruning (GRAP), which is complete only for STRIPS planning. In Section 4, we propose our confrontation and goal relevant actions pruning (CGRAP), which is complete for both STRIPS planning and actions with conditional effects. In Section 5, we discuss some pruning techniques in the literature that can be seen as approximations of CGRAP. Finally, we conclude the paper and discuss some future work.

2. Background

We will first introduce notations from the state space search-based planning, then methods for handling conditional effects, and finally the heuristic function and the HAP strategy in FF.

A planning task is a quadtuple , where is the set of atoms, is the set of actions, is a set of atoms called the initial state, and is the goal condition that each goal state must fulfill. States are denoted by sets of atoms. We adopted the “closed world” assumption; that is, an atom that is not in a state is false in the state. So, if and a state , then is logically equal to . An action is a pair , where is a set of atoms denoting the preconditions of and is the set of conditional effects of . Each conditional effect has the form , where , , and are conditions, add effects, and delete effects of , respectively. For an action and a state , if , then we say is applicable in . We use to denote all the actions that are applicable in . The execution of on , denoted by , results in a state , where if is applicable in , and otherwise. A state is called a goal state if . A plan for a planning task is an action sequence that transforms the initial state into a goal state. We use to denote its length, which is the number of actions in it. Here, we assume that a plan does not have redundant actions; that is, when some action is removed from , it will no longer be a plan.

Actions with conditional effects were introduced in the planning problem description language “Action description language” (ADL) [20]. And there are mainly three ways for handling conditional effects. Conditional effects of actions are expressed with the keyword “when”. Figure 1 (left) shows an action with two conditional effects: the first effect will happen with no condition and the second effect will happen if holds in the state where the action is executed. The three ways for handling conditional effects are, full expansion [21], IPP's method [22] and factored expansion [23]. Here, we focus on IPP's method, which is used in FF. The method will translate the action in Figure 1 (left) into the form shown in Figure 1 (right). From now on, we will call planning with actions with conditional effects ADL planning.

FF employs a forward state space search framework. Three key techniques of FF are the relaxed-plan-based heuristic (RP), HAP, and the enforced hill-climbing (EHC) algorithm. Here, we focus on RP and HAP. A relaxed plan is extracted from a relaxed version of a planning task where the delete effects of actions are ignored. Specifically, the relaxed version of a planning task is , where and . An action sequence is called a relaxed plan for if it is a plan for . For a state , its heuristic value is the length of a relaxed plan for the planning task . Note that the relaxed plan for is not unique and FF finds one using the Graphplan [24] algorithm. During the process of computing a relaxed plan, FF keeps track of the subgoals generated in the second propositional level of a planning graph, which is saved in a set . Helpful actions are the set of actions . For a state , the EHC algorithm only considers actions in and ignores others. This search strategy is called helpful actions pruning. We say that a strategy is complete if using it does not make a complete algorithm eliminate search branches that are directions to goal states. As shown in [11], HAP is incomplete for STRIPS planinng.

3. Complete Pruning Strategy for STRIPS

In this section, we will extend HAP to a complete strategy for STRIPS that we call goal relevant actions pruning (GRAP). We will then prove the completeness of GRAP for STRIPS and show that GRAP is incomplete for ADL planning. As a result, we will extend GRAP to a complete strategy for ADL planning in the next section.

3.1. Goal Relevant Actions

Helpful actions for a state are actions in that is relevant for adding the subgoals in . To obtain completeness, our goal relevant actions for are actions in that is relevant for adding every (sub)goal generated by the GraphPlan algorithm.

Definition 1 (Dependency among Facts). For two facts and a set of actions , is dependent on with respect to (denoted as ) if(1), (2), (3).

Definition 2 (Dependency between Facts and Actions). For an atom and an action , is dependent on with respect to (denoted as ) if(1), (2).

We note that Definitions 1 and 2 capture the relevant facts and actions to a goal. Specifically, if we are going to reach a goal , then the actions on which is dependent are relevant, and further, actions that adds facts on which is dependent are also relevant. Note that in the previous definitions we use “dependent,” instead of “relevant,” to indicate a directional relation.

Now, we are ready to introduce the notion of goal relevant actions. The actions are those a search algorithm could explore for reaching some goal state. Actions that are not relevant are to be ignored.

Definition 3 (Goal Relevant Actions, GRA). Given a planning task , actions that are relevant to are , actions that are relevant to are . Given a state , the “goal relevant actions” for is .

We propose the following pruning strategy based on GRA. For any search algorithm and any state , we only consider actions in and ignore those in . We call the strategy GRA pruning (GRAP). We will prove that GRAP is a generalization of HAP and is complete for STRIPS planning.

Proposition 4. For a planning task and any state of , .

As the directional relation is transitive, the correctness of Proposition 4 is straightforward in that .

Next, we will prove that GRAP is complete for STRIPS planning.

Proposition 5. GRAP is a complete pruning strategy for STRIPS planning.

Proof. Let be a STRIPS planning task, and let be one of the plans for . Note that is not redundant. As we restrict to STRIPS planning, each action has only one conditional effect, which is denoted by . We will prove that . We will compute with . Initially, . For the action , if its effect does not add a fact in then with removed will still be a plan. This contradicts the assumption that is not redundant. Following Definition 1, . Similarly, for , must add a fact in ; otherwise, with removed is still a plan, which contradicts the assumption that is not redundant. Therefore, . For , as is not redundant, it must hold that . Following Definition 2, is in . And according to Definition 3, holds. As is an arbitrary plan, we finish the proof.

Note that the above proof cannot be adapted to prove the completeness of HAP, as helpful actions are defined with respect to a specific relaxed plan. In other words, the arbitrariness of plans is not guaranteed.

3.2. GRAP is Incomplete for ADL Planning

Hoffmann and Nebel [11] pointed out that HAP is incomplete as the GraphPlan algorithm is greedy in computing shorter relaxed plans. Therefore, the source of this incompleteness could be eliminated if we use other algorithms to compute relaxed plans, other than GraphPlan. The method proposed by Hoffmann and Nebel [11] works in the following way: for a state and a relaxed planning task , they expand the planning graph to the level and collect subgoals backward from the level to level  1. Specifically, let be the subgoals at level , then

Following the method, is the union of subgoals of every relaxed plan for at level  1. We call actions in “full helpful actions” (FHAs). Intuitively, an FHA is equivalent to our definition . However, the former is more procedural, and ours is more formal. We will use our definition to develop a new pruning strategy for ADL planning. Before that, we will show, by example, that both the FHA pruning (FHAP) strategy and GRAP are generally incomplete for ADL planning.

Example 6. Given a planning task , where , , , and where is and is . The meaning of is as follows: its preconditions are and it has two conditional effects—the first is (denoted as ) and the second is (denoted as ). The action has one condition , and has a conditional effect that falsifies .

Next, let us consider the plan for Example 6. The difference between and is that the atom is not in . To make true, we would use action . However, executing on will result in a state where atom does not hold. After that, there is no action that can transform into a goal state. One could notice that this dead end is due to the fact that and both happened and destroyed . If we could prevent from happening, then we would succeed in finding a plan. This is the idea proposed by Weld [25], which is called “confrontation.” It is easy to see that with “confrontation” as a choice, we can find a plan .

In Example 6, is relevant for reaching a goal state. However, pruning strategies HAP, GRAP, and FHAP all ignore it. With a generalization, we have the following results.

Proposition 7. Let be an ADL planning task and the set of plans for be . If every plan contains an action of the form , then HAP, FHAP, and GRAP are incomplete for .

The correctness of Proposition 7 is in that if an action does not have any add effects, then it will not be considered as “helpful” or “relevant” anyway. As a result, this kind of actions will be mistakenly ignored.

From Example 6, we can see that actions that make “confrontations” are also “helpful” and “relevant.” Following this direction, we extend GRAP to a new pruning strategy that is complete for both STRIPS and ADL planning.

4. Complete Pruning Strategy for ADL Planning

We first introduce the notion of “confrontation and goal relevant actions” and then prove that its corresponding pruning strategy CGRAP is complete for ADL planning.

Definition 8 (Confrontational Dependency among Facts). For two atoms , a set of actions , is confrontationally dependent on with respect to (denoted as ) if(1), (2), (3).

Definition 9 (Confrontational Dependency between Facts and Actions). For two atoms , an action , is confrontationally dependent on with respect to (denoted as ) if(1), (2).

According to the previous two definitions, one could notice that actions that add or delete an atom are considered as relevant to .

Definition 10 (Confrontation and Goal Relevant Actions). Given a planning task , actions that are confrontationally relevant to are , and actions that are confrontationally relevant to are . Given a state , the “confrontation and goal relevant actions” for is .

The pruning strategy that considers actions in only, that is, ignores is called confrontation and goal relevant actions pruning (CGRAP). In the following proposition, we will prove that CGRAP is complete for ADL planning.

Proposition 11. CGRAP strategy is complete for ADL planning.

Proof (Proof by Contradiction). Given a planning task and any plan for it, we use to denote the result of executing on . Without loss of generality, suppose that is eliminated by CGRAP; that is, . When , as is not a redundant plan, must add one atom of ; that is, . Therefore, . For , either adds an atom , (2) adds an atom which is the (pre)condition of , or , or (3) deletes one condition of in order to prevent from happening. In case , , in case (2), , and in case (3), . In either cases, and according to our Definitions 8 and 9, holds. Note that . Therefore, we conclude that .

The completeness of CGRAP for ADL planning costs. One reason is that the pruning power of CGRAP is weak. In other words, CGRAP may cut a rather limited amount of branches of a search space. In addition, computing CGRAP is PSPACE-hard, as deciding irrelevant actions for a planning task is PSPACE-hard [26]. Therefore, it is practical to collect confrontation and goal relevant actions in an approximate way. In the next section, we will discuss some methods from the literature that fall into this scope.

5. Discussion

We will first review the helpful transition notion developed in Fast Downward [12] and then the delayed partly reasoning procedure [27].

Fast Downward is a representative planner that uses the planning formalism [28], which supports multivalued variables. It translates a propositional planning problem into an planning problem by utilizing an invariants analysis procedure [12]. After that, Fast Downward builds a causal graph that involves all the variables and a domain transition graph for each variable. Dependencies among variables are reasoned through the causal graph, and dependencies among values of a variable are reasoned through the corresponding domain transition graph. For details of the two kinds of graphs, please refer to [12]. The goal distance of a state is the sum of goal distances of variables. For each variable, its goal distance is computed by solving a shortest path problem formulated on the corresponding domain transition graph. In the problem, the source node is the value the variable currently takes, and the target node is the value that goal conditions require. When such a path is obtained, the transition associated with the first edge is labeled as “helpful transition.” Note that transitions are conditional effects of actions. So, we can collect “helpful actions” based on “helpful transitions.” Here we note that “helpful transitions” consider both goal relevant actions and actions that are for confrontations. The ability of collecting actions that are helpful for confrontations originates from the multivalued variable representation. In the representation, the change from one value to another one models both the add and delete effects of an action on a variable. As a result, actions have only one kind of effects, which are considered by Fast Downward to collect “helpful transitions.” In contrast, propositional planners, such as FF, consider only the add effects of actions for collecting “helpful actions.” Therefore, the “helpful transitions” strategy can be considered as an approximation of CGRAP.

The “delayed partly reasoning procedure” is proposed by Cai et al. [7]. This procedure is implemented on top of the propositional planning formalism. In the first action level of a planning graph, the procedure tracks harmful inducements with respect to an order of conditional effects and collects actions that confront the inducements. An inducement is that one conditional effect induces another conditional effect . It is harmful if deletes some previously added atoms of other conditional effects. As the procedure only operates on the first actions level and works with a predefined order, it is an approximation of CGRAP. Therefore, the computational cost of the procedure is not high.

6. Conclusions and Future Work

In this work, we analyzed some well-known pruning techniques, which are currently utilized by state-of-the-art planners. In particular, we showed that the helpful actions pruning strategy is incomplete for ADL planning and extended it to a complete strategy called confrontation and goal relevant actions pruning. Though our proposed strategy is computationally hard, we discussed methods from the AI planning literature that can be seen as approximations of it. In addition, we believe that this work will help us gain more insights into why the planner Fast Downward is powerful.

This work was done on pruning techniques in search-based planning. Future directions may consider pruning techniques in SAT-based ADL planning and conformant planning. As IPP's method for handling conditional effects does not lead to a high increase in problem size, it is suitable for SAT-based planning. Therefore, developing adaptations or approximations of our proposed strategy CGRAP in that settings could be interesting.

Acknowledgments

This work is supported by Natural Science Foundation of China (Grant no. 61103136), Educational Commission of Hubei Province of China (Grant no. D20111507), Hubei Province Key Laboratory of Intelligent Robot Open Foundation (Grant no. HBIR200909), and Youths Science Foundation of Wuhan Institute of Technology (Grant no. 12106022).