Special Issue

Artificial Intelligence for Computer Games

View this Special Issue

Research Article | Open Access

Volume 2009 |Article ID 162450 | https://doi.org/10.1155/2009/162450

Julio B. Clempner, "A Shortest-Path Lyapunov Approach for Forward Decision Processes", International Journal of Computer Games Technology, vol. 2009, Article ID 162450, 12 pages, 2009. https://doi.org/10.1155/2009/162450

A Shortest-Path Lyapunov Approach for Forward Decision Processes

Accepted07 Sep 2008
Published15 Jan 2009

Abstract

In previous work, attention was restricted to tracking the net using a backward method that knows the target point beforehand (Bellmans's equation), this work tracks the state-space in a forward direction, and a natural form of termination is ensured by an equilibrium point . We consider dynamical systems governed by ordinary difference equations described by Petri nets. The trajectory over the net is calculated forward using a discrete Lyapunov-like function, considered as a distance function. Because a Lyapunov-like function is a solution to a difference equation, it is constructed to respect the constraints imposed by the system (a Euclidean metric does not consider these factors). As a result, we prove natural generalizations of the standard outcomes for the deterministic shortest-path problem and shortest-path game theory.

1. Introduction

The shortest-path problem (see [1β3]) plays a fundamental role in Petri nets theory, since it can be used to model processes. The analysis of these models can show useful information about the process. For example, deadlocks, equilibrium points, and so forth can be identified by computational analysis.

While it is possible to analyze such processes using the existing classical theory through the Bellman's equation with the cost criterion ([4β15]), much of this theory has few disadvantages. Bellman's equation is expressed as a sum over the state of a trajectory needs to be solved backwards in time from the equilibrium point (target point). It results in an optimal function when it is governed by Bellman's principle, producing the shortest path needed to reach a known equilibrium point. Notice that the necessity to know the equilibrium point beforehand when applying the equation is a significant constrain, given that, in many practical situations, the state space of a Petri net is too large for an easy identification of the equilibrium point.

Moreover, algorithms using Bellman's equation usually solve the problem in two phases [16]: preprocessing and search. In the preprocessing phase, the distance is usually calculated between each state and the equilibrium points (final states) of the problem, in a backward direction. Then, in the search phase, these results are employed to calculate the distance between each state and the equilibrium points, leading the search process to a forward search.

Tracking the state space in a forward direction allows the decision maker to avoid invalid states that occur in the space generated by a backward search. In most cases, the forward search gives the impression to be more useful than the backward search. The explanation is that in the backward direction, when the case of incomplete final states arises, invalid states appear causing problems.

Shortest-path problem [17, 18] can be classified by two key categories [19]: (a) the single-source shortest-path problem where the goal is to find the shortest path from a given node to a target node (e.g., the algorithms of Dijkstra and Bellman-Ford); and (b) the all-pairs shortest-path problem is a similar problem in which the objective is to determine the shortest path between every pair of nodes in the net (e.g., the algorithms of Floyd-Warshall and Johnson).

We are concerned about the first case. However, we consider dynamical systems governed by difference equations described by Petri nets. The trajectory over the net is calculated using a discrete Lyapunov-like function. A Lyapunov-like function is considered as a distance function denoting the length from the source place to the equilibrium point. This work is concerned with the analysis of the decision process where a natural form of termination is ensured by an equilibrium point.

Lyapunov-like functions can be used as forward trajectory-tracking functions. Each applied optimal action produces a monotonic progress towards an equilibrium point. Because it is a solution to the difference equation, naturally it will lead the system from the source place to the equilibrium point.

It is important to note that there exist areas of research using Petri nets as modeling tool where the use of a Lyapunov-like function is inherent. For instance, the βEntropyβ function is a specific Lyapunov-like function used in Information Theory as a measure of the information disorder. The βfree Gibbs energy functionβ is a Lyapunov-like function used in molecular biology for calculating the energy change in a metabolic network.

This paper introduces a modeling paradigm for shortest-path decision process representation in Petri nets theory. The main point of this paper is its ability to represent the characteristics related only with the global system behavior, and those characteristics related with the trajectory-tracking behavior.

Within the global system behavior properties, we show notions of stability. In this sense, we call equilibrium point to the place in a Petri net that its marking is bounded and it is the last place in the net (sink).

In the trajectory-tracking behavior properties framework, we define the trajectory function as a Lyapunov-like function. By an appropriate selection of the Lyapunov-like function, it is possible to optimize the trajectory. By optimizing the trajectory, we understand that it is the minimum trajectory-tracking value (in a certain sense). In addition, we use the notions of stability in the sense of Lyapunov to characterize the stability properties of the Petri net. The core idea of our approach uses a nonnegative trajectory function that converges in decreasing form to a (set of) final decision states. It is important to point out that the value of the trajectory function associated with the Petri net implicitly determines a set of policies, not just a single policy (in case of having several decisions states that could be reached). We call βoptimum pointβ the best choice selected from a number of possible final decision places that may be reached (to select the optimum point, the decision process chooses the strategy that optimizes the trajectory-tracking value).

As a result, we show that the global system behavior properties and the trajectory-tracking behavior properties of equilibrium, stability, and optimum-point conditions meet under certain restrictions: if the Petri net is finite, then we have that a final decision place is an equilibrium point.

The paper is structured in the following manner. The next section discusses the motivation of the work. Section 3 presents the formulation of the decision model, and all the structural assumptions are introduced there, giving a detailed analysis of the equilibrium, stability, and optimum-point conditions for the global system behavior properties and the trajectory tracking behavior parts of the Petri net. Section 4 presents the properties of the model. Finally, in Section 5 some concluding remarks are outlined.

2. Motivation

In this paper, we consider dynamical systems in which the time variable changes discretely, and the system is governed by ordinary difference equations. Let us consider systems of first-order difference equations given by where with are the state variable of the system, is the initial state, and are the action of the system, . The system is specified by the state transition function , which is always assumed as a one-to-one function for any fixed and , continuous in all its arguments.

Lyapunov defined a scalar function , called a Lyapunov-like function, inspired by a classical energy function, which has four important properties that are sufficient for establishing the domain of attraction of a stable equilibrium point: (a) such that ; (b) for all ; (c) when ; and (d) for all . The condition (a) requires the equilibrium point to have zero potential by means of a translation to the origin, (b) means that the Lyapunov-like function to be semipositive defined, (c) means that there is no reachable from some , and (d) means that the Lyapunov-like function has a minimum at the equilibrium point.

The main idea of Lyapunov is attained in the following interpretation: given an isolated physical system, if the change of the energy for every possible state is negative, with the exception of the equilibrium point , then the energy will decrease until it finally reaches the minimum at . Intuitively, this concept of stability means that a system perturbed from its equilibrium point will always return to it.

A system is stable [20, 21] if for a given set of initial states the state of the system ensures (i) to reach a given set of states and stay there perpetually or, (ii) to go to a given set of states infinitely often. The conventional notions of stability in the sense of Lyapunov and asymptotic stability can be used to characterize the stability properties of discrete event systems. An important advantage of the Lyapunov approach is that it does not require high-computational complexity but the difficulty lies in specifying the Lyapunov-like function for a given problem.

At this point, it is important to note that the Lyapunov-like function is not unique, however the energy function of a system is only one of its kind. A system whose energy decreases on the average, but not necessarily at each instance, is stable but is not a Lyapunov-like function.

Lyapunov-like functions [22] can be used as trajectory-tracking functions and optimal cost-to-target functions. As a result of calculating a Lyapunov-like function, a discrete vector field can be built for tracking the actions over the net. Each applied optimal action produces a monotonic progress (of the optimal cost-to-target value) toward an equilibrium point. In this sense, if the function decreases with each action taken, then it approaches an infimum/minimum (that converges asymptotically or reaches a constant).

From what we have stated before, we can deduce the following geometric interpretation of distance [22]: (a) is a measure of the distance from the starting state to any state in the state space (this is straightforward from the fact that such that and for all ); and (b) the distance from the stating state to any state in the state space decreases, when . It is because for all .

A Lyapunov-like function can be considered as a distance function denoting the length from the initial state to the equilibrium point. It is important to note that the Lyapunov-like function is constructed to respect the constraints imposed by the difference equation of the system. In contrast, a Euclidean metric does not take into account these factors. For that reason, the Lyapunov-like function offers a better understanding of the concept of the distance required to converge to an equilibrium point in a discrete dynamical system.

By applying the computed actions, a kind of discrete vector field can be imagined over the search graph. Each applied optimal action yields a reduction in the optimal cost-to-target value, until the equilibrium point is reached. Then, the cost-to-target values can be considered as a discrete Lyapunov function.

In our case, an optimal discrete problem, the cost-to-target values are calculated using a discrete Lyapunov-like function. Every time a discrete vector field of possible actions is calculated over the decision process. Each applied optimal action (selected via some βcriteriaβ) decreases the optimal value, ensuring that the optimal course of action is followed and establishing a preference relation. In this sense, the criteria change the asymptotic behavior of the Lyapunov-like function by an optimal trajectory-tracking value.

Usually, the criterion in optimization problems is related with the choice of whether to minimize or maximize the optimal action. If the problem is related with energy transformations, as is classically the case in control theory, then the criterion of minimization is applied. However, if the dilemma involves a reward, typical in game theory, then maximization is considered. In this work, we will arbitrary consider the criterion of minimization.

The Lyapunov-like function can be employed as a trajectory-tracking function through the use of an operator, which represents the criterion that selects the optimal action that forces the function to decrease and approaches an infimum/minimum. It forces the function to make a monotonic progress toward the equilibrium point. The Lyapunov-like function can be defined, for example, as which means that the optimal action is chosen to reach the infimum/minimum. The function works as a guide leading the system optimally from its initial state to the equilibrium point.

Example 1. To illustrate the shortest-path problem, let us consider a grid world (see Figure 1). At each time step, an agent is able to select an action among a finite set of actions, for example, . A transition model specifies how the world changes when an action is executed. An βequilibrium pointβ is a natural final state of the system. Therefore, the shortest-path problem is a search through the state space for an optimal path to the equilibrium point , using a deterministic transition model. The value of a state is a number that intuitively speaking expresses the desirability of state . For instance, let us consider the state-value function being equal to the min function [23] as a specific Lyapunov-like function able to lead an agent to an equilibrium point in a grid world.Example 2. The relative entropy or Kullback-Leibler [24, 25] distance between two probability distributions and is defined asIn the above definition, we use the convention (based on continuity arguments) that and . The relative entropy is always nonnegative and is zero if and only if . is a distance-like function between distributions since it is not symmetric and does not satisfy the triangle inequality.

Example 3. Glycolysis pathway (see Figure 2) is well known and described [11, 26, 27]. It is a ten-step catabolic pathway that makes use of eleven different enzymes. The outcome are the conversion of glucose in two molecules of pyruvate with concurrent net production of 2 ATPs. Glycolysis process can be divided in two stages: (1) the conversion of glucose to glyceraldehyde 3-phosphate with a required input of 2 ATPs, (2) the conversion of glyceraldehyde 3-phosphate to pyruvate with a net output of 4 ATPs.
Glycolysis can be informally explained from an energetic perspective as follows. The initial amount of glucose may be represented as a ball at the top of an irregular hill. Every time the ball bounces, the hill represents a reaction state in the breakdown of the sugar process. Each bounce of the ball corresponds to a change in free energy level. This energy change is modeled by the Gibbs energy function which is a Lyapunov-like function. It is important to note that bounces are irregular (reaching lower and higher energy levels) and determined by the environment conditions. The final state (pyruvate) is represented by the bottom of the hill where the ball reaches a steady state (not bounces).
Let us explain the Petri net dynamics of the system model as follows. Continuing with the ball and hill explanation, let us suppose that the ball, representing the product pyruvate, is at the bottom of the hill. And let us suppose that there is no net force able to move the ball either up or down the hill. That means that the reactions (forward and backward) are evenly balanced. Therefore, the substances and products are in equilibrium, and no net dynamics will take place. That is, βthe metabolic network system is in equilibrium.β

3. Formulation

We introduce the concept of decision process Petri nets (DPPNs) by locally randomizing the possible choices, for each individual place of the Petri net [23, 28]. Definition 1. A decision process Petri net is a 7-tuple , where
(i) is a finite set of places,(ii) is a finite set of transitions,(iii) is a flow relation, where and such that and ,(iv)W: is a weight function,(v): is the initial marking,(vi): is a routing policy representing the probability of choosing a particular transition, such that for each (vii): is a trajectory-tracking function.

We adopt the standard rules about representing nets as directed graphs, namely, places are represented as circles, transitions as rectangles, the flow relation by arcs, and markings are shown by placing tokens within circles [29]. As usual, we will denote and for all A source place is a place such that (there are no incoming arcs into place ). A sink place is a place such (there are no outgoing arcs from ). A net system is a pair comprising a finite net and an initial marking . A transition is enabled at a marking denoted by if for every , we have that . Such a transition can be executed, leading to a marking defined by . We denote this by or . The set of reachable markings of is the smallest (with respect to set inclusion) set containing and such that if and , then .

The previous behavior of the DPPN is described as follows. When a token reach a place, it is reserved for the firing of a given transition according to the routing policy determined by . A transition must fire as soon as all the places contain enough tokens reserved for transition . Once the transition fires, it consumes the corresponding tokens and immediately produces an amount of tokens in each subsequent place . When for means that there are no outgoing arcs in the place-transitions Petri net (i.e., is a sink).

In Figure 2, we have represented partial routing policies that generate a transition from state to state , where as follows.

Case 1. The probability that generates a transition from state to is 1/3. But, because transition to state has two arcs, the probability to generate a transition from state to is increased to 2/3.

Case 2. We set by convention for the probability that generates a transition from state to is 1/3 (1/6 plus 1/6). However, because transition to state has only one arc, the probability to generate a transition from state to is decreased to 1/6.

Case 3. Finally, we have the trivial case when there exists only one arc from to and from to .

It is important to note that, by definition, the trajectory-tracking function is employed only for establishing a trajectory tracking, working in a different execution level of that of the place-transitions Petri net. The trajectory-tracking function in no way change the place-transitions Petri net evolution or performance.

denotes the trajectory-tracking value at place at time and let denote the trajectory-tracking state of at time . is the number of arcs from place to transition (the number of arcs from transition to place ).

Consider an arbitrary and for each fixed transition that forms an output arc , we look at all theprevious places of the place denoted by the list (set) , where , that materializes all the input arcs and forms the sum where and the index sequence is the set & running over the set . Remark 1. denote the previous places to for a fixed transition .

Continuing with all the 's, we form the vector indexed by the sequence identified by as follows:Intuitively, vector (5) represents all the possible trajectories through the transitions s to a place for a fixed , where is represented by the sequence and

Then, formally we define the trajectory-tracking function as follows. Definition 2. The trajectory-tracking function with respect a decision process Petri net is represented by the following equationwherethe function is a Lyapunov-like function which optimizes the trajectory-tracking value through all possible transitions (i.e., through all the possible trajectories defined by the different is the decision set formed by the 's; , of all those possible transitions , is the index sequence of the list of previous places to through transition is a specific previous place of through transition .

Example 4. OR-Path (see Figure 3). Define the Lyapunov-like function in terms of the Entropy as :
(i),(ii),(iii).

Example 5. AND-Path (see Figure 4). Define the Lyapunov-like function in terms of the Entropy as :
(i),(ii),(iii),(iv).

From the previous definition, we have the following remark. Remark 2. (i) Note that the Lyapunov-like function guarantees that the optimal course of action is followed (taking into account all the possible paths defined). In addition, the function establishes a preference relation because, by definition, is asymptotic; this condition gives to the decision maker the opportunity to select a path that optimizes the trajectory-tracking value.
(ii) The iteration over for is as follows: (1)for and the trajectory-tracking value is at place and for the rest of the places the trajectory-tracking value is 0;(2)for and the trajectory-tracking value is at each place , and is computed by taking into account the trajectory-tracking value of the previous places for and (when needed).

Property 1. The continues function satisfies the following properties:
(1) such that (a)if there exists an infinite sequence with such that , then is the infimum, that is, ;(b)if there exists a finite sequence with such that , then is the imum, that is, , where , ;(2) or , where , for all such that ;(3)for all such that then .

From the previous property, we have the following remark. Remark 3. In property 1 point 3, we state that for determining the asymptotic condition of the Lyapunov-like function. However, it is easy to show that such property is convenient for deterministic systems. In Markov decision process, systems are necessary to include probabilistic decreasing asymptotic conditions to guarantee the asymptotic condition of the Lyapunov-like function.

Property 2. The trajectory-tracking function is a Lyapunov-like function.

Proof. Proof comes straightforward from the previous definitions.

Remark 4. From Properties 1 and 2, we have the following:
(i) or means that a final state is reached. Without lost generality, we can say that by means of a translation to the origin.(ii) In Property 1, we determine that the Lyapunov-like function approaches to a infimum/minimum when is large thanks to property (d) of the definition the Lyapunov-like function (see motivation).(iii) Property 1, point 3 is equivalent to the following statement: such that , for all such that .

Explanation.

Intuitively, a Lyapunov-like function can be considered as trajectory-tracking function and optimal cost function. In our case, an optimal discrete problem, the cost-to-target values are calculated using a discrete Lyapunov-like function. Every time a discrete vector field of possible transitions is calculated over the decision process. Each applied optimal transition (selected via some βcriterion,β e.g., ) decreases the optimal value, ensuring that the optimal course of action is followed and establishing a preference relation. In this sense, the criterion changes the asymptotic behavior of the Lyapunov-like function by an optimal trajectory-tracking value. It is important to note that the process finished when the equilibrium point is reached. This point determines a significant difference with Bellman's equation.

Example 6 (Conc-Path (see Figure 2)). Biochemical pathway of the free energy profile of the glycolysis and pentose-phosphate. The following was adapted from Biochemistry Lehninger et al. [26] and Campbell and Farrel [30]. The free energy changes were calculated using the steady-state metabolite concentrations in RBC's and the equation . was set arbitrarily at the end of the pathway after the pyruvate kinase step. The overall reaction for the pathway is shown in Figure 1. Because , we will use the function to select the proper element of the vector :
(i)βkcal/mol;(ii)βkcal/mol.
A decision is taken and is selected instead of based in the environment condition modeled via the routing policy (1/3, 2/3).
(i)βkcal/mol.(ii)βkcal/mol.(iii)βkcal/mol.
The Conc-Path is calculated at .
(i)(ii)βkcal/mol.(iii)βkcal/mol.(iv)βkcal/mol.(v)βkcal/mol.(vi)βkcal/mol.(vii)was set arbitrarily at the end of the pathway, that is, after the pyruvate kinase step.

Remark 5. We are using to denote the OR-Path, to denote the AND-Path, and to denote the Conc-Path.

4. Properties of the Model

We will identify the global system properties of the DPPN as those properties related with the PN. Theorem 1. The decision process Petri net is bounded by a place of the system.Proof. Let us suppose that the DPPN is not finite. Then is never reached. Therefore, it is possible to evolve in time and to reduce the trajectory function value over . However, the Lyapunov-like trajectory function converges to zero when (or reached a minimum), that is, or .Theorem 2. Let be a decision process Petri net bounded by a place . Then, a Lyapunov-like trajectory function can be constructed if and only if is reachable from .Proof. If is a Lyapunov-like function then by the previous theorem is reachable.
By induction, let us construct the optimal inverse path from to . At each discrete time in descending order ( is the maximum place index) the place of a system is observed and a transition leading to is chosen. We choose the trajectory function as the best choice set of states. We continue this process until is reached. Then, the trajectory function is a Lyapunov-like function.

Notation.

Let , , , and .

Let us consider systems of first ordinary difference equations given by where and is continuous in . Definition 3. The -vector valued function is a solution of (8) if and for all .Definition 4. The system (8) is said to be (see [20, 21]) practically stable if, given with , it holds thatDefinition 5. The system (8) is said to be (see [20, 21]) uniformly practically stable, if it is practically stable for every .Definition 6. A continuous function belongs to class if it is strictly increasing and .

Let us consider [21] the vector function , and let us define the variation of relative to (8) by

Then, we have the following results [20, 21, 31, 32]. Theorem 3. Let be a continuous function in , such that for , it holds that and holds for , , where is a continuous function in the second argument. Suppose that is nondecreasing in , are given and finally that is satisfied. Then, the stability properties ofimply the corresponding stability properties of the system (8).Proof. The stability properties are preserved for the following.
(1) Practically stable. Let us suppose that is practically stable for then, we have that for , where is the solution of (11). Let , we claim that for . If not, there would exist and a solution such that and for . Choose , then for all . (If not and which is a contradiction). Hence we get that (where the last inequality is because the condition ), which cannot hold therefore, system (8) is practically stable.
(2) Stable. Suppose that system (11) is stable, that is, for all such that if for Now, since is a continuous function in there exists a such that if then setting equal to by the comparison principle (which was implicitly proved in point 1) implies that for all . Taking equal to the one given from the continuity of , for . If not, there would exist such that and for but thenwhich cannot hold therefore, we must have that for as desired.
(3) Asymptotically stable. We know that system (8) is stable, the fact that it is asymptotically stable follows thanks to
(4) Uniformly stable. Assume that the comparison system is uniformly stable, meaning that (independent of ) such that for and let independent of such that for Since is a decreasing function there exists such that Then, choosing works (if ) and choosing we arrive to the inequalityBut is independent of . Therefore, the system (8) is uniformly stable.

We will extend the last theorem to the case of several Lyapunov functions. Let us consider a vector Lyapunov function , and let us define the variation of relative to (8). Then, we have the following theorem. Theorem 4. Let be a continuous function in , define the function such that it satisfies the estimates:for , , where is a continuous function in the second argument. Assume that is nondecreasing in , are given and is satisfied. Then, the practical stability properties ofimplies the corresponding practical stability properties of system (8).Proof. (1) Let us suppose that is practically stable for Then we have that for , where is the vector solution of (16). Let , we claim that for . If not, there would exist and a solution such that and for . Choose , then for all . Therefore we have that which cannot hold. As a result, system (8) is practically stable.
(2) From the continuity of with respect to the second argument, it is always possible to make . We want to prove that for If it is not true, there exists an and a solution such that and for Then, we have that which proves our claim.
Remark 6. If in the point 1 of the proof it is not true that and , then we have that which is a contradiction.

Then, we have the following result [21]. Corollary 1. From Theorem 5, the following hold.
(1) If , the uniform practical stability of (8) which implies structural stability [21, 33] is obtained.(2) If , for , the uniform practical asymptotic stability of (8) [21] is obtained.
Example 7. The diamond is the stable form of carbon at extremely high pressures while the graphite is the stable form at normal atmospheric pressures. Regardless of that, diamonds appear stable at normal temperatures and pressures, but, in fact, are very slowly converting to graphite. Heat increases the rate of this transformation, but at normal temperatures the diamond is uniformly practically stable.

For Petri nets, we have the following results of stability [31]. Proposition 1. Let be a Petri net. Therefore, is uniform practical stable if there exists a strictly positive vector such thatMoreover, is uniform practical asymptotic stability if the following equation holds:Proof. Let us choose as our candidate Lyapunov function with and vector to be chosen. It is simple to verify that satisfies all the conditions of Theorem 3. Therefore, the uniform practical asymptotic stability is obtained if there exists a strictly positive vector such that equation (17) holds.Proposition 2. Let be a Petri net. Therefore, is uniformly practically stable if there exists a strictly positive vector such thatProof. Since holds, therefore for every we have that .
This came from the fact that is positive.
Remark 7. The if-and-only-if relationship of (19) exists from the fact that is positive.Example 8. The biochemical pathway of the glycolysis (Figure 1). The incidence matrix is as follows: Choosing , , we obtain that concluding stability. Definition 7. An equilibrium point with respect to a decision process Petri net is a place such that , for all , and is a sink.Theorem 5. The decision process Petri net is uniformly practically stable iff there exists a strictly positive vector such that .Proof. It follows directly from Propositions 1 and 2.
Let us suppose by contradiction that with fixed. From we have that . Then, it is possible to construct an increasing sequence which grows up without bound. Therefore, the is not uniformly practically stable.
Remark 8. It is important to underline that the only places where the DPPN will be allowed to get blocked are those which correspond to equilibrium points.

We will identify the trajectory-tracking properties of the DPPN as those properties related with the trajectory-tracking value at each place of the PN. In this sense, we will relate an optimum point the best possible performance choice. Formally we will introduce the following definition [23]. Definition 8. A final decision point with respect to a decision process Petri net is a place where the infimum is asymptotically approached (or the minimum is attained), that is, or .Definition 9. An optimum point with respect to a decision process Petri net is a final decision point where the best choice is selected βaccording to some criteria.β

Property 3. Every decision process Petri net has a final decision point.

Remark 9. In case that , such that , then are optimum points.Remark 10. The monotonicity of guarantees that it is possible to make the search starting from the decision points.

Then, we can conclude the following theorem. Theorem 6. Let be a finite decision process Petri net and let be a realized trajectory which converges to such that . Let then the optimum decision point is reached in a time step bounded by Proof. Let us suppose that is never reached, then, is not a sink (the last place) in the decision process Petri net. So, it is possible to find some output transition to . Therefore, it is possible to reduce the trajectory function value over by at least . As a result, it is possible to obtain a lower value than (that is a contradiction).Theorem 7. Let be a decision process Petri net. Then, converges to an optimum (final) decision point .Proof. We have to show that converges to an optimum (final) decision point By the previous theorem, the optimum decision point is reached in a time step bounded by , therefore converges to Proposition 3. Let be a decision process Petri net and let be an optimum point. Then , for all such that .Proof. We have that is equal to the minimum or the infimum. Therefore, for all such that .Theorem 8. The decision process Petri net is uniformly practically stable iff .Proof. Let us choose , then . Then by the autonomous version of Theorem 4 and Corollary 1 the DPPN is stable.
We want to show that the DPPN is practically stable, that is, given , we must show that . We know that and since is nondecreasing, we have that .
Theorem 9. Let be a decision process Petri net. If is an equilibrium point, then it is a final decision point. Proof. Let us suppose that is an equilibrium point, we want to show that its trajectory-tracking value has asymptotically approached an infimum (or reached a minimum). Since is an equilibrium point, by definition, it is bounded and it is a sink, for example, its marking can not be modified. But, this implies that the routing policy attached to the transition(s) that follows is 0 (in case there is such a transition(s), i.e., worst case). Therefore, its trajectory-tracking value can not be modified and since the value is a decreasing function of , an infimum or a minimum is attained. Then, is a final decision point.Theorem 10. Let be a finite and nonblocking decision process Petri net (unless is an equilibrium point). If is a final decision point, then it is an equilibrium point. Proof. If is a final decision point, since the is finite, there exists a such that . Let us suppose that is not an equilibrium point.
Case 1. Then, it is not bounded. So, it is possible to increment the marks of in the net. Therefore, it is possible to modify its trajectory-tracking value. As a result, it is possible to obtain a lower value than .
Case 2. Then, it is not bounded and it is not a sink. So, it is possible to fire some output transition to in such a way that its marking is modified. Therefore, it is possible to modify the trajectory-tracking value over . As a result, it is possible to obtain a lower trajectory-tracking value than .
Corollary 2. Let be a finite and nonblocking decision process Petri net (unless is an equilibrium point). Then, an optimum point is an equilibrium point.Proof. From the previous theorem, we know that a final decision point is an equilibrium point and since in particular is final decision point, then it is an equilibrium point.

5. Completeness

Theorem 11. Let be a decision process Petri net and let be a realized trajectory which converges to such that . Let then an optimum point is reached in a time step bounded by Proof. Let us suppose that is never reached, then is not the last place in the decision process Petri net. So, it is possible to find some output transition to . Therefore, it is possible to reduce the trajectory function value over by at least . As a result, it is possible to obtain a lower value than (that is a contradiction).Remark 11. The complexity time differs with that of the Dijkstra's algorithm.Remark 12. Each path in corresponds to a trajectory of/in a given system. The trajectory-tracking function value of at the source place () divided by equals the length of the shortest-path. Then, the infimum is equivalent to the infimum length over all paths in .Theorem 12. Let be a decision process Petri net. Then, converges to a point .Proof. We have to show that converges to a point By the previous theorem, the optimum point is reached in a time step bounded by , therefore converges to Proposition 4. The finite and nonblocking (unless is an equilibrium point) condition over the can not be relaxed.Proof. (1) Let us suppose that the is not finite, that is, is in a cycle, then the Lyapunov-like function converges when , to zero, that is, but the DPPN has no final place therefore, it is not an equilibrium point.
(2) Let us suppose that the blocks at some place (not an equilibrium point) . Then, the Lyapunov-like function has a minimum at place lets say but is not an equilibrium point, because it is not necessary to have a sink in the net.

6. Conclusions

In this work, a formal framework for shortest-path decision process problem representation has been presented. Whereas in previous work, attention was restricted to tracking the net using a utility function Bellman's equation, this work uses a Lyapunov-like function. In this sense, we are changing the traditional cost function by a trajectory-tracking function which is also an optimal cost-to-target function for tracking the net. This makes a significant difference in the conceptualization of the problem domain. The Lyapunov method introduces a new equilibrium and stability concept in decision process.

References

1. D. P. Bertsekas and J. N. Tsitsiklis, βAn analysis of stochastic shortest path problems,β Mathematics of Operations Research, vol. 16, no. 3, pp. 580β595, 1991. View at: Publisher Site | Google Scholar
2. D. Blackwell, βPositive dynamic programming,β in Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 415β418, University of California Press, Berkeley, Calif, USA, June-July 1965. View at: Google Scholar
3. C. Derman, Finite State Markovian Decision Processes, Academic Press, New York, NY, USA, 1970.
4. F. Baccelli, G. Cohen, and B. Gaujal, βRecursive equations and basic properties of timed Petri nets,β Journal of Discrete Event Dynamic Systems, vol. 1, no. 4, pp. 415β439, 1992. View at: Google Scholar
5. F. Baccelli, S. Foss, and B. Gaujal, βStructural, temporal and stochastic properties of unbounded free-choice Petri nets,β Rapport de Recherche, INRIA, Sophia Antipolis, France, 1994. View at: Google Scholar
6. R. I. Bahar, E. A. Frohm, C. M. Gaona et al., βAlgebraic decision diagrams and their applications,β Formal Methods in System Design, vol. 10, no. 2-3, pp. 171β206, 1997. View at: Publisher Site | Google Scholar
7. G. Ciardo and R. Siminiceanu, βUsing edge-valued decision diagrams for symbolic generation of shortest paths,β in Proceedings of the 4th International Conference on Formal Methods in Computer-Aided Design (FMCAD '02), M. D. Aagaard and J. W. O'Leary, Eds., vol. 2517 of Lecture Notes in Computer Science, pp. 256β273, Portland, Ore, USA, November 2002. View at: Google Scholar
8. G. Cohen, S. Gaubert, and J. Quadrat, βAlgebraic system analysis of timed Petri nets,β in Idempotency, J. Gunawardena, Ed., Collection of the Isaac Newton Institute, Cambridge University Press, Cambridge, UK, 1998. View at: Google Scholar
9. J. H. Eaton and L. A. Zadeh, βOptimal pursuit strategies in discrete-state probabilistic systems,β Journal of Basic Engineering, vol. 84, pp. 23β29, 1962. View at: Google Scholar
10. B. Gaujal and A. Giua, βOptimal routing of continuous timed Petri nets,β Automatica, vol. 40, no. 9, pp. 1505β1516, 2004. View at: Publisher Site | Google Scholar | MathSciNet
11. K. Hinderer and K.-H. Waldmann, βThe critical discount factor for finite Markovian decision processes with an absorbing set,β Mathematical Methods of Operations Research, vol. 57, no. 1, pp. 1β19, 2003. View at: Publisher Site | Google Scholar
12. K. Hinderer and K.-H. Waldmann, βAlgorithms for countable state Markov decision models with an absorbing set,β SIAM Journal on Control and Optimization, vol. 43, no. 6, pp. 2109β2131, 2005. View at: Publisher Site | Google Scholar | MathSciNet
13. V. Khomenko and M. Koutny, βVerification of bounded Petri nets using integer programming,β Formal Methods in System Design, vol. 30, no. 2, pp. 143β176, 2007. View at: Publisher Site | Google Scholar
14. A. F. Veinott, βDiscrete dynamic programming with sensitive discount optimality criteria,β Annals of Mathematical Statistics, vol. 40, no. 5, pp. 1635β1660, 1969. View at: Publisher Site | Google Scholar
15. H.-C. Yen, βA valuation-based analysis of conflict-free Petri nets,β Systems and Control Letters, vol. 45, no. 5, pp. 387β395, 2002. View at: Publisher Site | Google Scholar | MathSciNet
16. A. S. Poznyak, K. Najim, and E. Gomez-Ramirez, Self-Learning Control of Finite Markov Chains, Marcel Dekker, New York, NY, USA, 2000.
17. A. Auslender and M. Teboulle, βInterior gradient and proximal methods for convex and conic optimization,β SIAM Journal on Optimization, vol. 16, no. 3, pp. 697β725, 2006. View at: Publisher Site | Google Scholar | MathSciNet
18. S. Pan and J.-S. Chen, βEntropy-like proximal algorithms based on a second-order homogeneous distance function for quasi-convex programming,β Journal of Global Optimization, vol. 39, no. 4, pp. 555β575, 2007. View at: Publisher Site | Google Scholar | MathSciNet
19. T. Cormen, C. Leiserson, R. Rivest, and C. Stein, Introduction to Algorithms, MIT Press and McGraw-Hill, Cambridge, Mass, USA, 2nd edition, 2001.
20. V. Lakshmikantham, S. Leela, and A. A. Martynyuk, Practical Stability of Nonlinear Systems, World Scientific, Singapore, 1990.
21. V. Lakshmikantham, V. M. Matrosov, and S. Sivasundaram, Vector Lyapunov Functions and Stability Analysis of Nonlinear Systems, Kluwer Academic Publishers, Dordrecht, The Netherlands, 1991.
22. R. E. Kalman and J. E. Bertram, βControl system analysis and design via the “second method” of Lyapunov,β Journal of Basic Engineering, vol. 82(D), pp. 371β393, 1960. View at: Google Scholar
23. J. B. Clempner, βColored decision process Petri nets: modeling, analysis and stability,β International Journal of Applied Mathematics and Computer Science, vol. 15, no. 3, pp. 405β420, 2005. View at: Google Scholar
24. I. Csiszár, βInformation-type measures of difference of probability distribution and indirect observations,β Studia Scientiarum Mathematicarum Hungarica, vol. 2, pp. 299β318, 1967. View at: Google Scholar
25. S. Kullback and R. A. Leibler, βOn information and sufficiency,β Annals of Mathematical Statistics, vol. 22, no. 1, pp. 79β86, 1951.
26. A. L. Lehninger, D. L. Nelson, and M. M. Cox, Principles of Biochemistry, Worth, New York, NY, USA, 4th edition, 2004.
27. H. Matsuno, S. Fujita, A. Doi, M. Nagasaki, and S. Miyano, βTowards biopathway modeling and simulation,β in Proceedings of the 24th International Conference on Applications and Theory of Petri Nets (ICATPN '03), vol. 2679 of Lecture Notes in Computer Science, pp. 3β22, Eindhoven, The Netherlands, June 2003. View at: Publisher Site | Google Scholar
28. J. B. Clempner, βTowards modeling the shortest-path problem and games with Petri nets,β in Proceedings of the Doctoral Consortium at the ICATPN, pp. 1β12, Turku, Finland, June 2006. View at: Google Scholar
29. J. Desel and J. Esparza, βFree choice Petri nets,β in Cambridge Tracts in Theoretical Computer Science, vol. 40, Cambridge University Press, Cambridge, UK, 1995. View at: Google Scholar
30. M. K. Campbell and S. O. Farrell, Biochemistry, Brooks Cole, Florence, Ky, USA, 4th edition, 2002.
31. K. M. Passino, K. L. Burgess, and A. N. Michel, βLagrange stability and boundedness of discrete event systems,β Discrete Event Dynamic Systems: Theory and Applications, vol. 5, no. 4, pp. 383β403, 1995. View at: Publisher Site | Google Scholar
32. Z. Retchkiman, βPlace-Transitions Petri Nets: Class Notes,β CIC-I.P.N., 1998. View at: Google Scholar
33. T. Murata, βPetri nets: properties, analysis and applications,β Proceedings of the IEEE, vol. 77, no. 4, pp. 541β580, 1989. View at: Publisher Site | Google Scholar

Copyright © 2009 Julio B. Clempner. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.