Abstract

This article concerns faults detection and isolation for timed stochastic discrete event systems modeled with partially observed timed Petri nets. Events occur according to arbitrary probability density functions. The models include the sensors used to measure events and markings and also the temporal constraints to be satisfied by the system operations. These temporal constraints are defined according to tolerance intervals specified for each transition. A fault is an operation that ends too early or too late. The set of trajectories consistent with a given timed measured trajectory is first computed. Then, the probability that the temporal specifications are unsatisfied is estimated for any sequence of measurements and the probability that a temporal fault has occurred is obtained as a consequence.

1. Introduction

The prevention of faults is a critical issue in numerous systems to preserve the safety of both equipment and human operators. These issues have been addressed in numerous studies with fault detection and diagnosis (FDD) methods. The aim of fault detection is to create an alarm each time a fault occurs, and the aim of diagnosis is to isolate the fault within a group of candidates [1]. In the domain of discrete event systems (DESs), FDD has been often formulated with automata, Petri nets (PNs) [2], in particular labeled PNs (LPNs) [3] or partially observed Petri nets (POPNs) [4]. The main reason for developing FDD tests with PN extensions is that such models include graphical representations that can be disseminated widely in numerous application domains. They also offer mathematical supports that are consistent with standard tools. The proposed methods are useful for a large variety of technological systems, ranging from computer or chemical engineering to manufacturing and intelligent transportation systems.

In numerous contributions, the faults that are considered are unexpected events that may occur in event sequences and that cannot be directly measured. Various approaches have been proposed with PN extensions to detect and isolate such unexpected events. These approaches are based either on the analysis of the PN reachability graph [59], on the direct properties of the PNs [10, 11], or on PN unfolding [12, 13]. A few results also concern the introduction of temporal information in the diagnosis process. At first, dates of events have been introduced in usual extensions of untimed PNs. Such dates lead to a more accurate estimation of the past and future fault occurrence probabilities [14] and are also useful to propose an evaluation of the unknown fault dates [15]. The design and identification of models that include temporal faults have been also considered [16, 17]. Then, fuzzy Petri nets have been used to model and check temporal constraints between event occurrences [18]. Partial orders with unfolding and (max, +)-linear inequalities have been used with timed PN models [19, 20]. Monotonic monitoring and stratification have been introduced, when the monitoring is fragmented because of the uncertain temporal observation [21]. Finally, indirect monitoring has been used by comparing the actual cycle periods with the expected one in order to detect faults [22].

This paper takes place in the context where both transitions and places are assumed to be partially observed and consider only temporal faults. For that purpose, temporal constraints are defined by tolerance intervals that are associated with the transitions and that represent the normal durations of the system operations. The aim of the diagnosis system is to generate alarms when the temporal constraints are no longer satisfied. For that purpose, timed POPNs (POTPNs) are introduced. POTPNs take into consideration some measurable events that correspond to dated and labeled transition firings and also to partial measurements of the marking vector that is dated. This formalism, fully described in [23], is useful to represent incomplete history of dated measurements collected by SCADA systems. In the present work, this model is extended by adding temporal constraints that give upper and lower bounds for each transition duration. The paper is organized as follows. In Section 2, temporal constraints and POTPNs are introduced. In Section 3, the main results are detailed. Examples are detailed throughout the paper. Section 4 concludes the paper.

2. Context and Notations

2.1. PNs with Temporal Specifications

A PN structure is defined as , where is a set of places and is a set of transitions, and are the post- and preincidence matrices ( is the set of nonnegative integer numbers), and is the incidence matrix. A PN is choice-free if (the postset of contains at most a single transition). is a PN system with initial marking and represents the PN marking vector. A PN system is 1-bounded if and only if (iff) where (inequality is considered component wise). A transition is enabled at marking iff , where is the column of preincidence matrix; this is denoted as . When is enabled, it may fire, and when fires once, the marking varies according to . This is denoted as . A sequence of size fired at marking is a sequence of transitions , with that consecutively fire from marking to marking . This is denoted as . The integer is the number of occurrences of transition in , and is the firing count vector for . A sequence fired at leads to an untimed trajectory detailed in with . A marking is said to be reachable from initial marking if there exists a firing sequence such that . The set of all reachable markings from initial marking is .

Timed Petri nets are PNs whose behavior is driven by time. Time is measured with time units (TU). The time can be associated with the firing of the transitions or with the sojourn of the tokens in the places. In this paper the time is associated with the transitions and the firing of each transition occurs after a firing duration that can eventually be zero. In that case the firing is immediate; in the other cases it is delayed. In this last case, the duration can be deterministic ( is a constant) or stochastic ( is a random variable (RV)) with a probability density function (pdf) . In this article, stochastic durations are concerned at first but the results are also applicable to deterministic durations. No particular assumption is made on the pdfs of the firing durations but the pdf of each transition is assumed to be known. The set of pdfs for all transitions is referred to as . Two classes of pdfs are of particular interest for this work: bounded uniform (Figure 1(a)) and symmetrical triangular pdfs (Figure 1(b)) defined, respectively, with equations (2) and (3).

Bounded uniform pdf is as follows: 

Symmetrical triangular pdf is as follows:The motivations to consider these pdfs are that uniform random processes may be obtained as a limit case of (2) for and and that (3) is useful to describe the dispersion of the duration around an average value. Note however that other pdfs can be considered. Note also that deterministic durations are obtained as the limit behavior of both pdfs when the support of the pdfs tends to zero.

The considered timed PNs have a time semantic [24, 25] that is defined according to infinite server as server policy (Assumption A1), where each transition is considered as a server for firings and, in a given marking, each transition may fire simultaneously several times depending on its enabling degree; race as choice policy (Assumption A2), where the transition whose firing time elapses first is assumed to be the one that will fire next; and enabling memory as memory policy (Assumption A3), where, at the entrance in a marking, the remaining durations associated with still enabled transitions are kept and decremented and the remaining durations associated with disabled transitions are forgotten. Note that an interaction may exist between enabling memory and infinite server policies. At the entrance in a marking, the enabled degree of some transitions may decrease to a nonzero value. In the next, the considered nets are assumed to be 1-bounded and choice-free (Assumption A4); thus this situation never occurs and the reachability set of the net is of finite cardinality . Finally, firings are assumed not to be immediate and (Assumption A5).

The nominal behavior of the timed Petri nets is also assumed to be constrained by temporal constraints. A time interval TC = is associated with each transition . The timed PN system satisfies the temporal constraints as long as the firing of the transitions has a duration not less than or larger than . The duration is measured from the transition enabling date to the date when it effectively fires. The set of all temporal constraints is referred to as . Such temporal constraints are useful to characterize the validity and the performance of the activities that are represented (e.g., operation of machines in manufacturing systems, transfer in automated transportation systems, and server load in communication systems). An activity with an exact duration without any tolerance is represented by a transition constrained by the time interval and an activity with no temporal specification is represented by a transition constrained by the time interval .

A timed firing sequence of length fired at marking in time interval is defined as where are the labels of the transitions and represent the dates of the firings that satisfy . This leads to the timed trajectory detailed in the following with : Note that we refer to timed and untimed firing sequences with the same notation as long as the notation is not confusing; otherwise we use to refer to untimed firing sequence and to refer to timed ones.

2.2. Partially Observed Timed Petri Nets

Partially observed Petri nets are considered to represent the system sensor. is a labeling function that assigns a label to each transition where is the set of labels that are assigned to observable transitions and is the null label that is assigned to the silent ones. The concatenation of labels obviously satisfies the following: and . For simplicity, each label is represented by the elementary vector of dimension such that with for and . The null label is represented by the zero vector of dimension . The labeling function is linear and defined by the matrix such that if ; otherwise (“·” stands for the product operator). The marking sensor matrix ( is the set of real numbers) defines the projection of the marking vector over subsets of places. The observable part of the marking is denoted as .

Thus, partially observed timed PNs with temporal constraints (POTPN) are defined as where is the set of pdfs, is the set of temporal constraints, is a Petri net structure, is an event sensor matrix, and is a marking sensor matrix. The matrices and define the sensor configuration.

Measurements are collected over the time interval . When the POTPN marking varies with the firing of a single transition at date , the measurement function is defined by Roughly speaking, the measurement function collects a new label each time a transition fires that is not silent or that changes the measurement of the marking. The measurement function is then extended to timed trajectories of the form (4) measured over the time interval [, ]: The measurement function collects successive dated marking and event measurements of a timed trajectory of length over time interval and organizes these measurements in a timed measured trajectory that is written as follows where , is the length of the sequence that satisfies , and refers to the set of measurement dates. Note that does not necessarily correspond to the measurement of initial marking . A timed trajectory is said to be consistent with a given timed measured trajectory in time interval [] if it satisfies . In the next, it is assumed that the time interval starts at time 0 (i.e., ) and that it ends at the last measurement date (i.e., (Assumption B).

The objective of the present work is to estimate the probability that any given timed measured trajectory satisfies the temporal constraints. An immediate application of the proposed estimation is to provide an algorithm that generates alarms when this probability goes down a specific threshold . To the best of our knowledge, it is the first time that this problem is considered with PNs. Note that POTPNs cannot be encoded as a Hidden Markov Model (HMM) [26] because in a HMM each state successively reached by the system delivers an observation that is not certain and depends on the emission probabilities. On the contrary, in a POTPN model, the states and the transitions deliver certain but partial observations and in some cases the states do not deliver any observation at all.

3. Temporal Specifications Checking

The proposed diagnosis systems operate with three stages.(i)For any timed measured trajectory with measurements, the set of timed trajectories that are consistent with are first computed with an integer linear programming approach developed in our previous work [23, 27].(ii)For each possible trajectory, the probability that this trajectory is consistent with the temporal constraints is estimated.(iii)The probability that a timed measured trajectory is consistent with the temporal constraints is obtained as a consequence by computing the probability of each consistent trajectory [15].

3.1. Untimed Trajectories Consistent with

In this section, the set of all untimed trajectories that are consistent with a given measured trajectory is computed. Note that this problem cannot be solved using standard algorithms (as the Viterbi algorithm) [28] issued from dynamic programming because such algorithms aim to find only the trajectory of maximum probability from the measured trajectory, but not all trajectories. For diagnosis issues it is however required to consider all trajectories. It is assumed that the set of the reachable states and also the reachability graph of the net are known. Let us define as the matrix of the reachability graph of the unobservable part of the PN where all transitions that are observable or whose firing changes the measured part of the marking have been removed and assume that this graph is acyclic (Assumption C). In that case, the maximal number of consecutive silent events is upper bounded by [23] with Let us consider a timed measured trajectory of the form (7) with measurements in time interval . An untimed trajectory with is consistent with iff the following conditions are satisfied [27]: (1);(2)There exists , such that , and the untimed firing sequence is rewritten as and satisfies the following:where all inequalities are taken component wise. Roughly speaking inequality (9) means that the firing count vector of each transition in is positive, unitary, and feasible (i.e., leading to a positive marking). Equality (10) means that for , first transitions are silent and only the last one may provide a label . Similarly it ensures that first marking measurement does not provide any information and only the last one may provide marking changes . The combined use of (9) and (10) leads to the exhaustive set of untimed trajectories that are consistent with [23, 27]. Note that does not include the silent closure of the trajectories (i.e., the continuations of the trajectories that provide no event nor marking measurement) because the time interval ends with the last measurement (Assumption B) and the no immediate firing occurs (Assumption A5). If required, the silent closure can easily be added to by considering the following equation in addition to (9) and (10):where stands for the silent closure.

Let us consider the marked POTPN1 of Figure 2 with , a single observable transition , and no observable place (unobservable places and transitions are highlighted in grey). The set of labels is and the matrices and define the sensor configuration. Measurements are collected over the time interval . Assume that the measured trajectory is measured according to and . Note at first that the given example satisfies Assumptions A to C and that (8) leads to . Thus, untimed trajectories are searched with and with . Two untimed trajectories and are consistent with in this particular example. If we assume that the first measured marking is , then the single trajectory remains consistent with .

3.2. Probability of a Timed Trajectory with a Set of Given Firing Dates

A timed trajectory with is consistent with iff the corresponding untimed trajectory obtained by making abstraction of time satisfies the previous conditions (conditions 1 and 2 in Section 3.1) and if the date satisfies the following conditions [23, 27]:(3) (i.e., the chronological order of the events results from ).(4)There exists dates such that .(5)The probability that each transition fires within a small interval [ of width is nonzero.For each transition that fires at date , let us consider the firing duration of transition and introduce and as the date and marking from which transition remains enabled. Thus is enabled from date and fires at date .The problem solved in this section is to evaluate the probability that the transitions of in (12) fire in specific intervals [ knowing that the measurement dates are and that the untimed trajectory is the true one. This probability is noted . In the case where are independent RV, the following holds:The difficulty is that variables are not necessarily independent RV. Table 1 details the different situations (type 1, 2, or 3) that may occur depending on whether the dates and are measured or not.

Situations of type 1, 2, or 3 will be used in the next section to evaluate the probability that a timed trajectory satisfies the temporal constraints.

3.3. Probability That a Timed Trajectory Satisfies the Temporal Constraints

The probability that any timed trajectory , obtained from untimed trajectory and consistent with , satisfies the set of temporal constraints results from the extension and integration of (13) with respect to the 3 types of situations described in Table 1. This leads to the following: withwhere dates and and functions are defined in Table 2 for . Note that if is of type 1 (i.e., is a deterministic variable), then and if ; otherwise . Thus situations of type 1 are no longer considered and (15) does not necessarily include all variables . To simplify the notation let us also divide the transitions of type 2 in several classes referred to as formally defined by the following: In other words, is the set of transitions of type 2 (their firing date is measured) that are enabled at the same date .

Roughly speaking, is a multi-integral that evaluates the sum of the duration variables over their possible range of variation constrained by the measurement time interval ], the dates of measurements , and the temporal specifications [. Similarly, evaluates the sum of the duration variables over their possible range of variation constrained only by the measurement time interval and the dates of measurements (the temporal specifications [ are not considered). The ratio of both evaluations leads to the probability that any timed trajectory , resulting from untimed trajectory and consistent with , satisfies the set of temporal constraints . From a numerical point of view the calculation of and is obtained with a recursive algorithm.

3.4. Detection of the Temporal Faults

In the general case, several untimed trajectories may be consistent with a given timed measured trajectory in time interval (i.e., contains more than one trajectory). This situation is due to two different reasons: () several markings may be consistent with the first marking measurement ; () from a given marking , several untimed sequences may be consistent with the measured trajectory . In order to deal with (), let us consider as the set of markings consistent with and as the probability that is the current marking at date 0 such that . The set and the probability for each are assumed to be known (Assumption D). Note that Assumption D can be relaxed for PNs without absorbing subsets of markings by considering the steady state probability of as . In order to deal with (), the probability of each sequence issued from the same marking and consistent with is evaluated with the following: Finally the probability of each trajectory is obtained with and satisfies satisfies and . This last equation leads toThen, the diagnosis of temporal faults results from the iterative evaluation of (19). For that purpose, a sampling period is considered and for each date , satisfies is updated depending on the eventual new measurements that are collected during time interval . Formally, if refers to the measured trajectory collected during time interval , the probability satisfies will be compared to a given threshold and an alarm is generated each time satisfies .

Let us consider again POTPN1 of Figure 2 introduced in Section 3.1. If the first measured marking is assumed to coincide with then = with and . The timed trajectories consistent with satisfy also with , , , and . Two successive cases are considered to illustrate the computation of satisfies with (19).

In case A, the pdfs of the transition durations are assumed to be bounded uniform with the same support for and . The temporal constraints are arbitrarily defined by for and for . satisfies computed with (19) is reported in Figure 3(a) (full line) in function of the date . For the considered example, this equation can be rewritten as follows:Note that this probability is zero for  TU (one lower bound at least is not satisfied for the temporal constraints) and also for  TU (one upper bound at least is not satisfied for the temporal constraints). This computation is confirmed with a series of 1000 Monte Carlo simulations that coincide with (dashed line). Depending on the choice of the threshold , an alarm may be generated. For example, for , an alarm is generated if .

In case B, the pdfs of the transition durations are assumed to be symmetrical triangular with supports for and for . The temporal constraints are, respectively, for and for . satisfies is computed with (19) in Figure 3(b) (full line) in function of the date . This computation is also validated with a series of 1000 Monte Carlo simulations that coincide with (dashed line).

3.5. Numerical Complexity

The numerical complexity of the whole diagnosis schema is due (a) to the computation of ; (b) to the numerical evaluation of (19).

(a) The complexity to compute is related to the resolution of (9) and (10) that include inequalities and equalities with unknown integer variables. Basically, the complexity is exponential with respect to the number of reachable markings in and to the length of the measured trajectories ( is a constant parameter). Branch and bound algorithms can be used to solve (9)-(10) as an integer linear programming problem (LPP) [24]. These algorithms have a general nonpolynomial complexity but limit the computational effort in many practical situations. An algorithm of linear complexity has been also developed in our previous work that limits the length of the timed measured trajectories under test. It considers measured trajectories within a sliding window of constant size instead of increasing size [25] and leads to an algorithm of linear complexity with respect to . Note also that the complexity with respect to is no longer exponential if the set is known (Assumption D).

(b) The numerical evaluation of (19) is obtained according to a recursive scheme with a deep equal to the number of transitions of type 2 or 3 in the considered sequence. Consequently the computation effort increases rapidly in time and in space with respect to the sequence length. To limit the computational complexity, the trajectory is divided into subtrajectories that are considered successively and independently: with . Each subtrajectory is of minimal length such that (i) ends with a measurement ; (ii) all transitions in are enabled from a marking that belongs to the same subtrajectory and not to a previous one. For this reason each subtrajectory can include several measurements. For example, the trajectory in (21) is divided into two subtrajectories and .

Trajectory decomposition is as follows:

3.6. Example

Let us consider the marked POTPN2 of Figure 4 (unobservable places and transitions are highlighted in grey) that represents a cycle of tasks. The state of the system is not measured and only two events are observed. The set of labels is . The matrices and define the sensor configuration. Measurements are collected over the time interval and the measured trajectory is considered. For this example and a single untimed trajectory , with is consistent with . The timed trajectories σ, consistent with satisfy with  TU,  TU, and . The pdfs of the transition durations are assumed to be bounded uniform with the same support for 1 : 5. The temporal constraints are also assumed to be identical for = 1 : 5.

satisfies is obtained by (19) withThis computation is validated with a series of simulations that leads to a probability of 0.47. The evaluation of satisfies with (19) saves time compared to the numerical evaluation based on simulation.

4. Conclusion

This article has proposed a diagnosis system that checks if the heterogeneous measurements obtained from a stochastic timed discrete event system with an uncomplete sensor configuration are consistent or not with a set of temporal constraints that specify tolerance intervals for the system operations. For this purpose, the set of trajectories consistent with a given timed measured trajectory are first characterized. Then the consistency of each trajectory with the temporal constraints is estimated as a probability. Finally the probability of each trajectory is also evaluated and the global probability that the temporal constraints are satisfied results from the previous steps. The diagnosis system returns an alarm each time this probability goes below a given threshold. The contribution is validated with simulation results.

In the future, we will consider the isolation of the temporal constraints that are unsatisfied. We will relax some assumptions, in particular Assumption B, in order to consider the silent closure of the trajectories. We will also study the problem from a structural point of view by providing some results to decide whether a set of sensors is suitable or not to check that a set of temporal constraints is satisfied. Finally we aim to apply the proposed approach to larger systems.

Conflicts of Interest

The author declares that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This study was supported by the Region Normandie and the European Union (Project MRT MADNESS 2016–2019).