Abstract

This paper presents a method for the identification of stochastic timed discrete event systems, based on the analysis of the behavior of the input and output signals, arranged in a timeline. To achieve this goal stochastic timed interpreted Petri nets are defined. These nets link timed discrete event systems modelling with stochastic time modelling. The procedure starts with the observation of the input/output signals; these signals are converted into events, so that the sequence of events is the observed language. This language arrives to an identifier that builds a stochastic timed interpreted Petri net which generates the same language. The identified model is a deterministic generator of the observed language. The identification method also includes an algorithm that determines when the identification process is over.

1. Introduction

Industrial systems usually have a sequential evolution, so they behave as discrete event systems (DES). A DES is a discrete-state, event-driven system; that is, its state evolution depends entirely on the occurrence of asynchronously discrete events over time [1]. A DES describes the system behavior by means of the occurrence of events from an initial state. In an industrial system two kinds of events can be observed. Input (I) events are related to control commands or external interactions with the system and output (O) events are related to sensor measures. In this proposal, the external signals not related with control commands define the operating modes, which affects the controller and allows modelling the system under different strategies; in [2] the authors use a similar concept to differentiate production patterns in batch processes, called multimode; they propose to separate the original space of a mode into two different parts (the common and the specific) and a monitoring process is carried out in each block; the multiblock monitoring method proposed is used for fault diagnosis purposes in multimode multivariate continuous processes; some of its proposals could be applied in stochastic DES regarding times identification, so it will probably reduce the sizes of time matrices and the complexity of the identification method.

Regarding DES, one problem which is being studied recently is system identification. This problem can be defined as follows: given a set of observed ordered timed I/O signals, determine a model such that, given a set of ordered I signals, the model approximates a set of the observed O signals. When a model of the system is not available, identification can be effectively used to obtain a model that can be then used to formally prove if the system meets the requirements and to improve its dependability [3].

Therefore, the first step in an identification process is to define the characteristics and the format of the model to fit. DES modelling starting form I/O signals has been addressed by many authors using different approaches.

Petri nets (PNs) have been recognized as a suitable model to describe DES [4], particularly when a system is asynchronous [58]. Hiraishi [9] presented an algorithm for the construction of a free labeled Petri net model from the knowledge of a finite set of its firing sequences. In [10], the authors define a class of continuous-time discrete event dynamic systems (DEDS) with two types of discrete-valued I/O signals: conditions signals and event signals called condition/event systems (C/E systems) and they define models of C/E systems too that are based on an extension of PNs (C/E PNs). C/E systems provide an intuitive modelling framework amenable to block diagram representation. The way of thinking and modelling a system is like a set of modules with a particular dynamic behavior and their interconnection via signals. This way of modeling is intuitive, and the modules can be pretailored and used over and over again. Each module is equipped with I/O signals which are of two types: condition I/O carrying state information and event I/O carrying state transition information. This way of system extension with I/O signals clearly reflects the duality of PNs, namely, the clear distinction between states and states transitions with their own graphical representation, semantics, and formal properties.

Rausch and Hanisch [11] proposed formalism for composition of I/O PNs to new systems which they called net condition/event systems (NCEs). I/O PNs are coupled by means of condition signals and event signals. The composition is performed in the same way as known from the composition of continuous systems in state space representation. This model can simulate the systems behavior by firing maximal steps and can also compute the complete state graph for the system, similar to the state graph for timed place/transition nets under the maximum firing strategy.

The general idea of NCEs is modelling a system as a set of modules with a particular dynamic behavior and their interconnection via signals [12]. Condition and event inputs can be connected with some transitions inside the module by condition and event arcs. Module places can be connected to the condition outputs by condition arcs, and transitions can be connected to the event outputs by event arcs. This concept provides a basis for a compositional approach to build larger models from smaller components [12].

Once the model has been defined, the following step is defined as the DES identification method. In the literature there exist a lot proposals related to DES identification methods, these can be compared by their complexity in terms of the system characteristics, identification process, and the model or algorithm applied [13], as well as by the identification model objective. From the theory of DES some identification methods have been developed using different techniques such as finite state machines (FSM), Automata, and PN; currently, the most popular models are PNs and Automata [7, 14].

The simplest methods are based on data from I/O, where concurrent states are not identified and its execution is off-line [13]. If the identification process is incremental, then the algorithm must be executed on-line, but its complexity is higher, because it can be polynomial or exponential, depending on the approach used for modeling. Furthermore, if the identification goal is the analysis of the model structure, the model must be developed to verify PN properties like being alive, conservative, achievable, among others. One of the most well-studied PNs problems is estimating the state of a given PN, based on its event sequence observation; in [15] the problem of finding the least-cost transition firing sequence(s) for a given interpreted PN (IPN) based on the observation of labels sequence is addressed. The PN possesses both observable transitions (which are associated with a possibly nonunique label) and unobservable transitions (whose firings do not generate any label observations). This paper assumes that each transition in the given IPN is associated with a nonnegative cost that captures its likelihood (e.g., in terms of the workload or the power amount needed to execute a certain transition). Given the observation of a labels sequence, the task is to find the transition firing sequence which (i) is consistent with both the observed label sequence and the PN structure and (ii) has the least total cost.

Recent literature presents results and trends to solve the identification problem. These trends can be grouped according to [13], in three groups: IPN identification by I/O signals (I/O IPN identification), nondeterministic finite automaton (NDFA), and application of integer linear programming (IPL) for IPN identification (ILP + IPN).

I/O IPN identification is related to the generation of DES, based on the observation of output signals [16]; an on-line algorithm computes an IPN model , describing the behavior of an unknown system . Every time a cyclic behavior is detected, the previous computed model is updated. This approach considers that each input to the system is reflected in the output. It means that, even if the system is not completely instrumented, the information provided by the sensors is enough to detect any change of state [17]. The algorithm receives a sequence of output signals obtained from observations during the system operation. These output signals must be binary vectors representing the current state of every one of the sensors measuring the output behavior of the system. In [17] the authors present an evolution of the method which is able to deal with concurrent systems by means of detecting concurrent transitions. The identified models are useful for structural diagnosis.

Regarding NDFA method, its approach is to build a model that approximates the original system language. The observed language of the closed loop DES is to be translated into a finite state machine by an appropriate identification algorithm. In [18, 19] a nondeterministic autonomous automaton with output (NDAAO) is chosen as an appropriate model to reproduce the observed language of closed loop DES. This approach identifies a closed loop DES based on the interaction of the controller with deterministic behavior and the physical process with nondeterministic behavior, combining internal and external events. It presents a proposal for division of the system into smaller subsystems. Also [20] addresses the problem of identification and diagnosis in DES based on timed Automata; the algorithm identifies and detects failures in the actuators in low complexity systems. An evolution of the method of [20] is proposed by Jarvis in [21]. This paper deals with the identification of timed event systems with timed Automata models. Some disadvantages of this trend are that it does not model the concurrence of states. Moreover, in complex systems there is a combinatorial explosion of states.

ILP + IPN uses an ILP approach and it computes an IPN. The model is built by using observed events and the available output vector ([2226]). The identification problem is to determine a set of places of cardinality , a set of transitions of cardinality , and the function of labeling and the PN defined as an ILP problem. Silent transitions are those that identify faults. A contribution of these methods is given in [3] that deals with the identification problem for deterministic systems as of timed PN, starting from the timed observed sequences. This method generates a net language, whereas the timer information is used to identify language strings that are not accepted by the net; therefore, the algorithm does not need to know the language of the full net. The solution is based on ILP and knowledge of the timed net structure. It allows, together with the observation of events, determining whether a transition time has expired. The algorithm is carried out through two subroutines, one to observe the language on-line and another for the off-line ILP. Furthermore, [27] contributes with a technique that reduces the number of sets of inequalities and the number of inequalities in each set to identify the IPN as an ILP problem. The other contribution in this area is given in [28], presenting a method that uses ILP with IPN, to prevent exhaustive generation of reachable states; it is considered nondeterminism in the model, since different observable transitions share the same label; it has a timing structure of events leading to the imposition of new restrictions and provides the most accurate diagnosis. It is a very efficient method in small models but to find an optimal solution in complex models has very high computational requirements.

These trends can be compared regarding to criteria such as kind of system to identify, resulting model and computational cost, we can say that: The I/O IPN identification cannot be used with timed systems; it can be applied to complex systems; the obtained models are less complex than those obtained using the Automata based methods and computational cost is normal. The NDFA trend applies to timed systems, and it works best on simple systems; it generates nondeterministic models and the computational cost is normal; furthermore, there is a combinatorial explosion of states. And regarding IPL + IPN, some proposals are for timed systems and some present nondeterminism so these methods cannot be applied today to identify real complex DES because of the high computational complexity of the algorithm.

Timed event systems identification with output symbols has been also treated in [21] using a technique from computational learning theory in the context of timed Automata and in [3] with free labeled nets; in this paper it is assumed that a delay is associated with each transition and the timer variables are used to determine if a potentially enabled transition; on the basis of the observed firings, a subset of the net language is build, while the timing information is exploited to determine a subset of counterexamples, that is, a set of strings that do not belong to the net language.

Considering the advantages and disadvantages of the existing identification methods, this paper proposes an identification method of timed discrete event complex systems, which identifies a deterministic language generator. The language generator is defined as a stochastic timed IPN (st-IPN). The proposal to model a system as st-IPN is inspired in some concepts of the modelling formalism by NCES, specifically on conditions and I/O events with the purpose of labeling the transitions and places. We have added properties to generate a DES modelling approach with the purpose of performing diagnostics on complex systems.

This paper is organized as follows. Section 2 describes the background on PN; Section 3 defines a st-IPN; Section 4 presents the identification method; Section 5 shows an application case; finally, the concluding remarks and discussion are showed in Section 6.

2. Background on PN

Petri nets (PNs) are widely used for modeling DES [29]. PN provides compact models and captures main DES characteristics as concurrence, asynchronism, causal relationships, mutual exclusions, and so forth. This paper will model DES with PN, so some basics of PN are presented.

Definition 1 (Petri net (PN)). A Petri net structure is a bipartite digraph represented by the five-tuple , where is a set of places with cardinality and is a set of transitions with cardinality , and , are the pre- and postincidence matrices, respectively, which specify the arcs connecting places and transitions. Matrix is the incidence matrix of the net. The marking function represents the number of tokens residing inside each place, and is the initial marking [26, 30].

For pre- and post-sets, the dot notation is used. [26].

Definition 2 (Reachability set of a PN). Let be a PN. Transition is enabled at marking if . The enabled transition can be fired reaching a new marking that can be computed by , where is a vector of zeroes, except in entry , that is equal to 1. An enabled sequence is denoted as . The reachability set of a PN is the set of all possible reachable markings from , firing only enabled transitions; this set is denoted by [31].

The system inputs will be the control commands and the outputs will be the sensor readings. At each time instant each input will have a particular value and so the outputs. In order to organize the signals, this paper encodes the different I/O vectors using its binary representation; therefore, it is defined an input (output) symbol as the set of control commands (sensor readings) values at a time instant in binary representation.

Definition 3 (I/O symbol). Assuming binary values in input and output signals, is a binary representation of , stands for the input symbol, and , with ; ; , the same applies to , is a binary representation of , stands for the output symbol, and , with ; ; .

For example, given a set of control commands ; if and then the input symbol is represented by .

Note that is possible that the controller will not generate all the input symbols, since some control commands cannot be enabled simultaneously. In the same way, the system will not generate all the output symbols if there is no failure.

Definition 4 (interpreted Petri net). An IPN is a tuple (see [32]), where is a PN. is the input alphabet, is an input symbol, and is the number of inputs; is the output alphabet, is an output symbol, and is the number of outputs; is a labeling transition function that assigns an input symbol to each transition. is an output function that assigns an output symbol to each reachable marking.

From the definition given in [33], an event is a pair . A new event is generated when there is a change in , in , or in both.

The system alphabet relates I/O symbols; that is, ; therefore if and , where .

If is the system alphabet and () is the input (output) alphabet, then (resp., ), where , if . , .

Definition 5 (timed event). A timed event is an I/O symbol at time , where is an I/O symbol at time . The time elapsed from to is (see [34]) and .

Definition 6 (firing language (see [35])). A firing sequence in an IPN is a transition sequence such that . Given a firing sequence , a marking is reached from if there exists a such that . The set of all firing sequences is called the firing language . Each transition is fired at time   ; therefore is an arranged sequence in a timeline, where each transition happens at time .

generates a timed event sequence (based on Definition 5), called system language whose events are arranged at the same timeline as .

We define the projection (resp., ), where if and if . ( if and if ). Therefore consider the following.(i)The input language is the generated language by the input symbols, is: Note that there could be several transitions with the same label if is not isomorphic.(ii)The The output language is the generated language by the output symbols, is: Note that there could be several places with the same label if is not isomorphic. For identification purposes, it is convenient to use isomorphic functions. Isomorphic mappings will lead to equivalences between output symbols and net markings. Therefore, it will be immediate to infer the net state from a particular combination of output symbols.

Note that the evolution on an IPN could be nondeterministic if is nonisomorphic. For example, a nondeterministic evolution will occur if there exists at least one place with at least two transitions with the same label that lead to different places. Nevertheless, the language generated by many systems (see [18]) is represented by a nondeterministic NDA when the system state is described by means of its outputs. For some applications, such as diagnosis, nondeterminism is an undesirable characteristic. Next section will present an extension of IPN that will avoid the nondeterminism in this case.

3. Definition of Stochastic Timed Interpreted Petri Net (st-IPN)

This section presents the definition of st-IPN. st-IPN definition includes the concept of “differential of output signals” , which represents the historical behavior of the output signals. This concept will improve the understanding of the process states. For example, in a fluid tank it is possible to differentiate between the filling or emptying tasks. This knowledge takes a great importance for identification purposes.

Definition 7 (differential of output symbol, ). Given an output symbol and ; is defined as where , is an output symbol at time , is an output symbol at time , and ; for an output symbol ; , so possible values are ; ; ; .

st-IPN combines an IPN and a timed PN. IPN is defined in Definition 4 and the timing concepts are based on [3] where the authors define a timed PN with a delay associated with each transition, represented by . This delay represents the time that must elapse from the enabling of the transition until it fires. Timer variables are used to determine if a potentially enabled transition tr is timed-out, that is, if its firing time is elapsed and tr has not yet fired. If , then tr is said to be timed, while if , then tr is said to be immediate.

Definition 8 (timer variables (based on [3])). Given a transition sequence , a timer variable is associated with each timed transition , where is enabled if ; is disabled; that is, , before the firing of .

Moreover, it is important to introduce the concept of operation mode which is related to external events signals that affects the controller, such as SCADA commands, operator requirements, among others. The operation modes can be grouped in a set .

A stochastic system is affected by a number of aspects such as the variability in the material flow, the fluctuation of power supply, and the deterioration of machinery and devices. These variations change the enabling and the firing times. It is assumed that each transition time follows a stochastic distribution that can be different for each operation mode.

Considering that this variability is not always normally distributed, st-IPN includes a density function of transition firing time for each transition and for each operation mode.

Definition 9 (stochastic timed interpreted Petri net (st-IPN)). A st-IPN is a structure represented by where is an IPN where have the same meaning as in Definition 4, but input function is defined as (a labeling function that assigns an input symbol and a timer density function to each transition) and is defined as ; is isomorphic over .
is the system alphabet.
is a density function of transition firing time for each operation mode .

A transition with is enabled at operation mode if and only if and if , where and is the confidence level. Therefore, a transition is enabled if each input place meets the marking requirements and the time elapsed has reached a certain confidence interval.

When is fired, a new marking is reached, such that ; thereby, each place of the st-IPN represents not only the current system state, but also includes information about directionality.(i)Generated language: Stochastic timed event sequence generated from the evolution of a st-IPN, so , where at times . A symbol of the language generated is a concatenation of an input symbol and an output symbol () at time .

3.1. Properties

Given an I/O event sequence , where , with , if is an isomorphic output function between the set of markings and the set of output symbols, then the net that generates the sequence is nondeterministic because input symbols can be followed by two different output symbols, and . One of the consequences of a net nondeterministic is that it generates event sequences that are not part of the system language. The generation of event strings not included in the system language must be avoided if the generator is going to be used as a diagnosis model. The Proposition 10 proof that a st-IPN from is deterministic.

Proposition 10. Let stQ be a st-IPN defined as in Definition 9. If is isomorphic, then stQ is deterministic.

Proof. Let be the sequence described in the previous paragraph. Let us split the symbol into two and ; ; the differential of output symbols in is and in is ; as then ; therefore, the output function is different in the two states, thus eliminating the nondeterminism and increasing the number of states. So, is isomorphic between the set of markings and the set of output symbols .

The inclusion of in the model allows the computing of a maximum number of states.

Definition 11 (maximum number of states, ). The system state is determined by the output symbols and their historical behavior; therefore, the maximum number of identifiable states is related to these output signals; thus where is the number of output symbols.

For example in a system with only one binary sensor, ; in a system with two binary sensors, .

Proposition 12. Let stQ be a st-IPN and the language generated by stQ and let at times , () be an event sequence. If and then .

Proof. Because a st-IPN is deterministic (see Proposition 10), if then so can be generated infinitely.

Proposition 13. Let , be st-IPNs, with incidence matrices and and initial markings and , and let , be the language generated by and , respectively, with and . if .

Proof. The system evolves the following PN state equation:
Let us suppose that but .
Let us consider an event sequence at times that is generated by a firing sequence .
If then ;  , for each st-IPN, respectively, and but, . As is isomorphic then .
As then .
If and by (2) and (3) respectively.
Then .
Therefore . Whereof , which contradicts the initial condition.

It is concluded that the st-IPN that generates a language is unique.

Proposition 14. Let be an output function and let , be a sequence that models a cycle ; , for each, where is the number of sensors.

Proof. The system evolves the following st-IPN state. Equation (7) is as follows
observed at times ; then by (3), that is, .
In compliance with the condition that in a cycle , , the behavior of each output signal can be as follows. (i) It does not change; then, the value of is always zero during the cycle; therefore ; (ii) it changes; therefore the value of can be 1 or 0, if it starts in 1; the value along the cycle will be ; therefore, . If it starts in 0, the value along the cycle will be ; therefore, . That is to say, after steps (: number of outputs whose value has changed), the output values must return at their initial values when reaching .

According to their definition st-IPNs are an extension of PNs which include input and output signals in places and transitions (IPNs) and stochastic firing times for each transition (st-IPNs). The link with I/O signals and the capacity of including stochastic time variations make st-IPNs a powerful tool to model real systems.

4. DES Identification Method

4.1. System to Identify

The generation of internal and external events in a timeline defines the behavior of the system (see Figure 1). A set of external events defines an operation mode that affects the controller and an internal event is the concatenation of control commands and sensor readings in at time instant.

The system consists of the closed loop connection of a dynamic system (plant) and a control law (controller). Once the operation mode is defined (by the external events), the controller executes a control strategy by generating a sequence of control commands. This sequence input gets into the plant and activates the actuators. The system reacts and generates a sensor readings sequence (Figure 1). Therefore the behavior of the system can be described in terms of event sequences.

During a system evolution like a production cycle, it is possible to get information of the system state (by analyzing these sequences), with the language theory ([1]). Each internal event is a letter of the alphabet and a letter sequence is a word that models the system performance. Therefore, the system can be described as a regular language generator.

4.1.1. Definition of a Subsystem

To decrement the complexity of the model, the system is split into subsystems. A subsystem is a part of a system which has a particular behavior. The division into subsystems is carried out according to physical or functional criteria, because the system model to identify is unknown and is not possible to define the subsystems in other ways.

For each subsystem the inputs will be the control commands and the outputs will be the sensor readings. Control commands can be classified as global or local. A control command is global when it is applied to more than one subsystem and it is local when it is applied only to one subsystem. Therefore the control commands set is , where and is the number of global control commands; ; , is the index of the subsystem and is the number of local control commands in subsystem .

Each sensor reading is assigned to a subsystem, so the set of sensor readings will be , where is the number of sensors in subsystem and .

4.1.2. Restrictions over the System

The system has the following restrictions.(i)The system is closed loop controlled and its working is cyclic. Moreover, each subsystem can only be in one state at a time.(ii)The exchanged signals between the system and the controller are discrete with only two possible values (binary coding).(iii)Its behavior is defined by the generation of internal and external events in a timeline.(iv)An external event affects the controller; an internal event is the concatenation of control commands and sensor readings at time instant. An external event sequence is a timed sequence which generates a system operation mode.(v)An internal event can only be generated by a subsystem at a time.

4.1.3. System Operation

Given an operation mode, the controller executes a control strategy then a timed internal event sequence is generated; that is, at each time the controller generates a set of control commands . This set is the input set to system which changes its state and generates an output set containing sensor readings. and ; .

(a) Event Generator. It is a procedure that concatenates the I/O symbols at time when any signal changes (control command or sensor reading) from to and it generates an event as .

Applying this concept to system defined in definition of Section 4.1, an event in a subsystem will be noted with the subscript , so an event in a subsystem is noted as . Consider ; ; and ; (see Figure 1).

Note that when there are changes in control commands or sensor readings shared by over one subsystem, the event is generated in all the subsystems involved.

4.2. Identification Process

This section presents the identification method of the DES presented in Section 4.1.

Problem 15. Given an event sequence of size , , generated from generator defined in Section 4.1.3(a) in a system defined as in Section 4.1, find the st-IPN that generates the same sequence.

The first step in the identification process will be the division of the system into subsystems.

Each subsystem will be identified as a st-IPN. For this purpose the identification process requires monitoring the signals exchanged between the system and the controller (I/O signals). These signals enter the event generator, which concatenates the signals and generates an internal event.

The event arrives to the identifier, which continuously builds the st-IPN which is able to generate the same event, in the same sequence and at the same time.

4.3. Identification Algorithm

The identification algorithm is shown in Figure 2. First, the system to identify has to be split into subsystems. Then, the initial conditions must be set, which consist of the initial event and the initial state. Let us assume that the system is split into c subsystems, so each subsystem will be noted with the index   .

The starting event for each subsystem is , where , stand for the starting values of control commands and sensor readings, respectively, at time . Therefore each subsystem starts with a st-IPN with one place. The inputs and output functions will be , and and is the initial value of the timer; that is, the subsystem starts without transitions.

Following the establishment of an operation mode   (), the event detection procedure generates events when any change happens in the I/O signals. It reads the input symbol and waits for the sensor setting. The next step is checking if there has been a change in one of the symbols of the I/O pair, in order to generate an event in that case.

The st-IPN identification algorithm starts with a new event (see Figure 3); then it compares the event with the previous one in order to classify it as a change in an input, in an output, or in both.

The identification algorithm (Algorithm 1) follows the flow diagram presented in Figure 3. It identifies a change in a subsystem state when there is a difference between and the current output symbol in subsystem . In this case, the system executes Algorithm 2, (Case 1). This algorithm creates new transitions and places when necessary and it also updates the current st-IPN marking.

Inputs:   ; ;  
; ; ; .
Output ; ; ; ; .
Initial Conditions:   ;
; ; ;   ; ;
read ,
(1)  If and
  execute Algorithm 2 CASE
(2)  else if and
  execute Algorithm 2 CASE
(3)  else if and
  execute Algorithm 2 CASE
  end if

Inputs:   ; ; ; ; ; ;
Outputs:   ; ; ; ;
Case  1.
read
;
;
;
;
(1)
(2)
(3)
(4) Update ; ; ; ; .
Case  2.
read
;
;
;
(1)
(2)
(3) Update ; ; ; ;
The updating of ; ; ; ; can be seen in Figure 4.

Otherwise, a change only in the input symbol will involve a transition, so Algorithm 2, (case 2) includes this new transition (if it does not exist) in the st-IPN and updates the current state of the st-IPN.

4.4. Including Time Information

This section explains how to exploit the time to solve the identification problem.

Time that must elapse from the enabling of the transition until it fires is accumulated in time data arrays , where , , stand for the indexes of the transition and subsystem, respectively, and is the number of samples.

The goal to monitoring the process data is finding a statistical model that describes statistical behavior of each transition; therefore, the system must be observed for long time to have enough data for the statistical analysis, until the process converges. The convergence criterion is based on finding a reliable estimate.

The estimation error of population mean is the estimator we use to study the convergence of the method. It has denoted the maximal accepted difference between the population mean and the sample mean as .

Sample size determination is the act of choosing the number of times that the process to identify is observed. The sample size is increased until the process has been identified totally and has converged.

The number of samples needed for fitting a distribution () can be computed as [3638] . Moreover, the significance level () can be defined as , being the difference between the population mean and the sample mean and is the sample quasi-variance. is the risk of having an error larger than . Generally is set in 0.05 [39] and is the factor determining the length of a confidence interval of population mean.

From to equation of , is

Algorithm 3 is executed each time a time array gets a new value, so its size has increased. It computes and compares it with the defined upper limit (). That is, where, if , is computed as a Student's distribution and, if , is computed as a normal distribution. Once is reached, Algorithm 3 is no longer applied to this transition and operation mode, because its distribution can be identified with the available data. Moreover, no more data is included in the time vector.

Input:   with , , ,
Output:  
for to
for to
(1) ;
(2) compute , ;
(3) if compute
  (i)    else compute
 end if
(4) ;
  (i)     If
       execute Algorithm 4
  (ii)   end if
end for
end for

Algorithm 3 first determines if there is enough data to fit a distribution, once is reached, that is, the size sample is sufficient, then it is necessary to find the probability density function (PDF) that fits the data. The methods of estimating the PDF fall into two main categories. One is parametric method that assumes the form of the PDF of the population which is known a priori and the collected process sample is used to determine the parameters of the assumed PDF and another category is nonparametric methods. In this procedure we analyze a set CD of specific continuous distributions known; . Among the most commonly used continuous distributions it includes the normal distribution , the exponential distribution , and the uniform distribution .

Algorithm 4 computes the distribution type and its parameters. Algorithm 4 uses the goodness-of-fit test (see [40]) to define the distribution type with a Kolmogorov-Smirnov test. First it computes the distribution parameters using the maximum likelihood method; second, it computes the -value statistic as where is the th observed value in the sample (whose values have previously been ordered from the lowest to the highest); is an estimator of the probability of observing values less than or equal to and is the probability of observing values less than or equal to from a specified distribution [41]; that is, .

Input:   , , for contrast
test, CD
Output:  
search - in table of Kolmogorov-Smirnov.
for
 (i)    Design goodness-of-fit test
       ;
       .
 (ii)   Estimate parameters of .
 (iii)  Apply Kolmogorov-Smirnov test
      (a) Compute -value .
 (iv)   If - - ,
 then ; break
  else .
 (v)     end if
end for

4.4.1. Modelling Error

The proposed identification algorithm is a tool for identifying stochastic DES which estimates the probability distributions of transitions times. Because full times population cannot be observed, there will be a modelling error. As the sample mean is the estimator of a population mean, then the modelling error is related to the difference between value estimated and the true value of the mean. The standard error of the mean is the way to measure the modelling error. The standard error of the mean is the standard deviation of sample means over all possible samples (of a given size) drawn from the population. To measure it, the root-mean-square error (RMSE) is used, where .

4.5. Global Language Reconstruction from to Language of the Subsystems

Our objective is to identify the language of global system. It is found from the language of its subsystem.

A global system event is obtained by Timed Synchronized Splice so.

Definition 16 (operation of timed synchronized splice). Le Let be a system composed by subsystems. Let with be events at time , where . A global system event, at time , can be obtained as: where and where is a binary representation of , stands for all input symbols and where is a binary representation of , stands for all output symbols.
Then, the global system language can be built from the sequence of synchronized events of its subsystem. Therefore, the global generated language at times is .

Proposition 17. Given a global system composed by c subsystems, the global system language built by the temporal synchronization of the languages of its subsystems is equal to system language, .

Proof. Let be a complex system with subsystems. Let be an event sequence generated from generator defined in Section 4.1.3(a) and let be the global system language built from a sequence of synchronized events of its subsystems.
Let be an event sequence such that and let be the sequence of synchronized events of the with .
Assuming .
, and , as for . ; . Based on Definition 16 then , which contradicts the initial condition.

4.5.1. Synchronous Product of st-IPNs

To find the global system st-IPN we have proposed the operation of synchronous product of st-IPNs defined as follows.

Definition 18 (synchronous product of st-IPNs). Given a set of st-IPNs, defined as (5), synchronous product of is the st-IPN defined as: where . is an ordinary PN, with set of places, set of system transitions, is the length set of places, number of subsystems, , , and is the initial marking. is the set of input to the system. is the set of output to the system. (labeling function that assigns an input symbol and a timer density function to each transition) and is defined as ; is isomorphic over .
is the system alphabet.
is a density function of transition firing time for each operation modes .

In order to verify that the structure of a st-IPN solves Problem 15, we define the conditions for a PN which is considered a deterministic generator and the conditions for a system to be considered deterministically identifiable.

Definition 19 (). Let be an alphabet and let be a language generated by alphabet; a PN is an for if given a set of inputs , PN generates a set of outputs such that the strings with a probability greeter than .

4.5.2. Assumptions of a System so that It Is a Deterministically Identifiable (DI) System

Definition 19 enables us to set the assumptions of a system for it can be a DI system.(i)Let and be a set of I/O observed signals of system; if given a set of inputs associated with a time , is always the same.(ii)Moreover, given two identical inputs , but with different length of time and two different projections of outputs, the system can be considered DI, since length of time defines the mark to reach.

Proposition 20. Let be a system observed language, at times ; if st-IPN is a PN obtained with proposed identification algorithm then st-IPN is a of .

Proof. A st-IPN is a of if and if , .
Let be a st-IPN obtained by proposed identification algorithm for .
Assuming that is a string such that but , where , then the postlanguage of is given by . Then given the event with at time , applying the Algorithm 1 a new event generates a transition with and the marking reached by : will be and as are signals vectors observed at time , (based on description of event generator, Section 4.1.3(a)), then then assumption is rejected.
Now, we assume that is a string such that but , where . Then the postlanguage of is given by . Then given the event with at time , the st-IPN evolves a state from to a transition with and the marking reached by : will be , where has not been observed and reaches states that have not been observed, but according to Algorithm 1 the construction of st-IPN is made based on reached state by the observed language, then and then assumption is rejected and therefore Proposition 20 is proved.

Moreover, reaches the mark if and if for some , then is a cycle (Proposition 12).

Definition 21. A , where is DI language if , where , such as then if they started from the same state .

In other words, a is DI language, if from a state two equal inputs are given, the output symbol is always the same.

Theorem 22. A language is DI language if and only if   an for .

Proof. Necessary condition: if is a DI language, then an .
A deterministic for is a st-IPN. is a DI language, if is a st-IPN (see Proposition 20).
Sufficient condition: if an then is a DI language.
Assume that the for is a st-IPN. Let be the generated language by . Let be a string such that . Then, given the events and such that with with if , in the event then , based on Definition 21   is a DI language and according to Proposition 13  ; therefore is a DI language.

If a st-IPN is a deterministic for , then st-IPN generates the same observed language; therefore Problem 15 is solved.

5. Application Example

Let us illustrate the approach with an example. The proposed example is the centralized air heating system (AHS) in Figure 5. The system includes three heating subsystems. Each heating subsystem has a fan creating an air flow that is heated with hot water. The water flow is controlled by pump-valve systems. Moreover, there is a central heater providing hot water to each heating subsystem and two valves ( and ) controlling the water flow through the whole system. The system can be split into five subsystems (1, 2, 3, 4, and 5). Subsystems 3, 4, and 5 are the local heaters, subsystem 2 is the distribution subsystem ( and ), and subsystem 1 is the main heating subsystem (heater, main pump (), and reflux valve ()). The heater works in three modes (0, 1, 2), each state is defined by the number of resistances it has activated, so, modes 0, 1, and 2 will represent no activation of resistance, activation of , and activation of both and .

The system is equipped with a set of sensors detailed in Table 1. Each subsystem includes a flow sensor () that measures the presence or absence of flow in the subsystem. Nevertheless, flow level is affected when other subsystems are activated. So, a software sensor () is designed in order to measure the deviation over a normal operation flow taking into account the activation of other subsystems.

The system also includes binary temperature sensors, pressure sensors, and a position sensor for valve .

is an I/O symbol based on Definition 3, where is a binary representation of , stands for the input symbol, is a binary representation of , stands for the output symbol, and stands the subsystem index. For example, is an I/O symbol in subsystem 1, whose values are , that is, .

5.1. System Operation

The system globally starts with the external event “Son”; the heating subsystems are locally started with events “," “,” and “”. These events are external events that change the controller strategy (See in Figure 6). Each combination of external events generates a system operation mode. When event “Son” is generated, the controller opens the reflux valve () and starts Pm (), the heater starts heating (temperature set point is and the water flows through the path Heater-Pm-VR-Heater. When any heating subsystem is activated (“,” “,” or “”), subsystem 1 closes reflux valve () and it changes the temperature set point to . If “” and (or) “” are started, then, the corresponding valves 1 and 2 of subsystem 2 are opened (). Moreover local valves and pumps start their work when necessary ( and ). Each heating subsystem is stopped with events “,” “,” and “” (). Then, the controller performs the closing actions; it closes the valves () and stops the pumps (). When every subsystem is completely stopped, the system can be globally stopped with event “Soff.” Then, the controller closes the reflux valve (), and stops the heater and Pm ().

In order to demonstrate the advantages of the proposed identification method, the system will be identified over two scenarios.

The first test presented is the stand alone operation of subsystem 3 and the second test presented simulates all the possible operation modes sequentially. The identification method is additive, so after the identification of a language associated with a particular mode, the identified nets learn new languages by adding them to the previous knowledge.

Because the system cannot work if Son is not activated, there are 9 possible operations modes: ; where , , and  ; so on. Moreover, each test includes the starting and stopping operations. Therefore testing of operation mode includes the sequence .

5.2. Identification of Scenario 1

The number of subsystems is 5 and the initial conditions are as follows.(i)Definition of initial state at :  :;;;;.(ii)Initial Output Function:;;;;. The input vector arranges the control commands based on system operation as described in Section 5.1. The output vector arranges the sensor readings as in Table 1.

Once the system starts with “Son”; the controller generates the control commands when the external inputs evolve. Then the system reacts and the event generator starts generating events. (Note that the system starts with operation mode , then it changes to , and, when “” is externally activated, it changes to .)

With each new event, Algorithm 1 creates new places and transitions. After a complete run of the system ( is generated), Algorithm 1 creates the st-IPNs for subsystems 1 and 3 showed in Figure 7(a).

At this identification stage, places and transitions are defined and the identified st-IPNs model the expected behavior of the system.

Initially, the system works in operation mode 0 () as Son is not active. Subsystem 1 starts at place . Activation of “” changes the operation mode to , the controller opens the valves (reflux and main), turns on the main pump, and sets the heater in mode 1 (). A state change () is generated, because water flows through the path Heater-Pm-VR-Heater () (as it is showed in Figure 5). After a time () temperature reaches (18°C); therefore, the system evolves to other state (). The controller starts the heating actions; it closes the reflux valve and sets heater in mode 2 (), the state change cannot be measured by absence of sensors that detect the operation of the reflux valve (). At this time “” is activated, so the system operation mode changes to ; then the controller turns on pump 3 and opens valve 3 () and subsystem 3 changes its state to because water is flowing through path Heater-Pm-P3-V3-Heater. After a time (), flow reaches its normal level () and subsystem 3 changes the state to . Subsystem 1 continues with the reflux valve closed (). It waits until reaching (28°C) and, therefore, it evolves to . Subsystem 3 remains in until “” is externally deactivated and is reached, then normal flow falls () and it changes to (). Now, the controller turns off pump 3 and closes valve 3 (). As a consequence, flow in the subsystem decreases (change to ) until flow level is low . This transition makes the subsystem return to its initial state (). Now, the controller opens the reflux valve and sets the heater in mode 1 (). Subsystem 1 holds the state () until temperature is below , (), after the controller turns off the main pump, it closes the main valve and sets the heater in mode 0 (). Subsystem 1 evolves to and after time () to . Slightly, temperature decreases until it reaches the initial state (). At the end, Son is deactivated () and the system operation mode changes to .

Nevertheless, the time array for each transition has only one value, so no probability distribution can be fit.

As stated in Section 4, at least 5 samples are needed to identify the distributions, so scenario 1 will be repeated until each distribution for each transition will be identified.

After 5 observations of a transition, Algorithm 3 is executed. This paper shows, as an example, the case with the greatest dispersion () as it is the worst case.Inputs: ; ; ; ; , .for ;. As , there is not enough data to fit the distribution.

After 20 observations, Algorithm 3 gives a positive result: , ; . As , Algorithm 4 is executed and it will compute the distribution type and its parameters.

. in table of Kolmogorov-Smirnov, with .Hypothesis test:;.Parameters estimation and based on test Kolmogorov-Smirnov.Accept since . Therefore the data is normally distributed.To do a full identification of the process it is necessary to get 20 samples of each transition time.

Figure 8 shows the mean of the time array collected for versus the number of acquired samples in operation mode 2. It can be seen that for 20 samples onwards the mean remains stable.

Thereby, the sequence of operation modes is fully identified. This identification includes the st-IPNs structures and all the density functions for each transition in the net.

5.3. Complete Identification of the System

The last test performed is the sequential generation of all the possible combinations of the external events, so the system must be able to generate a really complex language. Subsystem st-IPNs are showed in Figure 9.

Results show the advantages of identifying subsystems instead of the global system. Figure 7(b) shows the identification of the global system (without splitting in subsystems) in scenario 1. The complexity (number of places and transitions) of this st-IPN is similar to the complexity of the st-IPNs in the case of the subsystem identification. Nevertheless, as the complexity of the system increases, subsystem identification reduces complexity. Table 2 shows a comparison of the number of sates and transitions when identifying with or without splitting the system into subsystems.

It can be seen that a global system identification increases its complexity with the number of operation modes considered, as a subsystem identification does not.

This behavior is related to the number of possible events handled by the st-IPNs. The maximum number of global system states is , ( is the total number of output symbols) (6), whereas in identification by subsystem the maximum number of system states is ; that is, , where , it can be seen that .

5.4. Language Generation

This section shows how the observed language is generated in both subsystem and global identification approach for scenario 1.

The observed language for each subsystem is equal to the one generated by the st-IPNs:

at times .

at times .

Figure 7(b) shows the result of the identification process in case that the system is not split into subsystems. The language observed also is equal to the generated.

at times .

A global input symbol is

And a global output symbol is

Note that, for example, at time , the input symbol in the global language is and at time is (see Figure 7(b)). At these times, input symbols in local subsystems are and . In this time the output symbol in the global language is and at time is (see Figure 7(b)). Moreover, at these times output symbols in local subsystems are and . Table 3 shows these symbols.

According to Definition 16, a global I/O symbol, , is obtained from splice of for . Therefore the global and the subsystem approach represent the same language. Then, time synchronization splice of and gives .

Therefore, the advantage of using subsystems is that the method identifies only active languages, which allows determining which subsystem is operating.

5.5. Timed Transitions Identification in Different Operation Modes

Transition time depends on the operation mode of the system although the device maintains its operation sequence.

Table 4 shows the parameters of distribution for each operation mode. So, although subsystem 3 uses the same net for each operation mode (Figure 9) it uses a different distribution in each transition for each operation mode, so it can model complex behaviors.

Note that time changes can be notable with changes from 0.37 seconds to 2.75 seconds.

6. Conclusions

The proposed method identifies stochastic DES without previous model as a set of st-IPNs modelling the subsystems. The identification algorithm uses on-line I/O data to identify each subsystem, and it defines st-IPNs that generate the same language as the observed one. It has been proved that these st-IPNs are unique and deterministic, a desired feature for applications such as diagnosis. In addition the proposed method identifies systems with stochastic time. This is of great importance in industrial systems because processes are affected by a number of aspects such as the variability in the material flow, the fluctuation of power supply, the deterioration of machinery and devices, among others.

The identification methodology is incremental and is directly related to the objective of modeling, that is, modeling the devices that compose the system in a constructivist way. This approach allows identifying all observable behaviors of the devices.

Some of the main advantages of the methodology (with respect to the existing ones) are that it handles a lot of inputs and outputs with low computational cost (as they are split into subsystems) and it does not need a previous knowledge of the number of places neither the length of the language. Moreover, it generates reduced size models and it includes timed information. Besides, this proposal includes a process to determine whenever the system is fully identified which has not been proposed in other research.

Future research will deal with a forgetting algorithm in order to forget behaviors that will be not more in use.

Conflict of Interests

The authors declare that there is no conflict of interests regarding this paper.

Acknowledgment

This work was supported by a Grant from the Universidad del Cauca, reference 2.3-31.2/05 2011.