Abstract

The fly-by-wire system plays an important role in modern civil aircraft. As a typical safety-critical system, its reliability will affect the safety of aircraft significantly. In the paper, stochastic Petri nets are applied in the reliability modeling and analysis for the fly-by-wire system to represent its dynamic (time-dependent) failure behaviors. Stochastic Petri net-based reliability models are established for all kinds of architectures including series, parallel, -out-of-, warm standby, cold standby, and load-sharing architectures, which are commonly used in the fly-by-wire system. A Monte Carlo simulation method is proposed for the stochastic Petri net-based reliability models to generate system lifetime samples, and the system reliability parameters can be calculated in terms of the lifetime samples. Finally, a fly-by-wire system is used as a case study to illustrate the application and effectiveness of our proposed approaches. The results show that the error of the reliability value in a flight duration obtained by our Monte Carlo simulation method is less than compared with the analytical equation.

1. Introduction

The flight-control system is a typical safety-critical system whose reliability will affect the safety of aircraft significantly. The failure or malfunction of the flight-control system will lead to an unsafe flight path or structural failure preventing continued safe flight and landing, which are considered as catastrophic top level failure conditions of the aircraft. In the modern transport category airplanes, fly-by-wire systems have been widely used to replace hydromechanical ones. By utilizing the fly-by-wire system, pilots’ commands are converted to electronic signals transmitted by wires to flight-control computers, and control commands are calculated by flight-control computers based on control laws to determine the movements of the actuators at each control surface. Therefore, the mechanical circuits consisting of rods, cables, and pulleys are not required anymore, and the weight of the airplane can be reduced.

In order to improve the reliability of the fly-by-wire system, redundancy architectures including parallel, majority, standby, and load-sharing have been widely used in the design of the fly-by-wire system. As there are dynamic or state-dependent behaviors in the standby or load-sharing systems, the failure of the systems depends not only on the combinations of its component failures but also on the occurrence order of the component failures [1, 2]. At present, fault tree analysis (FTA), dependence diagram analysis (DDA), and Markov analysis (MA) are the most widely used tools for reliability modeling and safety analysis of airborne systems. FTA is a deductive failure analysis that focuses on one particular undesired event and provides a method for determining causes of this event. DDA, which is equivalent to reliability block diagram (RBD) in reliability engineering, provides an alternate pictorial representation of combinations of failures for the purpose of probability analysis. In MA, Markov chains are used to represent various system states and relationships among them. The states can be either operational or nonoperational. The transition rate from one state to another is a function of the failure rate or repair rate. The state probabilities are derived by solving a set of differential equations that are derived from the Markov chain [3, 4]. Among all the three methods, FTA and DDA are both static tools; they cannot capture the state-dependent behavior of system failure mechanisms [5]. Although MA can cope with state-dependent behaviors, it will be faced with the infamous state space explosion problem when the system is large and complex. What is more, the solution of the differential equations for the Markov chains is a cumbersome work and MA can only deal with the system whose components are following exponential distributions [6].

As a tool for discrete event system simulation, Petri nets have been widely applied in reliability engineering since three decades ago. One of the important applications is focusing on system reliability modeling by using Petri nets instead of traditional reliability tools [711]. Hura and Atwood [7] presented a method to represent fault trees with Petri nets, which they thought can provide more insight into failure behaviors. Malhotra and Trivedi [8] studied to construct the reliability models by using stochastic Petri nets and stochastic reward nets, and different kinds of repair scenarios are also considered in these models. Liu and Chiou [9] used Petri nets to denote different kinds of logic operations, and a trapezoidal graph method is applied to account for failure scenarios. Schneeweiss [10] developed the Petri net models for many reliability scenarios, and maintenance cost and benefit are considered in their research. Volovoi [11] applied aging tokens in the Petri net-based reliability model, and the advantages of their method have been illustrated by comparing with classical reliability tools. Katsigiannis et al. [12] presented a new methodology for reliability modeling-based fluid stochastic Petri nets for small isolated power systems. Robidoux et al. [13] presented an algorithm that automatically converts the RBD model into a colored Petri net, and a case study is used to illustrate the effectiveness of the method. Wu et al. [14] established the reliability model for a solar array mechanical system by using fault tree and fuzzy reasoning Petri nets, and their method can be applied to find the fault mechanisms. Chu et al. [15] built a reliability model for the jet pipe servo valve by using generalized stochastic Petri nets, and the effectiveness of their method is illustrated by comparing with the Markov model. In recent years, the application of Petri nets has been extended to many other fields of reliability and safety engineering, which include the reliability analysis of integrated modular avionics [16], the reliability modeling of multimission phased mission system [17], the formal model-based safety analysis [18], and the dependability analysis of safety-critical real-time systems [19].

The Petri nets have displayed a powerful ability in reliability and safety modeling; thus, a stochastic Petri net-based reliability model will be proposed for the fly-by-wire systems in this study. The rest of this paper is structured as follows. In Section 2, a brief description of the stochastic Petri net is presented. In Section 3, the Petri net-based reliability models are constructed for both static and dynamic architectures including series, parallel, -out-of-, warm standby, cold standby, and load-sharing architectures, and the substitution transitions are applied in the hierarchical reliability model to simplify complex net structures. In Section 4, a Monte Carlo simulation method is proposed for the stochastic Petri net-based reliability models to generate system lifetime samples, and the system reliability parameters can be calculated in terms of the lifetime samples. In Section 5, a fly-by-wire system is used as a case study to illustrate the application and effectiveness of our proposed reliability approaches. In Section 6, concluding remarks are presented.

2.1. Definitions of the Stochastic Petri Net-Based Reliability Model

According to the definition of the stochastic Petri net [20, 21], the stochastic Petri net-based reliability model can be defined as a 7-tuple such that (i) is the set of places which are used to denote the states of components or systems(ii) is the set of transitions including timed transitions and immediate transitions; the timed transitions are used to denote the failure process of each component and the immediate transitions are used to denote the relationship among different states(iii) is the set of arcs which build the links between places and transitions(iv) is the capacity function which indicates the number of states for a specific component or system(v) is the weight function of arcs; the weights contain the imformation about how a transition can be fired and how the state can be changed after the transition has been fired(vi) is the marking of the net, , and is the initial marking; markings reflect the states of all components and the system simultaneously(vii) is the set of firing rates. The firing rate of the timed transition is the failure rate of the corresponding component, while the firing rate of the immediate transition is an infinite value that means the transition will be fired instantaneously

2.2. Enabled and Fired of Transitions

, is called the preset of and is called the postset of such that

A transition is enabled if and only if

Equation (2) illustrates the prerequisite of the component or system state changing.

When a transition is enabled, it does not imply that it will be immediately fired. Among all the enabled transitions, only the transition that has the minimum firing time will be fired. If several transitions have identical minimum firing time, one of them will be selected randomly to be fired. After the transition is fired, the markings of the net will be changed according to the following rule:

Equation (3) illustrates how the states will change after the transitions have been fired.

3. Petri Net-Based Reliability Model of Typical Architectures for the Fly-By-Wire Systems

A fly-by-wire system usually consists of three subsystems, which are the sensor subsystem, the flight-control-computer subsystem, and the servo-control subsystem. The sensor subsystem usually has a majority architecture, such as -out-of- architecture, which means the subsystem will be failed at least of the total sensors or transducers are failed. The flight-control-computer subsystem usually applies standby architecture, especially the warm standby or hot standby (parallel) architectures. Hence, the uninterruptible command calculation and monitoring can still be provided when part of the redundant channels in the computer is failed. The servo-controls usually have a load-sharing architecture. When one of the redundant actuators is failed, the failure rate of other actuators will increase. To the overall flight-control system, it can be treated as a series system composed of sensing units, flight-control computers, and servo-controls. In this section, the reliability models of all the abovementioned architectures will be presented based on the stochastic Petri net.

3.1. Petri Net-Based Reliability Model of Static Architectures

Failures of the static system are completely decided by the combinations of its component failures. The reliability model of a static system can be expressed by either a dependence diagram (RBD) with series, parallel, and -out-of- architectures, or a fault tree with AND, OR, and Voting gates.

In this study, a unified Petri net-based reliability model is proposed to express all static architectures including the series, parallel (hot standby), and -out-of-architectures. To a system composed of components, the unified Petri net-based reliability model is shown in Figure 1.

In Figure 1, the place denotes the operating state of the th component, and their capacity functions are all equal to 1, namely,

When there is a token in , the component is in the operating state; otherwise, the component is in the failed state. The place denotes the failed state of the system, and its capacity function is decided by the system architecture. To the series architecture, the capacity function of is equal to 1, and the system will be failed when there is one token in . To the parallel architecture, the capacity function of is equal to , and the system will be failed when there are tokens in . To the -out-of- architecture, the capacity function of is equal to , and the system will be failed when there are tokens in . The capacity function of the three types of static architectures can be expressed as follows:

The timed transition denotes the failure of the th component, and its firing time equals the time to failure of the th component.

The weight of the arc originating from is equal to , and the weights of all other arcs are equal to 1.

3.2. Petri Net-Based Reliability Model of Standby Architectures

A standby system consists of an active component and one or more standby ones. A sensing and switching mechanism is used to detect failures of the active component and activate the standby ones immediately when a failure of the active one occurs. There are three types of standby architectures including hot standby, warm standby, and cold standby. The hot standby is just the parallel architecture, and the standby components have the same failure rate as if it was operating in the system. The standby component cannot fail in a cold standby system which is usually assumed for spare or shelf items, whereas the standby component has a lower hazard rate than the operating component in a warm standby system and this is usually a realistic assumption [22]. Additionally, the flight-control system does not have individual sensing and switching mechanisms, whose function will be fulfilled by the monitor module in each channel of the flight-control computers.

The stochastic Petri net-based reliability model of a warm standby system composed of components is expressed by Figure 2.

In Figure 2, , , and denote the active state, the failed state, and the standby state of the th component, respectively. Their capacity function is equal to 1. Hence,

When there is a token in , , or , it means the component is in the active, failed, or standby state, respectively. The place denotes the failed state of the system, and its capacity function is equal to , namely,

The system will be failed when there are tokens in .

The timed transition denotes the failure of the th component in its active state, and its firing time is also equal to the time to failure of the th component. The time transition denotes the failure of the th component in the standby state, and its firing time is equal to the time to failure of the th component in the standby state. Transitions to are immediate transitions. Each connects all by a dual-arrow arc, which means the failures of component 1 to component are the prerequisite to enable transition , and place will still hold a token after have been fired.

The weight of the arc originating from is equal to , and the weights of all other arcs are equal to 1. The arcs connecting places or with transitions are bidirectional. It means the number of tokens in these places will not change after the pertinent transitions have been fired. As the state of a failed component or system will not change in the flight duration.

The stochastic Petri net-based reliability model of a cold standby system composed of components is given in Figure 3, which is simplified based on Figure 2. In the cold standby system, the place is just the place .

3.3. Petri Net-Based Reliability of Load-Sharing Architectures

In a load-sharing architecture, there is a dependency between the components. If one component fails, the failure rate of the other components increases as the result of the additional load placed on it. The servo-control with a load-sharing architecture usually has two or three redundant components. The Petri net-based reliability model of a load-sharing architecture with three components can be expressed by Figure 4.

In Figure 4, denotes the operating state of the th component when all the three components are operating. denotes the operating state of the th component when one of the other two components is failed. denotes the operating state of the th component when the other two components are failed. And the capacity functions of , , and are equal to 1; we have

denotes the failed state of the load-sharing system. The load-sharing system will be failed if and only if all the three components are failed; therefore, there will be three tokens in when the system is failed. The capacity function of is equal to 3. Hence, we have

denotes the failure of the th component when all the three components are operating. denotes the failure of the th component when one of the other two components is failed. denotes the failure of the th component when the other two components are failed. The firing times of , , and are equal to the time to failure of the th component in the case that all the three components are operating, one of the other two components is failed, and the other two components are failed, respectively. to are all immediate transitions.

The weight of the arc originating from is equal to 3; the weights of the arcs between and , and , and and are all equal to 2; and the weights of all other arcs are equal to 1.

To a load-sharing architecture with two components, the Petri net-based reliability model can be simplified. Figure 5 illustrates the model of a load-sharing architecture with two components.

3.4. Petri Net-Based Hierarchical Reliability Model

Creating the Petri net-based reliability model of a large system can be a cumbersome task. A Petri net-based hierarchical reliability model is proposed to simplify the reliability modeling of large systems in this section, and the state space explosion problem can be avoided to some extent.

In our model, a transition is used to represent an entire piece of net architecture or a branch of the Petri net. Such a transition is a substitution transition. Thus, a large Petri net can be simplified by replacing its small pieces of net or branches by substitution transitions. In this study, we use a square to denote the substitution transition. When the substitution transition is used to represent the pieces of net or branch architectures, two immediate transitions should be added. One is connected to the place, and the other is connected to the places that have tokens in the initial markings. Take the static architecture given in Figure 1 as an example; the corresponding piece of the net architecture represented by the substitution transitions is shown in Figure 6.

4. Stochastic Petri Net-Based Monte Carlo Simulation for Reliability Evaluation

A Monte Carlo simulation method is proposed for the stochastic Petri net-based reliability model, and the lifetime samples can be obtained via the Monte Carlo simulation. In this way, the reliability parameters can be calculated in terms of the lifetime samples.

4.1. Procedure of Stochastic Petri Net-Based Monte Carlo Simulation

The input of our Monte Carlo simulation procedure includes as follows: (1)The input incidence matrix (2)The output incidence matrix (3)The initial marking (4)The capacity function of each place (5)The set of firing rate or the firing time distribution of each transition(6)The maximum number of Monte Carlo simulations

The flowchart of the Monte Carlo simulation procedure is given in Figure 7. We can get one sample of the system lifetime from one simulation.

The detailed procedure of the Monte Carlo simulation is as follows.

Step 1. Let and start the th simulation.

Step 2. Variable initialization. Let , , and for all . is the current marking of the model, is the current time, and is the firing time of the transition .

Step 3. Decide whether the th simulation can terminate. The th simulation will terminate when , namely, the number of tokens in reaches the value of its capacity function. If , the simulation will go to Step 10; otherwise, the simulation will go to Step 4.

Step 4. Determine the enabled transitions. A Boolean variable is used to denote whether the transition is enabled or not. When equals 1, is enabled; otherwise, is not enabled. We let for all . We can determine which transitions are enabled by Equation (2). If is enabled, let .

Step 5. Update the firing time for all transitions. For each nonenabled transition (), let its firing time equal 0 (). For each enabled transition (), if its original firing time is 0 (), a random variable will be generated as its new firing time according to its firing rate ; otherwise, its firing time will not be changed. Additionally, the random variable of the firing time that does not follow exponential distribution (the firing rate is not a constant value) will be generated in terms of their firing time distributions. And the random variable of the substitution transitions’ firing time can be obtained via the Monte Carlo simulation of the corresponding piece of net architecture.

Step 6. Determine the fired transition. Among all the enabled transitions, the transition that has the minimum firing time () will be fired. If several transitions have identical minimum firing time, we can select one of them randomly to be fired. A Boolean variable is used to denote whether the transition is enabled or not. If equals 1, is enabled; otherwise, is not enabled.

Step 7. Update the current time. The current time will be updated by , i.e., let .

Step 8. Update the firing time for all enabled transitions. The firing time of the enabled transition () will be updated by , i.e., let .

Step 9. Update the markings. The markings will be updated by the state function of the Petri net, i.e., let . The th element of is the Boolean variable . And go back to Step 2.

Step 10. Obtain a sample of system lifetime. The current time will be a sample of the system lifetime.

Step 11. Decide whether all simulations have been finished. If , it means all simulations have been finished, and the procedure will end; otherwise, we let , and the next simulation will start (go back to Step 2).

4.2. Reliability Parameter Calculation Based on Lifetime Samples

Let denote the ordered lifetime samples, i.e., ; the estimate of reliability can be expressed as follows:

In this way, we can get a corresponding reliability value for each lifetime sample . By treating the pairs of and as training samples, we can get the regression curve of reliability function by using the backpropagation (BP) neural network. The method we used here is a kind of nonparametric regression; we do not make the assumption that the lifetime follows a specific distribution such as exponential distribution, lognormal distribution, or Weibull distribution. This is reasonable that the lifetime of a complex system may not follow a specific distribution.

The BP neural network we used has three layers, which are the input layer, the hidden layer, and the output layer. Both the input layer and the output layer only have one unit (neuron); the input unit denotes the lifetime sample and the output unit denotes the corresponding estimate of reliability function . The neural network we used is shown in Figure 8.

According to the structure of the BP neural network, the reliability function can be expressed as follows: where and are weights, and we can get the optimal values of weights by backpropagation algorithm.

5. Case Study

In the case study, a rudder control system of a commercial aircraft is used to illustrate the application and effectiveness of our proposed approach.

5.1. System Description

The rudder control system also consists of the sensor subsystem, the flight-control-computer subsystem, and the servo-control subsystem. Figure 9 shows the architecture of the rudder control system.

The sensor subsystem consists of the pedal position transducers, the inertial measurement units (IMUs), and the rudder position transducers. The pedal position transducer subsystem has a 2-out-of-4 architecture, and both the IMU subsystem and the rudder position transducer subsystem have a 2-out-of-3 architecture, namely, the triple modular redundancy (TMR).

The flight-control-computer subsystem has three independent channels. Each channel has an input signal monitor, a set of control laws, and a cross-channel monitor. The three channels can become any one of these three channels: the command channel, the standby channel, and the monitor channel. The command channel transmits its command output signals to the servo-control subsystem. It also has a channel monitoring function to find and isolate failures in the standby and monitor channels. The standby channel transmits test data only, and it becomes the command channel and transmits its command output signals to the servo-control subsystem when a failure causes the command channel to shut down. It also performs monitoring functions to find and isolate failures in the command and monitor channels. The monitor channel also transmits test data only. Its command output signals are used internally to find failures in the command and standby channels. The monitor channel becomes the standby channel when the command channel shuts down. The reliability model of the flight-control-computer subsystem has the characteristics of both the standby and the 2-out-of-3 architectures. The command channel is the active state; both the standby and monitor channels are in the warm standby state. The subsystem will be failed if and only if at least two of the three channels are failed.

The hydraulic actuation is achieved by the servo-control subsystem composed of two electrohydraulic servo-controls for the rudder. Each servo-control has two kinds of operation modes, which are the active mode (state) and the damping mode. Both the servo-controls are in the active mode initially, and they have identical failure rate in this situation. When one of the servo-controls is failed, the failed servo-control will be in the damping mode and the other one will be in the active mode. And the failure rate of the servo-control in the active mode will increase.

5.2. Petri Net-Based System Reliability Model

The Petri net-based reliability model of the pedal position transducer subsystem can be expressed as Figure 10(a), and both the IMU subsystem and the rudder position transducer subsystem can be expressed as Figure 10(b).

The Petri net-based reliability model of the flight-control-computer subsystem is shown in Figure 11.

The Petri net-based reliability model of the servo-control subsystem is shown in Figure 12.

The Petri net-based reliability model of the rudder control system is given in Figure 13.

In Figure 13, the place to denote the operating state of the pedal position transducer subsystem, the IMU subsystem, the rudder position transducer subsystem, the flight-control-computer subsystem, and the servo-control subsystem, respectively. to are all substitution transitions which are used to denote the failure of the abovementioned subsystems.

5.3. Results and Discussion

The component failure rates of the rudder control system are given in Table 1.

In Table 1, , , , , and are the failure rates in the active state of pedal position transducers, IMUs, rudder position transducers, flight-control-computer channels, and servo-controls, respectively. is the failure rate of the flight-control-computer channels operating as the standby or monitor channels. is the increased failure rate the sever-control when the other one is failed. We let , and 1000 lifetime samples of the system can be obtained. The reliability curve calculated by Equation (10) is shown in Figure 14. The analytical reliability function and the regression curve of the reliability function obtained by the BP neural network are also shown in Figure 13. The BP neural network used in this case study has ten neurons in the hidden layer.

The analytical expression of the system reliability function can be expressed as follows: where , , , , and are the reliability function of the pedal position transducer subsystem, the IMU subsystem, the rudder position transducer subsystem, the flight-control-computer subsystem, and the servo-control subsystem, respectively.

, , and can be calculated by RBD, and and can be calculated by the Markov process as all failure rates are constant values. They can be expressed as follows: where is the combination formula which can be expressed as follows:

Figure 15 shows the error between the analytical reliability function and the regression reliability function. And the maximum value of the error is 0.0174 when the lifetime is 723154 hours.

To the system of a civil aircraft, we usually care about the reliability in a flight duration which varies from a few hours to a dozen hours. We set the flight duration as 15 hours arbitrarily; the results obtained by both the analytical method and our Monte Carlo simulation-based regression method show that the reliability of the fly-by-wire system in a flight duration is greater than 99.99%.

Figure 16 shows the error between the analytical reliability function and the regression reliability function in the flight duration. The maximum error is and the minimum error is which illustrate that the error in a flight duration is negligible, although the error is greater than 0.01 in the whole life as shown in Figure 14.

6. Conclusion

In this study, stochastic Petri net-based reliability models are established for series, parallel, -out-of-, warm standby, cold standby, and load-sharing architectures, which are commonly used in the fly-by-wire system. And a reliability evaluation approach is proposed by using Monte Carlo simulation in terms of the stochastic Petri net-based reliability models.

Compared with the traditional reliability modeling tools such as RBD (DDA), FTA, and MA, our proposed approaches have the following advantages: (1)The time-dependent failure characteristics can be expressed by stochastic Petri net-based reliability models, and the model can be simplified by using the substitution transitions to construct a hierarchical reliability model. In this way, the state space explosion of the Markov model can be reduced to some extent(2)Our stochastic Petri net-based Monte Carlo simulation is suitable for all kinds of lifetime distributions; the limits of the Markov model that all component lifetimes should follow exponential distributions can be overcome(3)By using our proposed Monte Carlo simulation method, the cumbersome work of solving the differential equations for the Markov chains can be avoided. Meanwhile, the error of the reliability value in a flight duration obtained by our simulation method is less than that is negligible

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors wish to appreciate the support from the National Natural Science Foundation of China (U1733124) and the Aeronautical Science Foundation of China (20180252002).