System Reliability Assessment Based on Failure Propagation Processes

Lin, Shuai; Wang, Yanhui; Jia, Limin

doi:https://doi.org/10.1155/2018/9502953

Complexity

On this page

Abstract Introduction Discussion Conclusions Conflicts of Interest Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2018 | Article ID 9502953 | https://doi.org/10.1155/2018/9502953

System Reliability Assessment Based on Failure Propagation Processes

Shuai Lin,^1,2Yanhui Wang,^1,3and Limin Jia^1,3

Academic Editor: Michele Scarpiniti

Received27 Sept 2017

Accepted28 Feb 2018

Published26 Jun 2018

Abstract

One or several component failures may lead to more related component malfunction and ultimately cause system reliability reduction. Based on this, we focus on the assessment system reliability of complex electromechanical systems (CEMSs) in a fault-propagation view. First, failure propagation model taking into consideration failure data based on network theory and improved polychromatic sets is proposed for system reliability evaluation. From the node point of view, system effectiveness index is constructed to investigate the variation of efficiency of the holistic network. Subsequently, from the system’s perspective, system reliability measurement is provided and estimated in combination with system effectiveness index and failure propagation models. Finally, the application of proposed method to a bogie system of high-speed train assesses system reliability, and meanwhile, the effectiveness of the proposed method is able to be illustrated.

1. Introduction

Complex electromechanical system (CEMS) is defined as a set of interconnected components which work together to complete predetermined mission (Wang et al., 2017). Typical CEMSs include high-speed train, aircraft, nuclear equipment, and so on. Indeed, CEMS universally has higher reliability demand than simple system to ensure safety, due to the high complexity and maintenance costs. However, applying the traditional methods of reliability analysis, it is usually difficult to assess the reliability of the holistic systems in practical operation for a variety of reasons, such as the nonlinear coupling among components, the complexity of fault propagation mechanism, and the diversity of influencing factors. Hence, it seems, urgently, to be absolutely essential to explore a novel approach for system reliability assessment in order to ensure the safe operation of CEMS.

1.1. Literature Review

The complexity research of the CEMSs [1] mainly includes complex structure [2] and complex multifunction [3]. System reliability also is considered from two aspects of function and topology correspondingly.

In function, reliability, which is defined as the ability or capability of a product to perform a specified function in a designated environment for a minimum number of events or a minimum length of time [4], has long been a vital topic in systems engineering. Based on this definition, there has been a steady move towards the systematical use of reliability theory and historical failure data to evaluate and further improve system reliability in the last few decades. These methods include, but are not limited to, fault tree analysis (FTA), reliability block diagram (RBD), binary decision diagrams (BDD), dynamic fault tree (DFT), Markov model, Petri net, and Bayesian method (e.g., [5–17]). However, self-defects of the above approaches hinder their application in the CEMSs. To name a few, some methods used for modeling system reliability often rely on the assumption of the only two states of the component (i.e., functioning and malfunction) and independent failures. However, numerous industrial experiences have shown that the above assumptions have been unrealistic and may lead to unacceptable analysis errors [18]. Furthermore, these methods do not take into account the specificity of the physical structure of the entire system and the impact of failure propagation mechanism among components.

In the meantime, mostly evolved over the last decade, the development of network theory has provided an increasingly challenging reliability framework for characterizing CEMS. Indeed, a network can be commonly regarded as an abstract representation of system structure, in which the components are described as nodes and the interactions among the components are represented as edges. Not surprisingly, system reliability evaluation is equivalent to assessment of network reliability. Network reliability is concerned with the ability of a network to carry out a desired operation such as “communication.” Based on this definition, network reliability measures can be categorized as follows: (i)Terminal reliability [19]. It is defined as the probability of achieving connectivity from the input nodes to the output nodes and usually includes two terminal reliability [20], K-terminal reliability [21], and all-terminal reliability [22]. Unfortunately, combinatorial explosion commonly is the main problem in this method when it applies for the CEMS.(ii)Percolation reliability [23]. It investigates and addresses questions of practical interest in a system view such as “how many failed nodes will break down the whole network.” Percolation reliability is constructed according to a percolation process, and the critical threshold of percolation is used as network failure criterion. It attempts to overcome the combinatorial explosion problem. However, the coupling relationships among nodes and failure propagation mechanism are disregarded, since node breakdown is not independent.(iii)Efficiency reliability. It reveals how much the system is fault tolerant; thus, it shows how efficient the communication is among nodes when some of the nodes are fault [24, 25]. The global efficiency [26], reliability efficiency [27], and improved reliability efficiency [28] are suggested here as more common efficiency reliability indicators. The biggest advantage of efficiency reliability is the connectivity of the network to be taken synthetically into account. But seriously, the influences of failure propagation among nodes and the properties on system reliability still are not considered.

As mentioned above, each type of measures has its own strengths and weaknesses that need to be carefully considered (see Table 1) if they are applied to actual systems, especially the network of CEMS. Specifically, there are the following reasons:

First, the properties of nodes and edges, such as failure rate, reliability, and degree centrality (DC), are ignored. Different from the traditional network systems, both the nodes and edges in the network of CEMS represent the components and have their own attributes. What’s more, these attributes have a critical impact on system reliability. That is to say, system reliability is determined by those properties of components and their emergent behaviors. It is thus clear that the properties of nodes and edges are necessary for system reliability estimated.

Secondly, failure propagation caused by the coupling relationships among nodes is not considered. These relationships may cause failure propagation from one failure node to others, and then system reliability is decreased. In fact, the failure of a single node or a very few nodes can trigger failure propagation, which can disable the whole network almost entirely. Unluckily, most studies focus on one or several failure nodes of independent failure. Yet, failure propagation is, more often than not, ignored while system reliability is evaluated.

Thirdly, the edges serve as the medium that provide the possibility of failure propagation. Moreover, the attributes of edges have a great effect on the strength and depth of failure spread. Above detailed approaches explore the connectivity reliability of networks but miss the influence of failure spread.

In the above analysis, it can be seen that failure propagation is an indispensable part of system reliability estimation. Indeed, the problem of failure propagation for networks is not a new one. Numerous methodologies and models have been developed to describe, predict, and prevent failures or faults. They include classical probability models (Luo et al., 2009), Markovian models (Weber and Jouffe, 2006), Poisson models (Ren and Dobson, 2008), Bayesian models (Marquez et al., 2010), and Monte Carlo models (Lehmann and Bernasconi, 2010). However, these models or methods, more or less, have very limited applications in actual system, especially the CEMS. Typically, with the progress in structure and integration, system has become more and more complex and has shown that the assumption of independent failures has been unrealistic and has led to unacceptable analysis errors (Liu and An, 2014).

Subsequently, with the development of network theory, several failure propagation models clustering were proposed based on the small world. The most common problem taken in these models has been to focus on so-called the most possible propagation path. However, multipaths by one failure node in actual system may spread simultaneously. Multiple nodes also may fail at the same time, and then several paths are triggered. What’s more, if a node fails, it will (1) gradually spread to different other nodes due to the complexity of propagation mechanism, and it will not (2) not spread to all other nodes due to redundancy structure. Yet, the propagation distances of each path are also different. In addition, propagation path in the sense of topology is the main focus of the previously proposed ways, but the effects of functional attributes have been omitted. It is obvious not entirely satisfactory for the network of the CEMS. Therefore, it is vital to find out the whole probable failure paths and their occurring probability for the analysis of system reliability.

The remainder of this paper is organized as follows. Section 2 introduces brief definitions and notations of network construction and polychromatic sets, and their improved. In Section 3, the failure propagation model is proposed. Based on this, Section 4 defines the function-path length and then provides system reliability model. Section 5 presents our computational results of bogie system based on the proposed method. Conclusions and future research are discussed in Section 7.

1.2. Contribution

In this paper, we propose a new method to evaluate system reliability from the fault propagation prospective. Compared to the existing methods, our proposed method has the following central contribution: (i)The influence of failure propagation is considered in system reliability estimation. The descriptions of failure propagation comply well with the process of system failure in the proposed method. System failure reflects the changes of reliability.(ii)Both topology and function of system are comprehensively analyzed in the proposed method. For example, the traditional reliability analysis ignored the influence of topology, and terminal reliability also missed the effect of function.(iii)System reliability is estimated in a system view. The proposed method explores system reliability according to failure propagation paths and system effectiveness. The paths and system effectiveness measure are both global variables.

2. Preliminary

2.1. Improved Network Representation

Network theory is a basic premise of research on system reliability that a tool reflects real information about system topology and structure. It also provides a natural framework for the mathematical representation of system topology. Within most of research, CEMS may be reduced to a set of nodes, connected through directed edges, depending on the definition (Wang et al., 2017). Previous studies define a CEMS as a directed network that consists of a set of nodes/vertices and a set of edges/links that connect some of the nodes. Figure 1 shows the network of suspension system for bogie. Each component is a single node, whereas an inherent coupling relationship between two components (i.e., if there is at least one physical connection which is routed directly from to ) is represented by a directed link. Through a project of cooperation with China XXX Railway Vehicles Co. Ltd. (according to National High Technology Research and Development Program, 863 Program, No. 2012AA112001), the physical connection can be divided into three classes: mechanical, electrical, and information connections. And the direction of edges for different types is fixed (Wang et al., 2017). Table 2 shows the direction of different edges.

Unfortunately, the properties of edges and nodes are not embodied in the existing network model. These properties are indispensable to completely reflect the structure and function of the whole system. For the CEMS, the properties of nodes and edges are selected in view of 863 Program and professional experience of field expert (see Figure 2).

Therefore, the improved network model is proposed as follows: where is the set of nodes and is the set of edges. shows the node-node adjacency matrix representation of components and connections in the network, where elements represent directed edges with Boolean magnitude as set out. is the number of nodes in the network. is the set of nodes’ properties and mathematical representation of these measures that belonged to are shown in Table 3.

in (1) is the set of edges’ attribute, and specific formulations of these measures that belonged to are listed in Table 4.

2.2. Improved Polychromatic Sets

Polychromatic set is a newly established system theory (Chaudhry et al., 2000; Li and Da, 2003). Its key idea is to use standardized mathematical model to simulate different objects. This theory has a significant advantage in the set operation, which has also been considered as a contribution to theoretical development in systems theory. For a conventional set, the elements only describe their names even though these elements could be different. Obviously, names are impossible to represent all other characteristics of each element. In polychromatic sets, not only its elements but also its entirety can be, however, pigmented with different colors to represent the research object as well as the properties of its elements. Li et al. (2003, 2006) provided a more detailed description. Only important definitions are presented here for the sake of completeness.

Assume that the composition of a polychromatic set is . The color set of every element is where corresponds to every element , and denotes the th individual color of element .

The color set of the whole set is defined as where corresponds to the entirety of , and represents the th unified color of the entirety of .

The relationship between each element and unified color can be represented using the following Boolean matrix, in which , if and

Let the element be the node, and the color of each element represent the attribute of node . We can use polychromatic set to describe properties of components and their relationships. But it is important to note that the value of is 0 or 1 in polychromatic set theory. Obviously, the values of attributes in the CEMS, such as DC, CC, BC, and the probability of failure, are not an integer. Hence, we extend the definition of and then improve (4) as follows: where is the relationship between the element color and unified color , and represents the value of individual color and its probability value.

2.3. Basic Assumptions of the Models

Reliability evaluation of the CEMS under various operating conditions is a quite complicated issue. In order to deal with these complexities, the models proposed in this paper have been built on the following assumptions: (i)System failure is caused by nodes malfunction.(ii)Edges can help the spread of the failure but cannot cause the failure.(iii)The fault nodes are not able to fail again before maintaining.(iv)The different failure modes of the same component are independent.

3. Failure Propagation Model

In this section, the failure propagation model is proposed to obtain all possible propagation paths and their occurrence probability. All these are an extremely important foundation of system reliability assessment.

3.1. Correlation Matrix of Failure Modes

The failure modes of components, to some extent, reveal the degree of component failure. Serious failure mode of the component will increase the fault pervasion intensity (Shu et al., 2016). Indeed, there is a correlation between different failure modes of different components. Through communicating with experts and consulting the relevant literature, the correlations of failure modes for different components are listed in Table 5.

We can derive the correlation matrix of failure modes among different nodes as follows, where is the correlation matrix of failure mode between two nodes and . where is the possibility of the th failure mode of node , which is caused by the th failure mode of node . And the value of is shown in Table 5. denotes the th failure mode of node .

3.2. Failure Propagation Model

In the previous study, the fault pervasion intensity [29] is defined and described as the process of failure propagation for a single node in the traditional network according to the grade-diffusing process. where is the fault pervasion intensity from node to in the th step. and are the weight of the propagation probability and DC, respectively. The propagation probability from node to , which is directly caused by the th failure mode of node , is . If there is no connection between nodes, is 0. represents the set of nodes, which fail in theth step of failure propagation. is the DC of the th node. is the cluster coefficient.

However, (9) cannot directly apply for the CEMS. Differentiating from traditional networks, the fault pervasion intensity does relate not only to the fault propagation probability of edges and the probability of failure of nodes but also the comprehensive importance and failure modes of nodes. This is a consequence of the following two facts: (1) the failure of critical components has a great effect on system inherent topology and normal functional realization of the whole system. The failure of critical components can, to some extent, increase the risk of failure propagation. (2) Through exploratory failure data analysis, we find that the different failure modes of components represent the degree of performance degradation of a component. A severe failure mode of components will increase the degree or intensity of failure propagation. Therefore, we improve the calculation formula of fault pervasion intensity in (9) as follows: where represents the failure probability of node in the th step of propagation. is the comprehensive importance (CI) measure (Wang et al., 2017). is the probability of the most likely failure modes of node in the th step of failure propagation. and are the weights.

However, (10) still describes the failure propagation process of a single node. For the CEMS, propagation paths have diversity and complexity due to randomness and uncertainty. In other words, there is a possibility that multiple nodes simultaneously fail to cause multiple propagation paths. Therefore, the failure propagation model for the system level is proposed.

First, we define two kinds of operators:

(1) Corresponding multiplication operator .

If and is -dimensional column vector, then .

(2) Compact multiplication operator .

If and is -dimensional row vector, then .

According to (6) and (10), the failure propagation model, after the -steps fault pervasion, is where where denotes the set of failure paths after the th step of failure propagation. is the state of nodes in the th paths after the th step of failure propagation. represents the state of failure nodes in the th paths in the th step of failure propagation. is the set of failure nodes in the th paths in the th step of failure propagation. is the comprehensive importance measure of failure nodes in the th paths in the th step of failure propagation. is the most likely failure modes in the th paths in the th step of failure propagation. denotes failure node number in theth paths after theth step of failure propagation. is the th failure mode of node in the th step of failure propagation.

From the energy point of view, there is a constant accumulation of energy within the component, and the energy density increases continuously before this component failing. A fault occurs if the accumulated energy exceeds the maximum capacity of this component. Hence, the following constraints have to be satisfied for (11): (1)The fault pervasion intensity between components will reduce by orders of magnitude with the increase of propagation path length. If the fault pervasion intensity is lower than 10⁻⁸, the node is in secure state. In other words, the failure does not spread continually.(2)If , then the fault propagation stops.

From (11), , which is the set of nodes in the th path, and , which is the occurrence probability of the th propagation path, play an important role for system reliability assessment. In fact, is the th failure propagation path.

4. System Reliability Evaluation

In this section, we illustrate how to calculate theoretically the system reliability from failure propagation mechanism point of view. First, system effectiveness measure is proposed to analyze reliability for a node failure based on the function-path length. Then, system reliability is provided in view of the system effectiveness measure and network theory.

4.1. The Function-Path Length

From the view of the network’s topology, the topology-path length is the sum of the number of its constituent edges between two vertices (the so-called path length in the previous literature). In essence, it indicates the physical distance between two generic nodes. However, the network of CEMS is different from general complex networks such as small-world network, random network, and scale-free network. The nodes and edges correspond to components of actual system. As such, they may have multiproperties, which include topological and functional properties. Moreover, the path length should be able to characterize the distance of failure propagation paths. Obviously, the definition of traditional path length is ill-posed for reliability analysis of the CEMS network. Therefore, the function-path length is proposed through a combination of data-based functional properties and network-based topological attributes.

The function-path length is defined the distance of failure propagation between two nodes. It relates to the topology-path length and the properties of nodes and edges (see Figure 2) in this path. Figure 3 exposes the basic ideas of the calculation of the function-path length. As you can see, the whole process consists of three stages: (1) the same types of measures of nodes or edges in this path are fused based on fuzzy integral, respectively. (2) Then, measures, which belong to identical properties, are namely integrated. (3) All properties are aggregated, and finally, the function-path length can be obtained.

Mathematically, the function-path length between nodes and is defined as where is the topology-path length. is the integrated value of all topological properties of nodes in this path, where represents the th measure of the th node in this path, is the weight of all measures, which belong to topological properties of nodes, and. is the integrated value of all functional properties of nodes in this path, where represents the th measure of the th node in this path, is the weight of all measures belong to functional properties of nodes and. is the integrated value of all functional properties of edges in this path, where is the th measure of the edges in this path, and is the weight of all measures belong to functional properties of edges.

Correspondingly, the shortest function-path length is where is the number of the function-path between node and .

4.2. System Reliability Measurement

Most previous studies have dealt with the efficiency measure by using topology-path length. There is no doubt it is not applicable to the CEMS. For this reason, we improve global efficiency and construct system effectiveness (SE) measure based on the function-path length as follows: where is the shortest function-path length.

Due to the complexity and uncertainty of failure propagation, the existence of multiple paths is possible. Obviously, SE measure is not suitable for the CEMS with complicated propagation mechanism. For example, the possibility and relationship of multiple propagation paths are ignored. Hence, a novel system reliability measurement is defined as where is SE measure if node faulted and caused the th failure path. is obtained from (11). is the occurrence probability of the th failure path, which is caused by failure node . is the set of failure nodes in initial state. is the weight of each failure path.

5. Case Study

Throughout the world, high-speed railway offers a fast and comfortable transportation mode with a high carrying capacity [30]. The high-speed train (HST) system, as an essential component of high-speed railway, is the main carrier for passengers’ transportation from one place to another. To illustrate the method described in Section 3 and 4, we present a case study for bogie system. Bogie system, which is a critical component of HST system, is considered to play a fundamental role in both improving passenger comfort and maintaining safety of system. Figure 4 shows the bogie system of China Railway High-speed X (CRHX), which is a type of the HST system. It has been under investigation for many years with the aim to increase the reliability and safety of the HST system. Especially, understanding its reliability is important as a basis to improve design and cost-effective ways to protect system safety.

5.1. Data Analysis

Bogie system consists of the interacting elements, giving rise to the emergence of organization without any external organizing principle being applied. These components, including bogie frame, brake caliper, brake lining, and gearbox (see Table 6), usually interact through the mechanical, electrical, and information connections between them.

In terms of components as well as their connections, bogie system is modeled as a directed network that consists of 33 nodes and a series of edges connecting some of the components as shown in Figure 5. The mathematical expression of the network for the bogie system is as below:

The nodes in Figure 5 are in one-to-one correspondence with the components in Table 6. In addition, the directions of edges, such as mechanical connection, electrical connection, and information connection (Wang et al., 2017), are fixed listed in Table 2.

Based on (17), the topological properties of nodes, such as DC, BC, and CC, could be easily observed. Figures 6(a)–6(c) plot the DC, BC, and CC, respectively. The results show that node , on average, is the most critical component in topology. It should not be surprising due to its “core status.” Indeed, about 60.6 percent of components are directly installed on bogie frame (node ) in order to support the train. Perhaps the importance of node is self-evident from the topological point of view. However, an interesting observation against the failure data is that the critical nodes, such as bogie frame (node ), in topology achieve high reliability. These components are not more prone to failure, but once they fail, the consequences are disastrous.

Furthermore, Figure 6(d) shows comprehensive importance (CI) of all nodes, for the purpose of comparison. One striking result apparent is that the influential component is node by the assessment of CI, instead of node . The reason of this is that CI measure focus on the comprehensive consideration of the effects on node importance. However, the topological properties of nodes only concern the node importance in topology. Obviously, CI measure is more applicable to the HST system, since human factors and uncertainty can be effectively reduced. Therefore, we select CI measure to participate in system reliability evaluating.

The properties of nodes and edges include topological and functional attributes, in which topological properties (see Figure 6) can be derived by the network model in (17), and functional attributes can be collected from historical failure data. Functional properties are the data basis for analysis of system reliability. Through a project (863 Program, number 2012AA112001), the historical failure databases of bogie system of CRHX during 2011–2015 are provided and essential to investigate system reliability. In which, each failure data record contains the failure ID numbers, the vehicle ID number, the section of failure, the failure mode, the date of failure, the environment of failure, and so on. We deal with the data by removing some irrelevant items. Besides, a preprocessed failure data of these components in Table 6 is presented in Table 7.

To gain further insight, Table 8 reveals components’ functional properties within 120 million kilometers by using the preprocessed failure data in Table 7 and equations in Tables 3 and 4.

Furthermore, it is worth noting that edges also correspond to components in the network of bogie system. Hence, edges’ functional properties can be calculated through historical failure data, and they also have great influence on system reliability. Table 9 lists the functional properties of edges within 120 million kilometers based on equations in Table 4.

5.2. System Reliability of Bogie System

5.2.1. Failure Propagation Model

As revealed from (11), both and are the weights of the influence factors of failure propagation. To make the model and the corresponding analysis simple, we here assume . And the critical nodes (i.e., , , and ) and noncritical nodes (such as , , and ) are selected as a fault source for the expression of failure propagation process, respectively.

Table 10 illustrates all possible failure propagation paths and their probability if the node fails. An interesting observation is that node , which is a topologically critical node, does not cause failure propagation. As expected earlier, node (bogie frame) is a critical skeleton component. Once it breaks down, serious consequences may result for the whole bogie system. Therefore, node usually has the higher reliability in the design and manufacturing phase and hardly malfunctions. Another interesting fact observed is that, as presented in Table 10, path length, which is caused by critical nodes, is shorter than the noncritical nodes. Besides, the longer the path length, the smaller is the probability of the failure path. These results are consistent with the observations of historical failure data. It is due to various reasons including inherent redundancy device for critical nodes and warning device, as well as improved design which prevent the further failure propagation.

As a graphical illustration, Figure 7 presents the failure propagation path of nodes in Table 10. The red nodes represent the fault source, and the blue nodes are also the failure nodes which are caused by other nodes through failure propagation. The edges with different color describe the different propagation paths. We can see from Figure 8 that the topology-path length of failure propagation is shorter and usually lower than 3. Figure 8 also demonstrates that only one failure node does not cause the failure of all other nodes in the network. In other words, failure propagation has limits.

5.2.2. System Reliability

Notice, the function-length path is an important quantity to observe system reliability. To illustrate, take a concrete example of the path (i.e., ). According to (13), we first need to determine the types of integral. In general, fuzzy integral includes Choquet integral (Marichal, 2000), Sugeno integral (Klement et al., 2010), and Weber integral (Tomaschitz, 2014). This is an important consideration in view of the fact that weights of the various properties or measures and their relationships can be described. Hence, Choquet integral is selected to integrate multiproperties or measures. This is due to (1) Sugeno integral only considers the most critical factors and all others are ignored. (2) Weber integral gives the infimum of information fusion. (3) Choquet integral takes all factors into consideration and also gives a certain value.

Based on (13), the weights, such as , , , and , can be obtained by Labreuche and Grabisch (2013). Therefore, the function-path length is as below and Figure 9 explains the basic ideas of the calculation of function-path length. where

Similarly, and are also calculated as follows:

Finally, according to (14), the shortest functional-path length is arrived to a compact expression.

According to (16), the results of system reliability are reported in Table 11 if node or malfunctions. It can be seen from Table 11 that as expected, system reliability can be obtained no matter what a single node or several nodes fail. Besides, it also can be seen that the system reliability is lower if more than one node fails.

6. Discussion

6.1. Analysis of Parameters

6.1.1. The Parameters in Failure Propagation Model

In order to verify the effectiveness of the proposed failure propagation model, we discuss the effect of the weight on fault pervasion intensity. Figure 9 suggests the relationship between the number of steps of failure propagation and the parameter . An important observation reflected in Figure 9 is that the higher the weight is, the shorter the number of steps of failure propagation is. In addition, we also can see that the influence of the weights on failure propagation of critical nodes is not more significant changes than non-critical nodes. All these results further reflect that the impact of critical nodes on system reliability is not ignored.

To further illustrate the effectiveness of this model, the previous methods, such as the signed directed graph-fault graph (SDG-FG) (Hu et al., 2015) and improved fuzzy fault Petri net-based (IFFPN) method (Wang et al., 2013), and the proposed failure propagation model are compared in Table 12. By using SDG-FG method, the failure propagation path with the highest risk is with the ant colony algorithm. From Table 12, our proposed method can obtain all possible failure propagation paths and their probability. However, IFFPN-based method only can derive only one path for each failure node, and SDG-FG model is able to obtain the highest risk path for the whole network. Different from the general network, the bogie system, as a complex electromechanical system, has the complex topology and function and is also affected by complex operating environments. Hence, the analysis of multipaths will help the maintenance personnel to find quickly the fault component and reduce economic losses according to actual conditions. Furthermore, it also can be seen that the results of the proposed model are found to coincide well with the paths derived from failure data. The effectiveness and feasibility of the proposed method is proved again.

6.1.1.1. The Parameters in System Reliability Model. Figure 10(a) summarizes the shortest function-path length with different fuzzy integral. In order to make the results more tangible and digestible, Figure 10(b) compares the shortest path lengths with six paths, including , , , , , and . We can see that the shortest topology-path length between a pair of nodes is the same, but the function-path length is different. For example, the shortest topology-path of is 1, and the shortest function-path length with Choquet integral and Sugeno integral is 1.621 and 1.992, respectively. This is because the diversity of nodes and edges is prone to be ignored, such as the functional properties of nodes and edges. However, the multiproperties of nodes and edges are taken into account for construction the function-path length. Another striking result apparent is that the value of the shortest function-path length with Choquet integral is lower than Sugeno integral. The reason of this is that Sugeno integral remove unimportant factors. But Choquet integral is able to consider the effects of all factors. It is thus clear that Choquet integral has the higher accuracy.

Figure 11 compares global efficiency, reliability efficiency, and system effectiveness measure. The global efficiency is and the reliability efficiency is where is the shortest topology-path length. The minimization is done with respect to all paths linking nodes and , and the product extends to all the edges of each of these paths. is the reliability of the connection between pairs of nodes and .

It can be seen from Figure 11 that the value of global efficiency is the lowest and the value of reliability efficiency is the highest. In fact, global efficiency is defined only from the topology prospective. In fact, system topology determines system function and reliability. Hence, once a node fails from a structure view, it may have a greater influence on the whole system. This has contributed to the lower global efficiency if a node malfunctions. Reliability efficiency is constructed only based on the functional properties of edges and misses the influence of nodes and system topology. However, the proposed system effectiveness measure is proposed by taking into account both topological and functional of edges and nodes. Hence one can see that system effectiveness measure the most efficient than others.

6.1.2. Comparison of Results

Figure 12 shows the reliability with different measures, such as system reliability, global efficiency reliability, and the improved efficiency reliability. Global efficiency reliability is and the improved efficiency reliability is defined as where is the network after several nodes failure.

It can be seen from Figure 12 that the reliability of the whole system is different by using three measures, since the focus of each measure is different. But system reliability is generally smaller than other measures. For example, if node fails, system reliability is 0.357, global efficiency reliability is 0.392, and improved efficiency reliability is 0.411. Global efficiency reliability concentrates on the influence of topology, and improved efficiency reliability focuses on the effects of the reliability of the edges. However, the proposed system reliability is a comprehensive assessment and focuses on the impact of failure propagation on reliability.

In addition, Li et al. [23] also proposed a network reliability analysis method based on percolation theory. Reliability is defined as where is the reliability of the generic node, assumed the same for all nodes. is the number of nodes in the network. is the binomial coefficient.

In Table 13, we can see that the value of is higher than . From a mathematical point of view, in (26) only can compute the number of fault nodes, which fail by failure source. However, specific nodes and their relationships are not known. In other words, failure propagation mechanism is ignored. Hence, this method [23] is a conservative approach. However, failure propagation model is considered in the proposed system reliability.

Furthermore, failure data in previous analysis is applied within 120 million kilometers. Figure 13 plots system reliability within different running mileages. The result shows that system reliability decreases with the time increases. The evaluation result also demonstrates the efficiency of the proposed method with time-varying failure data.

7. Conclusions and Perspectives

In this study, we present and introduce a general system reliability assessment method from the failure propagation prospective. As was pointed out in previous researches, the reliability assessment of a CEMS is drawing much attention on the local behavior and not on the holistic system behavior. This study explicitly addresses this problem on how to assess system reliability with its network model, historical failure data, and failure propagation mechanism at a system level. The main contributions of this paper to the literature are as follows:

A contribution of our study is that it provides the failure propagation model for the CEMS. As stated previously, this model aims to solve the problem on how to determine simultaneously multipropagation paths when one or several nodes fail and then calculate their occurrence probability in a network. Meanwhile, other variables, such as the possibility of rate of nodes, fault propagation probability of edges, and DC of nodes are also included in the model, which decreases effectively the uncertainty and randomness due to failure data and human factors. The advantage of this modeling framework is that it can derive all possible failure propagation paths between nodes based on improved polychromatic sets rather than one most possible propagation path. The analyzed results suggest that the paths of failure propagation are consistent with the observed failure data.

Another contribution of our study is that it presents system reliability as a new measure for the system reliability assessment of the CEMS. And introduction of failure propagation model to the definition of system reliability is perhaps the most important methodological contribution of this paper. System reliability is defined as the probability that the network connectivity can accommodate a certain fault condition. This measure should be considered as an important and meaningful performance index. The reason is simple: the decreasing of reliability of the whole system is not determined by one independent node. The connectivity between nodes is a necessary condition for the successful operation of a CEMS. However, once a node fails, failure also spreads through these edges and affects system reliability. In order to assess reliability, the function-path length is given and integrates multiproperties of nodes and edges. Numerical results have been performed to demonstrate the feasibility of the reliability evaluation procedure. It is also shown that the model proposed in this study can correctly estimate the value of system reliability.

As expected, the method of system reliability assessment is the time-varying model. It was clarified that accuracy of the value of system reliability becomes higher with increase of failure data. These results may have significance for researchers and repair personnel who are concerned with the reliability and safety of high-speed railways. In addition, the proposed method is able to extend and apply for the complex electromechanical systems without loss of generality.

Though we have presented a comprehensive framework for the system reliability evaluation of the CEMS network, the current study of system reliability for the CEMS is still at a preliminary stage. There are many theoretical and methodological aspects that need to be explored. We do, however, believe them to be essential for the simple results obtained in this paper. And our studies open up the following future research directions. We outline a few potential research topics here. (1) Throughout the investigation, we have relied on several assumptions. Perhaps this is the most important limitation of the models. The validity of these assumptions needs to be assessed empirically in future research. These assumptions need to be relaxed for the development of a plausible model, which is our future task. (2) System safety is also important for the high-speed railways. And it has a special and close relationship with system reliability. Further research for system safety based on reliability is needed, and we believe this is an interesting line of future investigation.

Conflicts of Interest

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

First and foremost, the authors would like to show their deepest gratitude to Hengrun Zhang, who has provided them with valuable guidance in the stage of the language modification of this thesis. Next, this work described in this paper was supported by the grants from the State Key Laboratory of Rail Traffic Control and Safety (no. RCS2017ZZ002).

References

M. P. Aghababa, “Fractional modeling and control of a complex nonlinear energy supply-demand system,” Complexity, vol. 20, no. 6, 86 pages, 2015.
View at: Publisher Site | Google Scholar
L. J. Wei, F. B. Zhou, J. W. Cheng, X. R. Luo, and X. L. Li, “Classification of structural complexity for mine ventilation networks,” Complexity, vol. 21, no. 1, 34 pages, 2015.
View at: Publisher Site | Google Scholar
J. Mi, Y. F. Li, Y. J. Yang, W. Peng, and H. Z. Huang, “Reliability assessment of complex electromechanical systems under epistemic uncertainty,” Reliability Engineering & System Safety, vol. 152, pp. 1–15, 2016.
View at: Publisher Site | Google Scholar
W. H. K. Lam, H. K. Lo, and S. C. Wong, “Advances in equilibrium models for analyzing transportation network reliability,” Transportation Research Part B: Methodological, vol. 66, pp. 1–3, 2014.
View at: Publisher Site | Google Scholar
S. Bolus, “Power indices of simple games and vector-weighted majority games by means of binary decision diagrams,” European Journal of Operational Research, vol. 210, no. 2, pp. 258–272, 2011.
View at: Publisher Site | Google Scholar
F. Chiacchio, M. Cacioppo, D. D'Urso, G. Manno, N. Trapani, and L. Compagno, “A Weibull-based compositional approach for hierarchical dynamic fault trees,” Reliability Engineering & System Safety, vol. 109, pp. 45–52, 2013.
View at: Publisher Site | Google Scholar
J. L. Evans, L. Elefteriadou, and N. Gautam, “Probability of breakdown at freeway merges using Markov chains,” Transportation Research Part B: Methodological, vol. 35, no. 3, pp. 237–254, 2001.
View at: Publisher Site | Google Scholar
M. C. Kim, “Reliability block diagram with general gates and its application to system reliability analysis,” Annals of Nuclear Energy, vol. 38, no. 11, pp. 2456–2461, 2011.
View at: Publisher Site | Google Scholar
K. Kobayashi, K. Kaito, and N. Lethanh, “A statistical deterioration forecasting method using hidden Markov model for infrastructure management,” Transportation Research Part B: Methodological, vol. 46, no. 4, pp. 544–561, 2012.
View at: Publisher Site | Google Scholar
G. Merle, J. M. Roussel, and J. J. Lesage, “Quantitative analysis of dynamic fault trees based on the structure function,” Quality and Reliability Engineering International, vol. 30, no. 1, pp. 143–156, 2014.
View at: Publisher Site | Google Scholar
K. Parry and M. L. Hazelton, “Bayesian inference for day-to-day dynamic traffic models,” Transportation Research Part B: Methodological, vol. 50, pp. 104–115, 2013.
View at: Publisher Site | Google Scholar
F. Abdul Rahman, A. Varuttamaseni, M. Kintner-Meyer, and J. C. Lee, “Application of fault tree analysis for customer reliability assessment of a distribution power system,” Reliability Engineering & System Safety, vol. 111, pp. 76–85, 2013.
View at: Publisher Site | Google Scholar
J. P. Signoret, Y. Dutuit, P. J. Cacheux, C. Folleau, S. Collas, and P. Thomas, “Make your Petri nets understandable: reliability block diagrams driven Petri nets,” Reliability Engineering & System Safety, vol. 113, pp. 61–75, 2013.
View at: Publisher Site | Google Scholar
D. Straub and A. Der Kiureghian, “Bayesian network enhanced with structural reliability methods: methodology,” Journal of Engineering Mechanics, vol. 136, no. 10, pp. 1248–1258, 2010.
View at: Publisher Site | Google Scholar
J. Wu, S. Yan, and L. Xie, “Reliability analysis method of a solar array by using fault tree analysis and fuzzy reasoning Petri net,” Acta Astronautica, vol. 69, no. 11-12, pp. 960–968, 2011.
View at: Publisher Site | Google Scholar
L. Xing, O. Tannous, and J. B. Dugan, “Reliability analysis of nonrepairable cold-standby systems using sequential binary decision diagrams,” IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, vol. 42, no. 3, pp. 715–726, 2012.
View at: Publisher Site | Google Scholar
C. Zhang, J. E. Ramirez-Marquez, and C. M. R. Sanseverino, “A holistic method for reliability performance assessment and critical components detection in complex networks,” IIE Transactions, vol. 43, no. 9, pp. 661–675, 2011.
View at: Publisher Site | Google Scholar
Y. Sun, L. Ma, J. Mathew, and S. Zhang, “An analytical model for interactive failures,” Reliability Engineering & System Safety, vol. 91, no. 5, pp. 495–504, 2006.
View at: Publisher Site | Google Scholar
I. Gunawan, “Reliability analysis of shuffle-exchange network systems,” Reliability Engineering & System Safety, vol. 93, no. 2, pp. 271–276, 2008.
View at: Publisher Site | Google Scholar
C. C. Jane and Y. W. Laih, “A dynamic bounding algorithm for approximating multi-state two-terminal reliability,” European Journal of Operational Research, vol. 205, no. 3, pp. 625–637, 2010.
View at: Publisher Site | Google Scholar
G. Hardy, C. Lucet, and N. Limnios, “K-terminal network reliability measures with binary decision diagrams,” IEEE Transactions on Reliability, vol. 56, no. 3, pp. 506–515, 2007.
View at: Publisher Site | Google Scholar
A. Rodionov, D. Migov, and O. Rodionova, “Improvements in the efficiency of cumulative updating of all-terminal network reliability,” IEEE Transactions on Reliability, vol. 61, no. 2, pp. 460–465, 2012.
View at: Publisher Site | Google Scholar
D. Li, Q. Zhang, E. Zio, S. Havlin, and R. Kang, “Network reliability analysis based on percolation theory,” Reliability Engineering & System Safety, vol. 142, pp. 556–562, 2015.
View at: Publisher Site | Google Scholar
V. Latora and M. Marchiori, “A measure of centrality based on network efficiency,” New Journal of Physics, vol. 9, no. 6, p. 188, 2007.
View at: Publisher Site | Google Scholar
V. Latora and M. Marchiori, “Efficient behavior of small-world networks,” Physical Review Letters, vol. 87, no. 19, article 198701, 2001.
View at: Publisher Site | Google Scholar
M. Rubinov and O. Sporns, “Complex network measures of brain connectivity: uses and interpretations,” NeuroImage, vol. 52, no. 3, pp. 1059–1069, 2010.
View at: Publisher Site | Google Scholar
E. Zio, “From complexity science to reliability efficiency: a new way of looking at complex network systems and critical infrastructures,” International Journal of Critical Infrastructures, vol. 3, no. 3/4, p. 488, 2007.
View at: Publisher Site | Google Scholar
E. Zio, G. Sansavini, R. Maja, and G. Marchionni, “An analytical approach to the safety of road networks,” International Journal of Reliability, Quality and Safety Engineering, vol. 15, no. 01, pp. 67–76, 2008.
View at: Publisher Site | Google Scholar
L. Guo, G. Jianmin, G. Zhiyong, and J. Hongquan, “Failure propagation model of complex system based on small world net,” Journal-Xian Jiaotong University, vol. 41, no. 3, p. 334, 2007.
View at: Google Scholar
L. Zhou, L. (. C.). Tong, J. Chen, J. Tang, and X. Zhou, “Joint optimization of high-speed train timetables and speed profiles: a unified modeling approach using space-time-speed grid networks,” Transportation Research Part B: Methodological, vol. 97, pp. 157–181, 2017.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2018 Shuai Lin et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

2573

Downloads

1623

Citations