Abstract

This paper studies the stochastic analysis of a two non-identical unit parallel system with common-cause failure, critical human error, non-critical human error, preventive maintenance, and two types of repair. The system goes for preventive maintenance at random epochs. We assume that the failure, repair, replacement, and maintenance times are independent random variables and their rates are constant for each unit. The system is analyzed by using the graphical evaluation and review technique (GERT) to obtain various related measures. We study the effect of the preventive maintenance on the system performance. Certain important results have been derived as special cases. The plots for the mean time to system failure and the steady-state availability of the system are drawn for different parametric values.

1. Introduction

Goel et al. [1, 2] have studied a two-similar or dissimilar unit cold standby redundant system with preventive maintenance, inspection, and two types of repair. Mokaddis et al. [3] studied a two-unit cold standby system with four different models (normal, human errors, partial hadware failure, and total hardware failures). Kadyan et al. [4] and Chander [5] have analyzed reliability models of non-identical units with priority and keeping one unit in cold standby. The stochastic analysis of a non-identical two-unit parallel system with common-cause failure by graphical evaluation and review technique (GERT) is investigated by [6]. A stochastic analysis of a dissimilar two-unit parallel system with preventive maintenance and common-cause failure is investigated by [7]. Galikowsky et al. [8] examined the series system with cold standby components, where the time-to-failure and the time-to-repair of the primary and standby units are exponentially distributed. Wang et al. [9] considered four different system configurations with warm standby components, and standby switching failures are compared based on their reliability and availability, when the time-to-repair and the time-to-failure for each of the primary and warm standby components, are assumed to follow the negative exponential distribution. Further, sometimes it becomes necessary to give priority in repair to one unit over another unit. The reliability analyses of two mathematical models (outdoor electric power systems) are studied by Mokaddis et al. [10]. Mahmoud and Moshref [11] studied the stochastic analysis of a two-unit cold standby system considering hardware failure, human error failure, and preventive maintenance (PM).

Human error is defined as a failure to perform the performance of a prohibited action, which could result in damage to equipment and property or disruption of scheduled operations. It can be classified as either critical or non-critical. A critical human error is one which causes the failure of the entire system; for example, fire due to a human error in a room where a redundant system is located will cause total system failure. On the other hand, a non-critical human error does not lead to a catastrophic result. The occurrence of human errors can be due to incorrect actions, maintenance errors, misinterpretation of instruments, and so on.

A CCF is defined as the failure of single unit or multiple units due to a single common-cause. Some of the CCF may occur due to the following reasons:(i)abnormal environmental conditions, for example, temperature and pressure;(ii)defective design; (iii)natural catastrophe like fire,…, and so forth.

GERT analysis is applied to stochastic network having the following features.(a)All networks contain logical nodes and branches.(b)The probability represents a branch that the activity with it will be determined.(c)The activities represented by the branches are described by other parameters.

The time parameter is the main factor in the paper. The time related with a branch is described by a moment generating function (m.g.f.) of the form . The -function in the form where is the probability that any branch is realized and is the m.g.f. this function is used to get the information of relationship which exists between the nodes. Moreover, the equivalent probability of realization of the network, equivalent m.g.f., and mean time of realization of the network, are, respectively, given as The present paper is devoted to deal with two-unit (non-identical) parallel system with additional preventive maintenance on regenerative state at random epochs. The mean time to system failure, steady-state availability, busy-time, and idle-time for the service facility are obtained using the graphical evaluation and review technique (GERT). The effect of preventive maintenance on the system performance is shown graphically.

The results obtained by [12] can be derived from present paper as a special case.

2. Assumptions

(1)The system consists of two non-identical parallel components, “ and .(2)The system remains operative even if a single component operates.(3)All failure, repair, replacement, and preventive maintenance rates are constant.(4)The common-cause failure and other failures are statistically independent.(5)The system is downstate when both components are failed.(6)The online unit suffers four types of failures, namely, hardware error, non-critical human error, critical human error, and common-cause failure.(7)The system is subject to common-cause failure, hardware error, non-critical human error, and critical human error when it is operating.(8)There are two types of repair, that is, (1) repair low-cost with probability and (2) costly repairs with probability .(9)Both the components (when failed) can be replaced simultaneously.(10)Preventive maintenance (e.g., overhaul, inspection, minor, repairs, etc.) is provided to this system at random epochs when the system is in the state defined below.(11)After the repair or replacement, the component is as good as new.

3. Notation

: Constant hardware failure rate of component : Constant hardware failure rate of component : Constant non-critical human error failure rate of component : Constant non-critical human error failure rate of component : Constant common-cause failure rate of the system when both the units are operating: Constant common-cause failure rate of component when has been already under repair of type 2 (costlier repair): Constant common-cause failure rate of component when has been already under repair of type 2 (costlier repair): Constant common-cause failure rate of component when has been already under repair of type 1 (cheaper repair): Constant common-cause failure rate of component when has been already under repair of type 1 (cheaper repair): Constant critical human error failure rate of the system when both the units are operating: Constant critical human error failure rate of component when has been already under repair of type 2 (costlier repair): Constant critical human error failure rate of component when has been already under repair of type 2 (costlier repair): Constant critical human error failure rate of component when has been already under repair of type 1 (cheaper repair): Constant critical human error failure rate of component when has been already under repair of type 1 (cheaper repair): Cheaper repair rate of a component or component , receptivity: Costlier repair rate of a component or component , receptivity: Constant rate of PM time: Constant rate of time for taking a unit into PM: Rate of simultaneous replacement of components and when the system fails : Mean sojourn in state , for all  : The transition probability from state to state , for all  , , .

Symbols for the States of the System : Component in normal and operative mode: Component in normal and operative mode: Component in failure (hardware failure or non-critical human error) mode and under repair of type 1: Component in failure (hardware failure or non-critical human error) mode and under repair of type 1: Component in failure (hardware failure or non-critical human error) mode and under repair of type 2: Component in failure (hardware failure or non-critical human error) mode and under repair of type 2: Component in normal mode and under preventive maintenance: Component in normal mode and under preventive maintenance: Component in failure (common-cause) mode and needs replacement: Component in failure (common-cause) mode and needs replacement: Component in failure (critical human error) mode and needs replacement: Component in failure (critical human error) mode and needs replacement. Considering these symbols, the system can be in any of the following states:,,, , , , , , , , , , , , ,, , , , .

UP states: , , , , and . Downstates: , , , , , , , , , , , , , , and .

4. Transition Probabilities and Mean Sojourn Times

The probability of transition from state to state is given by The mean sojourn time in state is defined as the time of stay in state before transiting to any other state given by Hence from the above states the -function from state to state is given by where is the probability of transition from state to state , is the mean sojourn time.

5. Mean Time to System Failure (MTSF)

The mean time to system failure is defined as the time until the system is completely inoperative. This is accomplished by finding the -function from the initial state to terminal state . Making use of Mason’s rule we get where where The transition probabilities are .

The mean sojourn time can be written as follows: Hence, the mean time to system failure is given by , where The mean time to system failure (MTSF) when the system starts from is where

6. Mean Time to Repair and Replacement (MTTR)

The repair facility of the system is available in states . Hence, the mean time to repair is given by where The mean time to repair (MTTR) is given by where

6.1. Availability Analysis

The system availability, , is defined as Pr (that the system is in an operating state at time ). The steady-state availability is defined as

7. Busy-Time and Idle-Time for the Service Facility

When the process is available in states , the repair facility is idle. This path is represented by Thus, The repair facility is available in states with repair, replacement, and preventive maintenance rates , , , , , and , respectively. Hence, the equivalent -function is obtained as Therefore, the mean busy-time of service facility is given as Therefore, the busyness of the service facility may be defined as

8. Special Cases

8.1. Study the System without Preventive Maintenance

When and so on, .

The mean time to system failure () is given as

The mean time to repair/replacement (MTTR1) is given as where The steady-state availability is given by Busy-time and idle-time for the service facility:

when the process is available in state , the repair facility is idle. This path is represented as Thus, Therefore, the mean busy-time of service facility is given as

8.2. Study the System without Common-Cause Failure

When and so on, . Note that The mean time to system failure (MTSF2) is given by The mean time to repair (MTTR2) is given by where The steady-state availability is given by Busy-time and idle-time for the service facility:

when the process is available in state , the repair facility is idle. This path is represented by Thus, Therefore, the mean busy-time of service facility is given by

8.3. Study the System without Critical Human Error

When and so on, . Note that The mean time to system failure (MTSF3) is given by The mean time to repair/replacement (MTTR3) is given by where The steady-state availability is given by

Busy-time and idle-time for the service facility:

when the process is available in state , the repair facility is idle. This path is represented as Thus, Therefore, the mean busy-time of service facility is given by

8.4. Study the System without Preventive Maintenance, Common-Cause Failure, Critical Human Error, Non-Critical Human Error, and One Type of Repar “i.e.,

All results of [12] are deduced.

8.5. Study the System without Common-Cause Failure, Critical Human Error, Non-Critical Human Error, and One Type of Repar “i.e.,

We have the results of [12].

9. Comparison between MTSF and under the Effect of Preventive Maintenance

The purpose of this section is to study the effect of PM on the system. The following numerical results are obtained by considering the following system parameters.

We fix , , , , , , , , , , , , , , , , , , and and vary the values of and from 0.1 to 0.8. We found that the values of MTSF and are better when the time for taking a unit into greater than the PM time rate than other cases (see Tables 1, 2, 3, and 4).

These curves are shown in Figures 3, 4, 7, and 8. From these figures we conclude that MTSF and the steady-state availability for system (with PM) are always above as compared to those for system (without PM) which implies that the preventive maintenance “PM” leads to an improvement in overall system performance.

In Figures 1, 2, 5, and 6 we show that the MTSF and are increased when the time for taking a unit into greater than the time rate.

10. Contributions and Concluding Remarks

This paper applies the concept of GERT that is used to derive some reliability measures of a non-identical two-unit parallel system. Some previous results are deduced as a special case of our model. The effect of PM on the measures of the system is studied graphically. Finally, for achieving high reliability of the system, we recommend adding PM to the model and because of its positive effect on the model as shown in.