Research Article  Open Access
Naijun Sha, Ronghua Wang, Ping Hu, Xiaoling Xu, "Statistical Inference in Dependent Component Hybrid Systems with Masked Data", Advances in Statistics, vol. 2015, Article ID 525136, 11 pages, 2015. https://doi.org/10.1155/2015/525136
Statistical Inference in Dependent Component Hybrid Systems with Masked Data
Abstract
Complex systems are usually composed of simple hybrid systems. In this paper, we consider statistical inference for two fundamental hybrid systems: seriesparallel and parallelseries systems based on masked data. Assuming dependent lifetimes of components modelled by Marshall and Olkin’s bivariate exponential distribution in the system, we present maximum likelihood and interval estimation of parameters of interest. Intensive simulation studies are performed to demonstrate the efficiency of the methods.
1. Introduction
In a system consisting of several components, the reliability analyses are usually made by analyzing lifetime data. The system data includes two parts: (i) the system’s lifetime and (ii) the failure reason, that is, which component causes system failure. In real situations, however, some things may prevent systems from revealing the failure reason such as shortage of funds, limit of time, error of records, lack of diagnostic tools, and destructive consequences caused by the failure of some components. For example, in the reliability problems of computers and integrated circuits, the reason for the system failure is often attributed to a module containing several components, but one could not determine exactly which component causes the system failure. Therefore, the observable data from the test includes the failure time and failure reason related to a subset of components. In these cases, the reason for the failure of the system is masked and the lifetime data is called masked data.
The statistical analysis of masked data has a long history. Usher and Hodgson [1] initially proposed the parameter estimation under masked data. Since then, a significant amount of literature has emerged on various models. In the series system with constant, linear and polynomial failure components in the presence of masked data, the maximum likelihood (ML) and other estimation methods were studied among many researchers (e.g., [2–6]). Sarhan and ElBassiouny [7] considered a parallel system using masked data. Bayes methods with various priors were also used for the estimation of parameters in series and parallel systems (see, e.g., Sarhan [8–10], Jiang and Zhang [11]). ElGohary [12] discussed a series system with two dependent components in a Bayesian approach. So far, most researches of masked data focused on a system with either series or parallel only and assumed independent and identical component lifetime in the system. In many real situations, however, a “hybrid” system is often seen in which the working components are connected in a way of joining together with series and parallel. For example, currently, air supply systems generally are modular designed, where the power system consists of a number of semiconductor units combined in a series or hybrid method [13, 14].
Complex systems are usually composed of simple subsystems such as threecomponent seriesparallel and parallelseries systems illustrated in Figure 1. In this paper, we mainly focus on statistical inferences of the two fundamental hybrid systems, in which the component lifetimes are nonindependent and nonidentically distributed. For the two systems, first we note that the system failure occurrence is attributed to one of the four failures consisting of components 1, 2, 3, and 12, where 12 denotes the occurrence of components 1 and 2 failure simultaneously. Let be the set of all nine events causing the system failure; that is, If consists of more than one element, then the reason of the system failure is not exact and the life data is masked. Notice that here we differentiate two occurrences and by assuming different independent processes damaging component 1 only, component 2 only, and both components in the next section. We make statistical inference of parameters on likelihoodbased methods in the presence of masked data. Section 2 presents the life distribution and reliability for the hybrid systems. Section 3 concentrates on the parameter estimation for the seriesparallel and parallelseries systems, respectively. In Section 4, we assess the performance of the methods on simulation studies. Lastly, we conclude the paper with a brief discussion in Section 5.
(a) Seriesparallel system
(b) Parallelseries system
2. Model Specification
For the threecomponent hybrid system in Figure 1, there is a subsystem consisting of components 1 and 2. From a practical viewpoint, the lifetimes of the components in the subsystem are usually dependent on each other and independent of component 3 outside the subsystem. The unit lifetime model is addressed in the following.
2.1. Life Distribution
A bivariate model is developed by Marshall and Olkin [15] to describe the correlated lifetimes of two units and is widely used in twocomponent system. Basically, it was assumed that twocomponent system is affected by “fatal shocks” governed by three different independent Poisson processes with parameters , , and , according to the shocks damage component 1 only, component 2 only, and both components, respectively. Particularly, in the hybrid system of Figure 1, the lifetimes and of the units 1 and 2 are constructed through , , where , , and are mutually independent random variables with , , and . Then, follows a bivariate exponential distribution , whose joint reliability function is and the joint density function is The probability of both components failure at time corresponds to the mass probability of singular part , . The component 3 is shocked by another independent Poisson process with parameter , and so its lifetime is exponentially distributed with the density , , .
2.2. Reliability and Density of Hybrid System
First we briefly introduce the concept of masked probability. Assume that there is a masked event with the exact failure case in the hybrid system; then the probability of failure due to the masked occurrence at time is where is called masked probability and is the probability of system failure caused by component(s) at the time , . In statistical analysis of masked data, it is usually assumed that the masked occurrence is independent of the cause and failure time; that is, The lifetime of the seriesparallel system in Figure 1(a) is . With the assumption that and an independent , the reliability at time is and the probability densities of failure at time due to each event are Likewise, for the parallelseries system with three components as shown in Figure 1(b), the system life becomes . Therefore, the reliability is and the probability densities of failure at time due to each case are Finally, the density function for the system at due to the masked occurrence can be expressed as . The likelihoodbased parameter inference for the two hybrid systems is presented in the following.
3. Parameter Estimation
In our statistical inference, two common censoring schemes are considered: typeI and typeII. For tested systems, through reordering the failure times, we assume that there are systems failures due to the th mechanism in with the failure times , where , , . Obviously there are totally observed failure times and censored observations. For typeI censoring, the test is continuing until a prespecified time is reached and we observed systems failed; whereas for typeII censoring, the test is carried out until the prespecified systems failures for the th mechanism, and so the test terminated time . For both cases, we express the observed life data . The corresponding masked failure event is for the system and masked probability , . The probability density of system at for each case in (7) and (9) is expressed as , , , indicating the failure due to the components 1, 2, and 3 and both components 1 and 2, respectively. Finally, the density function for system at becomes . Therefore, the applicable unified likelihood function for both hybrid systems and censoring schemes is where the constant does not contain the parameters of interest in .
For the purpose of simplicity, we only consider two special cases of failure rates: (1) the components were shocked by independent Poisson processes with same parameters; that is, ; (2) the Poisson processes affecting the three components individually have the same parameters but different from that of the Poisson process applying on components 1 and 2 simultaneously; that is, . The maximum likelihood estimation (MLE) approach will be implemented for the inference. To make notation simpler, we denote the loglikelihood function as , where is the parameter of failure rates included in the life densities. We also apply the approximated chisquared likelihood ratio statistic [16] to numerically obtain the confidence intervals of parameters. Particularly, for our case, the likelihood ratio statistic for the parameter approximately follows where or and its MLE , and is the dimension of . In general, this method works well even for the situation of small sample size; that is, the coverage probability of the constructed interval is very close to the nominal confidence level.
3.1. SeriesParallel System
(1) . Based on the reliability in (6) and the densities in (7), the likelihood function (10) becomes So, the loglikelihood can be simplified as and its derivative with respect to is Since no analytical form of MLE can be obtained from the equation , a numerical method has to be implemented for specific data observations. The uniqueness of MLE can be justified in the following way: the terms involving exponent in can be expressed as a unified functional form with positive constants , and . Since , we have Hence, the loglikelihood function is strictly concave and therefore implies a unique MLE . Additionally, , , and so the MLE is a positive value.
(2) . Under this case, the likelihood function (10) reduces to and so the loglikelihood function is The MLEs , can be obtained numerically in the equations . The existence of MLE is provided in the Appendix.
3.2. ParallelSeries System
(1) . Based on the reliability in (8) and the densities in (9), the likelihood function (10) becomes and then the loglikelihood function is Taking derivative with respect to , we obtain Since and , has a positive root .
(2) . Under this special case, the likelihood function (10) then reduces to and so the loglikelihood function is The MLEs , will be obtained numerically in the equations . The prove of the existence of MLE is given in the Appendix.
4. Simulation Study
In this section, we conduct a simulation study to investigate the performance of our methodology. We choose two parameter values of failure rates for each case in the two hybrid systems; that is, in the case of same failure rates, and for the case of different failure rates. Under each setting of parameter values, we carry out simulation study to generate the lifetimes , , following the construction described in Section 2.1 under two sample sizes , for each of which two complete samples () with two settings of failure numbers and two censored samples with two failure numbers ( for and for ) are considered to determine the sample size and variation effects for the estimation precision. We conduct 10,000 MonteCarlo simulations for each setting of parameter value, sample size, and failure number. The averaged MLE, mean squared error (MSE), length of 95% confidence interval, and coverage probability are displayed in Tables 1 and 2 for the seriesparallel system and Tables 3 and 4 for the parallelseries system.




In each table, the estimation results in the upper panel correspond to the complete sample and lower panel to the censored sample. It seems that the estimations are reasonably good under these relative small sample sizes, and all the coverage probabilities of confidence intervals exceed the nominal confidence level, indicating that it is a conservative method for interval estimation by chisquared likelihood ratio statistics. As expected, under the same sample size , the MSEs and interval lengths are smaller in complete samples than these in censored samples. Due to the scale of the true parameter values, we noticed that given the same sample size and failure numbers ’s, the MSE and interval length of estimates under larger true parameter values are consistently larger than these under smaller true values. In Table 1, for example, given , , the MSE = 0.0267 and 95% confidence interval length = 0.7164 when , whereas MSE = 0.0177 and the length = 0.5664 when . However, it is common that, for a fair comparison between estimates variability with different units or different parameter values, one should use a relative variability measure such as coefficient of variation instead of a measure of dispersion like MSE or interval length. In our case, we propose a “normalized” measure of dispersion to remove the scale effect for the comparison. As a result, the estimation results mentioned above give us and , respectively, which are very close to each other. Similar outcomes are obtained for other estimation results across the tables, indicating a consistent precision for the estimation procedure.
Additionally, other findings can be seen from the estimation results. (i) For the complete samples, the upper panels in the tables interestingly show that given the same size , the MSEs and interval lengths are consistently smaller in the setting of larger variation of ’s than those in the setting of less variation of ’s. In other words, the estimations are more efficient under “unbalanced” failure numbers (’s vary largely) than “balanced” failure numbers (’s are close to each other). The possible reason is that the likelihood function with “unbalanced” failure numbers is less dispersed so that it accommodates more amount of information of parameters. (ii) For the censored samples, the MSE and interval length are getting smaller as the sample size and failure number are getting larger. For example, for the true parameter values in the lower panel of Table 2, when , , the MSE and the interval lengths for : 0.7216 and 0.5545, respectively, while the corresponding MSE and interval lengths for : 0.8288 and 0.5792 under , . Furthermore, given the sample size , the MSE and interval length under are smaller than these under , where the MSE and the interval lengths of : 0.8344 and 0.5909. In summary, the results indicate that it is more accurate for the estimates if more failures are observed.
5. Conclusions and Discussions
In this paper, we have studied statistical inference for threecomponent hybrid systems based on masked data, for which the lifetimes of units are nonindependent and nonidentical distributed. Two commonly censored schemes typeI and typeII were considered in the analysis. We have presented the maximum likelihood estimates of parameters when the failure rates of three components in the hybrid system were assumed to be the same and different, respectively. In addition, we obtained the approximate interval estimation of parameters by using likelihood ratio statistic. We have assessed the performance of estimation methods by simulation studies. The results have demonstrated that the procedure can achieve good estimation performances under small and moderate sample sizes, and the estimates are more accurate if more failures are observed, indicating the efficiency of the estimation method. While the method can be extended to more complex systems in the presence of masked data, the representation and evaluation of the likelihood function would become cumbersome for large systems. There is an alternative method based on signature that explores component topology. The system signature is the probability vector whose element is the probability of each component failure resulting in the system failure, and it provides an elegantly simple representation of a system [17]. Some advances and various applications of the signature are discussed in [18–20]. Recently, using the system signature, a Bayesian inference to the system with masked lifetime data was proposed by Aslett [21]. The generic likelihood function for complex systems can be easily expressed by data augmentation method; the parameter inference is relied on the samples from an iterative Markov chain MonteCarlo simulation of all the component failure times and parameters. This intensive computing method provides an alternative to the traditional likelihoodbased approach to deal with general systems.
Appendices
Proof of existence of MLEs for the likelihood function under the case in both hybrid systems.
A. SeriesParallel System
In the loglikelihood function in (16), taking partial derivatives with respect to and , respectively, First, we notice that(i)given , let ; it is easily seen that and , so there is a positive root for ,(ii)given , let ; we have so is decreasing for . Additionally, and . Thus, has a unique positive root . Hence, the MLEs , exist and can be obtained numerically from the equations .
B. ParallelSeries System
For the loglikelihood function in (21), the partial derivatives with respect to and are It is worth noting that (i)given , let , and and