#### Abstract

We propose a Bayesian-based reliability framework for a satellite-equipped harmonic gear drive (HGD) mainly addressing two unsettled issues in existing research works, that is, an efficient aggregation method for multilevel system with imbalanced information and the multisource data fusion in a hierarchical structure. The proposed approach improves the estimation result for the whole system by utilizing all available information throughout the system. In particular, the heterogeneous data sets are classified into three types of collections of statistical data, time-correlated data, and performance test data based on their mathematical characteristics, respectively. Novel features include the comprehensive information aggregation approach taking into account multilevel information and a unique information-based fusion strategy. A real HGD device is demonstrated as the case study for benefit illustration and validation purpose.

#### 1. Introduction and Motivation

Possessing the advantages of scalability, tractability, and modularity [1], the multilevel hierarchical structure is widely adopted in the design of complex engineering systems. Such systems are deployed to meet highly sophisticated and advanced requirements on functionalities and performance for their applications in mission-critical industries, for example, infrastructure, manufacturing aviation, and high-level national security. Their failure in use would result in devastating consequences. A multilevel hierarchical engineering system may consist of multiple subsystems that are composed of multiple lower-level subsystems or components. In this article, a component, a subsystem, or the whole system is referred to as an “element.” These elements are interconnected and interact with each other, jointly contributing to the functionality of the whole system.

Recent years witness the development of effective approaches (e.g., stochastic process approach and universal generating function approach) in the multilevel system reliability modeling, as summarized in [2]. For a multilevel hierarchical system, one critical issue is to accurately model and analyze the reliability for the whole system under an information imbalance scenario. To illustrate, the available information of a hierarchical system is taken into consideration. The bottom-level elements mostly are standard components with high volume of production and deployment, which yield either readily accessible or easily obtainable reliability information. However, since the experiments for a system as a whole are often costly and time-consuming, accumulated knowledge at the system level is limited and reliability information is often scarce or even absent for top-level elements. Usually, the information of bottom-level components is more easily obtained than that of the top system-level components, which produces the so-called information imbalance scenario [3]. Under such circumstance, it is desirable to aggregate the lower-level information to compensate higher-level information inadequacy for a better reliability estimation at the system level. Frequentist approaches are mathematically cumbersome in information aggregation [3]. Alternatively, Bayesian approaches provide satisfactory and flexible ways to aggregate the available multilevel reliability information [4]. For example, Johnson et al. [5] present a “full-Bayesian approach” for integrating all the reliability information related to a system. This approach resolves the upward and downward propagation problem by simultaneously modeling the complete set of system parameters. Hamada et al. [6] generalize their procedure to the fault tree quantification. Li et al. [7] propose a semiparametric modeling approach for hierarchical systems with multilevel information aggregation. Benefiting from the successful application of the Markov chain Monte Carlo (MCMC) methods in Bayesian inference, the choice of prior distribution is not limited within conjugate distributions and the modelling to multitype data has been significantly extended. Recent works include Li et al. [8], Pan and Yontay [9], Guo et al. [10] are well summarized in [11–18].

Another critical issue in reliability modeling and assessment field for the multilevel complex system is the difficulty in dealing with heterogeneous data sets. Subjected to costly and time-consuming reliability tests, the available datasets of many higher-level elements of the multilevel system are often restricted to attribute data (e.g., pass/fail data). By contrast, the collected information or data is abundant for lower-level components and exhibits as more informative forms (e.g., the failure time data or degradation data). Methods for system reliability analysis with lifetime or degradation data have been put forward by Huang et al. [19], Wang et al. [20], and Ye et al. [21]. But most of the aforementioned studies are focused on one type of data, while little attention has been paid on a scenario of mixed multisource data, and limited exceptions are the work proposed by [22–25]. The reliability assessment of a multilevel system addressing both reliability test data and performance test data has not been sufficiently addressed yet. In fact, for a product in a design stage or in a developing stage, few experiments would have been specially conducted for pure reliability purpose, while more experiments are conducted to test certain performance or verify if the design scheme satisfies the engineering demand. It is the performance test data that occupies the main domain of all available test data compared with the reliability test data. Similar scenario exists in the true measurement field where the data is collected from products during the service life span. On this occasion, a comprehensive reliability approach taking into account multilevel structure and multisource data makes significant importance [26]. Figure 1 provides a graphical description of the two crucial issues existing in a multilevel hierarchical system. To fill the research gap and establish a more generic reliability information aggregation framework, this article proposes a Bayesian-based approach to model and analyze the reliability of a system with *multisource data and imbalanced information* (MSDII).

This article is organized as follows: In Section 2, the two critical issues of the MSDII system mentioned above are illustrated through a real harmonic gear drive (HGD) device and the rationale of the proposed information aggregation approach is introduced. And then, the Bayesian-based information aggregation process is discussed from both single-element and interelement aspects at full length in Section 3. Following that, in Section 4, a case study is demonstrated for validation, while the benefits and effectiveness of the proposal are also illustrated through a series of computations and comparisons. Finally, some concluding remarks are drawn in Section 5.

#### 2. Problem Description of the MSDII System

The HGD device is the core part of the double-axis driving mechanism (DADM) which is used for erecting the antenna. In this paper, the HGD device is cited for instance to illustrate the two critical problems in the MSDII system. Its physical model is shown in Figure 2, and the system structure is represented by a hierarchical model in Figure 3. The whole device can be mainly divided into two subsystems, a spline system and harmonic wave generator. The spline system consists of a flexspline and a circular spline. The harmonic wave generator can be further decomposed into three components, that is, a wave generator producing basic torsion, a convex gear transmitting force and moment, and a direct current motor providing power.

Figure 4 intuitively presents the scenario of multisource data and imbalanced information in the developing and manufacturing phases of the HGD. For basic components in the bottom level, we have abundant test data and sufficient prior knowledge. This prior information (e.g., judgment of experts, historical records of prototypes, and past experience of similar products) produces an informative and accurate prior probability distribution in the Bayesian inference. However, since the experiment for the system as a whole is both costly and time-consuming, test data is limited for higher-level elements. In addition, the prior knowledge is usually insufficient to construct an accurate and informative prior distribution; thus, the noninformative prior or diffuse prior is assigned on account of the scarce or even absent prior knowledge. Figure 4 gives a graphical description of the available information in the MSDII system, and Table 1 summarizes the characteristics of the three types of data (i.e., the statistical data, the time-correlated data, and the performance test data).

To balance the inadequacy of the available data, the Bayesian approach can be applied to explicitly incorporate prior knowledge into statistical modeling [4, 5]. The rationale of this framework is that based on the functional relationships of system elements, the lower-level reliability information can be explicitly aggregated to the system level and elicited as an *attached probability index* for the system reliability combination to compensate its information inadequacy. The detailed information aggregation procedure will be illustrated in the next section.

#### 3. Bayesian-Based Multilevel and Multisource Data Fusion

This section presents the details of the proposed Bayesian-based information aggregation method, which is developed to resolve the reliability of a multilevel system with multisource data in an information imbalance scenario.

##### 3.1. Single-Element Information Aggregation

###### 3.1.1. BIC-Based Model Selection

In this section, a Bayesian information criterion- (BIC-) based model selection strategy is developed for single-element information aggregation. The HGD system shown in Figure 3 is considered here without loss of generality, in which the Bayesian network (BN) is constructed and shown in Figure 5. (i)For the statistical data () that only have the survival or failure records (e.g., pass/fail data and count data), the binomial distribution can be used to model the pass/fail data and the Poisson distribution can be adopted to model the count data set, as stated in [5, 23].(ii)For the time-correlated data () and performance test data (), there is usually a number of candidate models (e.g., exponential, Weibull, and lognormal). In practical engineering, the model selection is a case-based procedure and the mathematical convenience should be taken into consideration. In this paper, the BIC is employed to help select an appropriate probability model among several candidates. The BIC (also named SBC and SBIC) is a criterion for model selection among finite sets of models, which is proposed by Schwarz [27] and further developed by Diciccio et al. [28].

The BIC has the following basic form: where represents the observed data, denotes the sample size, refers to the number of free parameters, is the likelihood function, and denotes the estimation values that maximize the likelihood function. For a series of fixed test data, the smaller the BIC value is, the more acceptable the model is.

###### 3.1.2. Likelihood Derivation for Heterogeneous Data Set

To build a comprehensive model for a multilevel system, heterogeneous data sets need to be incorporated. In this section, the likelihood contributions of the three types of data to the joint likelihood are derived.

*(1) The Statistical Data Set*. The statistical data set is defined as a data collection that only contains the dimensionless “number” information of the test unit or the element of interest. That is, the data set only records the number of units that survived or failed in an experiment or during a period of reality service time (e.g., the pass/fail data). Its mathematical form can be represented as a specification of . From an information perspective, this type of data can only provide number information and is thought to be less informative compared with other types of data. Suppose that units are subject to independent sampling tests, where . The number of passing units in each test is . Let be the random variable following a binomial distribution. The probability that a unit passes the test is determined by its parameter . The model for is described as

The likelihood for a single observation can be described as

And the likelihood for the whole statistical data set is where all parameters in the statistical data set are involved in the parameter vector .

*(2) The Time-Correlated Data Set*. The time-correlated data set is defined as a type of data collection that contains both quantity and time information of a specific unit. For example, the failure time data gives exact records of failure time while the censored time data only provides partial information about failure time due to the limited test condition. Its mathematical form can be represented as the specification of a pair , and it is thought that the time-correlated data is more informative than the simple statistical data. Eventually, the information obtained from a collection of time-correlated data is limited not only in the number of failures for a batch of tested units but also in the time point that a unit fails.

Various probability distributions can be used to model the time-correlated data set: exponential, Weibull and lognormal distribution have been employed in different research works [22, 23]. If the likelihood contribution of the exact failure time is denoted as , the likelihood contribution of the left-censored time can be expressed as the cumulative distribution function (CDF) while the likelihood contribution of the right-censored time is derived similarly as . The likelihood function of the whole failure time data set could be obtained by multiplying the three individual contributions as where , , and denote the exact time, left-censored time, and right-censored time of the failure, respectively, and , , and denote the number of data points, respectively. All related parameters are involved in the parameter vector .

*(3) The Performance Test Data Set*. The performance test data set is defined as a type of data collection that records one kind of physical quantities which can reflect the performance of a product, for example, the starting acceleration of a motion mechanism and the output voltage of an electronic product. Its mathematical form can be represented as the specification of a triple . Generally, the reliability refers to the capacity of a component or a system to perform its required functions under stated operating conditions during a specified period of time [29]. Failure occurs when the performance exceeds a predefined threshold as

For a practical engineering problem, the reliability of one product is always related to multiple performance parameters and the reliability of this product is usually described as

Equation (7) can be intuitively explained as follows: the reliability of one product is determined by the “worst” aspect in its performance.

From an information view, it is evident that the performance test data is much informative than the previous two types of data. This is because the performance test data provides not only the “outcome” information at the failure point but also the detailed “procedure” information. The most challenge in modeling this type of data is the difficulty in handling multisource data with different units (i.e., velocity and acceleration collected from a sensor and real-time monitoring voltage). To model all the performance test data in a coherent way, we transform all performance test data with nondimensional operation as where is the initial performance value which serves as the nondimensional operator. Based on (8), the performance function would begin with 100 percentile. Given the data measurement error , the likelihood function for an individual data point at the time point is where is the probability density function (PDF) of the standard normal distribution. The denotes the value of a case-based performance function at the time point . Taking the multiparameters at a series of time points into consideration, the likelihood of the whole performance test data set is derived as where all parameters of the performance test data set are included in the parameter vector .

###### 3.1.3. The Bayesian Model and Derivation of the Basic Probability Index

According to the Bayesian theorem, one-parameter posterior distribution is proportional to the multiplication of the likelihood function and the prior distribution. The joint likelihood function can be obtained by multiplying together the respective likelihood contributions of all available data sets. Let , , and denote the sets of elements with statistical data, time-correlated data, and performance test data, respectively; then, the joint likelihood function is derived as where all parameters are involved in the parameter vector . Given the prior distribution of the model parameter vector , the Bayesian model is obtained as where is the joint prior distributions for the system model parameters and is the joint posterior distribution of the model parameter vector .

There is usually no analytic solution to (12); however, the Markov chain Monte Carlo (MCMC) sampling techniques could be applied to get numerical solution (e.g., the Metropolis algorithms or the Gibbs samplers). The Metropolis random walk algorithm (MRWA) [30] is employed here to draw samples. Since the parameter vector has been estimated, other reliability indexes could be evaluated by generating enough samples from the posteriors of parameters based on (13) and (14).

It should be noted that the information used to obtain the parameter posterior distribution from (12) and the reliability indexes from (13) and (14) is single-element based; namely, no additional information from interelements (e.g., other model outputs and system structural formation) is incorporated. Since the elements of a multilevel system are interconnected, it is desirable to aggregate all potential valuable information for reliability analysis and evaluation. On this occasion, we first estimate the parameters based on the two different data bases (i.e., the native information and induced information) and then integrate them into a comprehensive one. The probability indexes derived from (13) and (14) are defined as *basic probability indexes*, and the *attached probability indexes* on the basis of the interelement information will be discussed in the next section.

##### 3.2. Interelement Information Aggregation

In this section, we discuss the derivation of attached probability indexes and the combination method in information aggregation.

###### 3.2.1. The Structural Model of the MSDII System and Derivation of Attached Reliability Indexes

To illustrate the concept of attached probability indexes, let denote the element of interest. For a coherent system (commonly encountered examples of coherent systems include series systems, parallel systems and -out- systems), a probability index of the element can be calculated based on the induced information (such as other model outputs and system structural formulation) as (15) and the PDF is described as (16). where is the structure function determined by the direct predecessors of the element , is a set of indexes of the element direct predecessors, and and denote the attached probability index and the basic probability index (reliability), respectively. To specify the structure function , all relevant conditional probability tables (CPTs) in the BN will be used. For demonstration purpose, the simple series structure and parallel structure are employed. Thus, (15) has a specific form for series structure as

whereas that for parallel structure is

It is emphasized that no native information is used in the computation of the attached reliability indexes. Thus, the accuracy and the precision of results fully depend on the validity and adequacy of the induced information. Theoretically, if all basic probability indexes are accurately derived and the system structure is well investigated, the attached probability indexes derived are in correspondence with the basic probability indexes. However, due to the limited knowledge about a complex system, the given prior distribution may be inaccurate and the system function structure may be biased. Thus, the attached probability index is only an ideal inference value that can be used to revise the basic probability index.

To better estimate the reliability of the whole system, all potential valuable information (both native and induced information) are encouraged to be taken into consideration. A comprehensive combination method is required in the information aggregation as well as in the reliability analysis.

###### 3.2.2. Probability Index Combination

Combining the attached probability index with the basic probability index yields a new combined probability index, which provides a comprehensive way for reliability estimation taking into account all related information. For the proposed MSDII system, the linear opinion pool method is used to combine these two types of probabilistic indexes. It is one of the most widely used formal approaches for combining probability distribution in the field of expert judgments [31, 32]. The combined probabilistic index is formulated as
where , , and correspond to the combined probability index, the induced probability index, and the native probability index of the element in level , respectively. A weighting coefficient *w* is assigned to balance the contribution of the basic probability and attached probability . To deal with the information imbalance scenario in the MSDII system, the parameter *w* is determined by the ratio of the information content of available data sets between adjacent elements as
where , , and denote the information content of the statistical data set, time-correlated data set, and performance test data set of element , respectively. is a set of indexes of the element direct predecessors.

Based on the information theory founded by Shannon [33] and the research work extended by [34, 35], by definition, the amount of self-information contained in a probabilistic event depends on the probability of occurrence for that event. It is given as

The calculation to the self-information for the three types of data is discussed as follows.
(1)Self-information of the statistical data set: Let denote the statistical data set, which contains independent data records, and the vector can be specified as , where and denote the number of total units tested and the number of survived units, respectively. For one data point , its self-information amount could be calculated as
where the estimated value of parameter is obtained as (23) based on the maximum likelihood estimation (MLE).
Thus, the self-information of the statistical data set is derived as
(2)Self-information of the time-correlated data set: Let denote the time-correlated data set, which consists of lots of components. For an individual component , its self-information could be expressed as
where the estimated value of parameter is obtained as (26) based on the MLE.
Thus, the self-information of the time-related data set is derived as
(3)Self-information of the performance test data set: Given the performance test data set , there are *L* measurements observed at each observation time point and lots of observation time points in total. Since the measurements can be modeled by a Gaussian distribution as (9), the probability of one component observed is obtained as
where is the CDF of the standard normal distribution, is estimated as , and is estimated as based on the MLE. For an individual component , its self-information amount could be calculated as
The self-information of the whole degradation data sets is derived as
The idea of this combination way is that by setting an information aggregation weighting coefficient , quantified measurements can be taken to balance the contribution of the basic probability index and the attached probability index based on the information content calculation result. Sufficient test data produce a higher value of which will assign a larger weight in the combination procedure. As a consequence, the combined probability index inherits the major belief from the probability index with higher . The probability index combination provides a comprehensive method to incorporate the native information and the induced information in the reliability analysis of a multilevel system.

###### 3.2.3. Reliability Analysis and Parameter Estimation

The combined probability obtained from (19) contains the prior knowledge and test data of both native information and aggregated structural information. It resolves the information inadequacy problem for elements at level and improves the reliability estimation for the element of interest. To meet the engineering demands, it is beneficial to analytically derive some probability indexes. In this section, the probability indexes for both reliability analysis and physical parameters are studied from a Bayesian perspective.

*(1) Reliability Probability Indexes*. Based on the combined probability index given in (19), the reliability of the system as a function of mission time can be obtained as
where denotes the available test data in the MSDII system and is the joint posterior distribution of the parameters derived in (12). The failure rate of the system at the present time can be computed as

Additionally, the mean time to failure (MTTF) can be also calculated as

*(2) Physical Parameter Probability Distribution*. In other scenarios, people are more interested in certain physical parameter probability distribution other than the reliability indexes, especially for a product in the design and developing phase. We denote the parameter that is explicitly expressed in the reliability calculation (i.e., the parameter has been set as a threshold value for reliability calculation) as and the implicitly expressed parameters as . Based on (13), (15), and (31), we have

The deterministic physical model can be expressed as

Equation (35) can be rewritten in a vector form as , where and . By employing the *transformations of random variables* ([36], p.59), the joint posterior distribution of other remaining parameters is given as
where denotes the probability distribution of the physical model response corresponding to the relationship function of the remaining parameter, is the physical model response, and is the associated Jacobian of the transformation. Similar to the joint posterior distribution in (9), most of these probability indexes cannot be analytically specified. The calculations are based on the posterior samples of model parameters using simulation-based integration.

##### 3.3. Summary of the MSDII System Information Aggregation Framework

Several key steps of the proposed approach for the MSDII system are summarized as follows:

*Step 1. *Select the appropriate model based on the BIC calculation result.

*Step 2. *Derive the likelihood for elements with multisource data.

*Step 3. *Compute the basic probability indexes based on the native information.

*Step 4. *Calculate the attached probability indexes based on the induced information.

*Step 5. *Construct the combined probability indexes based on the information content of all available data sets.

*Step 6. *Estimate the probability indexes for reliability analysis or physical performance.

The flow diagram of the proposed approach is presented in Figure 6.

#### 4. Numerical Case Study

##### 4.1. Problem Description and Predefined Settings

A numerical case study is conducted to illustrate the proposed information aggregation procedure and demonstrate its effectiveness in reliability estimation. The hierarchical HGD system is again investigated here. Figure 7 shows the HGD system structure considered in this section.

The traditional information aggregation methodology encounters an obstacle in reliability modeling and the analysis of the HGD system due to the existence of imbalanced information and multisource data. To illustrate, the available information of bottom-level components is well accumulated, which produces accurate prior knowledge. Nevertheless, the sub-system-level information is not always accessible due to the unaffordable cost of the entire system experiment; thus, the available information is strictly limited.

Case study settings are assumed on the basis of a real HGD system, but they are adjusted due to the confidentiality. Table 2 gives the predetermined probability indexes of bottom elements, and the detailed settings of all relevant CPTs are given in Table 3. Based on these, probability indexes of the remaining elements can be calculated, as shown in Table 4, where denotes the normal state and corresponds to the failure state. They will serve as the ground truth in the case study. To imitate an information imbalance scenario, the performance test data set with a sample size of 100, the time-correlated data set with a sample size of 50, and the statistical data set with a sample size of 10 (see details in Table 5) are randomly generated for components, subsystems, and the system-level elements, respectively, based on the ground truth. Additionally, accurate native priors are assigned to the components since the accumulated knowledge about them is abundant. For the sub-system-level elements, diffuse native priors are assigned considering the fact that only bounds of certain parameters can be determined. Because the prior knowledge is almost absent at the top system level, noninformative native priors are adopted.

This case study is demonstrated to show how the proposed method improves the reliability evaluation result by using the imperfect information (i.e., inaccurate prior distribution and limited available data). The reliability framework for the case study is shown in Figure 8.

##### 4.2. Bayesian-Based Information Aggregation Method and Reliability Estimation

###### 4.2.1. Single-Element Information Aggregation

The multilevel information aggregation procedure starts with the model selection step. Since the available test data set for the system-level element is the nondimensional statistic data with a binary state, we use the binominal distribution to model the data set. For other elements, appropriate models are selected based on the computation of the BIC value among candidate distributions, and the calculated results are shown in Table 6.

Based on the likelihood constructed above, the basic probabilities (i.e., the reliability calculated using native information) are obtained from (13). These results are presented in Table 7. It is observed that the estimation results of component-level elements (~) are capable of high precision, whereas those of sub-system-level elements ( and ) are of less accuracy. For system-level element (), the result is significantly biased. Actually, the calculation of the basic probability using only native information has no essential difference compared with the traditional Bayesian-based approach. The traditional ones can get a satisfactory reliability analysis outcome in the ideal circumstance. However, the result would be biased if the prior is inaccurate or some data is misleading. Under the information imbalance scenario, it is desirable to use the components’ information to compensate the system information inadequacy for an improved reliability estimation. Thus, the interelement information aggregation is in demand.

###### 4.2.2. Interelement Information Aggregation

Since the system structural function is presented by the CPT given in Table 3, the attached probabilities for higher-level elements of , , and can be calculated based on (15) as follows:

Based on the derivations of self-information for heterogeneous data sets in (22), (23), (24), (25), (26), (27), (28), (29), and (30), the information content for elements with multisource data is obtained as

Thus, the information aggregation factor is calculated as

For the combined probability, we have

It should be noted that , , and in (22) are random variables with probability distribution. A direct way to calculate (40) is to apply the parameter combination by estimating an explicit form of the probability distribution first. However, it will bring more errors in the parameter estimation procedure. In this paper, we adopt a sample-based method by mixing samples generated from and to obtain enough random samples of . In this way, no more deviation is introduced and an accurate combined probability is derived.

###### 4.2.3. Result Analysis and Discussion

Figure 9 shows comparison results of the three posterior probability distributions (i.e., the basic probability, attached probability, and combined probability). For higher-level elements (, , and ), it appears that neither the basic probability nor the pure attached probability fits the ground truth well. By contrast, the combined probability shows a satisfactory result, indicating a successful proposal in reliability analysis of a multilevel system. Further, to validate the efficiency of our approach in uncertainty reduction, three subcases with different model specifications are presented; that is, only system-level information is utilized (less information aggregated with only statistical data), both system-level information and sub-system-level information are utilized (more information aggregated with statistical data and time-correlated data), and all available test data are employed (all information aggregated with three types of data). The comparison results of the three scenarios are shown in Figure 10, which suggests that as more information is aggregated, the estimation result is improved and the uncertainty is significantly reduced. Table 8 shows the 95% credible intervals and posterior mean and median of the reliabilities for , , and . It can be seen that the 95% credible intervals fully cover the predetermined true value. The prior and posterior distributions of the failure rates and are calculated based on the (31) and (32) and listed in Table 9. The uncertainty reduction is calculated as , where *l*_{G} and *l*_{P} are the 95% credible interval lengths of the probability distribution of parameters derived by the general Bayesian method and the proposed approach, respectively.

**(a)**

**(b)**

**(c)**

**(a)**

**(b)**

**(c)**

Note that we have made some assumptions in this case study. They are acceptable in this numerical case for validation purpose, but it may require some additional considerations in practical engineering. For example, we do not consider the model uncertainty; that is to say, we assume that the established model (system configuration) is in complete accordance with the ground truth; for a practical engineering problem, the simulation model should be validated by the physical model before the approach is implemented. In addition, we assume that the used data is accurately obtained; thus, the measurement error is omitted; likewise, the collected data should be carefully calibrated before it is aggregated in the Bayesian model since it can significantly affect the estimation result. For relevant research works, please refer to [14–16].

#### 5. Conclusion

This article proposes a multilevel information aggregation approach to evaluate the reliability of an MSDII system accurately, addressing two crucial issues in engineering reliability evaluation, that is, the imbalanced information and multisource data fusion. Based on system structural function, the reliability information of component-level elements has been aggregated to compensate the information inadequacy of the system level, yielding a better effect on the reliability analysis of the whole system. A real HGD system is demonstrated to confirm the validity and effectiveness of the proposal. The improved system reliability modeling will benefit system prognosis, warranty policy making, and the maintenance service planning, providing a more reliable assessment of the system’s health status with more accurate prediction of the remaining useful life. Furthermore, some better predictive maintenance policies can be established with less uncertainty.

It should be noted that our approach is developed for a coherent system; namely, this approach is feasible for series systems, parallel systems, -out- systems, and so on, but some complex systems do not fall within the scope of application. Since the structure function of some complex systems is not in analytical form or cannot be explicitly expressed, (15) and (16) are not applicable. It is also noted that many inherent uncertain factors in practical engineering such as model uncertainty, measurement error, and the cascading failure dependency have not been taken into consideration. A full Bayesian approach dealing with both aleatory uncertainties and epistemic uncertainties with cascading failure dependency would be under the consideration of the future work. It will be interesting to compare the frequency-based and full Bayesian approaches in the future from different aspects such as computational complexity, modeling accuracy and precision, and data availability and quality.

#### Nomenclature

MSDII: | Multisource data and imbalanced information |

MCMC: | Markov chain Monte Carlo |

PDF: | Probability density function |

CDF: | Cumulative distribution function |

BN: | Bayesian network |

CPT: | Conditional probability table |

BIC: | Bayesian information criterion |

HGD: | Harmonic gear drive |

DADM: | Double-axis driving mechanism |

: | Statistical data set |

: | Time-correlated data set |

: | Performance test data set |

: | Element with statistical data |

: | Element with time-correlated data |

: | Element with performance test data |

: | Set of elements with statistical data |

: | Set of elements with time-correlated data |

: | Set of elements with performance test data |

: | The element at level in the Bayesian network |

: | The set of indices of the element direct predecessors |

: | Structural function |

: | Prior distribution |

: | Posterior distribution |

: | Information aggregation weighting coefficient |

: | Information content of the statistical data set |

: | Information content of the time-correlated data set |

: | Information content of the performance test data set |

: | Basic probability |

: | Attached probability |

: | Combined probability. |

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

The work of Lechang Yang was supported by the Fundamental Research Funds for the Central Universities of China under Grant FRF-TP-17-056A1 and the China Postdoctoral Science Foundation under Grant 2018M630073. The work of Yanhua Du was supported by the National Natural Science Foundation of China under Grant 61473035.