Abstract

Power modules connected in parallel may have different electrothermal performance variances resulting from aging because of the nonuniform rate of degradation; different electrothermal performance variances mean different current sharing, different junction temperature, and power losses, which will directly influence the overall characteristics of them. Thus, it is essential to monitor the condition and evaluate the degradation grade to improve the reliability of large-scale power modules. In this paper, the impact of thermal resistance difference on current sharing, junction temperature, and power loss of parallel-connected power modules has been discussed and analyzed. Additionally, a methodology is proposed for condition monitoring and evaluation of the power modules without intruding them by recognizing the increase in external power loss due to internal degradation from aging. In this method, power modules are deemed as a whole system considering only external factors associated with them, all important electrical and thermal parameters are classified as the inputs, and power loss is considered as the output. Firstly, power dissipation is predicted by models using NARX (nonlinear autoregressive with exogenous input) neural network. Then, a monitoring method is illustrated based on the prediction model; a reasonable criterion for the error between the normal and the predicted real-time power loss is established. Finally, the real-time condition and the degradation grade of aging can be evaluated so that the operator can take suitable operating measures by means of this approach. Experimental results validated the effectiveness of the proposed methodology.

1. Introduction

Usually, the large-scale power module consists of several parallel-connected devices in order to perform higher currents [1]. Similarly, the power devices consist of several units connected in parallel internally, thereby sharing the common terminal and providing a rated current for the device. Although these units are usually manufactured by the same process, there may be differences during the production process. Therefore, the electrical and thermal parameters of these devices are not equal, and these power modules may have different rates of degradation, resulting in different thermal resistances. Different thermal resistance variances refer to different junctions or power losses, which will directly affect the overall performances of them. Thus, it is necessary for the power electronic system operator to supervise the condition of power modules, in order to assess the degradation grade and enhance the real-time reliability of parallel-connected power modules in large-scale converters.

Many studies have been conducted to investigate the enhancement of the reliability of power modules in converters. The paper [2] presents a method to monitor solder fatigue in a voltage source inverter IGBT power module by detecting the change of an inverter output harmonic. It is shown that low-order harmonics, caused by nonideal switching, are affected by the device junction temperature, which in turn depends upon the module solder condition. However, it is difficult to detect the low-order harmonics, so detection accuracy is not guaranteed. The paper [3] describes a method to estimate power losses from simulation using ideal switches combined with measured power loss data to gain the convertor efficiency. However, the simulation process in this method takes a long time, and if the training data for the simulation is not empirically determined from an experiment, then the estimated results have not been in good agreement with the measured power losses. Reference [4] proposes a method for online condition monitoring of bond wires present in insulated gate bipolar transistor (IGBT) package by evaluating at the inflection point irrespective of the junction temperature. Reference [5] presents a method for detecting the condition of bond wires in the IGBT module by identifying the short-circuit current of the IGBT module. Reference [6] describes a monitoring unit in an intelligent power module that detects internal thermal resistance changes. Usually, a 20% increase in internal thermal resistance from the junction to case is often accepted as the threshold. This model needs the internal temperature of the module, but in practice, it is very difficult to access the inside of power modules during operation, and the accuracy of measurements is affected by noise. Furthermore, in order to place the sensor to measure the internal junction temperature, the package must be uncovered, which is intrusive and can easily destroy them. Thus, this method is unfeasible for practical application. Additionally, all of the aforementioned conditions monitoring methods are used for an individual power module. Thus, an alternative approach is therefore required to monitor power modules connected in parallel in large-scale inverters [79].

This paper provides a novel methodology to supervise the parallel-connected power modules without intruding them for real-time condition monitoring. First, we consider the power modules as a whole system, as all parameters of working conditions are regarded as the input into the parallel-connected power module system. These parameters include the voltage of the power modules and the currents of each of the parallel-connected power modules, respectively, ambient temperature, case temperature, the current of inductance, output current, and output voltage; in turn, the power loss of the power modules is deemed as the output of this parallel system. Then, several models for parallel-connected power modules are built using neural networks. These models can be used to predict power losses at a given operating point; these power losses are directly associated with the working conditions and internal conditions. Finally, a reasonable threshold for the error between the normal power loss and the predicted power loss is set reasonably. Once the predicted real-time power loss is much higher than the normal power loss, a system operator could conclude that the parallel-connected power module system is abnormal. Further, we can evaluate the degradation grade of abnormal condition by comparing the errors between the output of real-time power loss and the normal power loss using several models. At a given working condition, as long as the working condition is not changed, the predicted real-time power loss should have almost the same value as the normal power loss [1014]. Thus, a significant difference between real-time power loss and normal power loss is an indication that there are some problems. Since all of the input and output parameters of the whole system for the parallel-connected power module system can be measured or detected accurately, the electrical operating point can be tracked from the large-scale power module terminal measurements by measuring an individual semiconductor [1518]. This method is feasible in practice without any internal detecting and intrusion; experimental tests have verified the viability of this approach.

This paper uses experimental measurements and simulations of parallel-connected power modules to understand the impact of aging on module reliability. Simulations are used to analyze the impact of thermal resistance variances between them on junction temperature, current balance, and overall power loss. This paper is organized as follows: Section 2 has explained the mechanism of thermal aging between the power modules, Section 3 explains consequences of differences in thermal aging between the modules, and Section 4 presents the experimental setup, which is used to investigate the problem of how thermal resistance variance in parallel-connected modules affects overall reliability, as well as to collect the important electrical parameters which are critical to monitoring the parallel-connected power modules. Section 5 introduces the monitoring method of the parallel-connected modules based on the neural network and put forward the working principle of the predicting model based on ANN, which is verified in the laboratory experiment. Section 6 discusses the application issues of condition monitoring using the neural network; Section 7 concludes the study.

2. Aging Mechanism of the Power Device

In general, the operational degradation of parallel devices usually does not occur at the same speed for each of the device. Parallel-connected power modules can start missions with almost the same thermal and electrical parameters at the beginning [1921], but may vary over the entire operational life due to uneven degradation rates of aging. Differences in electrical and thermal parameters of individual devices can cause other failure mechanisms [22, 23]. Therefore, if parallel-connected power devices are affected by different load currents, they will undergo different thermal cycles. Therefore, as shown in Figure 1, because the thermal expansion coefficient between the soldering layer and the substrate does not match, different degrees of thermal mechanical fatigue result in stress cycles. This means that the thermal resistance will change at different rates, so that the device will work at different junction temperatures, which in turn will affect the thermal resistance of parallel-connected power devices at different degrees and rates. For example, in general, due to the emergence of thermal mechanical stress, solder fatigue will lead to an increase in thermal resistance of power devices [24, 25].

3. The Simulation of Impact of Aging on Parallel-Connected Power Module

Two power modules are connected in parallel for simulation modeling using Simulink software in Matlab; one power module labeled has a thermal resistance which increases from 1.5, 2.0, to 2.5 times of the original thermal resistance to simulate the aging process gradually. The other power module is labeled and operates under normal conditions, and its thermal resistance remains unchanged throughout the simulation; a sine wave of voltage source is used as the power source. The output of this sine wave is shown as (1); the operating point of parallel-connected power modules is located in the negative temperature coefficient (NTC) region of power modules. where , , , .

The results of Figures 27 are from simulation; stands for the thermal resistance that is in the normal condition in all following figures and tables.

3.1. The Impact of Thermal Resistance Difference on Current Sharing of Parallel-Connected Power Modules

At the earlier stage of aging, there are no differences of current between the parallel-connected power modules as shown in Figure 2, but from Figure 3, we can find that the current of parallel-connected power modules is not shared equally with the accumulation and development of aging. Furthermore, the current of D1 will rise gradually as the thermal resistance increases due to dramatic aging over the operating life; in the meantime, the current of D2 will decrease, so the average current of D1 is greater than that of D2 and the imbalance of current between parallel-connected power modules becomes very serious once severe aging occurs on the parallel-connected power modules.

3.2. The Impact of Thermal Resistance Difference on Power Loss of Parallel-Connected Power Modules

If it is a slight aging or degradation, power loss is almost equal between the power modules as shown in Figure 4. From Figure 5, we can get that the power loss of D1 will rise significantly as thermal resistance increases, which is resulting from gradual degradation over the operational life of the module. The imbalance of power loss between parallel-connected power modules becomes very grave, if the parallel-connected power modules undergo severe aging.

3.3. The Impact of Thermal Resistance Difference on Junction Temperature of Parallel-Connected Power Modules

The impact of thermal resistance difference on junction temperature is different from that of current and junction; Figures 6 and 7 show that the junction temperature of parallel-connected power modules will increase evidently once the power modules have an insignificant aging, because the heat-dissipating potential degrades resulting from aging, and the imbalance of junction temperature between parallel-connected power modules becomes very serious once severe aging occurs on the parallel-connected power modules.

4. Experiment of Aging Test

4.1. The Schematic of the Experimental Rig

It is a buck circuit as shown in Figure 8.

4.2. Setup of the Test Rig

Figure 9 shows a test rig built for a buck circuit containing connected power modules, which is comprised of DC power supply; one switching IGBT; two diodes connected in parallel, one inductive load (10 mH), one capacitive load (320 μF), and one resistive load (0.5 ohm) load; and some measurement instruments including voltage probes, current probes (Tektronix TCP303, PR1030), and temperature sensors (Pico TC-08). A Semikron IPM SKM50GB12T4 is used for this buck circuit. It is loaded by an R load on the output side with capacitance smoothing. The buck circuit is controlled from a real-time system programmed in LabVIEW. The case temperature is measured using a 2-channel Pico temperature sensor which provides high accuracy (±0.01°C), with fast and synchronized data logging. The probes for the case and ambient temperatures are placed underneath the case surface of the module and next to the heat sink, respectively.

4.3. Experimental Simulation of Aging

One of the two diodes connected in parallel is marked as D1, the other one is marked as D2, and special thermal pads are put underneath the case surface of the D1 module to experimentally simulate the aging process, so D1 is simulated to be gradually degraded from aging.

Then, the following aging experiments are carried out by changing the number of thermal pads underneath the case surface of the D1 module step by step as shown in Table 1; the normal thermal resistance of D2 is 0.9442 k/w.

In the experiment, a square wave of the current source whose amplitude is 35 A is used as the power source. The operating point of parallel-connected power modules is also located in the negative temperature coefficient (NTC) region of power modules.

After that, some important electrical parameters are measured and collected by the LabVIEW software automatically, such as the voltage of the DC power supply, the voltage of diodes, the currents of the two diodes, respectively, the current of inductance and the output current, and the output voltage.

4.4. Experimental Results

Here, the coefficient of current sharing, power loss sharing, and junction temperature sharing is defined as , , and , which is shown as (2), (3), and (4), respectively.

We can find that becomes lower and lower, from the experimental results shown in Figures 10 and 11; it means that the current is different, while the current of and the difference of current between and rises gradually as thermal resistance increases. A conclusion can be drawn that as the aging on the parallel-connected power modules becomes severe, the current imbalance of between them becomes more serious.

From the experimental results shown in Figures 12 and 13, we can discover that the power loss is not equal and becomes more and more low, and the power loss of rises gradually with thermal resistance increase; therefore, we also get that the more serious the imbalance of power loss between parallel-connected power modules becomes, the more grave the aging occurs on them.

We can see that becomes more and more low, and the junction temperature is also not shared equally from the experimental results shown in Figures 14 and 15, and the junction temperature of parallel-connected power module will rise significantly as thermal resistance increases; as the aging on the parallel-connected power modules becomes severe, the junction temperature imbalance between them becomes more serious.

The simulation and experimental results show that current, power loss, and junction temperature of which is in the process of degradation from aging will become high. The reason for the rise is that the operating point of parallel-connected power modules is located in the negative temperature coefficient (NTC) region of power modules. As module begins to degrade from aging, its current share increases, leading to an increase in power loss which in turn increases the junction temperature. The rise in junction temperature also impacts directly on the current distribution or assignment of parallel-connected power modules, as modules with higher junction temperatures will conduct more current than those with lower junction temperatures. Thus, the system is weak in terms of its self-adjusting capability for current sharing. As it is a positive feedback loop shown in Figure 16, so it is not so suitable for the operational point to locate in the NTC region of them.

4.5. The Models Are Built for Predicting the Power Loss

In this section, we give an example of the power loss prediction by means of neural networks with real measured data obtained from the experiment described above.

The power loss predictive system in this paper is a high nonlinear dynamic system with significant variability of performance over time. The structure of a NARX (nonlinear autoregressive with exogenous input) is a feedback one; it feedbacks the difference of the previous outputs to the input layer. NARX is also a kind of dynamic filtering, in which past values of one or more time series are used to predict future values, and it is able to identify the nonlinear system, which has demonstrated a good performance. Thus, the NARX is selected in this paper.

The NARX neural network is used to predict the power dissipation series given past values of and another series [26, 27]; the problem definition is (5), and consists of the thermal-electrical parameter of power modules such as the voltage of diodes, the currents of the two diodes, respectively, the current of inductance and the output current, and the output voltage of the experimental circuit.

The model of the NARX neural network is designed by programming in Matlab. The first step of modeling is training; 70% of all data from the experiment are used for training, and they are presented to the network during training. The network is adjusted according to its error across this training. The second step of modeling is validation. 15% of the data are used for validation; these data are used to measure network generalization and to halt training when generalization stops improving. The third step of modeling is testing; the remaining 15% of data are used for testing, and they have no effect on training and so provide an independent measure of network performance during and after training.

Figure 17 shows the result of the power dissipation prediction with the NARX neural network; the unit of the -axis is W. Making a contrast between output and the target power loss of power modules connected in parallel, we can see that the performance of the NARX model is good. Table 2 and Figure 18 show the root mean square error (RMSE) of the NARX model. As shown in Table 2, the NARX model has a very low RMSE. Thus, this model has a very high prediction precision.

5. Method of Monitoring Parallel-Connected Power Modules

Simulation and experimental results have shown that the power loss is affected by an increase in thermal resistance which is regarded as the reflection of performance degradation from aging. Based on this, a new method can be proposed to monitor power modules connected in parallel by comparing the predicted real-time power loss with the normal power loss. This methodology mainly consists of three steps: the “collecting important electrical parameters” step, the “model building and prediction” step, and the postprocess step. For a real large-scale inverter, it is designed so that all important electrical parameters will be recorded online during its operation. Thus, the initial power loss characteristic, standing for normal condition of the power module before degradation, can be obtained from these data, which provides a standard or reference for condition monitoring. During the operation, the inverter variables will be collected periodically, e.g., every 1 or 2 weeks, to update the prediction model. If degradation occurs, its condition can then be monitored by comparing the change of power loss during the operation.

The purpose of the “collecting important electrical parameters” step is to measure and get the raw real-time data, which is used to build, train, and validate the prediction model. All data should be preprocessed by means of normalization and data cleaning. The data includes currents and voltage as well as case and ambient temperature.

The “model building and prediction” step is a look-up table containing power loss for various operating conditions. Several models are in this look-up table; each model is built from the historic data of experiment 0–4, respectively, as Table 3 shows.

The postprocess step is developed to compute the power module power loss using the NARX models built in the second step. Firstly, real-time data of electrical working conditions are put into prediction model-0 to output the real-time power loss during operation. Then, the real-time power loss is compared with the normal power loss. After that, the error in power loss is compared to an accepted threshold (usually 20%), which is pre-setup reasonably according to the actual condition, and it is determined whether the error is within the range of the threshold range. Finally, once the predicted real-time power loss is much higher than the normal power loss at a certain working point, the conclusion can be drawn that the condition of the power modules is abnormal; since the working condition at a fixed certain working point is unchanged, the predicted real-time power loss should have nearly the same value as the normal power loss. Therefore, the condition of aging can be determined in the step of fault severity analysis by evaluating the change in power loss.

When we find that the real-time condition is abnormal, in order to further evaluate the degradation level of the power modules connected in parallel, real-time data is also put into the model (1–4). The predicted real-time power loss of the models is compared with the measured value of power loss. By examining the errors between the models’ prediction for power loss and the actual measured power loss, the degradation grade of abnormal condition can be determined by seeing which model has the least error between the predicted power loss and the actual power loss according to the evaluation criteria for condition as shown in Table 4. Based on the result of the degradation grade, the operator can take suitable operating strategies or maintenance measures as shown in Table 5.

6. Discussion of This Method

In Section 5, the concept and algorithm of the condition monitoring and evaluation method were put forward by experimental results. This section analyzes the application problem of this approach.

Usually with more parameters, the NARX model can predict power dissipation better. The more sample data is used for NARX model building and training, the more accurate the model will become. Thus, further study should be conducted to make this model even more precise and complete.

The prediction model based on one type of power module cannot be used directly to predict another type of aging condition without changing any parameters of the model. This is because each manufacturer has produced many types of power modules using different processes. Therefore, these modules have various performance characteristics, but for the same type of power modules from the same manufacturer, it is helpful for operators to use this method to monitor and estimate power modules. For other types of modules from other manufacturers, this method can also be used to monitor parallel power modules, a model can be adopted after only updating a few parameters of the prediction model and adjusting the accepted threshold periodically.

7. Conclusion

Simulation and experimental results verified that the nonuniform rate of degradation from aging results in thermal resistance variances of power modules connected in parallel. Meanwhile, thermal resistance variances will lead to the imbalance of current sharing between the parallel modules. Furthermore, different thermal resistance variances mean different junction temperatures and power losses between modules, which directly affects the overall characteristics or performances of them. This paper proposes an approach to monitor the condition and evaluate the degradation grade of large-scale power modules connected in parallel using the estimation of external power loss measurements. The external characteristics of power modules are the reflection of internal conditions; the inputs to this model consist of all important electrical parameters at certain working points, and the output of the model is the power loss, which is used as the indicator of internal degradation. The case study result shows that power loss can be successfully estimated using NARX neural networks. The advantages and challenges of the method are analyzed in the paper, and it is desired that the study makes a step forward to developing a cost-effective technique to evaluate the degradation grade from aging for enhancing the real-time reliability of power modules connected in parallel.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there is no conflict of interest regarding the publication of this paper.

Authors’ Contributions

All authors typed, read, and approved the final manuscript.

Acknowledgments

This research was supported by the NSFC project of China (51477019), the Venture & Innovation Support Program for Chongqing Overseas Returnees (CX2018045), and the National Key Research and Development Program of China (2018FYB09058).