Abstract

A wide literature review of recent advance on monitoring, diagnosis, and power forecasting for photovoltaic systems is presented in this paper. Research contributions are classified into the following five macroareas: (i) electrical methods, covering monitoring/diagnosis techniques based on the direct measurement of electrical parameters, carried out, respectively, at array level, single string level, and single panel level with special consideration to data transmission methods; (ii) data analysis based on artificial intelligence; (iii) power forecasting, intended as the ability to evaluate the producible power of solar systems, with emphasis on temporal horizons of specific applications; (iv) thermal analysis, mostly with reference to thermal images captured by means of unmanned aerial vehicles; (v) power converter reliability especially focused on residual lifetime estimation. The literature survey has been limited, with some exceptions, to papers published during the last five years to focus mainly on recent developments.

1. Introduction

The photovoltaic (PV) market has witnessed over the last years a remarkable growth as a result of various stimulating factors: the significant cost reduction of the PV modules on the market and the changes on support policies. These factors have made the return on investment of a photovoltaic system more interesting. However, like other industrial processes, a photovoltaic system may be subject during its fabrication to defects and anomalies leading to a reduction of the overall system performance or even to total unavailability. These negative consequences will obviously reduce the productivity of a PV system and therefore its profit. Thus, proper early fault detection and real-time diagnostic are crucial not only for lowering cost and time maintenance, but also to avoid energy loss, damage to equipment, and safety hazards.

Several research papers and institutional body reports, such as IEA, have underlined the low yields of PV systems due to faulty components, especially the DC section (i.e., PV cells/modules and MPPT) [19]. Broadly speaking, faults of PV arrays are categorized as cracks in the cells, delamination, hot spots, dirt accumulation, modules mismatches, short circuit of modules, junction box faults, caused by damaged connections, corrosion of the connections, open circuit, short circuit, and MPPT faults [10]. Obviously, this is not an exhaustive list and many other faults can be found in the literature [11, 12].

The basic approach for the detection of unexpected power losses of PV systems uses analytical redundancy, which is a comparison between the monitored electrical quantities (output power, voltage, and current) and their counterparts obtained from a reference model. An alarm is triggered when predetermined differences are reached [1316]. The reference model is often based on the one diode model, whose parameters are determined either by the manufacturer’s datasheet or by automatic extraction methods [17]. However, these methods are only effective for the detection and diagnosis of a grouped set of faults but not for an individual location of each defect [18]. Moreover, irradiation and temperature measurements are essential requirements for this approach [15].

Another fault detection and diagnostic method is based on hardware redundancy, in which several similar subsystems undertake the same task. By collecting and analyzing each subsystem’s data, abnormalities can be detected. Since monitoring of electrical parameters usually produces a large amount of data, artificial intelligence and data mining are adopted as well. Similar models are adopted for power forecasting, which is important for both monitoring purposes and the management of the utility grid.

Recently, thanks to the widespread diffusion of unmanned aerial vehicles (UAV), thermal analysis is becoming a cost-effective alternative to electrical monitoring. Also the reliability of power converters plays a key role in the correct operations of PV.

The paper is organized as follows: Section 2 describes the monitoring techniques based on the measurement of electrical parameters; Section 3 focuses on the analysis of measured data with artificial intelligence and data mining algorithms; Section 4 is devoted to power forecasting; Section 5 discusses aerial thermal analysis; and Section 6 deals with the reliability of power converters. Conclusions are drawn in Section 7.

2. Electrical Methods

This section presents the recent trends for monitoring and diagnosis (M&D), based on electrical parameters directly acquired from the solar field. In principle, the performance analysis based on such parameters is straightforward, because it is based on the comparison between measurements and predictions. Unfortunately, the large number of unpredictable conditions, which affect the performance of solar panels, poses a serious challenge to the definition of a reliable target for the expected outputs.

The first step towards a suitable monitoring system is the definition of what should be measured, how it can be measured, and how measurements can be handled. The question about what should be measured introduces the first trade-off among possible monitoring/diagnosis approaches, depending on how PV subsystems are grouped. Indeed, the overall performance of a PV system depends on the performance of each subsystem, where the individual subsystem is the single solar cell forming the solar panel. A more pervasive measurement system increases the accuracy of M&D at the expense of an increased cost. Therefore, a rough classification of M&D electrical techniques can be based on the “level of granularity” (LoG). The lowest LoG corresponds to the monitoring of the solar field as a whole. In this case, only the instantaneous output power generated by the PV field, at either the DC side or the AC side, is measured and then converted into the energy yield of plant. In this case, the widely adopted figure of merit is the Performance Ratio (PR) defined, according to the IEC 61724 [26], as the ratio between the measured instantaneous power, , (or the measured cumulated energy) and the nominal power of the solar field, , (or the cumulated energy produced at nominal power rate), corrected by taking into account the actual instantaneous irradiance . With respect to the irradiance at STC (1 kW/m2),

The main drawback of adopting (1) as figure of merit is that, as well pointed out in [27], the power yield largely depends on the working temperature. In order to take into account thermal effects, an improved version (2) has been proposed:where β is the temperature coefficient for the power generated (it is always a negative number) and ΔT is the temperature increment with respect to 25°C.

Usually, if PR is lower than 1, the solar system is underperforming. However, the adoption of in both (1) and (2) does not take into account numerous factors leading to a deviation of the actual performance of the solar field from the nominal target, even though all its components operate correctly. In order to overcome this issue, [27] proposes an improvement of both (1) and (2) by replacing the nominal power with a reference power provided by a detailed model of the solar field. The model works with the same environmental conditions of the real system but with ideal solar panels, thus defining a “relative error” as figure of merit.

The limit of complex models, like those presented in [27], is that their effectiveness is based on the reliability of the model and the capability of extracting from its suitable parameters.

An opposite view is presented in [28], as the model used to predict the PV system electricity production has a low complexity.

For all the cases, the expected power , adopted in place of in (1), is evaluated aswhere a1, a2, a3, and a4 are fitting parameters.

The method proposed in [29] is also very simple. The sophisticated verification (SV) method has the following expression for the PR:where is a loss term (explicated in [29]) which includes various effects (temperature, shading mismatches, etc.). The measured energy is then plotted as a function of the irradiance and, according to (4), any deviation from the straight line is attributed to some form of malfunctioning.

A similar approach is also proposed in [30] with reference to a solar field with thin film panels. In this paper array losses are defined aswhere and are the measured operating voltage and current of the solar field, respectively.

Once a model for evaluating losses has been selected, another issue common to many papers is data transmission, with the main options being WiFi, GSM, and Power Line Communication (PLC). In [31] it is observed that a drawback of GSM [32] is its high operating cost, as the user needs to pay for the data transmission service; thus, the ZigBee protocol is proposed. In [33] wireless communication is exploited to monitor the effect of dust in solar field installed in the desert. Methods based on the monitoring at the array level have the drawback of not being suitable for locating faulted components. This is a very important issue, because maintenance costs strongly depend on the ability to undertake focused interventions. Therefore, the number of monitored parameters needs to be necessarily increased and move towards a higher LoG. A commonly adopted solution is to use the same figures of merit but refer to a single string rather than the entire system. Actually, the availability of string electrical performance can be also exploited to skip the need for weather information. The comparison among the strings allows the direct identification of the faulty ones. This approach is well illustrated in [34], where currents between strings are compared. In that paper, it was observed that fault detection is difficult when failures occur in multiple strings; the disambiguation is carried out by combining the figures of merit with the evaluation of the standard Performance Ratio.

The definition of a more suitable figure of merit can be found in [35], where an inferential tool, returning information about the operation of the PV field, is presented. After initial training, the software defines one or more reference strings that are used in place of the nominal power for the definition of the expected Performance Ratio. A simple method for defining a reference string was presented in [36, 37]. In those papers the instantaneous power generated by the best performing string in a large solar field is assumed as the target for all other strings with the same orientation. This approach has the advantage of being absolutely independent on weather conditions, irradiance, and temperature and does not require any training of the software. Moreover it allows a fast localization of faulty strings and a reliable estimation of energy losses attributable to each string.

A different approach for analyzing string data is proposed in [38], where a given dataset of observed string currents and voltages and their respective low-pass-filtered time derivatives are analyzed a posteriori for the determination of the probabilities of a restricted set of possible fault that could have caused that dataset.

All the techniques listed so far are based on the comparison of the instantaneous power with a set yield target. An alternative method, which can be found in [39], proposes plotting the whole curve of a single string using information from the inverter. Indeed, the inverter control needs to measure instantaneous voltage and current in order to track the maximum power point. Moreover, as pointed out in [40], some commercial inverters carry out a periodic scan of the entire curve in order to distinguish the global maximum from local ones, in case of mismatch among the modules.

The measurement of the whole curves of single strings, compared with a tailored model, is also proposed in [41] to recognize six categories of faults, including shadow effects, bypass diode fault, cell fault, module fault, and so on.

A possible issue, which is often misrecognized when dealing with string level monitoring techniques, is the possible occurrence of reverse currents in parallel connected string. Paper [42] shows the ineffectiveness of usual rules (3σ) used to recognize underperforming strings, since the current of faulty string always lies between the upper and lower bound of the 3σ rule. In order to overcome this problem, a machine-learning local outlier factor (LOF) is defined. This factor provides a quantitative approach to identify the faulty strings (called outliers) by defining a density-based outlier detection rule. This rule is based on the fact that the density around an outlier is significantly different from the density of its neighbors.

An improved fault location capability can be attained by pushing the monitoring at the individual solar panel level. It is obvious that in this case a pervasive sensor network is needed, so that the cost of the system can be justified by higher revenues coming from most effective maintenance strategies. The main issues to be faced when a single panel monitoring system is adopted are the power supply of sensors, data communication, and data management [43], as their effectiveness depends on the monitored parameters. The best performing option [44] consists in measuring the entire curve of each solar panel. This solution is relatively simple to implement in distributed conversion systems [4549], where each solar panel has its own dc/dc converter that can be properly controlled to plot or estimate the characteristic [50]; other measurements are much more complicated, because they would require the temporary disconnection of individual solar panels from the string.

A more widely adopted solution consists in the measurement of the operating voltage and the operating current of the solar panel to calculate the instantaneous power generated. This approach requires reduced hardware for sensing electrical parameters, while it could be demanding for the power supply, depending on both the adopted communication system and the sampling rate of measurements. For example, [51] proposes GSM, but no details are given about power supply. A possible alternative to GSM is PLC, which exploits the existing dc wiring; in this case it is essential to avoid the fact that signals travel through the solar panels (that are series connected with the dc power cable). To this end, a bypass low impedance path must be provided. Reference [52] proposes connecting a capacitor in parallel with each solar panel, while [53, 54] propose a more effective LC filter. PLC is also suggested in [55] to make a fire proof protection system. A heartbeat is sent through the power line to a microcontroller mounted on each solar panel. If the heartbeat is lost, either because the power line is opened on purpose or broken by fire, the microcontroller trips a series switch to open the circuit and a parallel switch to short circuit the solar panel.

The main drawback of a single panel monitoring system is that the operating current is the same for all the series connected solar panels and depends on the string operating point, which is fixed by the centralized converter. The consequence is that the operating power cannot be considered as a diagnostic measurement for each individual panel. From this point of view, the most advanced single panel monitoring system is described in [5659]. In addition to the measurement of operating voltage and current, this system uses both the open circuit voltage, , and short-circuit current, , which are unique to each solar panel. Moreover, is an indirect measurement of the temperature while is directly related to the irradiance, so that no specific sensors are needed for these parameters. In order to carry out the aforementioned measurements, the system physically disconnects the solar panel from the string for about 20 ms, thanks to a series solid state switch. The power of the electronic board is drawn from the solar panel. Since, during the measurement of , the output voltage is zero, a supercapacitor is used as a backup storage to allow the continuous controllability of the circuit. Lastly a cheap WiFi communication system is adopted for data transmission.

A common problem encountered by monitoring systems is the large amount of data to be analyzed. In the next section approaches adopted artificial intelligence and data mining are described.

3. Artificial Intelligence and Data Mining

As PV array characteristics are highly nonlinear, the presence of underperformance and faults within the system can lead to uncorrelated effects. Hence, more sophistication and refinement of the algorithms and methods for fault detection and diagnostic are required. One active research area is on the use of artificial intelligence and data mining, which are primarily based on the concept of a knowledge database. These methods can be split into three categories [6065]: signal processing methods, classification methods, and inference methods. The main idea of signal processing methods is to extract some features of the measured signals, which can be attributed to a particular state of health of the PV system. The most commonly used methods are wavelet transform techniques [66] and Fast Fourier Transform (FFT) [67]. The classification methods are instead based on artificial intelligence, where knowledge is built from an available dataset. As the amount of labeled data is quite large, supervised learning algorithms can learn the characteristics of the system and make the prediction after training. A number of supervised learning models addressing fault detection and diagnostic in PV systems have been proposed in the literature. For instance, artificial neural networks (ANN) have been proposed for PV systems working under partial shading conditions [68]; for the monitoring and supervision of health status of a PV system in [69]; and for short-circuit fault detection of PV arrays in [70]. In other works, Bayesian Neural Network (BNN) and regression polynomial models have been proposed to predict the soiling effects on large-scale PV arrays [71]. Data mining methods for fault detection and isolation in PV systems are also proposed in the literature such as decision-tree method, -nearest neighbor, and support vector machine (SVM).

4. Forecasting of Power Production in Photovoltaic Plants

The forecasting of the power generated by a PV plant is a key activity for supporting the monitoring of PV fields [72, 73]. Usually, it makes use of either solar radiation measurements made on module plane or solar radiation data taken from meteorological service providers [74], with the aim being to calculate reference values for energy yield.

Depending on the task required, forecasting techniques can have different timescale: the very short (up to one hour) and short (up to 6 hours) time scales belong to intraday forecasts, while longer forecasts have time scales of one or more days. With reference to the spatial extension, forecasting can be related to a single plant or, for regional models [75, 76], a cluster of plants.

As PV systems are greatly influenced by weather conditions, such as solar irradiance and air temperature, accurate models are required for a reliable prediction of their power generation. Many approaches have been developed over the last decades and a good review of them is presented [73, 77]. Models for forecasting the power generated by a PV plant can be broadly classified into three categories. A first type of techniques is based on Numerical Weather Prediction (NPW) for the forecast of meteorological parameters such as the solar irradiance and the air temperature. These parameters are used as the input of a model of the PV system to forecast the power generated. Another approach is based on statistical modelling of the historical record of the power generated. This approach includes regressive (e.g., autoregressive, AR; autoregressive moving average, ARMA; autoregressive integrated moving average, ARIMA, etc.) and Artificial Intelligence (AI) models (e.g., artificial neural networks, ANN; support vector machine, SVM; adaptive neurofuzzy inference system, ANFIS, etc.). Finally, a third way to forecast the power generated by a PV system combines physical and statistical modelling, called hybrid technique [7880]. Hybrid techniques are usually applied when some of the data required by physical or statistical methods are missing and can also be used for improving the accuracy of the forecasting activity [73]. The three different approaches have different temporal capability: most of the techniques produce short-term predictions, while NPW-based methods are better suited for long-term predictions of up to 15 days [81].

Among the different approaches presented in the literature, where the main challenge is the design of cost-effective models working for different PV technologies, locations, and working conditions, physical, regressive, and ANN-based models are the most applied techniques accounting for 50% of the reviewed literature [73].

Physical approaches, which represent 11% of the used techniques [73], use models of the PV system to generate the forecasts, while the major research attempts are spent on solar irradiance forecasting [75]. NWP modelling is based on the physical state and on the dynamic motion of atmosphere [75, 77, 8284]. One of the latest works in this field can be found in [85], where the solar irradiance is forecasted on a day-ahead and intraday basis by means of a model provided by the European Center for Medium Range Wheatear Forecasting (ECMWF model). NPW modelling is also used to calculate the temperature that, together with the solar irradiance, usually is the main input of any physical model for the calculation of the power generated [86].

Statistical modelling does not require any information regarding the system to model and use data to predict the future behavior of the plant [73]. These approaches include a number of different types of time series regression models [87], accounting for 14% of the total forecasting techniques [73]. The mostly used techniques are ARIMA-based models because of their generality [88]. As reported in [89], ARMAX models, which use exogenous inputs, give the best results for this type of modelling.

In a similar way, techniques based on artificial intelligence do not require any information regarding the system and include a number of different approaches: artificial neural networks, fuzzy logic, evolutionary algorithms, expert systems, and others [90]. The most used AI-based techniques use ANN, representing 24% of the total, and they can be classified as follows.(i)A first type of ANN-based model estimates the power generated by the PV plant starting from the instantaneous working conditions of solar irradiance and temperature [9193]. The working conditions can come either from sensors mounted on the field or from NWP-based models.(ii)Other ANN-based models take as an input the current and the past values of the output power [81, 9496]. These models directly forecast the power output without any additional meteorological parameters.(iii)A third type of ANN-based models is a combination of the first two types [97100].

5. Aerial Thermal Analysis

As it is widely known, the degradation of long-term performance and overall reliability of PV plants can drastically reduce expected revenues. It should be considered that medium- and large-size plants are composed by thousands of modules, with each one potentially affected by the following main types of faults:(i)optical degradation or fault: bubbles, delamination, discoloration of the encapsulant, and front cover (i.e., glass) fracture;(ii)electrical mismatches: cell cracks/fractures, breakage of interconnection ribbons, poor soldering, snail tracks, shunts, shading;(iii)nonclassified: potential induced degradation (PID), defective/short-circuited bypass diodes, short-circuited modules or strings, and junction box failure.

Standard monitoring approaches, that is, electrical string monitoring, only ensure power losses detection in a portion of the PV field, while the accurate localization of faulty modules requires strings disassembling, visual inspection, and/or electrical and thermographic analysis. Unfortunately the above-mentioned techniques are time demanding, cause undesired stops of the energy generation, and often require laboratory instrumentation, thus resulting in cost effectiveness only in case of catastrophic faults. Moreover, it should be noted that PV plants are often located in inaccessible places, for example, rooftops, thus making any intervention dangerous. As a consequence, the safety of operation deeply impacts on the maintenance costs.

The introduction of diagnostic techniques provides on one hand rapid detection and effective classification over a large number of faults, but on the other hand they limit monitoring and diagnostics costs. This requires in the majority of PV fields cost-effective O&M.

In the recent literature [1925, 37, 101106], a new nondestructive diagnostic approach uses unmanned aerial systems (UAS) equipped with thermal and/or visual cameras to inspect PV fields and automatic tools for image processing and fault detection and classification. The main challenges of this approach are(a)positioning;(b)individual module identification;(c)defect detection;(d)defect classification.

The critical aspect of PV module automatic identification in infrared images (point (b)) has been studied in [107]. Unfortunately, the small 5% error obtained by means of manual camera increases to 30% when a drone carries out the measurement from a flying altitude of 20 m.

An automatic defects detection and classification procedure is proposed in [108] for a cell-level analysis. First, the variance and the mean value of the temperature of each pixel of the photograph are calculated for each PV cell. Nonuniform cells, that is, cells exhibiting a large variance in their temperature distribution, are discarded and separately analyzed. Uniform cells are classified into light, medium, and strong hot spot according to their mean temperature. Subsequently, hot cells are classified as a function of the most common defects.

A simple and effective method to identify the frames of PV modules from thermal images is proposed in [109]. This approach relies on the assumption that the solar cells have temperature higher than the metallic frame and there is a sharp transition between the two regions. Moreover, the proposed procedure is capable of classifying the defects by means of the thermal gradient analysis carried out at cell level. The obtained results are promising, but the applicability of aerial inspections is still limited by strict requirements in terms of thermal camera resolution and/or low flight height.

Even though defects in solar panels often cause an increase of the surface temperature, sometimes the poor resolution of the thermal cameras hinders an accurate defect classification. In [20], a double stage procedure is proposed: in the first step UAV (in the following also referred to as drone) equipped with a thermal camera provides a preliminary thermal analysis to detect and classify large-size defects, while a subsequent visual inspection with a second drone, equipped by a HD photo camera, provides small-size defects classification. In particular, the first stage detects defective points in the PV field and classifies them depending on their shape and location. The following faults can be identified: (a) interconnection issues, that is, entire module warming; (b) defective bypass diodes, internal short circuits, cell mismatch, and snail trails, that is, isolated hot spots or “patchwork pattern”; (c) partial shadowing and cracks, that is, hot spots and/or polygonal patches. Additional information is obtained by reducing the distance between PV arrays and the UAV, thus allowing a more accurate visual analysis. Indeed, visual images can validate the detection and the classification of defects and failures like browning, bubbles, cracked cell, burning, corrosion, cell or module breakages, white spots, snail tails, discoloration, broken interconnections, solder bond failures, and dirty points.

Nevertheless, some critical aspects have to be addressed for an accurate aerial visual inspection, as reported in [21]:(i)the vertical photography (axis of the camera perpendicular to the ground) should produce a sort of a map of the PV field, where the objects are slightly affected by perspective issues;(ii)overlap between two consecutive pictures has to be ensured;(iii)a given flight height must be respected depending on the specific faults to be detect;(iv)the stability of the drone has to be ensured by flying without wind (wind speed < 3 m/s) and sunny, cloudless, and clear sky;(v)there is a need to find suitable flight trajectory to minimize the reflection of objects located near the modules and the sunlight.

In [22, 104] images acquired by a light UAV produced an IRT map, that, is thermal orthophotoplan, of the investigated PV installation by means of aerotriangulation methods. In particular, both photogrammetry techniques and global positioning system (GPS) receivers are employed to ensure correct positioning, while an image postprocessing procedure based on Canny edge approach allows highlighting hot spot of photovoltaic modules. Unfortunately no automatic classification tool is suggested, but auxiliary diagnostic measurements (e.g., IRT, characterization, and EL) validate fault detection and qualitatively classify the analyzed results into a specific fault type, corresponding to a specific thermal pattern, characteristic, and EL pattern. Automated diagnostic tools based on aerial thermal analysis are proposed in [2325, 37].

The system described in [23, 24] uses a three-step procedure: (i) undertaking a raw preliminary defects detection; (ii) selecting faulty modules from the thermal image according to health index; (iii) carrying out accurate defects detection and classification. In the first step the thermal image is converted into grayscale and digital filters (namely, rectangular average, rectangular ideal, and Gaussian filters) are applied to the frequency domain. Subsequently, a high-pass filter evidences the hot area in the panel. According to the assumption that a hot area suggests a fault, all the panels showing a high percentage of hot area with respect to the global one are selected for further analyses and their frames are accurately extracted by means of a Laplace filter. In the third step, the Decision Support Center evaluates the defect and failure type and proposes the best solution for the specific plant by comparing actual performance and its monitored history. Nevertheless, there are still no studies tackling the accurate description of the algorithms executed by the Decision Support Center.

An effective statistical data-driven approach is adopted in [25], where the identification of individual modules consists of the following steps: (1) normalization, (2) thresholding, (3) orientation estimation of the photovoltaic modules, and finally (4) correction and refinement. Moreover, in the proposed pipeline, all data corresponding to the detected photovoltaic modules within an infrared image are processed to obtain four sets of features. Suited statistical test highlights outliers, thus suggesting temperature abnormalities caused by module defects. Then, major temperature abnormalities are classified accurately into three main groups: overheated modules, hot spots, and overheated substrings. The method reaches high accuracy level, but the classification is still poor and generic. Table 1 reports information regarding drones and cameras adopted to implement the techniques discussed above.

Results obtained in [2325, 37] suggest that drone-assisted diagnostic is going to achieve an important role in O&M of PV plants thanks to its effectiveness in terms of detection and localization. Moreover, even though these techniques still require high resolution cameras often costly and heavy, as indicated in Table 1, today the market has on offer a growing number of light electric drones with high payload, sophisticated navigation systems mainly based on GPS receivers, and extended flight time. Nevertheless, an accurate classification of defects is still a challenge.

6. Converter Reliability

Among possible faults in photovoltaic systems, those associated with dc/ac power converters are the most dramatic. In fact, this occurrence completely stops the energy generation. Although a malfunction of the dc/ac power converter is easy to detect (even though reference data from inverter manufacturers might be needed), it is not so easy to fix, as it often requires the location and the replacement of damaged devices inside the case. Therefore, the challenge is to prevent failures by estimating the residual lifetime (RLT) of the power modules. RLT algorithms are different in terms of type of applications and module characteristics. In the following, different methods to estimate the expected RLT of IGBT modules based on accelerated life tests are summarized. First, temperature cycling tests are introduced to correlate RLT to temperature stresses. Then, methods for the estimation of the junction temperature (JT) and the temperature humidity bias (THB) tests are presented.

Power modules consist of materials with different thermal expansion coefficients and are subjected to temperature swings due to the variability of the load [110112]. These stresses lead to a degradation of the module integrity and development of faults, such as heal cracking [113], bond wire lift off (BWLO) [114], and solder fatigue [115]. Corrosion has received recently more attention as IGBT modules are packaged in plastic cases that do not normally offer sufficient moisture resistance.

Due to temperature swings and moisture penetrations, the actual RLT of IGBT modules can be significantly different from the manufacturer’s predictions. Thus, these factors should be taken into account to correct online the expected RTL. Condition monitoring systems (CMS) can give an important contribution to this problem, minimizing the risk of failure of IGBTs. CMS gather real-time data on temperature swings and moisture penetration during the operations of converters and use dedicated algorithms to correlate the measurements and update the prediction. These algorithms are based on accelerated life tests, where overstress conditions (high temperature, high temperature cycling, high power cycling, humidity, etc.) are applied to the module to understand in short time the effects of these stresses on the external characteristics and parameters of IGBTs. The CMS will then monitor the stresses in normal conditions and calculate the associated residual lifetime in the correct time scale using the acceleration factor used for the test.

Temperature cycling tests (TCT) refer to the power cycling of IGBT modules at high and low temperatures when the modules are on-state and off-state, respectively. During these tests, both the temperature difference ofthe junction and the mean value of the junction temperature (JT) need to be collected. The RLT model is obtained from the Palmgren-Miner rule [116]. In this model, the lifetime consumption (LC) is calculated as the ratio of total cycle numbers and number of cycles to failure. The former is obtained from a counting algorithm, like the Rain Flow [117]; the latter is calculated from the Coffin-Manson law [118], which physically models each fault. The RLT is finally estimated by time history (time duration when fault occur) over LC. LCs equal to one or greater than one can be symptoms of being close to failure or occurrence of failure, respectively. The most challenging issue is that TCT rely on JT. Measurement of JT is significantly difficult in a direct way, because of the difficulty in accessing the junction of the module [119]. Different alternative methods have been investigated by researchers to estimate JT, as explained hereinafter.

The JT can be directly measured by temperature sensors [120] and IR cameras [121]. However, due to the slow response of temperature sensors and dependency on the point of installation, the direct method is not recommended. Moreover IR cameras allow only average measurements of relatively large areas of the module and are also not easy to calibrate for the entire range of temperature variations.

The estimation of the JT is preferred to the direct measurement, as it does not require the modification of the internal structure of the IGBT [7]. The current estimation methods of JT are based on the measurement and analysis of thermal sensitive electrical parameters (TSEPs). These can be used to obtain the static (e.g., on-state collector-emitter voltage and collector current) and dynamic (e.g., gate parameters and turn-on and turn-off delay times) characteristics of the IGBT under examination. The improvement of models and algorithms for the analysis of indirect measurements of JT is now bringing the indirect methods to the same accuracy levels of their direct counterparts, even when a detection of JT at fast sampling rate is required.

Kuhn and Mertens [122] and Brown et al. [123] have demonstrated that turn-on and turn-off delay times are positively correlated to JT variations. However, the measurement of these times is challenging for the high sampling rate required to achieve a good accuracy. A second negative aspect of this method is that an increase of the off-state to on-state time () and the on-state to off-state time () can also be caused by a degradation of the gate-oxide characteristics, which increases the gate-emitter voltage ().

Denk and Bakran [124] and Baker et al. [125] showed that the internal gate resistance () of IGBTs has a negative correlation with JT, so that the value of gate current changes and, hence, the voltage across the external gate resistance linearly increase by a variation of JT. Variation of this voltage has been considered as a TSEP, useful to estimate the JT. The value of the proposed TSEP is accurately measured when the gate voltage is negative; that is, the IGBT is off. However, this method requires an external high frequency sinusoidal voltage signal that has to be superimposed on the negative gate voltage during the off-state time, using auxiliary MOSFET.

Barlini et al. [118] showed that the rates of change of the collector-emitter voltage () and the collector current () are linearly related to the JT. However, this relation has been verified only for MOSFETs and it is of difficult practical implementation, because it requires the measurement of time derivatives with high sampling rates.

The measurement of the static and characteristic of IGBTs has been investigated as another method to estimate the JT [126]. However, this method is not recommended, as the estimated JT is affected by BWLO and the method becomes unreliable if this fault occurs.

As mentioned above, RLT of IGBT modules can be adversely affected by moisture, since the plastic package is not hermetically sealed. Moisture can lead to corrosion of the aluminum inside the module and, hence, faults. Zorn and Kaminski [127] showed that the increase of moisture causes a decrease of avalanche voltage and, above a certain level of humidity, a surge in the leakage current. This leads to temperature stress, especially at the junction the IGBT. Therefore, moisture can significantly decrease RLT of IGBT modules.

The correlation between moisture levels and RLT can be obtained from THB tests. Generally, a THB test is conducted in relative humidity of 85% and 85°C to assess the moisture resistance of the IGBT package. The test voltage applied to the IGBT is used to regulate the leakage current. This voltage should be carefully selected, because if the leakage current heats the module, the moisture evaporates and the degradation effect of humidity becomes less evident [128]. As shown by Zorn and Kaminski [129], the problem is particularly delicate when the test voltage is above the bias voltage (typically 90% of the nominal voltage), because the leakage current noticeably increases and makes the degradation effect more observable, albeit the higher temperature accelerates the evaporation of moisture.

7. Conclusions

This paper has presented a literature survey on reliability issues of photovoltaic fields. The main aspects of the subject have been covered by reviewing papers dealing with data acquisition, data management, and modelling. Tradeoffs among high sensitivity, pervasiveness, hardware requirements, effectiveness, and costs have been pointed out. The abundance of high quality works is an indicator of the relevance of the problem for the scientific community; nevertheless it also evidenced that many issues are still debated and need a more in-depth investigation.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

Dr. A. Mellit would like to thank the ICTP, Trieste (Italy), for providing the materials and the computers facilities.