Abstract

We considered five generalizations of the standard Weibull distribution to describe the lifetime of two important components of sugarcane harvesting machines. The harvesters considered in the analysis harvest an average of 20 tons of sugarcane per hour and their malfunction may lead to major losses; therefore, an effective maintenance approach is of main interest for cost savings. For the considered distributions, mathematical background is presented. Maximum likelihood is used for parameter estimation. Further, different discrimination procedures were used to obtain the best fit for each component. At the end, we propose a maintenance scheduling for the components of the harvesters using predictive analysis.

1. Introduction

The arrival of the sugarcane culture in Brazil has had a significant impact on the national economy, which led the country to become the largest producer in the world [1]. Its subproducts are used in the food and chemical industries, as well as in electricity generation and fuel production. Mechanized harvesting is one of the most important stages in the sugar and ethanol mills, since it provides the raw material with quality, time, and competitive costs for later processing. Among the used machines in the mechanized harvest, the harvesters stand out for having a large number of corrective stops, given the functionality in such extreme environmental conditions. In addition, its operation is in a regime of 24 hours on the workdays, having impact on fatigue and wear of their parts. During operation, the harvester processes an average of 20 tons of sugarcane per hour and its malfunction may lead to major losses; therefore, an effective maintenance approach is of keen interest [2].

Reliability-centered maintenance consists of determining the most effective maintenance approach [3, 4]. This process was firstly developed in the aviation industry for deciding what maintenance work is needed to keep aircraft airborne, driven by the need to improve reliability, while reducing the cost of maintenance [5]. Reliability analysis can be used to estimate time-related parameters to the next machine stop [6], providing information to manage and control the preventive maintenance of harvesters which could result in increased production and has potential for cost savings.

In reliability, common procedures are usually based on the assumption that the data follows a Weibull distribution. Introduced by Weibull [7], this distribution has convenient mathematical properties and its physiological failure process arises in many areas (see Manton and Yashin [8]). Additionally, McCool [9] provided an extensive discussion about its use in reliability. However, this distribution cannot be used to describe data with nonmonotone hazard function (bathtub, upside-down bathtub, to list a few). To overcome this problem, many generalizations of the standard Weibull distribution have been proposed. Murthy et al. [10] presented the application of some generalized Weibull distributions for modeling complex failure data sets. Pham and Lai [11] discussed recent generations of Weibull-related lifetime. Further, Lai [12] reviewed more than 25 generalizations of the Weibull distribution, and Tahir and Cordeiro [13] cited more than 30 compounded Weibull models.

In this paper, we consider five important generalized Weibull distributions with three parameters to describe the lifetime of two important components of the sugarcane harvesting machines. Our main goal here is to correctly predict the next failure of the components, not to present an extensive review of the generalizations of Weibull distribution. In the cited papers, the authors only described mathematical properties of the distributions and conducted the fit for different data. However, reliability is more about correctly predicting the future than describing the past; in this sense, no predictive analysis was presented considering such generalizations.

The distributions considered are the gamma-Weibull distribution [14], generalized Weibull (GW) distribution [15], exponentiated Weibull (EW) distribution [16], Marshall-Olkin Weibull (MOW) distribution [17], and the extended Poisson-Weibull (EPW) distribution [18]. While the first three distributions are the most common three parameter generalization of the Weibull, the MOW and the EPW arise in the competitive and complementary risk scenario (see Louzada [19], for a detailed discussion). In these cases, the latent variables follow, respectively, a geometric and a zero-truncated Poisson distribution and each of components in risk came from a Weibull baseline distribution.

For each distribution, the mathematical background is reviewed and the parameters estimators are presented using the maximum likelihood estimators. Further, different discrimination procedures are used to obtain the best fit for each component. At the end, we propose a maintenance scheduling for the components of the harvesters using predictive analysis.

The remainder of this paper is organized as follows. Section 2 presents the literature review related to the survival models adopted. Section 3 exposes the data collection and empirical analysis, as well as carrying out the predictive analysis based on the parametric models. Finally, in Section 4, we present some final remarks related to the contribution of this study.

2. Theoretical Background

In this section, we present the statistical background on the adopted distributions and its parameter estimation procedures. The following distributions are considered: gamma-Weibull, generalized Weibull, exponentiated Weibull, Marshall-Olkin-Weibull, and Marshall-Olkin-Weibull. Their choice is based on their flexibility to accommodate lifetime dataset with hazard functions with different shapes, for instance, constant, increasing, decreasing, bathtub, and upside-down bathtub.

2.1. The Gamma-Weibull Distribution

Introduced by Stacy [14], the gamma-Weibull distribution with three parameters is a flexible model for reliability data due to its ability to accommodate various forms of the hazard function. This distribution is also known as generalized gamma (GG) distribution as it generalizes the two-parameter gamma distribution; hereafter, we will refer to this model as GG distribution to avoid confusion with the GW distribution. A random variable has GG distribution if its probability density function (PDF) is given bywhere , , and . The mean and variance of GG are given bySome relevant distributions are special cases such as the Weibull distribution (when ), the distribution gamma (), log-normal (case limit when ), and the generalized normal distribution (). For example, the generalized normal distribution is also a distribution that includes several distributions known as half-normal (), Rayleigh (), Maxwell-Boltzmann () and chi (). The cumulative distribution function (CDF) is given bywhere is the lower incomplete gamma function. The survival function iswhere is the upper incomplete gamma function.

The hazard function of the GG distribution iswhere the hazard function has constant, increasing, decreasing, bathtub, and upside-down bathtub hazard rate.

For parameter estimation, let be a random sample of size , where . Then, the likelihood function related to the PDF (1) is given by

The log-likelihood is given by

Setting the partial derivatives , , and equal to , we obtain the following maximum likelihood estimators:where . The solution provides the maximum likelihood estimates (MLEs). See, for instance, Ramos et al. [20, 21] and Achcar et al. [22] for a detailed discussion.

Under mild conditions, the estimators become unbiased for large samples and asymptotically efficient. Moreover, such estimators have asymptotically normal joint distribution given bywhere is the Fisher information matrix: that is,and is the trigamma function.

2.2. The Generalized Weibull Distribution

Introduced by Mudholkar et al. [15], the generalized Weibull distribution has PDF given bywhere , , and . The CDF and the survival function are, respectively, given by

The hazard function of the GW distribution is

This model is very flexible to describe lifetime data, since it has the hazard function with constant, increasing, decreasing, bathtub, and upside-down bathtub hazard rate. The quantile function of the GW distribution has closed form and is given by

For parameter estimation, let be a random sample of size , where . Then, the likelihood function related to the PDF (11) is given by

The log-likelihood is given by

Setting the partial derivatives equal to , we obtain the maximum likelihood estimators. Here, we follow Mudholkar et al. [15] which considers the direct maximization of (16). Under mild conditions, the obtained estimators are consistent and efficient with an asymptotically normal joint distribution given bywhere is the Fisher information matrix associated with the vector of parameters and is the Fisher information elements in and given by

Since the Fisher information matrix does not have closed-form expression for some terms, an alternative is to consider the observed information matrix, where the terms are given by

Hereafter, we considered the same approach to obtain the confidence intervals for the parameters from other distributions.

2.3. The Exponentiated Weibull Distribution

Introduced by Mudholkar et al. [16], the exponentiated Weibull distribution with PDF is given bywhere , and .

The exponentiated Weibull distribution includes the Weibull distribution () and the exponentiated exponential distribution (). The survival function is given by

The hazard function of the GG distribution is

The shapes of the hazard function are analogous to the GG and GW distribution. Additionally, the quantile function of the EW distribution has closed form and is given by

The th moment of the EW distribution is given bywhere , . The proof of this equality is presented by Choudhury [23].

For parameter estimation, let be a random sample of size , where . Then, the likelihood function related to the PDF (20) is given by

The log-likelihood is given by

Setting the partial derivatives , , and equal to , we obtain the following maximum likelihood estimators:

2.4. The Marshall-Olkin-Weibull Distribution

Marshall and Olkin [17] presented a new procedure for introducing an additional parameter into a family of distribution. In this case, the authors applied such procedure in the Weibull distribution. The obtained PDF of the MOW distribution is given bywhere , , and . The MOW distribution arises naturally in competing risks scenarios. Let , where is a random variable with geometrical distribution and are assumed to be independent and identically distributed according to a Weibull distribution; then the has a PDF given by (28). Cordeiro and Lemonte [24] derived many properties and the parameter estimators for the MOW distribution; the following results were obtained from the cited work. The survival function is given by

The hazard function of the MOW distribution iswhere its behavior is constant, increasing, decreasing, bathtub, and unimodal. Moreover, the quantile function of the MOW distribution has closed form and is given by

For parameter estimation, let be a random sample of size , where . Then, the likelihood function related to the PDF (28) is given by

The log-likelihood is given by

Setting the partial derivatives , , and equal to , we obtain the following maximum likelihood estimators:for more details, see Cordeiro and Lemonte [24].

2.5. The Extended Poisson-Weibull Distribution

Ramos et al. [18] introduced the extended Poisson-Weibull (EPW) distribution as a generalization of Weibull-Poisson distribution (see Hemmati et al. [25]), where its PDF is given bywhere , , and . Analogously to the MOW distribution, the EWP model arises naturally in competing risks scenarios. Let , where is a random variable with a zero-truncated Poisson distribution and are assumed to be independent and identically distributed according to a Weibull distribution, then the has a PDF given by (35). The survival function is given by

The hazard function of the GG distribution is

For the EWP distribution, the hazard function has different shapes such as constant, increasing, decreasing, bathtub, and upside-down bathtub. Furthermore, the quantile function of the EPW distribution has closed form and is given by

For parameter estimation, let be a random sample of size , where . Then, the likelihood function related to the PDF (20) is given by

The log-likelihood is given by

Setting the partial derivatives , , and equal to , we obtain the following maximum likelihood estimators:

2.6. Goodness of Fit

Firstly, in order to verify the behavior of the empirical data, the Total Time on Test plot (TTT-plot) was considered (Barlow and Campo [26]). The TTT-plot is obtained through the plot of , where, , and is the ordered data. For data with concave (convex) curve, the hazard function has increasing (decreasing) shape. If the behavior starts convex and then becomes concave (concave and then convex), the hazard function has bathtub (inverse bathtub) shape.

The goodness of fit is checked considering the Kolmogorov-Smirnov (KS) test. This procedure is based on the KS statistic , where is the supremum of the set of distances, is the empirical distribution function, and is CDF. A hypothesis test is conducted at the level of significance to test whether or not the data comes from . In this case, the null hypothesis is rejected if the returned value is smaller than .

The following discrimination criterion methods were adopted: Akaike information criteria (AIC) and the corrected AIC (AICc) computed, respectively, by and , where is the number of parameters to be fitted and is MLEs of . For a set of candidate models for , the best one provides the minimum values.

3. Data Collection and Empirical Analysis

The dataset came from two sources: a manual stop system, which brings the history of revisions and corrective stops of two sugarcane harvesters; and data from the onboard computers of the harvesters, which provide information on the operation of the machine. The data were collected from January 2015 to August 2017, a period corresponding to 2.5 harvests (crops), that is, a period of thirty months of activity.

3.1. Empirical Analysis

Firstly, considering all the stops and their reasons, records of the performance of the predictive maintenance are required to be observed. In total, 1347 stops were observed, of which 186 were preventive and 1161 corrective stops. Thus, it is possible to observe the superior amount of unplanned stops, thus questioning the effectiveness of preventive maintenance. Table 1 shows the failure among the harvests, considering both machines analysis.

The Pricker and transmission from each machine were selected given their complexity in the maintenance. Figure 1 describes the number of failures per year divided by harvest, considering their temporal sparsity, by which items analyzed in this report correspond to 18% of the stops.

It is possible to notice a difference in the machines’ behavior; both machines appear to be equally affected by the problems of transmission and Pricker, but the machine B is more affected by problems with the Pricker. Further, reliability models were individually adjusted, and thereby compared, as described in the next section.

3.2. Preventive Maintenance

In this section, we discuss a parametric approach in order to perform a predictive analysis for the lifetime of the components.

3.2.1. Pricker from Machine A

Table 2 presents a high defect rate after a short repair time as well, compromising the cost of the production. The experiment considered a total period of 30 months, as said before. The operating equipment had three off-seasons; these periods were not included in the dataset. The equipment was only observed during the time of its active operation.

Figure 2 presents the TTT-plot and the survival function fitted by different generalizations of the Weibull distribution.

From the TTT-plot, we observed that the proposed data has unimodal hazard rate, which implies that all the proposed models may be used to describe the proposed dataset. Additionally, the survival function adjusted by the different distributions shows that the proposed models provide a good fit for the proposed data. In order to discriminate the best fit, we considered the results of AIC and AICc (see Table 3).

Among the proposed models, the exponentiated Weibull distribution has superior goodness of fit since the AIC and AICc returned smaller values. Therefore, using the exponentiated Weibull distribution, we computed the maximum likelihood estimates and the predictive value for (see Table 4). Hereafter, as we considered the quantile function to obtain the predictive value, the confidence intervals (CI) related to this estimate were obtained from bootstrap technique [27].

From Table 4, we observe that the predictive maintenance should be done in approximately 3 days after the last failure with confidence interval between 2 and 4 days.

3.2.2. Pricker from Machine B

A similar behavior is observed for the Pricker in the machine B, shown in Table 5 presenting a high defect rate as well. The approach was maintained considering only the time during its active operation.

Figure 3 presents the TTT-plot and the survival function fitted by different generalizations of the Weibull distribution, similar to the previous machine.

The TTT-plot shows that the proposed data has unimodal hazard rate, which implies that all the proposed models may be used to describe the dataset. Analogously to the previous case, the survival function adjusted by the different distributions shows that the proposed models provide a good fit for the proposed data. Therefore, to discriminate the best fit, we considered the results of AIC and AICc (see Table 6).

From the obtained results, we observe that the EW distribution also provided the best fit among the proposed model. Furthermore, the maximum likelihood estimates for the EW distribution were computed as well as the predictive value for . Table 7 presents the MLEs, standard deviations, and confidence intervals for , , , and related to the EW distribution.

Table 7 results suggest that predictive maintenance should be done in approximately 3 days, considering a point estimation, or given a 95% confidence interval, it would be between 2 to 4 days approximately. Thereby, Pricker among machines showed no difference in performance ever.

3.2.3. Transmission from Machine A

Table 8 shows that more than 50% of the defect rate appears until 8 days right after its repair for the transmission for the machine A.

Figure 4 presents the TTT-plot and the survival function fitted by different generalizations of the Weibull distribution.

As can be seen in the TTT-plot, we observed that the proposed data has also fulfilled the hazard rate shape presupposition. However, from the survival function, there is an indication that the generalized Weibull distribution is not a good candidate to describe the propose data. Table 9 presents the results of AIC and AICc in order to discriminate the best fit.

From Table 9 we can see that the GW distribution has the value of the KS test smaller than 0.05; therefore, it is not a possible candidate to fit the data. Overall, the GG distribution has a better fit, since it has the smaller AIC and AICc. Therefore, we computed the maximum likelihood estimates and the predictive value for using the GG distribution. Table 10 presents the MLEs, standard deviations, and confidence intervals for , , , and related to the GG distribution.

Table 10 results suggest that predictive maintenance should be done in approximately 3 days, considering a point estimation, or given a 95% confidence interval, it would be between 2 to 4 days approximately.

3.2.4. Transmission from Machine B

Comparing to the other equipment, the transmission from the machine B presented smaller number of occurrence. Table 11 shows the sparsity of the dataset related to the sugarcane harvester’s transmission B.

Figure 5 presents the TTT-plot and the survival function fitted by different generalizations of the Weibull distribution, considering the transmission from machine B.

From the TTT-plot, we observed that the proposed data has bathtub shape. Moreover, the adjusted survival functions show that all models are candidates to describe the lifetime of the transmission from the machine B.

Table 12 presents the results of AIC and AICc in order to discriminate the best fit.

As shown in Table 12, the EWP distribution has the minimum AIC and AICc. Therefore, we computed its maximum likelihood estimates and predictive value for , respectively. Table 13 presents the MLEs, standard deviations, and confidence intervals for , , , and related to the EWP distribution.

Table 13 results suggest that predictive maintenance should be done in approximately 7 days, considering a point estimation, or given a 95% confidence interval, it would be between 4 to 10 days approximately.

4. Final Remarks

In this study, we considered different distributions to describe the lifetime of sugarcane harvesting machine components. The harvesters stand out for having a large number of corrective stops, given the functionality in such extreme environmental conditions. However, these harvesters do not have an effective preventive maintenance policy which affects its working time schedule. To overcome this problem, we presented a predictive analysis using probability models based on its percentiles aiming to incorporate intelligence into maintenance planning.

The Weibull distribution is a popular model that can be used to describe a wide range of problems; however, it cannot be used to describe data with nonmonotone hazard rate. Thus, many generalizations of the Weibull distribution have been proposed to overcome this problem. Since the proposed datasets have nonmonotone hazard rate, we considered some flexible generalizations such as the Gamma-Weibull, the generalized Weibull, the exponentiated Weibull, Marshall-Olkin Weibull, and the extended Poisson-Weibull distribution. For the proposed distributions, some mathematical functions were discussed as well as the parameter estimators under the maximum likelihood approach.

The proposed distributions were used to fit the datasets using maximum likelihood estimators. The exponential Weibull presented a superior fit for both machines considering the Pricker component; in these cases, we concluded that a predictive maintenance should be done in approximately 3 days. On the other hand, for the transmission component, the distributions that presented better fit were, respectively, the Gamma-Weibull distribution and the extended Poisson-Weibull for machines A and B, where a predictive maintenance should be done, respectively, in 3 and 7 days after the last failure.

Further work should be considered beyond the adjusted models by including many other generalizations of the Weibull distribution. Also a structure of recurrent event data could be included and its forecast accuracy was analyzed. Finally, this approach should be implemented as an applicative, helping the maintenance section in their individualized scheduling distributions.

Abbreviations

Acronyms
GW:Generalized Weibull
EW:Exponentiated Weibull
MOW:Marshall-Olkin Weibull
EPW:Extended Poisson-Weibull
GG:Generalized gamma
MLE:Maximum likelihood estimator
PDF:Probability density function
CDF:Cumulative distribution function
TTT:Total Time on Test
KS:Kolmogorov-Smirnov
AIC:Akaike Information Criterion
AICC:Corrected Akaike Information Criterion
SD:Standard deviation
CI:Confidence interval.
Notations
:Probability density function
:Mean function
:Variance function
:Lower incomplete gamma function
:Upper incomplete gamma function
:Cumulative distribution function
:Survival function
:Hazard rate function
:Likelihood function
:Log-likelihood function
:Digamma function
:Trigamma function
:Expected Fisher information matrix
:Positive parameter
:Positive parameter
:Positive parameter
:Real parameter
:Positive parameter
:Quantile function
:Vector of parameters
:Observed Fisher information matrix
:th moment
:TTT-plot
:Kolmogorov-Smirnov statistic
:Predictive value.

Conflicts of Interest

No potential conflicts of interest were reported by the authors.

Acknowledgments

The research was partially supported by CNPq, FAPESP, and CAPES of Brazil.