Abstract

Entropy is a useful indicator of information content that has been used in a number of applications. The Log-Logistic (LL) distribution is a probability distribution that is often employed in survival analysis. This paper addresses the problem of estimating multiple entropy metrics for an LL distribution using progressive type II censoring. We derive formulas for six different types of entropy measurements. To obtain the estimators of the proposed entropy measures, the maximum likelihood approach is applied. Approximate confidence intervals are calculated for the entropy metrics under discussion. A numerical evaluation is performed using various censoring methods and sample sizes to characterize the behavior of estimator’s measures using relative biases, related mean squared errors, average interval lengths, and coverage probabilities. Numerical analysis revealed that the accuracy measures improve with sample size, and the suggested entropy estimates approach the genuine values as censoring levels decrease. Finally, an actual dataset was evaluated for demonstration purposes.

1. Introduction

Over the last few decades, log-logistic (or Fisk) distribution (LLD) has been frequently employed, notably in the areas of survival and reliability. The LLD is a popular alternative to the log-normal distribution because it has a failure rate function that grows with time, peaks after a certain period, and then gradually reduces [1]. Unlike the log-normal, the cumulative distribution function (CDF) of LLD has a closed form. For some parameter values, this distribution can have a monotonically declining failure rate function. In economics, the LLD is used to simulate wealth and income [2], while in hydrology, the LLD is used to describe stream flow data [3]. For additional details on the significance and applications of a LLD, see Bennett [4], Ahmad et al. [5], and Robson and Reed [6].

A random variable (RVr) is said to have a LLD with the scale parameter and the shape parameter if its CDF is provided via

The probability density function (PDF) corresponding to (1) is then provided via

The hazard function of the LLD is either decreasing or inverted bath-tub, and the PDF is either reversed J shaped or unimodal, see Johnson et al. [7].

In reliability studies, researchers want to see how long it takes for units to fail. However, due to time and expense restrictions, as well as a variety of other factors, experimenters are unable to track the lifetime of all units. As a result, filtered data is available. The most frequent types are type I and type II. In the statistical literature, popular censoring techniques are explored. However, in medical/engineering survival analysis, units may be removed at intermediate phases for a variety of causes beyond the experimenter’s control. In this case, progressive censoring (PC) system is an acceptable censoring strategy since it permits surviving items to be removed before the test ends. PC has the benefit of quickly terminating the test and including at least some extreme life periods in the sample data. Progressive type I censoring (PT1C) happens when the number of survivors reduces to predefined levels, whereas progressive type II (PT2C) occurs when the number of survivors drops to specified levels.

The following is how a PT2C sample is carried out: A life testing experiment with units and the PC method , is used. Units are randomly eliminated from the remaining surviving units at the moment of the first failure . Similarly, units from the remaining units are randomly eliminated after the second failure . The test continues until the failure occurs, at which point all remaining units are removed. The number of failures as well as the progressive censoring design is preset and fixed. Let denote such a PT2C sample with being the PC scheme. Balakrishnan and Aggrawala [8] provided some historical remarks and a good summary of progressive censoring. It really should be observed that this censorship is limited to the classical type II censoring (T2C) when and ; further, it reduces to a complete sample having no censoring for and

In information theory, entropy is a measure of uncertainty in a RVr that gauges the anticipated value of the information embodied in that RVr. Entropies are motivated by how receiving new information decreases uncertainty. Shannon’s entropy is one of the earliest and most commonly used measurements of entropy. In the study of communication systems, this measure has proven to be effective. Let be a non-negative RVr with a continuous CDF; the formal measure of Shannon’s entropy is characterized by:

One of the most significant disadvantages of Shannon’s measure is that it may be negative for particular probability distributions, making it useless as a measure of uncertainty. Rényi [9] developed a new generalized entropy by studying the concepts of uncertainty and randomness. The Rényi entropy () is calculated as follows: where the constant is conditional, leading to a positive entropy. Different generalizations of entropy were proposed by Havrda and Charvat [10], Arimoto [11], Awad et al. [12], and Tsallis [13].

Havrda and Charvat’s (HC) entropy suggested extension of (3). This extension is called HC entropy of degree and is characterized with

Arimoto’s (Ar) entropy measure (see [11]) is characterized with

Awad et al. [12] suggested two types of entropy: an extension of Réyni entropy and Havrda and Charvat entropy. The first extension, denoted by and the second extension, denoted by are characterized with

Tsallis [13] generalized Shanon’s entropy and defined the measure as

Many researchers have worked on entropy estimates for various life distributions. Cramer and Bagh [14] used progressive censoring to investigate entropy in the Weibull distribution. Kang et al. [15] used doubly type II censored data to construct entropy estimators for a double exponential distribution. Using record values from the generalized half-logistic distribution, Seo et al. [16] calculated an entropy estimate. Cho et al. [17] addressed entropy estimates for Rayleigh distribution using doubly generalized type II hybrid censoring. Cho et al. [18] used generalized type II hybrid censored samples to derive estimators for the entropy function of a Weibull distribution. Dey et al. [19] studied the loss of entropy for a truncated Rayleigh distribution using different entropy measures. Chacko and Asha [20] investigated entropy estimation in generalized exponential distribution. Bantan et al. [21] considered the entropy estimators for inverse Lomax via the multiple censored scheme. The dynamic cumulative residual Ré for the Lomax distribution was estimated using Bayesian and maximum likelihood (ML) techniques by Al-Babtain et al. [22]. Entropy estimator of Lindley was prepared by Almarashi et al. [23]. Helmy et al. [24] investigated Shannon entropy estimation of Lomax distribution using unified hybrid censored data. Bayesian and non-Bayesian estimation of the Nadarajah–Haghighi distribution using progressive Type-1 censoring scheme studied by Elbatal et al. [25].

In this paper, we are inspired to investigate six possible entropy estimators for log-logistic distributions in the presence of T2PC data. We construct analytical formulas for the entropy measurements proposed. The ML and two-sided approximate confidence intervals of several entropy estimators are calculated. Numerical comparisons for various sample sizes are presented to identify which entropy estimator outperforms the others.

The paper is broken down into five sections. Section 2 presents expressions for the recommended entropy measures based on LLD. PT2C is used in Section 3 to give several entropy estimators as well as their estimated confidence intervals. In Section 4, numerical comparisons of different entropy estimators and data analysis are examined. Finally, in Section 5, there are some conclusions and a summary of the study.

2. Expressions of Entropy Measures

Statistical entropy measures the amount of uncertainty or variability in a RVr. The higher the value of entropy leads to more variability in the data. This section focuses on obtaining the expression for various entropy measurements of LLD.

2.1. Rényi Entropy

The Ré of LLD is obtained by replacing (1) in (3) as follows:

Write and assume that thus the integral will be written as where is the beta function. As a result of inserting (11) into (10), LLD’s Ré entropy takes the form

Hence, (12) is the necessary formulation of LLD Ré entropy.

2.2. Havrda and Charvat Entropy

The HC of the LLD is obtained by replacing (1) in (4) as follows:

As a result, we get the HC of LLD by inserting (11) in (4) as follows

Hence, (14) is the necessary formulation of LLD HC entropy.

2.3. Arimoto Entropy

The Ar entropy of LLD is obtained by replacing (1) in (5) as follows:

We obtain the Ar entropy of LLD by putting (11) into (15) as follows

Thus, the formula for Ar entropy found in equation (16).

2.4. A-entropies

To get (6) and (7), we must obtain by getting the maximum value of as below:

After simplification, then (17) is written as follows which leads to the following

Using (17), we get as

Using (11) and (19), hence the A-entropies can be expressed as

Thus, (20) and (21) provide essential expressions of A-entropy.

2.5. Tsallis Entropy

The Tsallis entropy of LLD is obtained by replacing (1) in (9) as follows:

Thus, using (11) in (23), then the Tsallis entropy of LLD is obtained as follows

Thus, the formula of Tsallis entropy of LLD is provided in equation (24).

3. Estimation of Different Entropies

Using the ML method and T2PC data, we get estimators for the various entropies metrics provided in the preceding section. To construct entropies estimators, we start with the ML estimator of population parameters. The invariance property of ML estimators may then be used to determine the ML of the recommended entropies measurements. Furthermore, we obtain the approximate confidence intervals of the suggested entropy measures.

Assume that be a PT2C sample of size from a sample of size drawn from CDF (1) and PDF (2) with censoring scheme The likelihood () function based on the PT2C sample is given by where Thus, the constant is the number of ways in which the m PT2C order statistics may occur if the observed failure times are The log- function of (25), say , is then provided via where we write and for simplified forms. Form (26), we derive the equation for and as

The ML estimator of and can be obtained using the numerical method by solving the non-linear Equations (27) and (28) after setting them with zero. Once the ML estimator of and say and is computed, we can obtain the ML estimator of entropy measures provided in (12), (14), (16), (20), (21), and (24). Consequently, the ML estimator of Ré entropy, denoted by , is obtained by inserting and in (11) as follows

The other entropy estimators, indicated by , and may be obtained in a similar fashion.

The preceding theoretical results may be specialized in two cases; firstly, ML estimators of , and are produced when and via T2C. Second, for and , we get the ML estimators of population parameters as well as the proposed entropy measures.

The asymptotic normality of ML estimation may be used to determine the asymptotic 100(1-v) confidence intervals (CIn) for the parameters as

Also, the asymptotic 100(1-v) CIns for entropy measures are given by where is standard normal and (1-v) is the confidence coefficient.

4. Simulation and Real Data Outcomes

The challenge in this section is to analyze the outcomes of the numerous entropy estimations stated before. To evaluate the behavior of the suggested entropy measures and to analyze the statistical performances of the estimators under PT2C, a Monte Carlo study is used. An actual data is also examined for demonstration purposes. For calculations, the statistical programming language R will be used in this study.

4.1. Simulation Study

The effectiveness of the approaches recommended of entropy estimation using Monte Carlo is compared using a simulated exercise. Six entropy estimates are calculated using the Monte Carlo process. For the ML estimates (MLEs), one may generate 1000 data from the LLD with the following assumptions: (1)Presume the parameters of the LLD in the coming situations:(2)Assume two values for the constant (3)Sample size is n =60 and number of observed failures are m =10, 20, 30.(4)Removed items rj are assume the accompanying:

Scheme I:

Scheme II:

Scheme III:

Table 1 shows the patterns that were eliminated for each suggested schemes. Note that the first scheme (Scheme I) is a particular instance of PT2C, which is the most common T2C. Another, particular case is considered when which leads to a complete sample.

According to the generated data, MLEs are computed under the above assumptions using PT2C. When getting MLEs, keep in mind that the initial estimate values are treated the same as the real parameter values. These values, MLEs, are then plugged-in to calculate the desired entropy estimates.

All the average entropy estimates, relative biases (Rbias), associated mean squared errors (MSEs), corresponding average interval lengths (AIL), and coverage probabilities (CPs) for all six entropy methods are reported in Table 2 forand in Table 3 for Also, the results of the complete sample case are reported in Table 4 for both values of

From tabulated values, it can be noticed that: (i)Higher values of lead to a decrease in Rbias, MSE, and AIL for all different schemes of removing items and all different entropy methods.(ii)All CPs are greater than 93% for all different schemes of removing items and all different entropy methods.(iii)The increase in the constant term leads to a decrease in estimates of all different entropy methods.

4.2. Real Data Application

A real data set is analyzed for illustrative purposes as well as to assess the statistical performances of the MLEs for different entropy estimates in the case of the LLD under different PT2C schemes.

The uncensored data set below corresponds to the remission periods (in months) of a random sample of 128 bladder cancer patients reported in Lee and Wang [26]. The following are the bladder cancer remission times:

0.08, 2.09, 3.48, 4.87, 6.94, 8.66, 13.11, 23.63, 0.20, 2.23, 3.52, 4.98, 6.97, 9.02, 13.29, 0.40, 2.26, 3.57, 5.06, 7.09, 9.22, 13.80, 25.74, 0.50, 2.46, 3.64, 5.09, 7.26, 9.47, 14.24, 25.82, 0.51, 2.54, 3.70, 5.17, 7.28, 9.74, 14.76, 26.31, 0.81, 2.62, 3.82, 5.32, 7.32, 10.06, 14.77, 32.15, 2.64, 3.88, 5.32, 7.39, 10.34, 14.83, 34.26, 0.90, 2.69, 4.18, 5.34, 7.59, 10.66, 15.96, 36.66, 1.05, 2.69, 4.23, 5.41, 7.62, 10.75, 16.62, 43.01, 1.19, 2.75, 4.26, 5.41, 7.63, 17.12, 46.12, 1.26, 2.83, 4.33, 7.66, 11.25, 17.14, 79.05, 1.35, 2.87, 5.62, 7.87, 11.64, 17.36, 1.40, 3.02, 4.34, 5.71, 7.93, 11.79, 18.10, 1.46, 4.40, 5.85, 8.26, 11.98, 19.13, 1.76, 3.25, 4.50, 6.25, 8.37, 12.02, 2.02, 3.31, 4.51, 6.54, 8.53, 12.03, 20.28, 2.02, 3.36, 6.76, 12.07, 21.73, 2.07, 3.36, 6.93, 8.65, 12.63, 22.69, 5.49.

We first check whether the LLD is suitable for analyzing this data set. We report the MLEs of the parameters and the value of the Kolmogorov–Smirnov (K–S) test statistic to judge the goodness of fit. The calculated K-S distance between the empirical and fitted LLD distribution is 0.0440 and its -value is 0.9667 where and which indicate that this distribution can be considered an adequate model for the given data set.

From the original data, one can generate, e.g., three PT2C samples with a number of stages and removed items are assumed to be as follows:

Scheme I:

Scheme II:

Scheme III:

Also, we consider the complete case as . Two different values of the constant are proposed: 0.5 and 1.5. In Table 5, the MLEs of the parameters have been calculated and then plugged into different entropy methods in the proposed schemes for PT2C samples as in the given real data set.

5. Summary and Conclusion

In this paper, we look at the estimation problem of certain entropy measures for log-logistic distribution under the PT2C scheme. Rényi entropy, Havrda and Charvat entropy, Arimoto entropy, A-entropies, and Tsallis entropy are the six entropy measures considered. The recommended entropy measurement expressions are computed. The point and two-sided approximate confidence intervals for the recommended entropy measures are obtained using the maximum likelihood procedure. To describe and compare the behavior of estimator’s measures, a numerical evaluation is done in terms of relative biases, associated mean squared errors, average interval lengths, and coverage probabilities under different censoring schemes as well as different sample sizes. The suggested entropy estimates approach the real values with decreasing censoring levels, as well as the accuracy of measurements increases with sample sizes. Finally, an actual dataset was analyzed for demonstration purposes.

Data Availability

If you would like to get the numerical dataset used to conduct the study reported in the publication, please contact the appropriate author.

Conflicts of Interest

The authors state that they have no conflicts of interest to disclose in relation to this work.

Acknowledgments

This work was supported by Researchers supporting project number RSP2022R464, King Saud University, Riydah, Saudi Arabia.