#### Abstract

Developing an efficient model for analyzing right-skewed positive observations has a long history, and many authors have attempt in this direction. This is because the common analytic modeling procedures such as linear regression are often inappropriate for such data and leads to inadequate results. In this article, we proposed a new model for regression analysis of the right-skewed data by assuming the weighted inverse Gaussian, as a great flexible distribution, for response observations. In the proposed model, the complementary reciprocal of the location parameter of response variable is considered to be a linear function of the explanatory variables. We developed a fully Bayesian framework to infer about the model parameters based on a general noninformative prior structure and employed a Gibbs sampler to derive the posterior inferences by using the Markov chain Monte Carlo methods. A comparative simulation study is worked out to assess and compare the proposed model with other usual competitor models, and it is observed that efficiency is quite satisfactory. A real seismological data set is also analyzed to explain the applicability of the proposed Bayesian model and to access its performance. The results indicate to the more accuracy of proposed regression model in estimation of model parameters and prediction of future observations in comparison to its usual competitors in literature. Particularly, the relative prediction efficiency of the proposed regression model to the inverse Gaussian and log-normal regression models has been obtained to be 1.16 and 64, respectively, for the real-world example discussed in this paper.

#### 1. Introduction

Modeling right-skewed positive observations has been considered repeatedly by many authors during the last decades; hence, there is a vast literature on different probability distributions utilized in order to take into account the uncertainty behind this type of data. Many distributions have been candidated and studied for this purpose from the exponential, gamma and Weibull to log-normal (LN), inverse Gaussian (IG), etc. The IG distribution, as a member of exponential distributions family, benefits from considerable flexibility along with good theoretical and computational properties. A look at the papers was considered that the IG distribution indicates to the desirability of this distribution in statistical literature (see, for example, [1–5]). The IG distribution has been recommended for analyzing right-skewed positive data as a serious competitor for some of the most well-known distributions such as log-normal and gamma. It is employed in various applications such as reliability (e.g., [2, 6]), social sciences (e.g., [7, 8]), marketing (e.g., [9]), engineering (e.g., [10]), industrial management (e.g., [11]), cardiology [12], and civil engineering (e.g., [13]). In addition, some extensions to the IG distribution have been proposed in order to provide more flexible and interpretable distributions. In this direction, [14] proposed a three-parameter extension of the IG distribution, named weighted inverse Gaussian (WIG) distribution with a great ability for modeling positive right-skewed data specially in the field of reliability and lifetime analysis. As pointed out by [15], WIG is a versatile lifetime distribution and a serious competitor for other lifetime distributions. The most important inferential aspects of WIG distribution have been considered by Gupta and Kundu. They proposed an expectation maximization (EM) algorithm to estimate the parameters of distribution.

On the other hand, some authors have been discussed the regression analysis by assuming the IG as distribution of response observations. For example, [16] considered a simple regression model with a single explanatory variable and zero intercept. [17] suggested a log-normal approximation in exponential regression. [6, 18] proposed a model which offers physical interpretation in survival analysis. This model further has been used by [6] in analysis of variance and survey sampling, respectively. Up to our best knowledge, although the IG distribution has been considered vastly in regression analysis, there is no published research considering the WIG distribution for modeling purposes. In this paper, we developed a new model for regression analysis of positive right-skewed response observations under the WIG distribution. A fully Bayesian framework has been utilized to estimate the model parameters and predict the future observations. The proposed methodology employs a set of noninformative vague prior distributions from a well-known family of flexible distributions and proceeds to the posterior inferences by using a Gibbs sampler. The main advantages of the proposed model are as follows: (i) due to using the WIG as distribution of response variable, it has great flexibility for taking into account the variability of positively skewed response observations. This leads to more accurate estimations and predictions for the proposed model in comparison to its competitors in the literature. (ii) Despite serious difficulties related to the frequentist approach for estimating the parameters of the WIG distribution (see [15] for more details), the proposed model provides a straightforward Bayesian approach with a tractable posterior distribution and interpretable posterior inferences about the parameters. (iii) Due to the Bayesian nature of the proposed model, it is possible to incorporate personal prior beliefs about uncertainty behind the parameters in order to construct a more reliable model.

The paper organized as follows: Section 2 provides some background about the WIG distribution containing its most important distributional and inferential properties. In Section 3, a WIG regression model is proposed. The essential theory for the Bayesian analysis of the proposed model is developed in Section 4. In Sections 5.1 and 5.2, we provide empirical evidence for assessing the proposed methodology via analyzing simulated and real-world data, respectively. The paper is closed by Conclusion.

#### 2. A Look at Weighted Inverse Gaussian Distribution

A random variable, , is said to have a WIG distribution with location parameter , scale parameter , and shape parameter , , if its density function is given by

where and show the density function of IG distribution. The shape of WIG distribution is always unimodal, and the mean and variance of distribution are given, respectively, by

As it can be seen from Equation (4), despite the Gaussian distribution, the mean and variance of WIG distribution are correlated. The cumulative distribution function (CDF) of WIG distribution can be written as where denoted the CDF of standard normal distribution and and The key point in derivation of these results is that the CDF of IG distribution can be expressed as a linear combination of standard normal CDF of the form where and are solutions of Euler differential equation which are of the form , for constants and . If we consider that a pair of linearity-independent solutions and correspond to IG distribution, then a function of the form with chosen constants and is CDF of the WIG distribution. See [14] for more details.

The WIG family becomes the IG family as and the reciprocal inverse Gaussian (RIG) family as , where the RIG, like the IG, is a special case of the generalized inverse Gaussian (GIG) distribution (see, for example, [19] for more details about GIG distribution).

The problem of estimating the WIG distribution parameters has been discussed in the literature. [14] showed that the maximum likelihood (ML) estimators of the WIG distribution parameters can be obtained as responses of a nonlinear challenging three-dimensional optimization problem. [15] formulated the problem in a missing data structure and developed an EM algorithm for computing the ML estimates. They used asymptotic optimality properties of the ML estimators to construct asymptotic confidence intervals for the unknown parameters. We refer the readers to [14, 15, 20, 21] and other above-mentioned references for more details about the WIG distribution.

Before concluding this short review section, it is useful to clarify two points. First, despite the IG distribution, the WIG distribution does not belong to the exponential family of distributions. Of course in some special cases, for example, for known , it has a near relation to the exponential family of distributions.

Second, the WIG distribution is a reparameterized version of the mixed inverse Gaussian (MIG) distribution. The MIG distribution defined as a two-component finite mixture of IG and length-biased inverse Gaussian (LBIG) distribution as where is the mixing coefficient and denotes the pdf of LBIG distribution. From this point of view, the WIG density function obtained by adopting a reparameterization of the form in the MIG density function is given in Equation (6).

#### 3. The WIG Regression Model

The traditional theory of regression modeling is based on normality assumption for the response observations. While in many applications, these assumptions may not be valid due to positive right-skewed structure of data. An elementary solution, in this situation, is to use the log-transformed data that provide the possibility of using standard inferences based on the normal distribution and leads to a log-normal regression model. But applying any transformation to data increases the complexity of the model building procedure and makes meanings of the parameters less clear. In these situations, the WIG family of distributions, due to its flexibility and desired distributional properties, can be a suitable candidate for data analysis.

Consider a set of paired observations given by , where for a given , denotes the vector of explanatory variables and indicates the corresponding random response variable. Assuming as distribution of the response observations, the WIG regression model is given by where denotes the vector of regression coefficients. The likelihood function associated with the model (8) is given by

Letting and , it is easy to show that the following identities hold: where is a -dimensional vector with all elements equal to one. Therefore, the summation which appeared in the exponential function in the right-hand side of Equation (9) can be written as where denotes a quadratic form in terms of parameter . Hence, the likelihood and log-likelihood functions of the model are given by respectively. Equating the partial derivations of the log-likelihood function with respect to the model parameters to zero leads to the following system of equations: where is a quadratic form in terms of parameter introduced previously. As it can be seen, we have two nonlinear equations in terms of parameters and and a linear equation in terms of parameter . Therefore, the ML estimator of parameter is obtained in the closed form to be , and there is no closed form expression for the ML estimators of parameters and . Hence, a usual iterative optimization procedure like the Newton-Raphson could be employed to calculate the ML estimates of the model parameters. It should be pointed out that it is possible to use the other frequentist point estimation approaches such as the method of moments (MM) to estimate the model parameters. See [15] for more details about difficulties related to the estimation of parameters in WIG distribution by using the frequentist approaches. In what follows, we developed a Bayesian approach for this purpose.

Before going further, we tend to give some explanation about the link function of the proposed model. As it can be seen, here we have adopted an inverse form link function to specify the relation of the explanatory variables and the location parameter of the response observations. Although, as it is mentioned previously, there is no background on using WIG distribution in modeling context, the link function given in (8) has been previously used by some authors for modeling purposes using the IG distribution. For example, [6, 22, 23] used the inverse form link function for regression analysis and analysis of covariance (ANCOVA) under the IG distribution, respectively. Of course, there are other possible choices for link function of this regression model. For example, [24] used a linear link function as , and [25] utilized an exponential form link function, , in modeling of censored observations by IG distribution in order to guarantee the positivity of response variable mean. On the other hand since IG, despite WIG, is a member of exponential family with canonical parameter , one can use a link function of the form for modeling purposes based on the IG distribution.

Regardless of differences between the IG and WIG distributions, each of the above-mentioned link functions can be considered, probably with some modifications, for modeling process with the WIG distribution, provided that the estimated value of vector of regression coefficients should not result in a negative value for the mean of response observations. However, from the theoretical point of view, there are two reasons supporting this choice for link function. First, the canonical parameter in the inverse Gaussian family of distribution appears in the form of a function of inverse mean. The second reason is related to the mathematical computations. This choice enables us to provide posterior distribution in a well-formed mathematical expression in terms of a quadratic form of regression coefficients and gives us closed form expressions for some of posterior full-conditional distributions.

#### 4. Bayesian Analysis of the Model

In order to provide a Bayesian framework for analysis of the proposed model, it is necessary to set suitable prior distributions for the model parameters. These prior distributions should be able to explain the prior beliefs of data analyst about the model parameters.

##### 4.1. Prior Setting

In regression analysis, the main interested parameter that should be estimated is the vector of regression coefficient, . Theoretically, any multivariate continuous distribution on could be employed as a prior distribution for this parameter. For example, in a vague or a noninformative Bayesian analysis, a flat multivariate normal or a multivariate uniform distribution with large variance is a usual choice for this purpose. But, in the proposed model, considering such prior distributions for may lead to the nonpositive values for , , which is inconsistent with the positivity constrain for the location parameter of WIG distribution. Taking these considerations into account, we consider the following joint prior distribution of the model parameters as where and zero indices indicate the hyperparameters of the model that should be determined based on prior beliefs. As it can be seen, in this prior setting, all of the prior knowledge about the model parameters is expressed by using three gamma distributions. The family of gamma distributions seems to be sufficiently flexible that one can expect that a member of this family could describe the prior opinion of the data analyst about each of the interested parameters. In addition, it is always a potential choice to describe various positive populations from exponential to normal shape (see, for example, [26]). Under the above prior distributions setting, the prior distribution of parameter appears as distribution of difference of two nonindependent gamma random variables. While deriving the distribution of sum of two gamma random variables is an elementary work, this is not a trivial work for the case of difference of random variables especially when they are nonindependent. Generally, there is a vast literature on the sum of independent and nonindependent gamma random variables, and many authors [27–29] discussed this problem under various assumptions such as independent and identically or nonidentically distributed random variables. However, the difference of the nonindependent gamma random variables has been less considered in the literature. Holm and Alouini [30] showed that the general formulas for the probability density function of the difference of two not necessarily identically distributed gamma random variables can be expressed in the form of the McKay Bessel function distribution of type II [31]. The random variable follows the type II MacKay distribution with parameters , , and , denoted by McKayII (, , and ), when the density function of is given by where is the modified Bessel function of the second kind and of order . He provided the moments of this distribution in terms of the Gauss hypergeometric function. In addition, there are relatively simple expressions for the moments generating function, cumulants, and the moments of the McKay distributions. It should be noted that several famous distributions such as normal; distributions of Pearson system, e.g., type III Pearson (see [32]); and distribution of first product moment coefficient in a sample from normal population are special cases of McKay distribution of type II. The support of the McKay distribution is the line of real numbers, and its density function has a skew-symmetric shape. This implies that the prior distributions given in Equation (14) are in consistent with our knowledge about the parameters of the model. Specifically, the prior distribution of parameter holds the necessary constrain .

##### 4.2. Posterior Inference

Considering the likelihood function (9) and the prior distributions (14), the joint posterior distribution of the model parameters in terms of () is obtained to be

Due to the equality , the posterior distribution (16) can be written in terms of interested parameters as where is a quadratic form in terms of the vector of regression coefficients, , given in Equation (11). Clearly, the complexity of joint posterior distribution (17) precludes deriving analytical posterior inferences. In what follows, we employed the Markov chain Monte Carlo (MCMC) methods to sample form posterior distribution in order to deriving numerical estimates for interested posterior quantities.

##### 4.3. Gibbs Sampler

To construct a Gibbs sampler, it is necessary to determine the full conditional posterior distributions. The kernel of full conditional distribution for a typical parameter is easily obtained by vanishing all unrelated quantities to this parameter from the joint posterior distributions. According to (17), the full conditional posterior distribution of parameter in proportional form is given by which cannot be written down in a closed form density function. Also, the full conditional distribution of parameter is Therefore, the full conditional distribution for this parameter, in closed form, is obtained to be

For parameter , one can write where denotes the density function of a gamma distribution with mean . As it can be seen, the full conditional posterior distribution of parameter is proportional to a linear combination of two gamma density functions with equal shape parameters with positive coefficients. It is easy to show that the full conditional posterior distribution of is a two-component gamma mixture distribution given by where with

Knowing the full conditional distributions of the model parameters, it is possible to sample from the joint posterior distribution of the model parameter. Since there is no closed form expression for the full conditional distribution of parameter , a nested Metropolis-Hastings algorithm should be used for sampling from this parameter within the iterations of Gibbs algorithm.

Also, considering as old observations, the posterior predictive distribution for a new observation which corresponds to vector of explanatory variables is given by where and are the density function of WIG distribution and joint posterior distribution of the model parameters given in Equations (1) and (17), respectively. The MCMC estimate of the posterior predictive distribution is obtained to be where denotes a MCMC sample from the joint posterior distribution of parameters.

#### 5. Data Illustration

In this section, we preformed a simulation study and analyzed a real data set to access the performance of the proposed model and to explain its applicability. To provide a situation for an impartial judgment, we considered the IG and log-normal regression models as two famous competitors for the proposed model in computations.
(i)*Bayesian IG Regression Model.* The IG regression model has been studied repeatedly in the literature and can be written as , with the same link function as the proposed WIG regression model, i.e., . For the Bayesian analysis of this model, we used a vague structure for the prior distributions by considering and , where hyperparameters were set to be ,, and with . Under these prior settings, it can be shown that the posterior distributions of the parameters and are members of gamma and noncentral -student family of distributions, and their corresponding Bayes estimators are given by and , respectively. We refer the readers to [6, 22] for more details about the modeling by IG distribution(ii)*Bayesian Log-Normal Regression Model.* As it can be found in any Bayesian textbook (see, for example, [33]), to construct a Bayesian regression model for a log-normally distributed response variable, , it is suffice to construct a normal regression model for the logarithm of the response observations as , where with . Using a semiconjugate structure for the prior distributions of the form, , it is straightforward to contract a Gibbs sampler in order to approximate the joint posterior distribution and interested posterior quantities by employing the following full conditional distributions:

It should be noted that the hyperparameters , , and have been set, in our studies, in such manner that produce a vague prior setting.

##### 5.1. Simulation Study

We considered a WIG regression model with two explanatory variables given by with , , and . The values of explanatory variables simulated from independent standard normal distributions and the values of regression coefficients are set to be . Then, the response observations are simulated from model (27) for different sample sizes 25, 50, 75, and 100. Given simulated data, we used the proposed Gibbs sampler to create a MCMC sample in order to estimate the posterior interested parameters after 5000 burn-in period. The Geweke [34] diagnostic was used to test convergence of the algorithm. The root mean square error ( and absolute value of the relative bias of Bayes estimators of the regression coefficients for the WIG model along with corresponding values for IG and log-normal model were calculated, where is the real value of the parameter, denotes the value of the parameter in -th repetition, and is the estimated posterior mean. In all computations, the number of repetitions, , is fixed to be 4000 in order to taking into account the uncertainty of the simulated data. The results are presented in Table 1. In Table 2, a general comparison of strengths and limitations of different discussed models has been presented.

As it can be seen, the absolute value of relative biases and the root mean square errors of the proposed WIG regression model are less than the corresponding values for both IG and log-normal regression models. This implies that the Bayes estimator of regression coefficients for WIG model is more efficient than the counterpart estimators in IG and log-normal models.

##### 5.2. Application to the Seismological Data

To investigate the effect of soil shear wave velocity () in thirty meters at top of the site and the Joyner-Boore distance () as explanatory variables on the peak ground acceleration (PGA) as response variable, 25 different sites are selected and interested variables are measured. The data, presented in Table 3, are a subset of the Imperial Valley earthquake in 1979. The histogram and box plot of response observations, given in Figure 1, show the right-skewed structure of data. This indicates that the WIG and some of other right-skewed distributions such as IG and log-normal could be suitable candidate distributions for fitting to response observations.

**(a)**

**(b)**

The result of the Kolmogorov-Smirnov goodness of fit test for the WIG distribution along with corresponding values for the IG distributions is given in Table 4. We also provided the corresponding values for the log-normal distribution as a traditional candidate distribution for modeling positive right-skewed data. As it can be seen, the null hypotheses of the WIG and IG distributions for response distributions both are not rejected at 0.05 level of significance. The Bayesian estimates of the model parameters along with their corresponding HPD credible intervals for different models are presented in Table 5. It should be pointed out that the estimates also are obtained as the posterior mean of the MCMC samples after a 5000 burn-in period. The value of hyperparameters in (14) is set to be in order to provide noninformative vague prior distributions with large variances for parameters of the model. The Geweke diagnostic test [34] was used to assess the convergence of the sampler.

As it can be seen, from the HPD credible intervals, for the WIG and LN models, both of the regression coefficients and are different from zero at a significance level of , and this is the same for other parameters of these models. But, for the model IG, the coefficient is not significantly different from zero at the same level. It should be noticed that the estimated values of the regression coefficients for LN model are not comparable to corresponding values in the WIG and IG models due to the differences between the functional forms of the link functions of these models. In such situations, one can use some information-based criteria to assess the fit and complexity of models or provide a criterion for evaluating and comparing the predictive power of them. In this direction, the Akaike information criterion (AIC; ), Bayesian information criterion (BIC; ), and Hannan-Quinn information criterion (HQIC; ) are employed, where in definition of these criteria, , , , and denote the log-likelihood function of the model, vector of estimated parameters, sample size, and the number of model parameters, respectively. The lesser values of these criteria, presented in Table 6, indicate better parsimonious fits. We also used the cross-validation mean square error of prediction criterion, to calculate the predictive performance of different models. Where in Equation (28), denotes the predicated value of based on the data , i.e., all observations expect -th observation.

The values of AIC, BIC, and HQIC show that the data provide more support for the WIG regression model than other models considering the parsimonious principle. Also, the values of indicate to the higher prediction accuracy, in average, for the WIG model.

#### 6. Conclusions

Modeling right-skewed positive data has been at the core of interest in lifetime analysis and many other fields. This paper developed a regression model by considering the weighted inverse Gaussian distribution for response observations as a versatile candidate for analysis of the right-skewed positive data. A Bayesian framework employed to analyze the model. The proposed method uses a set of noninformative flexible priors from the gamma family of distributions and provides a tractable joint posterior distribution for parameters. It is computationally straightforward and proceeds by constructing a Gibbs sampler to derive the posterior inferences. The comparative experimental studies, simulation, and real-world example showed that the proposed weighted inverse Gaussian regression model has the ability to dominate inverse Gaussian and log-normal as two most famous competitor models in the literature. Particularly, considering that the relative prediction efficiency (REP) of two given predictors and in terms of cross-validation mean square error of prediction criterion is defined as , it can be seen that the relative prediction efficiency of the proposed WIG regression model to the IG and LN regression models is given by 1.16 and 64, respectively, for the real-world example discussed in this paper.

#### Data Availability

The data used to support the findings of this study have been presented in the paper.

#### Conflicts of Interest

The authors declare that there are no conflicts of interest in the publication of this article.