Computational and Mathematical Methods in Medicine

Volume 2016 (2016), Article ID 7878325, 7 pages

http://dx.doi.org/10.1155/2016/7878325

## Analysis of Blood Transfusion Data Using Bivariate Zero-Inflated Poisson Model: A Bayesian Approach

^{1}Student’s Research Committee, Shahrekord University of Medical Sciences, Shahrekord, Iran^{2}Social Health Determinants Research Center, Shahrekord University of Medical Sciences, Shahrekord, Iran^{3}Epidemiology and Biostatistics Department, Shahrekord University of Medical Sciences, Shahrekord, Iran

Received 14 April 2016; Accepted 16 August 2016

Academic Editor: Dong Song

Copyright © 2016 Tayeb Mohammadi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Recognizing the factors affecting the number of blood donation and blood deferral has a major impact on blood transfusion. There is a positive correlation between the variables “number of blood donation” and “number of blood deferral”: as the number of return for donation increases, so does the number of blood deferral. On the other hand, due to the fact that many donors never return to donate, there is an extra zero frequency for both of the above-mentioned variables. In this study, in order to apply the correlation and to explain the frequency of the excessive zero, the bivariate zero-inflated Poisson regression model was used for joint modeling of the number of blood donation and number of blood deferral. The data was analyzed using the Bayesian approach applying noninformative priors at the presence and absence of covariates. Estimating the parameters of the model, that is, correlation, zero-inflation parameter, and regression coefficients, was done through MCMC simulation. Eventually double-Poisson model, bivariate Poisson model, and bivariate zero-inflated Poisson model were fitted on the data and were compared using the deviance information criteria (DIC). The results showed that the bivariate zero-inflated Poisson regression model fitted the data better than the other models.

#### 1. Introduction

Blood transfusion is so important in health system that it plays a big part in saving many people’s lives in normal and emergency situations. Furthermore, it has a noticeable impact on improving the quality of life and consequently the life expectancy of chronic patients. Nevertheless, many patients either die due to lack of access to safe blood transfusion or at least suffer from it. According to the World Health Organization report, about one percent of the population of every country is in need of blood donation [1]. Today, the need for blood and its products is increasing day by day [2]. Since some diseases may be caused by blood transfusion, screening the donors and detecting potential healthy donors are of great importance [3]. For that matter, lack of healthy donors has always been a serious problem for blood banks to supply sufficient and healthful blood [4, 5]. Therefore, one of the main goals of blood transfusion centers is detecting and preserving healthy donors and preventing unhealthful blood donation which may cause many diseases to be created or aggravated [2]. Nonetheless, even from among those who are eligible to donate blood only a small portion really become blood donors [6]. Inasmuch as the screening test is done at each donation to separate healthful from unhealthful blood, the more the blood donation is, the higher the chances of getting healthful blood will be. That is why recognizing the factors which influence blood donation is of great importance in attracting potential donors and turning them into regular donors [3]. From among all the laboratory screening methods to prevent the transference of infection through blood, the only truly effective method is to select healthy donors and not allow ineligible donors to donate blood [7]. People who are not eligible to donate blood are called “deferred donors” [8]. Most deferrals are “temporary” and exist due to taking certain medications before donation, high or low blood pressure, anemia, high-risk behavior, and so forth. These deferrals are controllable and can be reduced by giving the necessary information before donation [2, 9–11]. Naturally, as return to donation increases, the probability of deferral from donation increases too. Therefore, it is expected that there will be positive correlation between the number of donation and the number of deferral. What has been said so far points out the importance of getting healthful blood and its products and the relation between number of return for donation and deferral.

Poisson regression model belongs to the family of generalized linear models in which the response variable is a count one and has followed Poisson distribution. The equality of the variance and the mean of the dependent variable is one of the important hypotheses of Poisson regression analysis. In most practical applications, response observations are overdispersed (i.e., the variance of observations is significantly bigger than their mean); thus fitting the Poisson regression model on the data will not yield the desired results. In a univariate case, the best solution to the problem of overdispersion is to use a negative binomial regression model [12, 13]. In many studies, it is seen that there is a correlation between the two count response variables; in these cases, dealing with the response variables separately without considering this correlation will result in inconsistent and inefficient estimators [14]. The basic solution is to use bivariate count models [14–16]. In medical, environmental, and ecological studies, existence of excessive zeros in count data is common. If the zeros are ignored for the sake of simplifying the analysis, valuable information will be lost and can result in biased estimate of the parameters and thus misleading findings [17, 18]. An appropriate class to explain these data is the class of zero-inflated distributions. In practice, the data including zero inflation can be sampled from zero-inflated Poisson or zero-inflated negative binomial [18]. The zeros in the count data can then be attributed to structural causes (known as structural zeros) or sampling limitations (known as sampling zeros) [17]. The most common model for explanation and analysis of excessive zeros in count data is the zero-inflated model [17, 18]. Generally, for this kind of data, zero-inflated models fit better than regular models [19]. For a multivariate and especially a bivariate case, in which there is a correlation between the two count response variables [18], since the marginal distributions of bivariate model are univariate, this bivariate model cannot be used to model extra zeros paired count data. Instead, the bivariate zero-inflated regression model is used [20]. In most applications, it seems logical to use the zero-inflated bivariate Poisson distribution.

Due to calculation problems while fitting these models, researchers were not able to use zero-inflated bivariate count model for a long time [18]. Recent improvements in hierarchical Bayesian modeling and specifically the improvement in simulation methods like Markov chain Monte Carlo (MCMC) have provided the mechanisms for simple implementation of bivariate distributions such as bivariate Poisson [18].

This study seeks to determine the factors which affect the number of return for donation and also deferral from donation. The data is of the type of count data and, to explain it, it is necessary to use a count regression model. On the other hand, the data has a large number of zeros, and also there is a positive correlation between the number of return for donation and the number of deferral. In order to model these two variables obtained from blood transfusion data, the bivariate zero-inflated Poisson regression model was used. The remainder of the paper is organized as follows: in Section 2, first, the bivariate data set of blood transfusion is introduced, then, the bivariate zero-inflated Poisson models are presented, and later a Bayesian methodology for fitting the bivariate zero-inflated Poisson model is developed. In Section 3, the result of fitting the proposed model on blood transfusion data is brought up and discussed. Finally, Section 4 provides some conclusions.

#### 2. Materials and Methods

##### 2.1. Data

The data used in this research was obtained from a longitudinal study in which a random sample of donors who had a first-time successful donation were followed up for a maximum of five years and their number of return for blood donation and number of blood deferral were measured as response. A full description of the data can be seen in [3]. Figure 1 shows the frequency of return for blood donation and blood deferral. 51% of return for blood donation and 85% of blood deferral are zero, which is much more than the Poisson distribution contribution. On the other hand, the Spearman correlation coefficient of the number of return for donation and the number of deferral was equal to 0.276, which is significant at level 0.01. Therefore, to study the effective factors, the bivariate zero-inflated regression model was used. Sex, weight, age, marital status, education, and job were taken as independent variables. Since education and job were nominal, in order to apply those in the model, three dummy variables for education and four for job were used. As a result, thirteen independent variables were inserted into the model.