Blood donors are considered one of the healthiest populations. This study describes the epidemiology of cancer in a cohort of blood donors up to 20 years after blood donation. Records from donors who participated in the Retroviral Epidemiology Donor Study (REDS, 1991–2002) at Blood Centers of the Pacific (BCP), San Francisco, were linked to the California Cancer Registry (CCR, 1991–2010). Standardized incidence ratios (SIR) were estimated using standard US 2000 population, and survival analysis used to compare all-cause mortality among donors and a random sample of nondonors with cancer from CCR. Of 55,158 eligible allogeneic blood donors followed-up for 863,902 person-years, 4,236 (7.7%) primary malignant cancers were diagnosed. SIR in donors was 1.59 (95% CI = 1.54,1.64). Donors had significantly lower mortality (adjusted HR = 0.70, 95% CI = 0.66–0.74) compared with nondonor cancer patients, except for respiratory system cancers (adjusted HR = 0.93, 95% CI = 0.82–1.05). Elevated cancer incidence among blood donors may reflect higher diagnosis rates due to health seeking behavior and cancer screening in donors. A “healthy donor effect” on mortality following cancer diagnosis was demonstrated. This population-based database and sample repository of blood donors with long-term monitoring of cancer incidence provides the opportunity for future analyses of genetic and other biomarkers of cancer.

1. Introduction

Blood donors are considered to be one of the healthiest populations due to donation eligibility requirements. Studies have suggested a lower incidence of cancer diagnosis and mortality in blood donors. Merk et al. and Edgren et al. estimated cancer incidence in Swedish donors and Swedish and Danish donors, respectively, and showed lower incidence of cancer—including hematological malignancies—in blood donors [1, 2]. These investigators also analyzed risk of cancer in longer-term blood donors relative to donation frequency and found no association between donation intensity and risk of cancer among blood donors [3].

We are unaware of any similar large-scale longitudinal study of cancer occurrence in the US blood donor population. Moreover, the Swedish and Danish studies did not retain samples from donors, in contrast to the existence of cryopreserved plasma and cellular sample repositories from a large number of US blood donors who consented to long-term outcome research studies in the past several decades. Establishment of a population-based database for long-term monitoring of cancer incidence and outcome among these US blood donors therefore provides a necessary base on which future analyses of genetic or other biomarkers of cancer can be conducted.

Here we describe characteristics of a sample of the California blood donor population who consented to collection and storage of repository samples of serum, plasma, or whole blood for future research. By linking identity of these donors with the California Cancer Registry (CCR) we identified donors who developed cancer after blood donation and were able to estimate incidence of cancer among donors for all cancers and by anatomic site. We also compared survival in blood donors with a primary cancer to a demographically matched sample of nondonor population with malignancies to investigate the so-called “healthy donor effect” on overall mortality.

2. Methods

2.1. Retroviral Epidemiology Donor Study

Beginning as early as 1974, as part of blood transfusion safety studies in the United States the National Heart Lung and Blood Institute (NHLBI) and others funded the creation and maintenance of a series of large-scale blood specimen repositories [4]. These repositories of donor or linked donor and recipient specimens include three large repositories developed during the NHLBI Retroviral Epidemiology Donor Study (REDS) starting in 1991 that contain specimens from over 700,000 representative blood donors [4, 5]. Details of donor consent, blood collection, and sample processing methods have been previously described for each repository [6]. Briefly, REDS was a multicenter study awarded to 5 large, geographically dispersed community blood centers (Baltimore/Washington DC, Detroit, Los Angeles, San Francisco, and Oklahoma). Approximately 10% of these specimens were provided by consenting donors at a Northern California blood bank, Blood Centers of the Pacific based in San Francisco (BCP). At the time of donation, blood donors would not have had overt signs of illness because such signs would have excluded them from eligibility to donate. History of hematological cancer would have resulted in permanent deferral for blood donation, whereas donors with a history of other cancer types would have been eligible to donate blood one to three years after completion of treatment if they were in complete remission and free of symptoms, consistent with the American Association of Blood Banks (AABB) and FDA policies [79].

2.2. Data Linkage

We obtained electronic record files from the data coordinating center for REDS (Westat Inc., Rockville, MD) that included the repository specimen number, donor and donation identification numbers, and dates of donation(s) in the repositories for BCP blood donors who had consented to participate and donated specimens to any of the three REDS repositories between 1991 and 2002. The files were provided to BCP information technology (IT) staff who then submitted an electronic file containing personal identifiers of the REDS repository donors to CCR to be matched against their records between January 1991 and December 2010.

The CCR is a population-based registry that maintains records of nearly all cancer cases diagnosed in California [10]. Statewide cancer reporting has been mandatory since January 1, 1988. For this linkage, the CCR’s January 2012 extract was used. The linked dataset is complete through 2009 for donors with a cancer record. CCR performed a formal data linkage and sent deidentified records containing the specimen number and demographic, socioeconomic, and cancer variables (cancer diagnosis, cancer biomarkers, cancer description, treatment, and outcome) to Blood Systems Research Institute (BSRI) investigators for data analysis. All donor records were returned to BSRI investigators and kept in the master database irrespective of successful linkage to CCR records.

The linked database was also matched against the Social Security Administration (SSA) Death Master File (DMF) to verify vital status of blood donors and date of death for those who are no longer alive. The SSA has been collecting information for DMF since 1936. The SSA-DMF contains over 89 million records of deaths. An encrypted file including social security number, first and last name, date of birth, and date of last visit to blood center was sent to the SSA and compared to the DMF to identify matches. We verified information received from SSA with CCR and BCP records, using multiple time points and data elements, including date of last visit to the blood center, cancer diagnosis date, hospital admission date, and last date known alive/dead in CCR and BCP records. Where there was a discrepancy among multiple sources, we assumed BCP records to be the gold standard with respect to vital status, because donors must be physically present at BCP for each donation visit; followed by CCR, which annually updates vital status of cancer patients using multiple national and local sources; and then SSA-DMF.

To compare overall survival of donor and non-donor cancer patients, we also requested a comparison group of frequency matched non-donor cancer records from CCR in a ratio of 1 donor with cancer to 3 nondonors with cancer (Figure 2). The linked BCP/CCR cohort database ( ) was divided into one of 26,400 categories. The categories were determined by a unique combination of “year of diagnosis/quintile of SES/race/sex/SEER tumor categories” (22 years of birth certificates 5 months 6 race categories 2 sex categories 20 SEER tumor categories = 26,400 total categories). A uniform random number was assigned between zero and one to each of the non-donor birth certificates in CCR database. Then, the records were sorted in each of the 26,400 categories by this random number. Records in each category were selected for inclusion in the study dataset with the first record up to three times the number of matches in that category to achieve a 3 : 1 ratio of non-donor to donor records.

IRB approvals to conduct this data linkage and analysis were obtained from the University of California, San Francisco Committee on Human Research (the IRB of record for BSRI/BCP), and from the California Public Health Service (the IRB of record for CCR).

2.3. Master Database

The resulting linked master database contains demographic and donation history data from BCP’s database for 66,984 donors and CCR records for the subset of these donors who were diagnosed with cancer between 1991 and 2009.

BCP Data. The BCP dataset consists of the demographic information required by the CCR in order to link donors to cancer registrants: first name, middle initial, last name, social security number, date of birth, sex, last known street address, city, and zip code. Donors’ current address is actively updated only when donors visit the center seeking to donate. This dataset also contains the following demographic and donation history variables: birth state and/or country, race/ethnicity, marital status at time of last BCP visit, education, first or repeat status on repository specimen donation, donation type, date of repository specimen donation, first donation date, total number of presentations to attempt to donate, total number and types of successful donations, last donation date, and ABO/Rh blood groups [11].

CCR Data. The CCR dataset includes additional variables in addition to demographics related to the following domains for donors and nondonors who had a cancer:(1)socioeconomic: occupation, median household income, education index (average years of schooling in a given census tracts), college degree index, poverty index, socioeconomic status (SES) index, and urban/rural status;(2)cancer diagnosis: date of diagnosis, date of admission, primary or secondary tumor, number of tumors, tumor size, and diagnosis location;(3)cancer description: tumor primary site (SEER ICD-O-2 and ICD-O-3 definition), tumor stage at diagnosis, histologic type and grade/differentiation, and regional node extension;(4)cancer biomarkers (available for some records): estrogen receptor for breast cancer, acid phosphatase for prostate cancer, alphafetoprotein for testicular cancer, carcinoembryonic antigen (CEA) for colorectal cancer, and carbohydrate antigen (CA-125) for ovarian cancer;(5)treatment: type of chemotherapy, hormonal therapy, and immunotherapy;(6)outcomes: vital status, date of last contact/death, place of death, and cause of death.

2.4. Exclusion Criteria

We excluded donors who donated before age of 18 ( ), donors who gave autologous donations ( ), donors with unknown donation type or confirmatory testing blood sample ( ), and those who were diagnosed with a cancer before their first donation to REDS repositories or had invalid date records in the linked database ( ) (Figure 1). We further excluded 664 donors who were diagnosed with a primary nonmalignant tumor (SEER “behavior” variable coded as benign, borderline, or in situ tumor) during study followup. We also excluded one year and two year periods following first donation from the analysis to ensure exclusion of donors with potentially undiagnosed cancer at time of donation or cancers diagnosed shortly after donation ( and 333, resp.).

For survival analysis (donors and nondonors with cancer) we excluded records of donors with autologous/unknown donation type ( ), nonprimary tumors ( ), tumors with unknown sequence ( ), duplicate tumor IDs ( ), or a tumor diagnosed before age of 18 ( ) (Figure 2).

2.5. Statistical Analysis

Number of diagnosed cancer cases for all cancers and by anatomic site were estimated using SEER ICD-O-3 site codes [12]. Donors’ follow-up time and all-cause mortality were updated using “vital status” captured in CCR data defined as “patient’s vital status as of the date of data extraction” (completed through 2009) [13]. For donors without a cancer record, last follow-up time was updated using matched SSA-DMF records. Donors were followed from first repository donation to cancer diagnosis, death, or the end of study followup (December 31, 2009), whichever came first.

The observed incidence rate of any cancer among BCP donors was compared with the rates expected based on SEER 9 registry data, including Atlanta, Connecticut, Detroit, Hawaii, Iowa, New Mexico, San Francisco-Oakland, Seattle-Puget Sound, and Utah. The number of expected cancer cases was calculated using multiplication of person-years of followup by the sex- and age-specific SEER incidence rates for the calendar years 1992 to 2009. We also estimated the expected cancer cases using only San Francisco-Oakland SEER data. Standardized incidence ratios (SIRs) were calculated by dividing the number of observed over expected cases for all cancers and by anatomic sites. Sensitivity analysis was conducted to estimate SIRs using different latency periods between first repository donation and primary cancer diagnosis and to exclude potential cancer patients that were not diagnosed at time of first donation (Figure 1). We estimated SIR for all eligible donors with a valid record ( ); then we limited the eligible study population to donors who had more than one year of follow-up ( ) and to those with two years or longer of follow-up time ( ). Confidence intervals (CI) for the SIRs were calculated based on Poisson distribution as described in Rothman and Boice [14].

Hazard ratios (HRs) and 95% CI for death following cancer diagnoses were calculated using Cox proportional hazard regression models for all primary cancers combined and by cancer site, adjusted for age at diagnosis, sex, race, SES, tumor stage, and grade at diagnosis to compare all-cause mortality among donors and nondonors diagnosed with primary cancer (Figure 2).

All data analyses were performed using STATA 11.2 (STATA Corporation, TX, USA). All statistical tests were two sided with a 5% type I error.

3. Results

A total of 66,984 BCP donors who had consented to one or more of the REDS repository studies were searched for a link to 3,413,457 CCR records. The linkage resulted in 7,943 individual donors with cancer. Between 1991 and 2009, 4,236 primary malignant cancer records were identified among 55,158 blood donors eligible for this analysis (Figure 1).

Demographic characteristics and donation history of eligible BCP blood donors with and without a primary cancer record are presented in Table 1. Compared to donors without cancer, donors who developed cancer were older at time of first repository donation and overrepresented white males and donors with lower education levels. Donors with a cancer record had a shorter follow-up time compared to donors without a cancer (median 9.8 years versus 16.9 years), a longer duration of donor activity (median: 3.7 years versus 3.1 years), and a greater number of lifetime blood donations (median 6 donations versus 4 donations).

3.1. Cancer Incidence among BCP Blood Donors

During 863,902 person-years of follow-up, 4,236 primary cancers were observed among BCP blood donors, while 2,670 cancers were expected (SIR = 1.59; 95% CI = 1.54–1.64). Excluding donors with less than one year of followup, the number of observed primary cancers decreased to 4,092 cases during 808,855 person-years of followup (Table 2). Restricting the analysis to exclude donors with less than one or two years of followup slightly increased the ratio of observed to expected cancers for about 5–10% ( ; 95% CI = 1.54–1.64 with no latency period, versus , 95% CI = 1.59–1.70 with one year latency period (Table 2) and 1.69, 95% CI = 1.64–1.74 with two years latency period). The most common sites of cancer diagnosed among BCP blood donors with more than one year of followup were male genital and breast cancers ( and 754, resp.) followed by cancers of digestive system ( ). Compared with the general US population, BCP blood donors had a higher number of observed cases for most cancer sites than expected (Table 3). Using “San Francisco-Oakland SEER registry” incidence rates as a reference for the same time period slightly changed the SIRs for all cancer sites (data not shown); however, the changes in SIR using different reference populations were not substantial. The largest change was observed for female malignant skin cancers with a 27% increase in the estimated SIR when using San Francisco-Oakland SEER registry rates (SIR = 2.26, 95% CI = 1.88–2.68) as compared with Nine SEER registry rates (SIR = 1.77, 95% CI = 1.48–2.11).

3.2. Mortality following Cancer Diagnoses in Donors versus Nondonors

We compared BCP allogeneic blood donors with a primary cancer record ( ) to a demographically matched sample of the California cancer population (nondonors, ). In analyses adjusted for cancer site, we found that donors had a significantly higher overall survival following diagnoses of cancer compared with nondonors regardless of the cancer site ( , data not shown). In stratified analyses by cancer site, we observed a significantly higher overall survival among donors with site-specific cancers ( , Figures 3(a), 3(c), and 3(d)), except for cancers of respiratory system ( , Figure 3(b)), brain and nervous system, and myeloma ( and 0.96, resp., data not shown).

In multivariable Cox regression models, risk of mortality was significantly lower among donors compared with nondonors for most cancer sites after adjusting for age at diagnosis, sex, race, SES index, tumor stage, and grade at diagnosis, with an unadjusted HR of 0.58 (95% CI = 0.55–0.62) and an adjusted HR of 0.70 (95% CI = 0.66–0.74, Table 4). Risk of all-cause mortality among donors with cancers of the respiratory system and multiple myeloma remained nonsignificant in the adjusted Cox regression models.

4. Discussion

4.1. Standardized Incidence Ratio

The present study describes the epidemiology and incidence of cancer among a large cohort of US blood donors for up to 20 years after blood donation. We observed an increased incidence of cancer diagnoses among BCP blood donors as compared with general US population, as well as lower overall mortality among donors who were diagnosed with cancer compared with a matched non-donor cancer population.

Investigators with the Scandinavian Donations and Transfusions Database (SCANDAT) linked data on all registered blood donors and recipients in Sweden and Denmark with national population and health registries with complete followup up to 36 years [15]. The investigators reported lower mortality and cancer incidence among blood donors than among the general Swedish and Danish populations [1]. The current study design is similar to Scandinavian study in terms of linking donation database to health registries to estimate cancer incidence among blood donors. However, there are major differences between the two study designs. The most important one is the existing national health registers and nationwide transfusion registries in the Scandinavian countries, which capture almost all Swedish and Danish population. The “complete followup” of the donors in SCANDAT minimizes the likelihood of surveillance bias in the Scandinavian study, while our analysis is limited to BCP blood donors and cancer cases reported within California. Another major difference is the relatively small size of our study population with about 4,000 cancer cases, compared with 38,000 cancers recorded in the SCANDAT. However, the current analysis is the first attempt to report on long-term cancer outcome of blood donors in the US.

Likely explanations for increase in cancer incidence among BCP donors relative to the general US population are the differences in demographics and access to primary care in BCP blood donors and the general population. BCP blood donors are disproportionately young and white and are more educated than the general US population. The higher incidence rates of breast and male genital system cancers for which screening of healthy adults is available and recommended, for example, could be due to known differences in education level, race, SES, geographic location, or unmeasured differences in health-seeking behavior and access to health care for blood donors compared to the general US population. Our SIR estimates do not take into account the stage and grade of cancer at time of diagnosis (and hence entry into CCR database); a more educated donor with more frequent primary doctor visits and access to preventive care and screening is more likely to be diagnosed with a low grade cancer at earlier stages than the general population [16]. Interesting finding of the Scandinavian study is higher incidence of breast and prostate cancer in their donor population compared with the reference population [1].

4.2. Lower Mortality among Blood Donors versus Nondonors

Our study documented lower overall mortality following diagnoses when comparing blood donors with cancer and non-donor cancer populations. Selection bias due to “healthy worker effect” has been indicated in relation to various diseases including cancer [1720]. However, there are no studies of a “healthy donor effect” on overall survival in blood donors with cancer. Lack of a significant dose-effect relationship between blood donation and overall mortality among BCP donors with cancer (data not shown) suggests that “healthy donor effect” on mortality is through pathways not related to duration of donor activity or number of lifetime donations. Many behavioral and medical factors related to cancer incidence and survival are not included in the blood donor screening and eligibility criteria. For instance, smoking history or alcohol consumption are known risk factors for cancer incidence and mortality [21, 22] but were not collected for REDS (or other US) donors, because smoking or drinking are not a reason for donor deferral. This could partially explain similar survival rates among donors and nondonors with primary cancers of the respiratory system.

Better survival among BCP blood donors with cancer after adjusting for multiple confounding factors suggests the possibility of differences in access to health care resulting in earlier diagnosis and better treatment of malignancies. One might again assume health seeking behavior of blood donors and lag time effect as an explanation, that is, more frequent doctor’s office visits and screening during donor’s lifetime; and therefore, cancers were diagnosed at earlier stages with a better prognosis [23]. This is only true for cancers that could be diagnosed during presymptomatic stages by the current recommended screening practices, that is, breast, prostate, and colorectal cancers. Nonetheless, selection bias due to overall healthier profile of donors, access to health care, and treatment options or “healthy registration effect” [23] are more plausible explanations, because the effect remains significant after adjusting for tumor stage and grade. Thus, the increased incidence of cancer and the better survival following cancer diagnoses are linked.

4.3. Study Limitations and Strengths

The present data is limited by differential ascertainment of the vital status for donors with and without cancer. For donors with a cancer record in CCR the “vital status” variable is updated through various registries and data sources (source of last followup information is the CCR) [13], with a complete followup through December 2009. However, vital status for donors without cancer was updated using only SSA-DMF records.

The SSA does not guarantee the accuracy of their SSA-DMF records. Thus, the absence of a particular person is not proof that this person is alive. Based on age-adjusted US mortality rates from 1997 to 2009 [24], we expected about 5,200 deaths among 64,000 blood donors within an average 10 years of followup. However, SSA-DMF search reported about 4,300 matched death records for our donor population (82%). We verified vital status of blood donors using multiple time points, including last visit to BCP to increase accuracy of our follow-up time. However, an erroneous classification of a deceased donor as being alive by the end of study followup is plausible, which will result in overestimation of person-years of followup, overestimation of expected cancer cases, and underestimation of SIR. On the other hand, we speculate that our observed cases are underestimated, due to ignoring primary tumors that may have been diagnosed outside of California after blood donation. Knowing the last visit date to the blood center, 15% of eligible donors without a cancer record in CCR visited BCP in 2008 or 2009, which we can consider correctly classified as cancer-free. For the other 85%, it is not feasible to verify California residency beyond their last blood donation date. We cannot ascertain if they were diagnosed with a primary tumor outside of California and if they had been diagnosed in California, the number of observed cases would have been larger, resulting in greater observed to expected ratio and larger SIR. The person-years of followup for these donors, on the other hand, would have been shorter, because they would have contributed person-time up to cancer diagnosis, not to the end of study (or date of death by SS-DMF). Smaller person-years of followup would have reduced the estimate for expected cancers and increased observed to expected ratio and SIR. Thus, we consider the overall bias in SIR estimates, if any, to be towards the null.

In another set of sensitivity analyses we calculated a range of SIRs without updated vital status information. We estimated SIRs assuming two extreme ranges for person-years of followup: from first donation to cancer diagnosis or December 31, 2009 (SIR = 1.52, 95% CI = 1.47–1.56) or from first donation to cancer diagnosis or last visit to the blood center (SIR = 3.47, 95% CI = 3.37–3.58). Our conservative estimate after vital status update and excluding donors with less than 12 months of followup time showed 64% greater incidence of primary malignant tumors among BCP REDS donors compared with the general US population (95% CI = 59–70%). While our data are observational with limited follow-up information for donors without a cancer record, sensitivity analyses assuming the best and worst case scenarios for donors without a cancer (all donors were alive until the end of study followup or all donors were lost to followup after last BCP visit, resp.) indicate that cancer incidence among our blood donor cohort exceeds that of the general US population in each extreme assumption scenario.

A second limitation of the current analysis is that we had access to limited demographic and medical history information for donors. Data on variables generally known as risk or prognostic factors for cancer, such as weight and height, smoking status, alcohol drinking, family history of cancer, and diabetes, are not captured in BCP or REDS donor data.

Third, our results are limited to northern California blood donors who participated in REDS during 1991–2002. Blood donors have different demographic and behavioral risk factor profiles than the general population, and furthermore BCP blood donors may be a group of blood donors that are different from the general US blood donor population in many aspects related to cancer incidence and mortality. Lastly, while non-donor cancer population is not reflected in BCP REDS, there is a possibility of misclassification if they donated blood at other times or other blood centers. If this is true, the actual donor, non-donor differences in mortality would likely be even greater than the current report.

This is the first report of cancer incidence among US blood donors in comparison with the general population. In the next phase of the study a larger dataset including additional REDS donor records from other participating blood centers will be matched against National Death Index to increase the precision and accuracy of the standardized cancer incidence and outcomes estimates among REDS blood donors. A noteworthy aspect of the existing linked donor-cancer database is the stored repository of whole blood and/or serum/plasma specimens which were collected years prior to cancer diagnosis. The current study establishes the capability of health outcomes linkage and presages utility of the stored specimens in a way that had not been considered during the design and creation of the REDS repositories. The establishment of the linked database for these repository donors is the first step towards further investigation of repository samples to investigate cancer biomarkers that may have been present in serum, plasma or whole blood samples during a time period when these individuals were healthy cancer-free blood donors. These investigations should help us to better understand biomarker development and changes that may be present in blood years before clinical cancer is diagnosed.

Conflict of Interests

All the authors declare that there is no conflict of interests regarding the publication of this paper.