Abstract

The use of formalin-fixed, paraffin-embedded (FFPE) tissue overcomes the most prominent issues related to research on relatively rare diseases: limited sample size, availability of control tissue, and time frame. The use of FFPE pancreatic tissue in GEM may be especially challenging due to its very high amounts of ribonucleases compared to other tissues/organs. In choosing pancreatic tissue, we therefore indirectly address the applicability of other FFPE tissues to gene expression microarray (GEM). GEM was performed on archived, routinely fixed, FFPE pancreatic tissue from patients with congenital hyperinsulinism (CHI), insulinoma, and deceased age-appropriate neonates, using whole-genome arrays. Although ribonuclease-rich, we obtained biologically relevant and disease-specific, significant genes; cancer-related genes; genes involved in (a) the regulation of insulin secretion and synthesis, (b) amino acid metabolism, and (c) calcium ion homeostasis. These results should encourage future research and GEM studies on FFPE tissue from the invaluable biobanks available at the departments of pathology worldwide.

1. Introduction

In the last decade, GEM studies have proven valuable in clinical and molecular genetics research. The application of microarrays and transcript profiling analysis in cancer research has been useful in providing insight into the mechanisms and targets involved in oncogenesis [1, 2]. Additionally, the GEM method has become useful in the prediction of diseases, the prediction of drug responses, and tumor classification. Thus far, the use of GEM has been limited by the need for fresh frozen (FF) materials. Until now, purification of high-quality RNA from FF sample specimens has been considered a prerequisite for microarray analysis. The existing GEM studies have also been restricted to fairly frequent diseases to ensure a sufficient amount of sample material. Collaboration between investigators to overcome the issue of sample size has been hindered mainly by logistic problems, namely, possible exposure of FF material to RNA degradation when freezing is somehow compromised during transport and the subsequent processing of the tissue.

During the last century, tissue biopsies worldwide have routinely been subjected to formalin-fixation and paraffin-embedding by pathologists, resulting in large stocks of biopsy materials. Formalin-fixed, paraffin-embedded (FFPE) samples are easy to handle and store and are suitable for diagnostic histology, immunohistochemistry, and in situ hybridization. Therefore, this sample type is almost always available for diseases in which treatment involves surgery to remove parts of or almost the entire affected organ. Furthermore, invaluable clinical information and follow-up data are often accessible in conjunction with FFPE samples [3]. Lastly, the tissue is usually available in amounts adequate for use in GEM studies.

A well-known obstacle in the use of these samples in gene expression analysis has been the extensive degradation, fragmentation, and chemical modification of RNA that occur during the fixation process [4, 5]. However, several recent findings indicate that reverse transcription PCR (RT-PCR) and quantitative PCR (qRT-PCR) on FFPE RNA can produce reliable and reproducible results [69].

Nevertheless, gene expression analysis on multiple gene sets is more scarce and has proved more challenging for FFPE samples than for FF samples, especially with regard to the subsequent data analysis and its interpretation [10, 11]. This issue has been addressed by comparing FF and FFPE samples originating from the same biopsy, but this approach requires that the surgeons, pathologists, or lab technicians have control of the tissue, which implies tissue handling and treatment that differ from the handling of routine samples [1214]. Some studies overcome this obstacle by comparing the transcription profiles of unmatched FF and FFPE samples, but this approach leads to difficulties in validating the results of the data analysis [15]. However, a few studies solely investigated FFPE samples from patients for tumor classification without the inclusion of healthy control samples [16, 17].

CHI and insulinomas are characterized by hyperinsulinemic hypoglycemia, a condition of low blood glucose due to excessive and uncontrolled endogenous insulin secretion.

CHI is a monogenic, heterogeneous disease occurring in newborns with dysregulated hypersecretion of insulin, leading to severe hypoglycemia [18]. Currently, disease-causing mutations in seven different genes expressed in pancreatic beta cells are known and account for approximately 50% of CHI cases [1825]. Depending on the underlying genetic cause, newborns can either be treated nutritionally and/or medically or require partial or subtotal pancreatectomy [26]. In children requiring subtotal pancreatectomy, postoperative hypoglycemia and in later years diabetes mellitus or impaired glucose tolerance may occur. Gaining more insight into the disease-causing mechanisms in CHI patients with unknown genetic cause may open new possibilities for medical treatment of beta-cell diseases. Consequently, CHI is one of several relatively rare diseases that would benefit from the use of FFPE tissues in GEM studies.

Insulinomas are pancreatic neuroendocrine tumors that, like the pancreas in CHI patients, produce and secrete insulin at low blood glucose levels. Five to ten percent of all insulinomas are due to mutations in the tumor suppressor gene MEN1 [27]. Insulinomas are usually discovered because they cause hypoglycemic symptoms and need to be differentiated from other conditions that cause fasting hypoglycemia, such as CHI [28].

In the absence of readily available FF pancreatic tissue samples from healthy controls and patients with CHI, we considered the use of archived FFPE tissue for GEM analyses. In doing so, we would overcome the most prominent challenges regarding relatively rare diseases: the acquisition of the necessary amounts of samples to ensure adequate sample size, the availability of tissue from healthy controls, and the considerably reduced timeline for the study.

In selecting pancreatic tissue for our study, we also indirectly address the issue of whether the results are applicable to other diseases and tissues. Pancreatic tissue presents an extraordinary challenge because it contains high amounts of ribonucleases. The function of ribonucleases in tissue is to degrade RNA; greater amounts of ribonucleases result in faster degradation of RNA. Therefore, the tissue to be investigated is an important aspect in GEM studies, and a successful assessment of pancreatic tissue would undoubtedly be of great interest. According to EPConDB, currently, only 105 GEM studies on pancreatic tissue have been described; these studies have used pancreatic cell lines, pancreas from animal models, and FF human pancreatic tissue [2932]. As yet, no GEM studies have been conducted on FFPE pancreatic tissue.

Our aim was to assess whether archived, routine-treated FFPE material can be used to gather disease-specific information and whether this usage is also applicable to pancreatic tissue.

2. Materials and Methods

2.1. Samples

A total of 15 samples were included in this study (Table 1).

Seven of these were CHI pancreatic samples originating from near-total resections (CHI no. 1–7), of which there were five FFPE pancreatic tissue samples (CHI no. 1–5) and two frozen pancreatic tissue samples (CHI no. 6-7). These seven samples were from independent, unrelated patients. Using methods described earlier by Hussain et al. (2005), no mutations were found in the CHI samples in the most frequently mutated genes, ABCC8 and KCNJ11 [33]. The control pancreatic tissue samples consisted of five FFPE neonate samples (C no. 11–15) obtained from departmental archives. These neonate samples were obtained from deceased neonates whose deaths were not pancreas related. Further, a neonate insulinoma biopsy was included, in which one portion where one part was originally formalin fixed and paraffin embedded (C no. 16 and C no. 17) and the other portion was fresh frozen at −80°C (C no. 18). Tissues were formalin fixed and paraffin embedded according to standard procedures in the respective departments at the time of surgery.

The samples were obtained from Great Ormond Street Hospital for Children, London, England, and the Department of Clinical Pathology and Department of Pediatrics, University Hospital of Odense, Denmark. Approvals were obtained from the ethics committees of Great Ormond Street Hospital for Children, the NHS, Trust and the Institute of Child Health, London; the local ethics committee of Funen and Vejle County, The Danish National Committee on Biomedical Research Ethics; the Danish Data Protection Agency, Denmark. Written informed consent was obtained from the parents or guardians.

2.2. RNA Extraction

Paraffin blocks of FFPE pancreatic samples were cut in two 10 μm sections. To extract RNA from FFPE pancreatic sections, paraffin was solubilized with xylene and centrifuged to pellet tissue from the solution. Residual xylene was removed by two ethanol rinses, and the tissue pellet was air dried. The pellet was digested with Proteinase K overnight in digestion buffer. TRIzol Reagent (Carlsbad, CA, USA) was used to isolate RNA. The final pellets were resuspended in 15 μL of nuclease free water and stored at −80°C. RNA from frozen pancreatic samples was extracted from 50–100 mg of tissue. The tissue was homogenized and total RNA was extracted using TRIzol Reagent. Samples were stored at −80°C

For within-array normalization, we used reference RNA from cell lines. Four different cancer cell lines were included in the reference: HeLa (cervical epithelium), SK-BR-3 (mammary gland), HT29 (colon), and A431 (skin) cells. Cells were grown according to instructions from the American Type Culture Collection (ATCC, Manassas, VA, USA). Total RNA was extracted using an RNeasy Midi Kit (Qiagen, MD, USA).

Subsequently, DNase treatment was then performed on all samples. RNA integrity and concentration were determined with an Agilent Technology Bioanalyzer 2100 (Agilent, Lindenhurst, NY, USA), and OD260 measurements were obtained.

2.3. Amplification and Labeling

Amplification of DNase-treated samples was performed using an Amino Allyl MessageAmp aRNA kit (Ambion, Austin, TX, USA). Reference RNA was amplified with one round of amplification, whereas the pancreatic samples (FF and FFPE) were subjected to two rounds of amplification. Aminoallyl-modified RNA (aaRNA) from pancreatic samples was labeled with Cy5, and aaRNA from reference cell lines was labelled with Cy3 (Amersham Biosciences, Buckinghamshire, England). Frozen pancreatic samples and reference samples were fragmented using RNA fragmentation reagents (Ambion, Austin, TX, USA) according to the manufacturer’s instructions. Fragmented reference aaRNA was subsequently aliquoted.

Two micrograms of labeled pancreatic aaRNA, corresponding to approximately 200 pmol of incorporated dye and 1 μg of reference aaRNA, corresponding to approximately 100 pmol incorporated dye, were used for hybridization to the microarray slide.

2.4. Hybridization to and Scanning of Microarray Slides

Oligonucleotide targets (29,134 total) were spotted and prepared as previously described [34]. The dried slides were stored in desiccators until use. Hybridization of samples and reference aaRNA and the subsequent microarray washes were performed using an Agilent Gene Expression Hybridization Kit according to the manufacturer’s protocol. Slides were scanned using an Agilent G2565BA microarray scanner (Agilent Technology).

2.5. Data Preprocessing

Spot intensities measured for references and samples were corrected for local background intensities. Data analysis was performed using the variance stabilization normalization procedure (vsn) implemented in the R-based Bioconductor package (http://www.bioconductor.org/) [35, 36]. A ratio of the normalized intensities was calculated for each spot. Finally, a geometric mean was calculated for the ratios of each pair of spot duplicates and was used as the gene expression measurement. Data are available from GEO (http://www.ncbi.nlm.nih.gov/geo/, Accession no. GSE32610).

2.6. Data Analysis
2.6.1. Correlation Analysis

The normalized gene intensities for the Cy3 and Cy5 dyes were compared for labeling efficiencies and reproducibility across all genes by calculating the Pearson correlation coefficient: CC=𝑥𝑥mean𝑦𝑦mean𝑥𝑥mean2𝑦𝑦mean2,(1) where 𝑥 and 𝑦 are normalized gene intensities for Cy3 and Cy5 dyes in the log scale.

2.6.2. Cluster Analysis

Unsupervised hierarchical cluster analysis was first applied to all of our genes without filtering to determine whether RNA from FFPE samples contained disease-specific information that could be retrieved. Cluster analysis was then applied to the genes identified as significant to reveal the disease-dependent expression patterns of these genes.

2.6.3. GEE Model

Because the 3 arrays for insulinoma (C no. 16–C no. 18) were from the same biopsy, we introduced the generalized estimation equation (GEE) model to identify genes differentially expressed between CHI patients (FFPE) and the insulinoma case. The GEE model accounts for the correlation in the three arrays with an exchangeable working correlation matrix. The false discovery rate was calculated to adjust for multiple testing.

2.6.4. SAM Analysis

The popular SAM method [37], a penalized 𝑡-test, was used to compare the gene expression levels between CHI patients (FFPE) and the neonate controls. Different from the standard 𝑡-test, SAM reduces the effects of noisy genes and evaluates the statistical significance of each gene by a permutation test that accounts for multiple testing. We used 1000 replicates for the permutation test.

3. Results

We extracted total RNA from five FFPE CHI samples and five FFPE control samples (Table 1). In addition, RNA was harvested from one frozen insulinoma sample with two matched FFPE insulinoma samples obtained from the same pancreas.

Total RNA from FFPE specimens was purified from FFPE pancreas sections with acceptable yields and OD260/280 ratios (Table 2) despite extensive degradation (data not shown).

The mean purities (OD260/280) of total RNA for FFPE and FF were 1.84 and 1.77, respectively. The RNA yield ranged from 9 μg to 77 μg (mean 25 μg). The three frozen samples had intact 18S and 28S ribosomal bands, whereas the FFPE samples did not. RNAs yields were not comparable between frozen and FFPE samples, because frozen tissue was not consistent with the FFPE sections. Because of the obvious differences in RNA quality, dye incorporation rates were of interest. We could not detect differences in incorporation rates: we measured 39.7 and 43.7 dye mols/1000 nts, respectively. Ambion recommended incorporation rates of 30 to 60 dye mols/1000 nts. Thus, despite extensive degradation in FFPE samples, the dye incorporation rates were within the manufacturer’s recommendations. We also wished to estimate the reproducibility of the gene expression profiles of pancreatic tissue, particularly because of its high ribonuclease content. For this purpose, we included pancreatic samples that had been routinely both FF, and formalin fixed and paraffin embedded. The pancreatic biopsy was from a child with insulinoma, which fulfilled our criteria. Microarray analysis of the FFPE biopsy was performed in duplicate (C no. 16, C no. 17). The resulting three samples (including the FF insulinoma sample (C no. 18)) were normalized.

In Figure 1(a), the correlation between the log2 signal intensities of Cy3 (reference RNA) and Cy5 (sample RNA) is presented. Keeping in mind that the reference and sample RNAs were not obtained from the same material, the plot does not detect differentially expressed genes between samples. The plots show sample uniformity, which is also supported by their respective correlation coefficients (CC) (CCAI: 0.85, CCAII: 0.81, and CCAIII: 0.82), indicating high comparability between samples despite differences in initial treatment and storage. In addition, to detect eventual intensity-dependent patterns that are likely to be introduced in microarray experiments, the correlation between log2 mean (Cy5/Cy3) versus ratio (Cy5/Cy3) signal intensities is shown in Figure 1(b). The data were distributed horizontally around 0, indicating that the detected intensities are likely to be of biological origin and not introduced during the laboratory work. The CCreprod_FFPE for the two FFPE samples was 0.86, whereas CCreprod_FF_FFPE for the FFPE and FF tissues was 0.63 (for both C no. 16 and C no. 17 versus C no. 18), demonstrating that the two FFPE samples correlated better than frozen and FFPE.

We used a housekeeping gene, GAPDH, as another variable to evaluate reproducibility. GAPDH is arrayed 304 times in duplicate on each microarray slide and is therefore useful for assessing sample- and intraslide variation (Figure 2). Ratios of the expression values of Cy5 and Cy3 suggest a slight increase in the number of outliers in FFPE samples compared to FF samples.

Using all the 29214 genes without filtering, we first applied unsupervised hierarchical clustering analysis to the five FFPE controls and five FFPE CHI samples, to see, if the samples contained information that could be used to retrieve their disease status. The dendrogram in Figure 3 showed that four of five CHI samples clustered as one cluster. The controls clustered in two clusters indicating the usefulness of FFPE samples in a disease study.

We also performed two different gene expression analyses on the normalized data. In the first analysis, we compared five FFPE controls with five FFPE CHI samples using the SAM approach; in the second analysis, we compared three insulinoma samples with five FFPE CHI samples using the GEE model because the three insulinoma samples were correlated.

The QQ-plot in Figure 4 shows the observed versus expected scores for comparing FFPE controls and FFPE CHI samples. The red and green spots that deviate from the straight line represent genes that were up- and downregulated, respectively. The pattern in Figure 4 shows that our FFPE samples can be used for microarray analysis to identify genes that are differentially expressed under disease conditions. Figure 5 is a heat map displaying the expression patterns for the 19 most highly significant genes. Unsupervised hierarchical clustering applied on the 19 genes clustered four of the five CHI samples to a branch distinct from the control samples (Figure 5).

The 19 most significantly up- or downregulated genes were further characterized for their functional or biological relevance to congenital hyperinsulinism (Table 3).

Hypothetical proteins and genes of unknown function are present in the list. Furthermore, five genes are cancer-related (PLCXD1, ERBB3, ARPC5L, CBLL1, and PLXDC2), and one is involved in inflammation and immunity (IL1A). Two genes are involved in metabolism; PFKL is involved in glycolysis, and GOT1 is involved in amino acid metabolism [3845].

Figure 6 is a heat map of 54 significant differentially expressed genes (FDR<1×105, fold change >4) in insulinoma and CHI samples that were identified using the GEE model. Unsupervised hierarchical clustering showed a clear separation of the insulinoma and CHI samples. A list of the 54 significantly up- or downregulated genes is presented in Table 4.

Again, hypothetical proteins, genes of unknown function, and cancer-related genes were detected, but most importantly, we detected three genes involved in glycolysis (PFKL, SUCLG2, PFKP) and 2 genes involved in calcium ion homeostasis (CALB2, NUCB2) [42, 4649].

4. Discussion

We show that routine, archived FFPE tissue contains valuable biological and disease-specific information that can be assessed by GEM. The resultant data are amenable to statistical analysis and can give biological and disease-specific information. This finding applies to pancreatic tissue, whose RNA is considered more prone to degradation and fragmentation than RNA from other tissues due to high quantities of ribonucleases, which is of great interest.

This pilot study was conducted to determine whether FFPE pancreatic tissue is amenable to GEM. We included seven CHI pancreatic samples in the study with unknown genetic causes that are not likely to be of the same genetic origin. To date, clinicians have been unable to subdivide these unexplained individuals into groups according to clinical and/or biochemical features. Our group of CHI samples was therefore likely to be genetically different by chance alone. Despite this assumed genetic heterogeneity, in this study, we were able to show that ribonuclease-rich FFPE tissue contains biological information that can be retrieved. We were able to separate the samples according to disease, thereby demonstrating that the interesting disease-specific information was still present in the RNA of the samples. Also, we detected cancer-related genes, which could be related to the hypertrophy occurring in CHI pancreatic islets. The assumed genetic heterogeneity could explain the broad spectrum of cancer-related genes we detected. Additionally, we also identified significantly differentially expressed genes that are involved in glycolysis (PFKL, SUCLG2, and PFKP) and calcium-homeostasis (CALB2 and NUCB2). Glycolysis and calcium-homeostasis are two important steps in the regulation of insulin synthesis and secretion [46, 49, 50].

To our knowledge, thus far, there have not been any studies performed on FFPE tissue in which this tissue type was also chosen as the control with the overall purpose of narrowing down potential disease-relevant pathways. Similar studies have been performed on FF tissue with good results [5153]. Studies based on FFPE tissue mostly address issues such as whether (a) matched FF and FFPE RNA in GEM produce the same GEM results, depending on the laboratory and data analysis methods used [11, 14, 15, 5456] or (b) FFPE samples can be used for tumor classification [57]. Hoshida et al. (2008) used FFPE liver tissue to produce a signature model to predict survival in hepatocellular carcinomas [17].

Jacobson et al. (2011) presented a gene expression study that included tissue harvested from non-small-cell lung cancer. The study addressed two interesting issues: storage time and methodology; the authors concluded that storage time was not a limiting factor for RNA quality and its further applicability, whereas the methods/commercially manufactured kits that were used had varying outcomes. Another approach involved detecting the differences in glioblastoma expression profiles between FF and FFPE samples [55]. FFPE samples expressed a reduced number of significant transcripts compared to FF samples, but the overall expression pattern for the significantly differentially regulated transcripts was similar in the two groups. The use of RNA from FFPE tissue for the identification of differentially expressed genes is supported by Fedorowicz et al. (2009), whereas Kibriya et al. (2010) advise against it. In general, the use of FFPE tissue in GEM studies is as regarded useful, considering the lack of other options [11, 14].

Penland et al. (2007) determined four criteria for RNA quality and quantity to predict which samples would hybridize successfully. We addressed three of the four criteria: (a) a sample purity (OD260/280- ratio) >1.5 of the extracted total RNA, (b) a total RNA yield of >600 ng, and (c) a dye incorporation rate of >4.5 pmol/ng in labeled aRNA [57]. Our data fulfilled the first two criteria, whereas our dye incorporation rate was much lower than the recommended limit, with means of 0.11 and 0.21 pmol/ng for FF and FFPE samples, respectively. However, the incorporation rates for our samples fulfilled the recommendations of the manufacturer (Ambion). Another approach to determine the quality of the data produced is to compare correlation coefficients. Frank et al. (2007) calculated a CCreprod greater than 0.95 after normalization in FFPE samples, whereas directly comparing FF and FFPE replicates resulted in a mean CCreprod of 0.86, which is considered suboptimal by Frank et al. However, our data yielded CCreprod values of 0.86 and 0.63 for the two FFPE insulinomas and the FF and FFPE insulinomas, respectively [15]. Despite this discrepancy, we achieved a reasonably good comparability between the FF and FFPE samples, with CCAI-AIII values from 0.81 to 0.85.

Considering this, our cluster analyses showed that our data were feasible, even though they did not fulfill the criteria stated by Penland et al. (2007) and Frank et al. (2007). Cluster analysis revealed that CHI samples were in a different subcluster separate from both normal controls and insulinoma samples [15, 57]. The significantly up- and downregulated genes were mostly cancer related but, most importantly, also included genes that are involved in the regulation of insulin synthesis and secretion. CHI is diagnosed histologically as diffuse beta cell abnormality, which supports the finding of cancer-related genes in our analysis. Because CHI is a congenital disorder in which excessive insulin secretion causes hypoglycemia, the finding of up- and downregulated genes involved in glycolysis (PFKL, SUCLG2, and PFKP) and calcium ion homeostasis (CALB2, NUCB2) clearly shows that disease-relevant biological information is retrievable even from pancreatic FFPE tissue [46, 49, 50]. We do not find significant differential insulin gene expression neither comparing insulinoma and CHI nor controls with CHI. The reason for this is not clear but could be explained by the insulin-lowering medication prescribed to insulinoma and CHI patients before surgery. In these patients, the medical effect is not sufficient to treat hyperinsulinemia but might be sufficient to erase significant differential insulin gene expression in CHI versus controls and CHI versus insulinoma.

Despite the small sample size and the use of pancreatic tissue, we were able to show that FFPE tissue can be used as a sample source for GEMs. In light of our results and the growing knowledge regarding FFPE tissue, it seems worthwhile to study this type of tissue more closely. Future studies should address the issue of how to analyze data from FFPE samples that are not classified together with genetic certainty. Although the laboratory utility of FFPE has been examined from numerous angles, data analysis issues still remain.

5. Conclusions

This study shows that ribonuclease-rich pancreatic tissue that has been processed in formalin and paraffin still contains meaningful biological information that can be gathered through GEM studies. The results show that informative hybridizations of archived, FFPE-derived aaRNA produced expression data of sufficient quality to allow the identification of disease-specific profiles. The findings also indicate that CHI subtypes, distinguished by different disease-causing mutations, can be identified from FFPE tissue.

Conflict of Interests

The authors declare that they have no competing interests.

Acknowledgments

The authors thank Professor, M.D., Claus Hovendal, Department of Surgical Gastroenterology, for his cooperation as surgeon and Professor Claus Fenger, Department of Clinical pathology, for providing control samples, both Odense University Hospital, Odense, Denmark. This work was financially supported by Diabetesforeningen, Denmark, Overlægerådets Legatudvalg, Odense University Hospital, Denmark, and Fonden for lægevidenskabelig forskning ved Fyns Amts Sygehusvæsen, Denmark.