Abstract

The mosquito midgut is a physiological organ essential for nutrient acquisition as well as an interface that encounters various mosquito-borne pathogens. Metabolomic characterization would reveal biochemical fingerprints that are generated by various cellular processes. The metabolite profiles of the mosquito midgut will provide an overview of the biochemical events in both physiological states and the dynamic responses to pathogen infections. In this study, the midgut metabolic profiles of Anopheles gambiae mosquitoes following feeding with sugar, human blood, mouse blood, and Plasmodium falciparum-infected human blood were examined. A mass spectrometry system coupled to liquid and gas chromatography produced a time series of metabolites in the midgut at discrete conditions (sugar feeding, 24 h and 48 h post-normal blood and P. falciparum-infected blood feeding). Triplicates were included to ensure system validity. A total of 512 individual compounds were identified; 511 were assigned to 8 superpathways and 75 subpathways. The dataset can be used for further inquiry into the metabolic dynamics of sugar and blood digestion and of malaria parasite infection. The dataset is accessible at the repository Dryad.

1. Introduction

Malaria is caused by infection with mosquito-borne parasites of the genus Plasmodium. According to the WHO, there were 214 million new cases and an estimated 438,000 deaths worldwide in 2015, with most deaths associated with P. falciparum transmitted by An. gambiae species complex mosquitoes in Africa [1]. Understanding mosquito biology and vector competence for parasite transmission will facilitate the development of novel intervention strategies for malaria control.

The act of feeding on blood is essential for mosquito reproduction and for malaria parasite transmission. Blood feeding, digestion, and associated physiological responses also affect the microbial community in the gut [2]. As an additional genetic repertoire, the microbiome plays essential roles in various mosquito phenotypes, including reproduction and immunity. “Omics” technologies have been robustly employed in studying metagenomic interactions in various biological systems [3]. In particular, metabolomics approaches have been used to characterize amino acids, lipids, sugars, cofactors, and other small molecules that are precursors, intermediates, and by-products of reactions and pathways directed by the genetic blueprints in the metagenomic repertoire. In mosquito studies, transcriptomic responses to blood feeding and malaria parasite infection have been well characterized [49]. However, comprehensive metabolomic data are limited for medically important arthropods, except for a few reports [1014]. Metabolomics allows for the identification and quantification of a range of metabolites in a system [15]. In this study, we used a nontargeted metabolomics approach [16, 17] to examine the midgut metabolites of sugar-fed, blood-fed, and P. falciparum-infected mosquitoes. This dataset can connect transcriptomic responses to biochemical fluxes related to the blood feeding and digestion and to P. falciparum infection.

2. Methodology

The study design is outlined in Figure 1. The mosquito midgut samples were prepared in two laboratories. The midgut samples from the Xu lab (NMSU) were prepared from mosquitoes fed on sugar meals and meals of uninfected mouse blood; and the midgut samples from the Luckhart lab (UCD) were prepared from mosquitoes fed on sugar, on uninfected human blood, and on P. falciparum-infected human blood. Samples of uninfected human blood and P. falciparum-infected human blood were included as well (Table 1).

An. gambiae G3 mosquitoes were used for the study. All mosquito rearing and feeding protocols were approved and were in accordance with regulatory guidelines and standards set by the Institutional Animal Care and Use Committee of New Mexico State University and the University of California, Davis.

In the Xu lab, mosquitoes were reared under standard insectary conditions of 27-28°C and 80% humidity and a 12 h : 12 h light-dark cycle. Larvae were provided with a 1 : 1 mix of Brewer’s yeast and rodent chow (Purina Laboratory Rodent Diet 5001, LabDiet, St. Louis, MO). Adult mosquitoes were provided with 10% sucrose ad libitum. NIH Swiss outbred mice were used as a blood source for egg production.

In the Luckhart lab, An. gambiae G3 were reared and maintained at 27-28°C and 80% humidity and a 12 h : 12 h light-dark cycle. First instar larvae were provided with a 1 : 2 mix of Brewer’s yeast and Sera® Micron planktonic rearing food, while second to fourth larval instars were provided with Purina® Game Fish Chow pellets. Adult mosquitoes were provided with 10% sucrose ad libitum and allowed to feed on CD1 outbred mice for egg production.

In the Luckhart lab, the procedures for culture of P. falciparum strain NF54 MCB and mosquito infection were described previously [18]. The infection rate for each replicate mosquito cohort was determined by counting P. falciparum oocysts in the midgut of a sample of mosquitoes at day 10 after infection. The protocols involving the culture and handling of P. falciparum for mosquito infection were approved and were in accordance with regulatory guidelines and standards set by the Biological Safety Administrative Advisory Committee of the University of California, Davis.

In the Xu lab, midgut samples were collected from adult mosquitoes fed on 10% sucrose and were collected at 24 h and 48 h after feeding on uninfected mouse blood. In the Luckhart lab, midgut samples were collected from mosquitoes fed on 10% sucrose, were collected at 24 h and 48 h after feeding on a mixture of uninfected human red blood cells (RBCs) and heat-inactivated human serum (1 : 1), and were collected at 24 h and 48 h after feeding on P. falciparum-infected blood (1 : 1 human RBCs : serum). Each sample contained 30 midguts at each collecting point, triplicates were collected from three mosquito cohorts, and a total of 30 samples were collected for metabolic profiling at Metabolon, Inc. (Durham, NC). Table 1 summarizes the sample collections.

The MicroLab STAR system (Hamilton Company) was used for sample preparation. For quality control purposes, recovery standards were added before the extraction process. Protein from the samples was removed using both organic and aqueous extraction methods. The extract from each sample was split into two fractions, one of which was used for gas chromatography (GC) and the other was used for liquid chromatography (LC).

For LC/MS analysis, a Waters ACQUITY UPLC and a Thermo-Finnigan LTQ-FT mass spectrometer were used. The sample extract was divided into two identical aliquots. These aliquots were dried and were then reconstituted in acidic or basic LC-compatible solvents. Each of these solvent types contained 11 or more injection standards that had fixed concentrations. The first aliquot was analysed using acidic positive ion optimized conditions. The second aliquot was analysed using basic negative ion optimized conditions. These aliquots were independently injected using separate dedicated columns. If the extract was reconstituted in acidic conditions, a gradient elution using water and methanol containing 0.1% formic acid in each liquid was used. If the extract was reconstituted in basic conditions, a gradient elution containing 6.5 mM ammonium bicarbonate and water/methanol was used. During data collection, the MS analysis alternated between MS and data-dependent MS2 scans using dynamic exclusion.

A Thermo-Finnigan Trace DSQ fast-scanning single-quadrupole mass spectrometer using electron impact ionization was used for GS/MS. Samples were redried under vacuum for at least 24 hours and then were derivatized under dried nitrogen using bis-trimethyl-silyl-trifluoroacetamide (BSTFA). The column used for GC/MS analysis was 5% phenyl and used a temperature ramp from 40° to 300°C in a 16-minute period.

Raw MS files were stored in a database. Data was then examined and quality control limits were applied. Peaks identified were stored in a separate database. Compounds were found by comparison to analytical standards or recurrent unknown entities. Compounds registered into the Laboratory Information Management System (LIMS) were used as a dataset for identification. Also, entities that were classified as unknown could still be identified by their recurrent nature (using both mass spectral and chromatographic characteristics). If a compound was found in both the extraction blank and the sample, it was excluded, unless the signal intensity was at least three times the intensity found in the blank.

A total of 512 individual compounds were identified; 511 were assigned to 8 superpathways and 75 subpathways (Table 2). The dataset is presented as biochemical abundance in each sample (Dataset Item 1). MetaboAnalyst tool suite [19] can be employed for further analysis. Some compounds were detected in certain sample(s) but were not detected in other samples, which yielded no values of the compounds in those samples. A Bayesian PCA (BPCA) method is designed to handle datasets with missing values [20]. Here, we used it to present the patterns of metabolites in the midgut samples in different conditions. Then, the data were log-transformed and followed by data scaling via Pareto scaling [21]. Figure 2 presents a PCA plot, which explains the variance within the dataset and provides an overview of the data with regard to questions related to the conditions under which samples were collected. In our study, blood samples, uninfected human blood (NB), and P. falciparum-infected blood (IB) cluster distinctively from mosquito midgut samples. In addition, samples of 24 h post-uninfected (NBM 24 h) or post-infected (IBM 24 h) blood feeding are closer to each other, while samples of 48 h post-blood feeding (IBM 48 h, IBM 48 h) are clustered with the samples of sugar-fed (SM-L, SM-X) midguts. The latter is likely due to the completion of blood digestion and the midgut environment returning to sugar-fed conditions. Midgut samples fed on uninfected mouse blood are separated from human blood-fed samples, which is likely attributed to the distinction between human and mouse blood. Overall, the PCA output indicates that variation is largely attributed to the biological conditions, with limited variation induced by nonbiological factors present in the dataset. Figure 3 is a graphical representation of biochemical abundance (rescaled data).

3. Dataset Description

The dataset associated with this Dataset Paper consists of 3 items which are described as follows.

Dataset Item 1 (Table). The original scale of biochemicals, pathway assignment, and abundance. The column Superpathway shows the superpathway the biochemical is associated with and the column Subpathway shows the subpathway the biochemical is associated with. The column Platform presents the platform used for the identification of the biochemical (GC/MS, LC/MS positive, or LC/MS negative); the column Retention Index, the retention index (RI) associated with the biochemical; the column Mass, the mass of species; the column CAS, the Chemical Abstracts Service identifier of the biochemical; the column PubChem, the PubChem identifier of the biochemical; the column KEGG, the KEGG identifier of the biochemical; and the column Group HMDB, the Human Metabolome Database (HMDB) identifier of the biochemical. Columns 12–41 present the biochemical abundance associated with samples. Values are normalized in terms of raw area counts.

  • Column 1: Biochemical Name
  • Column 2: Superpathway
  • Column 3: Subpathway
  •     ⋮
  • Column 39: NBM-X, 48H1
  • Column 40: NBM-X, 48H2t
  • Column 41: NBM-X, 48H3t

Dataset Item 2 (Table). The rescaled biochemicals, pathway assignment, and abundance. The column Superpathway shows the superpathway the biochemical is associated with and the column Subpathway shows the subpathway the biochemical is associated with. The column Platform presents the platform used for the identification of the biochemical (GC/MS, LC/MS positive, or LC/MS negative); the column Retention Index, the retention index (RI) associated with the biochemical; the column Mass, the mass of species; the column CAS, the Chemical Abstracts Service identifier of the biochemical; the column PubChem, the PubChem identifier of the biochemical; the column KEGG, the KEGG identifier of the biochemical; and the column Group HMDB, the Human Metabolome Database (HMDB) identifier of the biochemical. Columns 12–41 present the biochemical abundance associated with samples as denoted in Table 1. Each biochemical in original scale is rescaled to have median equal to 1. Then, missing values are imputed with the minimum.

  • Column 1: Biochemical Name
  • Column 2: Superpathway
  • Column 3: Subpathway
  •     ⋮
  • Column 39: NBM-X, 48H1
  • Column 40: NBM-X, 48H2t
  • Column 41: NBM-X, 48H3t

Dataset Item 3 (Table). The amount of protein found in each sample.

  • Column 1: Sample Identifier
  • Column 2: Bradford Protein Concentration

4. Concluding Remarks

In the mosquito midgut ecosystem, the metabolites can be derived from the mosquito host, microbial organisms in the gut microbiota, and ingested pathogens. This dataset presents a survey of the metabolomic landscapes in the An. gambiae midgut in response to different diets and to human malaria parasite infection. The dataset provides a valuable metabolic reference that will facilitate establishing a connection to transcriptome and proteome data related to the mosquito physiology and interactions with symbiotic microbes and P. falciparum in the midgut environment. The dataset allows users to further compare the metabolic dynamics under these conditions in the contexts with their own interest.

Dataset Availability

The dataset associated with this dataset paper is dedicated to the public domain using the CC0 waiver and is available at https://doi.org/10.1155/2017/8091749/dataset. In addition, this dataset is accessible at the repository Dryad.

Disclosure

The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health and the National Science Foundation. The current affiliation of Phanidhar Kukutla is as follows: Department of Anaesthesia, Critical Care and Pain Medicine, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, USA.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Jiannong Xu and Shirley Luckhart designed the study. Phanidhar Kukutla collected midgut samples in the Xu laboratory, and Elizabeth K. K. Glennon and Bo Wang collected midgut samples in the Luckhart laboratory. Cody J. Champion, Jiannong Xu, and Shirley Luckhart analysed the data and Cody J. Champion prepared data deposit. Cody J. Champion, Jiannong Xu, Shirley Luckhart, and Elizabeth K. K. Glennon prepared the manuscript. All authors read and approved the final version of the manuscript.

Acknowledgments

The research reported here was supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Awards nos. R01 AI080799 and R01 AI073745 to Shirley Luckhart and SC2GM092789 and SC1AI112786 to Jiannong Xu. Cody J. Champion was supported by the National Science Foundation Graduate Research Fellowship under Grant no. 1144468.

Dataset Files

  • 8091749.item.1.xlsx

    Dataset Item 1 (Table). The original scale of biochemicals, pathway assignment, and abundance. The column Superpathway shows the superpathway the biochemical is associated with and the column Subpathway shows the subpathway the biochemical is associated with. The column Platform presents the platform used for the identification of the biochemical (GC/MS, LC/MS positive, or LC/MS negative); the column Retention Index, the retention index (RI) associated with the biochemical; the column Mass, the mass of species; the column CAS, the Chemical Abstracts Service identifier of the biochemical; the column PubChem, the PubChem identifier of the biochemical; the column KEGG, the KEGG identifier of the biochemical; and the column Group HMDB, the Human Metabolome Database (HMDB) identifier of the biochemical. Columns 12–41 present the biochemical abundance associated with samples. Values are normalized in terms of raw area counts.

  • 8091749.item.2.xlsx

    Dataset Item 2 (Table). The rescaled biochemicals, pathway assignment, and abundance. The column Superpathway shows the superpathway the biochemical is associated with and the column Subpathway shows the subpathway the biochemical is associated with. The column Platform presents the platform used for the identification of the biochemical (GC/MS, LC/MS positive, or LC/MS negative); the column Retention Index, the retention index (RI) associated with the biochemical; the column Mass, the mass of species; the column CAS, the Chemical Abstracts Service identifier of the biochemical; the column PubChem, the PubChem identifier of the biochemical; the column KEGG, the KEGG identifier of the biochemical; and the column Group HMDB, the Human Metabolome Database (HMDB) identifier of the biochemical. Columns 12–41 present the biochemical abundance associated with samples as denoted in Table 1. Each biochemical in original scale is rescaled to have median equal to 1. Then, missing values are imputed with the minimum.