Abstract

Haplaxius crudus Van Duzee is a pest of various economically important palms due to its ability to transmit lethal yellowing, a fatal phytoplasma infection. It is also the putative vector of lethal bronzing in Florida, another lethal phytoplasma disease causing significant economic losses. To date, no mitochondrial genomes for species in the family Cixiidae are sequenced. In this study, the complete mitochondrial genome of H. crudus was sequenced, assembled, and annotated from PacBio Sequel II long sequencing reads using the University of Florida’s HiPerGator. The mitogenome of H. crudus is 15,848 bp long and encodes 37 mitochondrial genes (including 13 protein-coding genes (PCGs), 22 tRNAs, and 2 rRNAs) in addition to a putative noncoding internal control region. The nucleotide composition of H. crudus is asymmetric with a bias toward A/T (44.8 %A, 13.4 %C, 8.5 %G, and 33.3 %T). Protein-coding genes (PCGs) possess the standard invertebrate mitochondrial start codons with few exceptions while the gene content and order of the H. crudus mitogenome is highly similar to most completely sequenced insect mitochondrial genomes. Phylogenetic analysis based on the entire mitogenome shows H. crudus resolving closely to Delphacidae, the accepted sister taxon of Cixiidae. These data provide a useful resource for developing novel primer sets that could aid in either phylogenetic studies or population genetic studies. As more full mitogenomes become available in the future for other planthopper species, more robust phylogenies can be constructed, giving more accurate perspectives on the evolutionary relationships within this fascinating and economically important group of insects.

1. Introduction

The invention of high-throughput genome sequencing technologies has greatly altered the understanding of the biology, diversity, and relationships between insect vectors and their associated pathogens [1]. Mitochondrial loci in particular are some of the most commonly used molecular markers for phylogenetic studies [2] and assessing population dynamics [3, 4] and are therefore an important component of next-generation sequencing. The insect mitochondrial genome is compact, generally spanning from 14 to 20 kilobases (kb), with a collection of encoded genes that are extremely conserved and seen in all suborders of Hemiptera [57] as well as other orders, such as Coleoptera [8]. Genes typically encoded in the animal mitochondrial genome comprise 13 protein-coding genes (PCGs), 2 ribosomal RNAs (rRNAs), and 22 transfer RNAs (tRNAs) with a noncoding control region [9]. In addition, the specific gene order within mitochondrial DNA is a key feature that can provide important evidence to establish evolutionary relationships among taxa at both high and low taxonomic levels and can be variable among insects [4, 10, 11].

To date, 5,178 complete or nearly complete insect mitochondrial genome sequences are available on GenBank (as of April 2020) and continue to grow as sequencing technologies become more cost effective and time effective. However, fewer complete mitochondrial genomes of Hemipteran insects, specifically Auchenorrhyncha, have been sequenced or published in GenBank (https://www.ncbi.nlm.nih.gov/).

Hemipterans are one of the largest groups of the hemimetabolous insects [12] and include three suborders: the Auchenorrhyncha, Sternorrhyncha, and Heteroptera [5, 13]. The Auchenorrhyncha and Sternorrhyncha are well suited to transmit plant pathogens based on the morphology of their mouthparts (piercing/sucking) and feeding behavior [14]. Within Hemiptera, notable insect pests are planthoppers, leafhoppers, aphids, and whiteflies. Planthoppers belonging to the family Cixiidae consist of more than 2,000 species and 150 genera of insects in the order Hemiptera [15].

The American palm cixiid, Haplaxius crudus (Figure 1), is among the most important auchenorrhynchan insect pests of palms, ranging from the subtropical United States to the tropical regions of Central and South America [16, 17]. H. crudus is a confirmed vector of the palm disease termed lethal yellowing (LY) caused by the 16SrIV-A phytoplasma on the American continent [18] and is the putative vector of lethal bronzing (LB) [19], a devastating palm disease caused by the lethal bronzing phytoplasma (16SrIV-D subgroup). Phytoplasmas are related to Gram-positive bacteria and are obligate intracellular parasites of plants that are transmitted by phloem-feeding Hemipteran insects, including leafhoppers, planthoppers, and psyllids [20, 21]. The LY and LB phytoplasmas ultimately result in death of infected palms and have resulted in the losses of millions of coconut palms (Cocos nucifera) throughout the Caribbean basin [22] as well as other palm species. Furthermore, H. crudus is widespread and abundant in the Southeastern United States.

Herein, the complete (sequenced, assembled, and annotated) mitochondrial genome of H. crudus is presented. These data are meant to serve as a resource for the development of novel markers to use in phylogenetic and population studies.

2. Materials and Methods

2.1. DNA Extraction and Sequencing

The preparation of high-quality high molecular weight genomic DNA was carried out with a mature wild-type H. crudus female collected in Davie, Florida. The specimen was morphologically identified, preserved in 100% ethanol, and stored at −80°C in the Insect Vector Ecology Laboratory, Fort Lauderdale Research and Education Center (FLREC), University of Florida. The whole body insect tissue was homogenized with a sterile pestle in 2 ml of liquid nitrogen. After homogenization, total genomic DNA was extracted from the frozen adult using the Qiagen Gentra Puregene® Genomic DNA kit supplemented by the 10X Genomics® whole genome extraction protocol. Purity and concentration tests were performed using NanoDrop™ Microvolume Spectrophotometer and Qubit® Fluorometric Quantification technologies (Thermo Fisher Sciences, https://www.thermofisher.com/us/en/home.html). Purity ratio of 1.8 and high genomic concentration >30 ng/μL were required for whole genome sequencing. Once cleared for purity and concentration, the sample was sent directly for sequencing to the University of Florida Interdisciplinary Center for Biotechnology Research (UF-ICBR). Sample was sequenced using PacBio Sequel II SMRTbell® long read sequencing technology with 40x coverage (Pacific Biosystems, https://www.pacb.com/).

2.2. Genome Assembly and Annotation

Sequence assembly, alignment, and nucleotide composition calculations were conducted with the University of Florida’s supercomputer, HiPerGator 3.0. HiPerGator’s cluster offers processors and nodes for memory-intensive computations in basic bash command line operations. De novo assembly of sequence reads was performed via the CANU v2.0 operational assembler specialized in assembling PacBio sequences in three phases: correction, trimming, and assembly [23]. Protein-coding genes were identified using the NCBI’s ORF finder for invertebrate mitochondrial genes (NCBI, https://www.ncbi.nlm.nih.gov/orffinder/), tRNAs were identified using ARWEN v1.2 [24], and rRNAs were identified using MITOS WebServer [25]. The putative control regions were assumed to be present between the rrnL (rRNA-12S) and Ile-Gln-Met tRNA cluster. Nucleotide diversity and composition analyses were also performed with HiPerGator. Postannotation alignment was performed using the MUSCLE algorithm in MEGA X [26].

2.3. Phylogenetic Analysis

Phylogenetic analyses were assessed using nine insect mitochondrial nucleotide sequences. Of the nine insect species, four non-Hemipteran insects and four Hemipterans were selected for analysis (Table1). Species’ mitochondrial genome sequences and annotations were downloaded from GenBank in Fasta file format from eight species of insects and were aligned with the H. crudus mitochondrial Fasta file using the MUSCLE algorithm in MEGA X with default settings [26]. Once aligned, the evolutionary history was inferred by performing the ML method and Tamura–Nei model to best fit the scheme [27]. The tree with the highest log likelihood (−140533.20) is shown with the percentage in which the associated taxa are related. Heuristic search trees were obtained using both neighbor-join and BioNJ algorithms to estimate the maximum composite likelihood (MCL) in tandem with a superior likelihood value. Branch lengths are measured by the number of substitutions per site with a total of 10 nucleotide sequences. Codon positions analyzed were 1st, 2nd, 3rd, and noncoding in order. There were a total of 21,821 positions in the final dataset.

3. Results

3.1. Genome Size, Organization, and Structure

The assembled contig demonstrated that the mitochondrial genome of H. crudus is a circular DNA molecule 15,848 bp in length. The mitochondrial genome includes 37 genes, 13 PCGs, 22 tRNA genes, and 2 rRNA ribosomal genes (Figure 2, Table 2). The new sequence was submitted to GenBank under the accession number (MW057863). The major strand (α strand) carries most of the genes (8 PCGs and 14 tRNAs), while the remaining genes are encoded on the minor strand (β strand). The AT-rich regions of the mitogenome range from 14,720 to 15,848 bp with the location between rrnL and tRNA-Ile (Figure 2). The nucleotide composition of the H. crudus mitochondrial DNA is A = 7,097 (44.8%), T = 5,279 (33.3%), G = 1,341 (8.5%), and C = 2,128 (13.4%) of 15,845 nucleotides present. The genome organization generally follows the standard order of the ancestral insect mitochondrial genome plan (Figure 3).

3.2. Protein-Coding Genes

The mitochondrial DNA of H. crudus contains the full set of PCGs usually present in animal mitochondrial DNA. PCGs are arranged along the genome according to the standard order of insects (Figure 3). The putative start codons of PCGs are those previously known for animal mitochondrial DNA, i.e., ATG, ATT, ATA, ATC, GTG, TTG, and GTT (Table 2) [28]. The common start codon ATG could be assigned to most of the protein-coding sequences, with few exceptions. Two protein-coding regions, ATP6/ATP8 and ND4/ND4L, overlap and are translated from the same cistronic mRNAs. In addition to the control region, we observed 18 noncoding regions ranging from 1 to 1,210 bp (Figure 2, Table 2). The noncoding control region in the H. crudus mitochondrial genome extends 1,129 bp and is located between the final tRNA (tRNA-Val) and the Ile-Gln-Met tRNA cluster. There are many unique TA-dinucleotides and TTA-trinucleotide repeats within the H. crudus mitochondrial genome sequence that are similar to microsatellite sequence divergence. The repeating 21 nt motif is AAAATGTCAAAAATTTGGACT31.

3.3. Phylogenetic Analysis

The phylogenetic analysis performed show that Haplaxius crudus resolved with Nilaparvata lugens (Delphacidae) with strong bootstrap support (100) (Figure 4). There was also strong support (100) for Aphis aurantii (aphids) resolving near both H. crudus and N. lugens. In general, there is strong support (100) for each clade that comprises an order of insect: the Hemiptera clade that includes H. crudus, N. lugens, A. aurantii, Dolycoris baccarum, and Magicicada tredecassini, the Coleoptera clade that includes Sitophilus oryzae and Chauliognathus opacus, the Odonata clade that includes Nannophya pygmaea, and the Diptera clade that includes Drosophila melanogaster (Figure 4). Based on the pairwise comparison, N. lugens also shows the highest level of sequence homology among the analyzed taxa, differing from H. crudus by 28.3% (Table 3). All other taxa differ from H. crudus by at least 30.7% (Table 3).

4. Discussion

We assembled the complete mitogenome of Haplaxius crudus using PacBio Sequel II SMRTbell™ sequencing technology. The H. crudus mitochondrial DNA demonstrated the typical Hemipteran gene order [10, 11] which follows the ancestral gene order of insects [10]. Gene rearrangement is not uncommon in Hemipteran families such as flat bugs (Aradidae), aphids (Aphididae), and whiteflies (Aleyrodidae) [6, 7]. Furthermore, H. crudus possesses three additional mixed product tRNAs; one within the coding regions of NAD2, one within rRNA-16S, and one immediately upstream of the AT-rich control region. While the H. crudus mitochondrial genome is similar to that of the N. lugens and L. striatellus in content and taxonomically, it is unique based on the presence of additional tRNAs. Cixiidae is accepted as the sister group to Delphacidae based on morphology and sequence homology. Both N. lugens and L. striatellus have unique mitogenomes relative to other insects with both species having major inversions, placing NAD6 upstream of the proline and threonine tRNAs (rather than downstream) and both tRNA-Pro and tRNA-Thr inverted so the order is P-T (not T-P as observed in other insects). Furthermore, N. lugens possesses two extra cysteine tRNAs downstream of NAD2, resulting in the tRNA order of Cys-Cys-Cys-Trp-Tyr, whereas the standard order in insects is Trp-Cys-Tyr [5]. The presence of additional tRNAs and deviations of gene order are thus not surprising in H. crudus. The length of the H. crudus mitochondrial genome falls within the range observed for most insect mitochondrial genomes, including other arthropods [29].

The nucleotide composition of the H. crudus mitochondrial genome is AT biased, which is generally observed in insect mitochondrial genomes [5]. Hemipteran insects from the suborders Fulgoromorpha, Coleorrhyncha, and Heteroptera are typically AC skewed [30]. The control region in the H. crudus mitochondrial genome corresponds to the control region of vertebrate mitogenomes and contains the origin sites for transcription and replication [31]. In the H. crudus mitochondrial genome, 18 noncoding regions ranging from 1 to 1,210 bp were observed in addition to the control region. This region corresponds to the transcriptional and replicational control regions typical for insect mitochondrial genomes, also referred to as the AT-rich region. This is not uncommon among insects as the Adoxophyes mitochondrial genome has 26 noncoding intergenic regions [32]. In this insect, the control region is less likely to be less variable than coding regions of the mitochondrial genome due to the high AT content that consequently limits the usefulness as a diagnostic marker [33]. The overlaps observed with tRNA-Ile/tRNA-Gln are consistent with findings by Lee et al. [32] and Lessinger et al. [34], and the overlaps documented in coding regions (ATP8/ATP6 and ND4/ND4L) were consistent with overlaps in other insect species [15, 32, 34, 35]. The unique AAAATGTCAAAAATTTGGACT repeats within the H. crudus mitochondrial genome sequence are similar to microsatellite sequence divergence and have potential to be used in population genetic studies. Phylogenetic analyses in MEGA X demonstrated that H. crudus is monophyletic with Hemipterans in the families Delphacidae and Aphididae. Our ML analysis using the Tamura–Nei model confirms putative lineage of H. crudus within the suborder Auchenorrhyncha. The ingroup taxa A. aurantii (MN397939.1) and N. lugens (NC021748.1) are monophyletic with H. crudus (MT385107). Mitochondrial genomes may provide a better approach to resolving intractable phylogenetic relationships than single gene analyses [3, 33, 36].

The sequence of the mitochondrial genome of H. crudus is intended to be a resource to aid in future phylogenetic and population genetic studies on cixiids and H. crudus, respectively. Due to the presence of multiple coding regions, conserved regions, and variable regions, utilization of the full mitochondrial genome will serve as a valuable phylogenetic marker as more full mitochondrial genome sequences are generated for other species of Haplaxius, other genera in the tribe Oecleini, and more groups of cixiids in general in order to better understand the phylogenetic relationships of the large and economically important group.

Data Availability

Data are deposited in GenBank (https://www.ncbi.nlm.nih.gov/genbank/) accession no. MT385107. The data are scheduled to be released upon publication, so an annotated Fasta file is provided as a supplemental file.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors sincerely appreciate the International Society of Arboriculture (ISA) and the Goss laboratory at the University of Florida for their research support. In addition, the authors would like to thank the HiPerGator IT personnel and support staff at the University of Florida for their support and contributions. The authors thank the UF-Emerging Pathogens Institute for seed funding. This work was also funded by a cooperative agreement with USDA, HATCH Project FLA-FTL-005539, and the UF Plant Pathology Department.

Supplementary Materials

The complete annotated Fasta file for the mitochondrial genome is provided (file name: Hcr_mtgnm). Sequence has not been released by GenBank at time of submission so sequence is not provided. (Supplementary Materials)