Helicoverpa assulta (Guenée), a moth species belonging to the Noctuidae (Lepidoptera) family, is a destructive agricultural pest that infests multiple cash crops. To assess differences in the gene expression profiles of different tissues in H. assulta, we analyzed the transcriptomes of two tissue types (midgut and hemocytes) using the Illumina Hiseq 2000 platform, on the basis of which we obtained 52076750 and 53404200 high-quality clean reads, respectively. De novo assembly yielded 46146 and 33707 unigenes from the midgut and hemocytes, respectively. After screening, we identified 23726 unigenes differentially expressed between the midgut and hemocytes. Taking the midgut as the control, we detected 7448 and 16278 unigenes that were up- and downregulated in hemocytes, respectively. Gene Ontology functional annotation divided the differentially expressed unigenes (DEUs) into three categories (biological process, cellular component, and molecular function) and 51 branches, whereas the Kyoto Encyclopedia of Genes and Genomes metabolic pathway annotation assigned the DEUs to six categories, mapping these to 258 pathways. In addition, we detected 224918 single-nucleotide polymorphic sites. Our findings based on transcriptome sequencing, data assembly, and functional gene annotation of two different tissues in H. assulta will provide a valuable reference for further excavation and study of functional genes in H. assulta.

1. Introduction

The moth species Helicoverpa assulta, belonging to the family Noctuidae (Lepidoptera), is an important polyphagous pest of agricultural crops, causing potentially widespread damage and immeasurable economic losses. The larvae generally feed on the buds, flowers, and fruits of crop plants, and consume tender stems and leaf buds, with fruit decay being the primary source of yield reductions. H. assulta is widely distributed in numerous Asian countries, wherein addition to tobacco and pepper [1, 2], its hosts include tomato, pumpkin, cowpea, cabbage, and cauliflower. Currently, the control of H. assulta is primarily based on applying chemical insecticides, which pollute the environment and disrupt ecosystem balance to varying extents. Consequently, it is particularly desirable to identify less environmentally damaging prevention and control measures. In this regard, recent developments in high-throughput sequencing technology have witnessed the application of transcriptome sequencing technology to the study of insects. For example, the molecular mechanisms associated with pesticide detoxification by Spodoptera frugiperda have been studied based on second-generation sequencing technology [3], and transcriptome sequencing has been used to solve the problem of sex identification in Helicoverpa armigera [4]. Similarly, transcriptome sequencing has been used to study differences in the expression of different tissues of Agrotis ipsilon to reveal the mechanisms underlying wing development [5]. In contrast, Illumina sequencing has been used to characterize differences in gene alterations, signal pathways, and gene expression patterns associated with the infection of rice by Nilaparvata lugens and Chilo suppressalis [6]. To date, transcriptome sequencing-based studies on H. assulta have primarily focused on the following aspects: gender-related olfactory system differentiation [7], functional characterization of chemosensory genes [8], and host selection and adaptation [9]. However, the accumulated biological information resources for H. assulta are main comparatively limited. To augment these resources, we adopted a high-throughput transcriptome sequencing approach to analyze the transcriptomes of two types of tissues in H. assulta, namely, those of the midgut and hemocytes. Based on Trinity software assembly, functional database annotation, analysis of differential gene expression, and single-nucleotide polymorphism (SNP) site screening, we performed a comprehensive molecular genetic characterization of H. assulta. The information thus obtained will provide a valuable basis for further study of functional genes and differential gene expression in H. assulta and contribute to augmenting its biological information databases. Moreover, gaining new insights into genetic mechanisms underlying the responses of H. assulta may contribute to developing novel strategies for controlling and preventing this important agricultural pest.

2. Materials and Methods

2.1. Acquisition of Insect Tissues and Extraction of RNA

Midgut and hemocyte material was obtained from fourth-instar larvae of H. assulta raised in our laboratory. The larvae were placed on ice and decapitated using scissors. Midguts were extracted from the head-less bodies and immediately immersed in liquid nitrogen for preservation. Having excised the larval gastropods, hemocytes were collected into precooled Eppendorf tubes. Total RNA was extracted from the two tissue types using TriZol reagent according to the manufacturer’s instructions. The purity and integrity of the isolated RNA were determined using an Agilent 2100 Bioanalyzer and 1% agarose gel electrophoresis, respectively, and samples of the qualified preparations were used as templates for transcriptome sequencing.

2.2. Construction of cDNA Libraries and RNA Sequencing

To isolate poly-A mRNAs from total RNA, we used Oligo (dT) magnetic beads. The mRNA was randomly denatured at 94°C for 5 min to obtain small fragments of approximately 200 bp in size, and these mRNA fragments were used as templates for synthesizing first-strand cDNA using random hexamer primers. Subsequent synthesis of second-strand cDNA was performed using a mixture of dNTPs, DNA polymerase I, and buffer solution. The purified double-stranded cDNA thus obtained was then subjected to terminal repair, followed by the addition of poly-A tails and the ligation of sequencing adapters. Finally, cDNA libraries were constructed for each type by enriching the amplified products and subsequently sequenced using the Illumina HiseqTM 2000 platform.

2.3. De Novo Assembly and Functional Annotation

Prior to sequence assembly, the raw reads were cleaned by removing low-quality reads (those with a quality value of less than 20), adaptor reads, and reads containing N (ambiguous) bases using Filter software. The clean reads were then de novo assembled into unigenes using Trinity software [10, 11], for which we calculated Q20 values and GC and N contents. Clean reads with a certain overlap length are initially combined to generate longer fragment contigs. Thereafter, we mapped the read data to contigs based on peer-to-peer mapping, thereby enabling us to determine contigs from the same transcriptome and inter-contig distances. Finally, the contigs were assembled into sequences that could be extended no further at either end. These sequences were defined as single unigenes. The unigene sequences thus obtained were functionally annotated based on searches of the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases using BlastX software.

2.4. Identification of Differentially Expressed Unigenes (DEUs)

The expression of single unigene was normalized using the fragments per kb per million fragments (FPKM) method [12]. Differential expression levels were analyzed based on false detection rate (FDR) [13], with FDR ≤ 0.001 and | log2 ratio | ≥ 1 being used as thresholds values for determining the significance of differential gene expression. Based on GO database searches, we performed functional classification annotation and enrichment analysis of differentially expressed unigenes (DEUs) in different tissues, with a value of ≤0.05 used as a threshold indicating the DEU enrichment. In addition, based on a significant enrichment of KEGG pathways, we identified the main biochemical, metabolic, and signal transduction pathways associated with these DEUs.

3. Results

3.1. Transcriptome Assembly

By sequencing the transcriptomes of two different tissues from H. assulta, we obtained 52076750 and 53404200 clean reads for midgut and hemocytes tissues, respectively, with respective Q20 and GC content values of 98.71% and 98.64% and 46.09% and 49.13%, indicating that the amount and quality of transcriptome sequencing data were high. These read fragments were accordingly used for subsequent transcriptome assembly (Table 1).

As a consequence of assembly, we obtained 46146 midgut unigenes, among which 27717 were of lengths between 200 and 500 bp, accounting for 60.06% of the total unigenes. Moreover, 1905 of these had lengths exceeding 3000 bp, accounting for 4.13% of the total (Figure 1). Comparatively, 33707 unigenes were obtained for hemocytes, among which 20827 were between 200 and 500 bp and 4318 had lengths ranging from 1000–2000 bp, accounting for 61.79% and 12.81% of the total unigenes, respectively (Figure 2). The sequence datasets described in this manuscript have been deposited in the National Center for Biotechnology Information Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) under accession number PRJNA789178.

3.2. Analysis of Differentially Expressed Unigenes

A comparative analysis of the differential expression in the midgut and hemocyte unigenes revealed altered expression profiles for 23726 unigenes, among which 7448 were upregulated and 16278 were downregulated in hemocytes compared with those in the midgut (Figure 3). For example, the expression levels of cytochrome P450 and cathepsin in hemocytes were higher than those in the midgut. In contrast, those of carboxypeptidase inhibitor and disintegrin metalloproteinase were higher in the midgut than in hemocytes.

3.3. Gene Ontology (GO) Annotation of DEUs

GO functional annotation classified midgut and hemocytes unigenes into the three primary GO categories, namely, biological process, cellular component, and molecular function, among which the annotated proteins were found to be mainly grouped into the following subcategories: cellular process (3871), cell (2849), cell part (2848), binding (2975), and catalytic activity (2872) (Figure 4).

3.4. KEGG Pathway Analysis of Differentially Expressed Unigenes

We identified 9270 DEUs that were functionally annotated with respect to KEGG pathways based on enrichment analysis. Among the six KEGG categories, organismal systems, metabolism, human diseases, genetic information processing, environmental information processing, and cellular processes, DEUs were mapped to 258 pathways, with bile secretion, metabolic pathways, amoebiasis, RNA transport, neuroactive ligand-receptor interaction, and regulation of actin cytoskeleton, having the respective highest representations (Figure 5). Among these, 1988, 1331, 829, and 1580 DEUs were annotated to metabolic, signal, immune, and infectious disease pathways, respectively. Moreover, 1541, 605, 146, and 733 unique DEUs were, respectively, annotated to these four pathways (Figure 6). The category immune pathway comprises two pathway types, namely, immune system and immune disease, to which 77.8% and 8% of DEUs were annotated, respectively (Figure 7).

3.5. Analysis of Single-Nucleotide Polymorphisms

Our analysis of SNPs revealed 224918 polymorphic sites of six types (A-G, C-T, A-C, A-T, C-G, and G-T), among which types A-G and C-T accounted for the highest proportion (64.6%) of all markers, with the remaining four types having similar smaller proportions. Furthermore, we identified base transitions as being more common than transversions. Comparative tissue analysis indicated a larger number of SNPs in hemocytes than in midgut tissues. For each tissue type, the proportions of the SNP types and transition/transversion were similar to those of the total SNPs (Table 2).

4. Discussion

Based on our transcriptome sequencing of two different tissue types (midgut and hemocytes) from larvae of the H. assulta moth, we identified a large number of genes differentially expressed between these two tissues, with differences in the expression profiles of 23726 unigenes (7448 up- and 16278 downregulated in hemocytes compared with those in the midgut) being detected. For example, genes encoding cytochrome P450 and cathepsin were upregulated, whereas carboxypeptidase inhibitor and disintegrin metalloproteinase were downregulated. In insects, the cytochrome P450 enzyme system serves as a vital metabolic enzyme system, which plays functional roles in detoxifying and metabolizing exogenous chemicals, including plant secondary metabolites and pesticides. By enhancing P450 gene expression and enzyme activity, insects can regulate their defense state, thereby bolstering resistance to toxic compounds or adverse environmental conditions [1416]. Disintegrin metalloproteinases play roles in cell proliferation, migration, and invasion [17]. In contrast, carboxypeptidase inhibitors inhibit carboxypeptidase activity in the insect intestine [18]. Furthermore, metallocarboxypeptidases are crucial enzymes involved in food digestion and absorption in the digestive tracts of insects. They may also play important roles in insect metamorphosis, development, disease resistance, and immunity. For example, upregulated expression of five midgut metallocarboxypeptidase genes in response to pathogen infection was observed within 24 h of infecting silkworm larvae with Bombyx mori nuclear polyhedrosis virus [19, 20]. In addition, cathepsin has been established to play a significant role in regulating the fecundity and pathogen response of insects [21].

Although we could functionally annotate many of the identified DEUs with reference to the GO and KEGG databases, many genes remain unannotated. We speculate that this could be attributable to an insufficient sequence length and lack of sequence information for comparable species, thereby precluding annotation via homologous sequence alignment.

Based on our KEGG analysis of the transcriptomes of different tissues in H. assulta, we annotated 9270 DEUs to KEGG pathways (metabolic, signal, immune, and infectious disease pathways), among which the largest number were assigned to metabolic pathways, followed by infectious disease pathways, with immune pathways having the fewest annotations. Furthermore, among the genes annotated to immune pathways, 92% were associated with immune system pathways, whereas 22.2% were involved in immune disease pathways. In this regard, the hemolymph system of insects facilitates the transport of nutrients and metabolic wastes as well as plays functional roles in cellular immune regulation [22, 23]. Insects are characterized by an innate mode of immunity, comprising humoral and cellular mechanisms, in which hemocytes are mainly involved in targeting invasive pathogens via phagocytosis, nodules and coating.

Our findings regarding the differential expression of genes in two different tissue types in H. assulta, along with the identification of several SNPs, will provide a valuable scientific basis for further studies examining the growth and development, life metabolism, defense, and pesticide resistance mechanisms of H. assulta. Moreover, they will contribute to the further excavation and characterization of functional genes in H. assulta.

5. Conclusion

High-throughput sequencing is an effective approach for obtaining gene resources for nonmodel organisms. In this study, we used this technique to sequence the transcriptomes of the midgut and hemocytes of H. assulta larvae. We obtained the total unigenes associated with these two tissue types, many of which were successfully annotated based on reference to the GO and KEGG databases. These data will accordingly augment the resources of insect gene databases. We also identified and characterized several SNP sites, thereby significantly contributing to the future development of molecular markers for H. assulta. Moreover, our findings will provide a valuable reference source for the further excavation, development, and utilization of functional genes in H. assulta.

Data Availability

Availability of data 1. The data that support the findings of this study are openly available in the National Center for Biotechnology Information Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) under accession number PRJNA789178. 2. The data that support the findings of this study are included within the article. 3. The data that support the findings of this study are available from the corresponding author by request. 4. Some data are not publicly available due to the information they contain that could compromise the privacy of research participants.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

Kuiyin Li and Yubo Zhang contributed equally to this work.


This work was supported by the special fund of Guizhou Province to Support the City (Prefecture) College Education Quality Improvement Project (Qian Cai Jiao [2021] No. 78), the Science and Technology Foundation of Guizhou Province (Qiankehe foundation [2017] 1001), and Guizhou Province Educational Science and Technology Research (Guizhou Province KY Character [2016] 279, [2016] 278).