Abstract

Lucilia sericata larvae are used as an alternative treatment for recalcitrant and chronic wounds. Their excretions/secretions contain molecules that facilitate tissue debridement, disinfect, or accelerate wound healing and have therefore been recognized as a potential source of novel therapeutic compounds. Among the substances present in excretions/secretions various peptidase activities promoting the wound healing processes have been detected but the peptidases responsible for these activities remain mostly unidentified. To explore these enzymes we applied next generation sequencing to analyze the transcriptomes of different maggot tissues (salivary glands, gut, and crop) associated with the production of excretions/secretions and/or with digestion as well as the rest of the larval body. As a result we obtained more than 123.8 million paired-end reads, which were assembled de novo using Trinity and Oases assemblers, yielding 41,421 contigs with an N50 contig length of 2.22 kb and a total length of 67.79 Mb. BLASTp analysis against the MEROPS database identified 1729 contigs in 577 clusters encoding five peptidase classes (serine, cysteine, aspartic, threonine, and metallopeptidases), which were assigned to 26 clans, 48 families, and 185 peptidase species. The individual enzymes were differentially expressed among maggot tissues and included peptidase activities related to the therapeutic effects of maggot excretions/secretions.

1. Introduction

The maggots of certain flies have been used as traditional medicines for centuries [1] but modern maggot debridement therapy (MDT) was established approximately 100 years ago. MDT was then widely used for the treatment of chronic wounds until the mid-1940s, since when the technique has been supplanted by antibiotics and improved wound care [2]. MDT, using exclusively Lucilia sericata maggots, has recently undergone a renaissance, and medical maggots are now approved as an alternative approach for the treatment of many types of chronic and necrotic wounds, including diabetic ulcers [35], postsurgical wounds [6], and burns [7, 8]. Maggots applied to hard to heal wounds debride the necrotic tissue, disinfect the wound, and stimulate the healing process [9]. The beneficial effect of MDT cannot be attributed to the single molecule but rather to the synergistic action of various bioactive substances, including large variety of proteolytic enzymes, which are present in maggots excretions/secretions products (MEP) [10].

Debridement, the removal of necrotic tissue and wound slough, is a well-documented effect of MDT [1113]. The maggots perform physical debridement with their mandibles, but chemical debridement with enzymes is the most important component. They do so by releasing their digestive enzymes into the wound, which liquefy necrotic and infected tissues, before it is consumed back. Chambers et al. identified three classes of proteolytic enzymes (aspartic, serine, and metallopeptidases) from MEP and proposed that mainly serine peptidases are responsible for the superficial debridement activity of maggots [14]. Only two such peptidases (serine peptidases) have been identified and characterized thus far. Chymotrypsin 1 was identified from MEP and produced in the recombinant form [15]. Recombinant enzyme was shown to degrade the eschar from venous leg ulcers in vitro [15] and to be unaffected by two endogenous inhibitors, α1-antichymotrypsin and α1-antitrypsin from wound eschar [16]. We recently produced and characterized Jonah-like chymotrypsin, which digested three specific extracellular matrix proteins (laminin, fibronectin, and collagen IV) in vitro and proposed its function in wound debridement [17].

The natural habitat of L. sericata larvae is rotting organic matter such as cadavers and excrement, but this ecological niche also favors many microorganisms so the larvae must have adequate defenses against infection. The maggots therefore protect themselves by producing many antimicrobial substances [1822] and by digesting microbes, which are thus eliminated in the larval gut [23, 24]. Interestingly, MEPs also show activity against relevant human pathogens including antibiotic-resistant bacterial strains [2527] and biofilms [2831]. Recently, two molecules with antibiofilm activity have been identified from MEP. Affinity purified DNase disrupted Pseudomonas aeruginosa biofilm [32] and recombinant chymotrypsin I was active against Staphylococcus epidermidis and S. aureus biofilms [33].

Surprisingly, medical maggots also directly promote wound healing [10]. MEPs stimulate fibroblast migration [34, 35] and proliferation [36] and increase angiogenesis [37, 38]. MEPs also influence the activation of the human complement system [39], reduce proinflammatory responses [4043], and induce fibrinolysis [44]. Recently we discovered that MEPs contain peptidases that influence blood coagulation as part of the wound healing process [45] and this activity was attributed to Jonah-like serine peptidase. Recombinant enzyme was shown to reduce the clotting time of human plasma by substituting for the intrinsic clotting factors kallikrein, factor XI, and factor XII, respectively [17].

However, although several molecules have been identified from MEP, it is still recognized as a largely unexplored source of compounds with therapeutic potential. The future studies shall focus on identification, isolation, and/or production of effector molecules and testing of their therapeutic potential. Here, we analyzed the transcriptome of different larval tissues to systematically identify MEP peptidases. It is not clear whether MEP components are exclusively produced by salivary glands or also by other tissues, so we dissected three individual tissues (gut, crop, and salivary glands) as well as the remaining larval biomass to generate tissue-specific sequence data. The extracted mRNA was sequenced using the Illumina HiSeq2000 Genome Analyzer platform and paired-end read technology. After preprocessing, 123,856,654 paired reads remained in the panel of libraries. These were processed further to yield a final assembly of 41,421 contigs in 17,479 clusters, resulting in the identification of 1729 contigs in 577 clusters encoding five different functional classes of proteolytic enzymes.

2. Materials and Methods

2.1. Preparation of Biological Material

First-instar L. sericata maggots were obtained from BioMonde GmbH (Barsbüttel, Germany) and were cultured under sterile conditions on Columbia agar plates with “sheep blood +” (Oxoid Deutschland GmbH, Wesel, Germany) at 28°C for 48 h in the dark. The larvae were cleaned and then infected with a mixture of Pseudomonas aeruginosa (DSM 50071) and Staphylococcus aureus (DSM 2569) as previously described [21]. The larval gut, salivary glands, and crop were dissected under a binocular microscope 8 h after infection. Dissected tissues and the remaining larval body for Illumina sequencing were frozen in liquid nitrogen and stored at –80°C. Samples for qRT-PCR analysis were processed immediately as described below.

2.2. RNA Isolation and Illumina Sequencing

Total RNA was extracted from individual tissues and the rest of the larval body using the innuPREP RNA Mini Isolation Kit (Analytik Jena, Jena, Germany) following the manufacturer’s instructions. Additional RNA purification, quantification, and quality control were carried out as previously described [46]. An additional Turbo DNase treatment (Thermo Fisher Scientific, Waltham, MA, USA) was applied before the second purification step to eliminate contaminating DNA. The DNase was removed and the RNA purified using the RNeasy MinElute Clean up Kit (Qiagen, Hilden, Germany) following the manufacturer’s protocol. RNA was eluted in 20 μL Ambion RNA Storage Solution (Thermo Fisher Scientific) and poly(A)+ mRNA was prepared using the Ambion MicroPoly(A) Purist Kit according to the manufacturer’s instructions (Thermo Fisher Scientific). The integrity and quantity of the mRNA was confirmed using an Agilent 2100 Bioanalyser and RNA Nano chips (Agilent Technologies, Santa Clara, CA, USA).

Transcriptome sequencing was carried out on an Illumina HiSeq2000 Genome Analyzer platform using paired-end (2 × 100 bp) read technology for the larval tissues, with RNA fragmented to an average length of 150 nucleotides. Sequencing was carried out by Eurofins MWG Operon (Ebersberg, Germany) and resulted in totals of 29, 33, 26, and 34 million reads for the rest of body, gut, crop, and salivary glands, respectively.

2.3. Read Preprocessing

The quality of the reads was checked using the FastQC toolkit [47]. Trimmomatic [48] was used to clip adapters and trim low-quality regions (parameter ILLUMINACLIP:TruSeq3-SE-2:2:30:10 SLIDINGWINDOW:5:20 HEADCROP:15 MINLEN:50). The reads from all libraries were then pooled for assembly and digitally normalized to achieve 100-fold coverage using the Trinity “in silico normalization” tool [49].

2.4. Assembly

The reads from all libraries were assembled de novo in two steps. One assembly was computed using the Trinity assembler [49] followed by 28 individual Velvet/Oases assemblies [50] with k-mer parameters ranging from 21 to 75. In the second step, the resulting transcript sequences were combined and high-quality sequences were extracted using the EvidentialGene pipeline [51]. Potential isoforms were detected by clustering the protein sequences from the EvidentialGene pipeline using CD-HIT [52] with 90% identity.

2.5. Quality Assessment

The CEGMA method [53] was used to assess the completeness of the transcriptome. The detection step of the CEGMA pipeline was replaced by a BLASTp search [54] against the CEGMA EuKaryotic Orthologous Groups (KOG) sequences, because protein sequences had already been identified in the previous step. The completeness of individual sequences was estimated by computing the “ortholog hit ratio” [55] against the D. melanogaster protein sequences.

2.6. Functional Annotation and Peptidase Identification

Putative transposable elements were identified using TransposonPSI [56]. Furthermore, all sequences with HMMER 3.0 [57] hits against Pfam domains [58] as previously described [59] were marked as potential transposable elements. All sequences in clusters with at least one putative transposable element were annotated as transposable elements. All sequences were uploaded to the SAMS web server [60] and automatically annotated using BLAST [54] and HMMER [57] searches against different databases. Next all the peptidases were identified using the EC numbers [61] from the automatic annotation of the transcriptome data and further classified using MEROPS database [62].

2.7. Mapping and Digital Gene Expression Analysis

Digital gene expression analysis was carried out by using QSeq Software (DNAStar Inc.) to remap the Illumina reads onto the reference backbone and then counting the sequences to estimate expression levels. For read mapping, we used the following parameters: n-mer length = 40; read assignment quality options required at least 40 bases (the amount of mappable sequence as a criterion for inclusion) and at least 90% of bases matching (minimum similarity fraction, defining the degree of preciseness) within each read to be assigned to a specific contig; maximum number of hits for a read (reads matching a greater number of distinct places than this number are excluded) = 10; n-mer repeat settings were automatically determined and other settings were not changed. Biases in the sequence datasets and different transcript sizes were corrected using the RPKM algorithm (reads per kilobase of transcript per million mapped reads) to obtain correct estimates for relative expression levels. For the selected protease groups, gene expression (log2 transformed RPKM values) was visualized as heat maps using custom scripts and matplotlib [63] to generate a 2D plotting library using the Jet Colormap [64].

2.8. Quantitative Reverse Transcription Real-Time PCR Reaction (qRT-PCR)

A subset of differentially expressed peptidases from each peptidase clan was validated by qRT-PCR. Total RNA isolation, cDNA synthesis, primer design, and qRT-PCR experiments were performed as described previously [65]. Data were analyzed in Rest 2009 (http://www.gene-quantification.de/rest-2009.html) using the ΔΔCq method [66]. Relative mRNA values of individual genes were adjusted to the sample with the highest value and normalized using the 60S acidic ribosomal protein P0 (RPLPO) and 40S ribosomal protein S3 (RPS3) genes.

3. Results and Discussion

3.1. Preprocessing and Assembly of Sequence Reads

After preprocessing, 123,856,654 paired reads remained in the four tissue-specific libraries (Table 1). The “in silico normalization” of the pooled reads reduced the total number to 16,236,645. The normalized reads were assembled using the Trinity assembler and 28 individual Velvet/Oases assemblies, producing a total of 1,794,145 contigs (Additional file 1; see Supplementary Material available online at http://dx.doi.org/10.1155/2016/8285428). Filtering and clustering of the contigs using the EvidentialGene pipeline and CD-HIT produced a final set of 41,421 contigs in 17,479 clusters. The final assembly contained 41,421 contigs covering a total of 67.79 Mb, a mean contig length of 1.64 kb, and a N50 contig length of 2.22 kb.

3.2. Assembly Quality Control

We found that 80–89% of the nonnormalized reads could be mapped to the final assembly (84.61% body, 80.30% crop, 88.92% gland, and 87.95% gut). CEGMA identified 235 of the 248 (94.76%) core genes with an ortholog ratio of 2.79. We then mapped 26,950 Drosophila melanogaster protein sequences to the final assembly, and 9961 (36.96%) could be aligned with a mean ortholog hit ratio of 0.71. The mean ortholog number (number of contigs mapping to the same D. melanogaster protein sequence) was 3.5.

3.3. Annotation

TransposonPSI and HMMER searches identified 470 putative transposable elements in 288 clusters. The automatic functional annotation pipeline involving BLASTp searches against different databases revealed 17,864 (43.12%) “high confidence” annotations and 15,155 (36.58%) “hypothetical proteins.” Gene ontology (GO) analysis was used to explore the functional characteristics of all contigs and assign them to three independent categories: biological processes, molecular function, and cellular components (Figure 1). In addition, a BLASTp search against the MEROPS database v9.12 [67] identified 1729 contigs in 577 clusters as peptidases. The identified peptidases represented ~4% of the total number of contigs. This result correlates with data from other organisms where peptidases represent more than 2% of all genes [68].

3.4. Peptidases

Peptidases are proteolytic enzymes that hydrolyze peptide bonds and they are found ubiquitously in all biological systems from viruses to vertebrates. Based on the key amino acid residues responsible for proteolytic activity, six different peptidase classes are recognized (aspartic, cysteine, serine, glutamic, threonine, and metallopeptidases) as well as further unclassified peptidases [69]. From the 1729 L. sericata contigs (in 577 clusters) identified as peptidases, 1655 contigs (in 557 clusters) were assigned to one of five peptidase classes (aspartic, cysteine, serine, threonine, and metallopeptidases) whereas 74 contigs (in 20 clusters) remained unassigned. As summarized in Figure 2, serine peptidases were the most prominent class (837 contigs in 270 clusters) followed by metallopeptidases (565 contigs in 202 clusters), cysteine peptidases (145 contigs in 45 clusters), threonine peptidases (51 contigs in 25 clusters), and aspartic peptidases (57 contigs in 15 clusters). The MEROPS database was used to subdivide the identified enzymes further into clans (peptidases with evolutionarily conserved tertiary structures, orders of catalytic residues, and common sequence motifs around the catalytic site), families (peptidases with similar amino acid sequences), and species (peptidases with similar properties and a unique MEROPS identity) [70, 71]. Accordingly we identified 26 clans containing 48 families and 185 peptidase species (Table 2). We found that almost half of the identified clusters represented serine peptidases in clan PA and family S1.

GO analysis was then carried out to assign functional categories to each of the identified peptidase clusters. We were able to assign 534 of 577 clusters to three different categories: biological process (345 clusters), molecular function (533 clusters), and cellular component (70 clusters) (Figure 3). We found that most of the peptidases (310 clusters) are involved in the biological process (level 3) category of “primary metabolic process” (Figure 3(a)). The molecular function (level 3) of most peptidases was either catalytic activity (201) or hydrolase activity (253) (Figure 3(b)) as expected given the molecular role of peptidases. Interestingly, only 70 clusters were assigned a cellular component function (Figure 3(c)).

3.5. Aspartic Peptidases

Aspartic peptidases contain an aspartic acid residue at the active site [72]. An aspartic peptidase activity was previously identified in maggot MEPs using class-specific inhibitors [14] and the corresponding gene was shown to be strongly upregulated in L. sericata larvae following an immune challenge [18]. We identified 57 contigs in 15 clusters representing aspartic peptidases, and these were further assigned to two clans, three families, and seven peptidase species (Table 3) with different tissue-specific expression profiles (Additional file 2).

3.5.1. Family A1

Preprocathepsin D-like peptidases are the largest group of aspartic peptidases. The majority of clusters included a signal peptide, propeptide, and mature enzyme containing all of the conserved catalytic and substrate-binding residues found in human lysosomal cathepsin D [69]. With the exception of cluster LST_LS5572 and two incomplete clusters (LST_LS009595 and LST_LS016491), all clusters lacked the polyproline loop (DxPxPx(G/A)P) (Figure 4). The absence of this loop is a characteristic of pepsin and digestive cathepsin D peptidases in the Brachyceran infraorder Muscomorpha and may be associated with the extracellular role of these enzymes [73]. The aspartic peptidase gene previously identified in challenged L. sericata larvae [GenBank: FG360526] is homologous to cluster LST_LS005916, which was predominantly expressed in the larval gut. Aspartic peptidases can kill bacteria in vitro in an acidic medium [74] and may also kill bacteria in the Musca domestica larval midgut (Espinoza-Fuentes, Terra 1987). Based on its localization in the gut and induction by an immune challenge, we propose a similar role for this L. sericata aspartic peptidase. However, heat map analysis (Additional file 2) revealed that the majority of A1 family aspartic peptidases are predominantly expressed in the larval gut, suggesting a role in digestion and/or the elimination of bacteria.

3.5.2. Family A22

This family of intramembrane peptidases comprises two subfamilies. The A22a subfamily is typified by presenilin, an enzyme that plays central role in intramembrane proteolysis [75] and the pathogenesis of Alzheimer’s disease [76]. The A22b subfamily is typified by impas 1 peptidase, which is responsible for the degradation of liberated signal peptides and may play an essential role in the development of D. melanogaster larvae [77]. We identified four clusters assigned to three peptidase species encoding members of the A22b subfamily, and two of them (LST_LS005714 and LST_LS015224) were strongly upregulated in the salivary glands (Additional file 2) as previously reported in D. melanogaster [77]. We therefore propose a similar function for the L. sericata peptidase in larval development.

3.5.3. Family A28

Only one contig in one cluster was identified assigned to family A28. A homologous skin aspartic peptidase (SASPase) was recently identified in human skin [78] although its biological role remains unclear.

3.6. Cysteine Peptidases

Cysteine peptidases are ubiquitous and mediate diverse biological processes, including immune responses [79], extracellular matrix remodeling [80], and development and apoptosis [81]. The deregulation of cysteine peptidases is associated with human diseases such as cancer and atherosclerosis [82]. There are 65 cysteine peptidase families listed in MEROPS database v9.12 [62], 56 of which are assigned to 10 annotated clans, whereas 9 families remain unassigned. Our L. sericata transcriptome contained 145 contigs in 45 clusters belonging to 6 clans, 11 families, and 23 peptidase species (Table 4).

3.6.1. Family C1

The biggest family of cysteine peptidases in the L. sericata transcriptome is clan CA family C1, and its members were moderately abundant in all tissues (Additional file 3). Of 16 clusters, 12 were assigned to 7 peptide species including several with known roles. These include insect 26/29 kDA peptidase, which plays a role in immunity [83, 84], vitellogenic cathepsin B, which degrades vitellogenin [85], and bleomycin hydrolase, whose natural function remains unknown although it can inactivate bleomycin [69]. Heat map analysis (Additional file 4) revealed only three papain homologs present mostly in the larval gut (LST_LS004517, LST_LS006643, and LST_LS006644), and two of them (LST_LS004517 and LST_LS006644) are identical to previously identified partial sequences of L. sericata cysteine peptidases [GenBank: FG360492, FG360504] that are upregulated in response to an immune challenge [18]. Although, papain-type cysteine peptidases were previously shown to play important digestive role in ticks [86], hemipterans [87], and beetles [88], their digestive role in dipteran species remains unclear [89]. Based on the strict localization in the L. sericata gut and induction by an immune challenge, we speculate that these enzymes are probably required for the elimination of bacteria in the larval gut [23] rather than the digestion of food, but additional experiments are needed to confirm their specific function.

3.6.2. Family C13

The C13 family of cysteine peptidases comprises two types of enzymes. The first is the asparaginyl endopeptidases, which were originally found in legumes [90] and later in schistosomes [91], mammals [92], and recently also arthropods [93]. These are acidic lysosomal enzymes that favor asparagine at the P1 position [94] and whose roles include antigen presentation [95], enzyme transactivation [96], and blood meal digestion [97]. The second is the glycosylphosphatidylinositol (GPI):protein transamidases, which are required for the removal of C-terminal peptides and the attachment of GPI anchors [98]. We identified two clusters encoding GPI:protein transamidases and two clusters remained unidentified.

3.6.3. Family C14

Caspases are intracellular endopeptidases that are highly specific for the cleavage of aspartyl bonds. With the exception of caspase 1, which is responsible for the production of interleukin-1β in monocytes [69], most caspases regulate apoptosis by taking part in a protease cascade [99]. The D. melanogaster genome encodes seven caspases. Dronc (Drosophila Nedd2-like caspase), Dredd (death related ced-3/Nedd2/like), and Strica (serine/threonine rich caspase) possess long N-terminal domains and function as upstream or initiator enzymes, whereas Drice (Drosophila ICE), Dcp-1 (death caspase-1), and Decay (death executioner caspase related to Apopain/Yama) are downstream or effector caspases [100, 101]. Damm (death-associated molecule related to Mch2) caspase shares the features of both groups but its biological role is not fully understood [69]. The L. sericata transcriptome database contained eight clusters in seven peptidase species representing caspases, with different tissue-specific expression profiles (Additional file 4). Phylogenetic analysis (Figure 5) revealed one L. sericata homolog each for the effector caspases Dcp-1 and Decay, two homologs for Drice, one homolog each for the initiator caspases Dredd and Strica, and two homologs for Dronc. We did not find a sequence representing the D. melanogaster Damm caspase.

3.6.4. Other Cysteine Peptidase Families

Several cysteine peptidase families were more or less equally distributed among the L. sericata tissues we tested, and these are probably required for essential cellular functions. The C2 family of calcium-dependent peptidases (calpains) is formed of ubiquitous, intracellular, neutral peptidases, associated with diverse biological functions ranging from signal transduction to apoptosis [102]. Ubiquitinyl hydrolases (family C12) are intracellular enzymes that remove ubiquitin from ubiquitinylated proteins and peptides [69]. Members of family C15 are ubiquitous, intracellular peptidases that remove pyroglutamate from the N-terminus of peptides and hydrolyze biologically active peptides such as neurotensin and gonadotropin [103]. Gamma-glutamyl hydrolases (family C26) are primarily lysosomal enzymes, which are widely distributed in nature and probably required for the turnover of cellular folates [69]. Hedgehog proteins (family C46) are self-splicing, two-domain signaling proteins originally discovered in D. melanogaster [104]. They are found in most metazoan species and play multiple roles in pattern formation during development [105]. Members of family C54, which was first discovered in the budding yeast Saccharomyces cerevisiae, are necessary for autophagy [106]. Otubains (family C65) are isopeptidases involved in the removal of ubiquitin from polyubiquitin [107]. These enzymes share no homology to other deubiquitinylating enzymes but belong to the ovarian tumor family (OTU) and possess a cysteine peptidase domain [108].

3.7. Metallopeptidases

The metallopeptidases are a ubiquitous and highly diverse group of enzymes containing both endopeptidases and exopeptidases. MEROPS database v9.12 [62] lists more than 15 clans and 71 families involved in diverse biological processes such as digestion, wound healing, reproduction, and host-pathogen interactions. Although these enzymes vary widely at the sequence, structural, and even functional levels, all members require a metal ion for catalytic activity [69]. More than 30% of all clusters in our L. sericata transcriptome database (202 clusters) were found to encode metallopeptidases, which were further assigned to 9 clans, 20 families, and 53 peptidase species (Table 5). The metallopeptidases are therefore the second largest group of peptidases in the L. sericata transcriptome and the most diverse in terms of the number of families. Although the variability and abundance of metallopeptidases in L. sericata indicate their importance, their roles are not well understood and few studies have addressed specific biological activities. A metallopeptidase with exopeptidase characteristics and a pH optimum of 8 was detected in L. sericata MEPs using FITC-casein as a substrate [14] and a partial sequence encoding a clan MC; family M14 metallopeptidase [GenBank: FG360509] was identified among the upregulated genes in immune challenged larvae [18].

3.7.1. Family M1

Family M1 is one of 16 peptidase families found in the genomes of all forms of cellular life [109]. This family mostly comprises membrane-bound or cytosolic exopeptidases that remove the N-terminus of their substrates. However the specificity of the S1 subsite varies considerably, which allows this family to be involved in many different biological processes [110]. Insect M1 peptidases are mainly expressed in the gut, where they play important intermediate roles in protein digestion [111] as well as host-pathogen interactions. Membrane-bound aminopeptidases in the gut are receptors for Bacillus thuringiensis toxins in several insect species [112114]. Aminopeptidases have also been detected in other insect tissues, such as the fat body [115], salivary glands [116], and Malpighian tubes [116]. Although their interactions with B. thuringiensis toxins have been confirmed, their endogenous role is unclear [116]. Aminopeptidase N in the hemocoel plays an important role in the postembryonic development of the pest moth Achaea janata [117]. We identified 33 clusters encoding 8 peptidase species (Table 5) and 6 of them are predominantly expressed in the larval gut (Additional file 4).

3.7.2. Family M2

Family M2 contains angiotensin converting enzyme (ACE), the dipeptidyl peptidase that removes dipeptides from the C-terminus of angiotensin. ACE was originally identified in mammals, where it regulates vascular homeostasis [69]. The first insect ACE was found in M. domestica [118] and several ACE paralogs have been identified in every insect genome investigated thus far [119]. Insect ACE cleaves peptides with roles in development [119, 120], reproduction [121], and immunity [122, 123]. Recently, ACE was shown to be involved in aphid-plant interactions by modulating the feeding behavior and survival of aphids on plants [124]. Six ACE paralogs were identified in the D. melanogaster genome, but only Ance and Acer are active enzymes [125]. These enzymes have distinct tissue localization and substrate profiles, but their exact role is unclear. Ance is expressed mostly in the gut and around the reproductive organs, thus suggesting a role processing peptides in gut muscle cells [126] and during spermatogenesis [121]. In contrast, Acer was exclusively found in developing heart cells [127]. We identified 26 clusters belonging to the M2 family, 16 of which were assigned to the Ance peptidyl-dipeptidase species (Table 5) and were predominantly expressed in the larval gut (Additional file 4). Based on this localization, we speculate that L. sericata Ance plays a similar role to its D. melanogaster ortholog. Interestingly, D. melanogaster Ance was shown to hydrolyze the two important bioactive peptides angiotensin I and bradykinin [119], which are the major substrates of mammalian ACE. It would be interesting to see whether L. sericata Ance can also cleave these substrates, which would suggest a potential endogenous role in hormonal signaling.

3.7.3. Family M3

The L. sericata transcriptome was shown to contain mitochondrial intermediate peptidase and thimet oligopeptidase, which were expressed similarly in all the tissues we sampled (Additional file 4). Both enzymes are intracellular endopeptidases. Mitochondrial intermediate peptidase processes mitochondrial protein precursors during their import into the mitochondria [128], whereas thimet oligopeptidase degrades small peptides (5–53 residues) with broad specificity and plays an important role in antigen presentation [129].

3.7.4. Family M8

Two L. sericata clusters belong to family M8, which is typified by leishmanolysin, an important virulence factor found in leishmania parasites [130]. Leishmanolysin is a membrane-bound peptidase which degrades extracellular matrix proteins, thus enabling parasite migration [131]. Furthermore, a D. melanogaster M8 metallopeptidase was found to be involved in cell migration during embryogenesis and coordinated mitotic progression [132].

3.7.5. Family M10

Family M10 comprises secreted matrix metalloendopeptidases (MMPs) that are synthesized as inactive proenzymes and become functional in the extracellular environment. MMPs are strongly conserved and have been identified in plants [133], cnidarians [134], nematodes [135], insects (Vilcinskas and Wedde 2002), and humans [136]. As their name indicates, MMPs play important roles in extracellular matrix remodeling and turnover. Aberrant MMP activity is associated with many forms of cancer, making them medically relevant [137]. Most MMPs are oncogenic, that is, higher activity promotes cancer, but some (including MMP3 and MMP8) have the opposite effect [138]. It is difficult to determine their precise individual roles because there are 24 human MMPs with overlapping expression profiles and activities, but insects could be used as a simplified model to probe their functions in more detail. Only two D. melanogaster MMPs have been described [139, 140], as well as three from the red flour beetle Tribolium castaneum [141] and one from the greater wax moth Galleria mellonella [142]. All insect MMPs play important physiological roles and some also promote tumor progression, suggesting they have similar functions to their human counterparts [143]. We identified only four clusters representing L. sericata MMPs, which were assigned to two peptidase species (Table 5). These enzymes were generally expressed at low levels but were slightly upregulated in the larval gut (Additional file 4). The role of these enzymes remains unknown and further studies are required to clarify their physiological functions and whether L. sericata MMPs contribute to the degradation of extracellular matrix proteins in human wounds.

3.7.6. Family M12

Family M12 comprises two subfamilies, namely, subfamily M12a, which is typified by astacin, and subfamily M12b, which is typified by adamalysin. Astacin is an endopeptidase, originally identified in the crayfish Astacus astacus, which is probably involved in digestion [69]. Hundreds of astacins have been identified in many different species, but no examples have yet been identified in plants and fungi [144]. In addition to digestion, astacins may also play roles in embryogenesis and extracellular protein remodeling [145]. Adamalysins are membrane-bound proteins with disintegrin and metallopeptidase domains. They have a broad substrate range and are therefore involved in many important physiological processes, such as protein shedding, development, and spermatogenesis [146]. Adamalysins are also known to facilitate cell signaling and have been implicated in carcinogenesis, making them medically relevant [147]. We identified 13 L. sericata clusters representing subfamily M12a and another 13 clusters representing subfamily M12b. Only one cluster (LST_LS007850) was mainly expressed in the larval gut, indicating a potential role in digestion, whereas the others showed diverse tissue-specific expression profiles and their roles remain unclear.

3.7.7. Family M13

Neprilysin and endothelin converting enzyme (ECE) are the two best characterized members of metallopeptidase family M13 in mammals. Neprilysin is involved in biological processes such as reproduction and the modulation of neuronal activity and blood pressure, whereas ECEs are responsible for the final step in the synthesis of endothelins, which are potent vasoconstrictors [69]. Insect family M13 metallopeptidases are membrane-bound peptidases with a broad substrate range and tissue distribution [125]. The precise biological roles of these enzymes in insects are still unclear, but they are associated with immunity to bacteria, fungi, and protozoa [122, 148], metamorphosis [149], reproduction [150], and neuropeptide metabolism [151]. We identified 34 clusters coding for M13 peptidases in L. sericata and they were predominantly expressed in the larval body following the removal of the gut, crop, and salivary glands (Additional file 4). Among 34 clusters, 15 were further assigned to 8 peptidase families, whereas 13 remained unassigned and 6 represent nonpeptidase homologs (Table 5).

3.7.8. Family M14

Most family M14 enzymes are carboxypeptidases that remove a single amino acid residue from the C-terminus of polypeptides. Carboxypeptidases are required for digestion and are widely distributed among insects [152], but they also process bioactive peptides (carboxypeptidase E) and hydrolyze bacterial cell walls (γ-glutamyl-(L)-meso-diaminopimelate peptidase I) [69]. Recently, a partial L. sericata sequence encoding an M14 metallopeptidase was found to be upregulated by an immune challenge [18]. We identified 33 clusters representing M14 family metallopeptidases that were differentially expressed among the L. sericata tissues we tested (Additional file 4). These clusters were assigned to 9 peptidase species, whereas 14 remained unassigned and one was shown to represent a nonpeptidase homolog (Table 5). The previously identified M14 peptidase [GenBank: FG360509] was found to be homologous to cluster LST_LS004029, which is strongly expressed in the gut. The localization of this enzyme in the gut and its induction in response to an immune challenge suggest that it contributes to the elimination of ingested bacteria as previously described for L. sericata larvae [23].

3.7.9. Family M16

Family M16 can be divided into three subfamilies: M16A comprising oligopeptidases such as insulysin, nardilysin and pitrilysin, M16B which includes mitochondrial processing peptidase, and M16C which includes eupitrilysin [69]. We identified four M16A clusters and two peptidase species with pitrilysin-like characteristics. Pitrilysin is an endopeptidase originally found in Escherichia coli which is homologous to human insulin-degrading enzyme [153]. We also identified five M16B clusters and three peptidase species similar to mitochondrial processing peptidase, which cleaves the N-terminal signals of mitochondrial proteins during their import from the cytosol [69]. We also identified one M16C cluster representing one peptidase species similar to eupitrilysin.

3.7.10. Family M17

Leucyl aminopeptidase (LAP) is a cocatalytic peptidase; that is, it requires two metal ions for activity, with diverse biological roles [154]. We identified four clusters and two peptidase species similar to LAP, with the strongest expression in gut tissues (Additional file 4). LAPs were previously identified in the digestive organs of blood-feeding parasites including ticks [155], schistosomes [156], and Plasmodium spp. [157] and were found to be involved in the final stage of hemoglobin digestion. Because hemoglobin could also represent part of the L. sericata diet, we can speculate that L. sericata LAPs similarly are required for hemoglobin digestion.

3.7.11. Families M19 and M50

Families M19 and M50 each comprise strictly membrane-bound enzymes. The family M19 dipeptidases degrade extracellular glutathione or inactivated leukotriene D4, whereas the family M50 enzymes regulate gene expression by processing different transcription factors [69]. We identified one cluster coding for a family M19 nonpeptidase homolog and one representing a family M50 S2P peptidase. The latter is a strongly hydrophobic peptidase found on the endoplasmic reticulum membrane. D. melanogaster S2P (ds2p) is required to cleave the sterol regulatory element binding protein (SREBP) and thus helps to regulate lipid biosynthesis [158].

3.7.12. Families M20 and M28

Families M20 and M28 comprise divergent cocatalytic exopeptidases. Family M20 contains only carboxypeptidases, whereas family M28 includes both carboxypeptidases and aminopeptidases. We identified one cluster in family M20, which was tentatively identified as peptidase T and one cluster tentatively identified as a homolog of D. melanogaster putative protein CG10062. Six unassigned clusters were also identified in family M28.

3.7.13. Family M24

Members of family M24 are mostly intracellular, cocatalytic exopeptidases characterized by the so-called pita-bread fold [159], which have been found in every genome sequence published thus far [109]. They are involved in many fundamental biological processes, including the removal of N-terminal methionine residues from nascent polypeptides (methionyl aminopeptidase), intracellular protein turnover, and collagen metabolism (Xaa-Pro dipeptidase). They are also involved in angiogenesis and their specific inhibitors are therefore sought as potential anticancer drugs [160]. We identified nine clusters representing methionyl aminopeptidases and one Xaa-Pro dipeptidase. Clusters LST_LS003866, LST_LS017277, and LST_LS003028 were predominantly expressed in the larval gut, whereas the other M24 family metallopeptidases were expressed at similar levels in all the tissues we investigated (Additional file 4).

3.7.14. Families M48 and M79

The members of families M48 and M79 are membrane-bound metallopeptidases involved in the release of tripeptides from Saccharomyces cerevisiae mating factor [161] and the Ras oncoprotein [162], to facilitate membrane attachment. Both families are medically relevant because of their ability to regulate the function of Ras, which is involved in many forms of cancer. We identified two clusters in one peptidase species coding for M48 family and one cluster in one peptidase species coding for M79. All clusters were expressed at similar levels in all dissected tissues.

3.7.15. Family M49

Dipeptidyl-aminopeptidase III (DPP-3) is an exopeptidase that may be involved in the metabolism of angiotensin peptide and encephalin in mammals [163]. Insect orthologs of DPP-3 were purified from the foregut membrane of the cockroach Blaberus craniifer [164] and from adult D. melanogaster [165]. Purified DPP-3 hydrolyzed an insect neuropeptide (proctocolin), suggesting a role in neuropeptide signaling activity. We identified two clusters and one peptidase species related to DPP-3 expressed at similar levels in all the L. sericata tissues we tested.

3.7.16. Family M67

Family M67 metallopeptidases are responsible for the removal of ubiquitin from ubiquitinylated proteins prior to their degradation in the proteasome. We identified one cluster and one peptidase species representing family M67 expressed at similar levels in all the L. sericata tissues we tested.

3.8. Threonine Peptidases

Threonine peptidases were discovered in 1995 in archaean proteasomes [166]. They are N-terminal nucleophile peptidases belonging to clan PB and can be divided into three families, namely T1, T2, and T3. The T1 family comprises peptidases of the proteasome and related compound peptidases. The proteasome plays a central role in intracellular protein turnover and is a complex supramolecular complex with many subunits [167]. The T2 family comprises the aspartyl glucosylaminases, which are necessary for the degradation of asparagine-linked glycoproteins [168]. The T3 family comprises the γ-glutamyltransferases, which play a key role in glutathione metabolism [69]. Among 25 L. sericata clusters identified as threonine peptidases, 21 clusters and 14 peptidase species represented family T1, whereas 4 clusters and 2 peptidase species represented family T2 (Table 6). All 25 clusters were expressed in all the L. sericata tissues we investigated and the expression levels were universally low (Additional file 5).

3.9. Serine Peptidases

Serine peptidases require a serine residue for their catalytic activity and represent one of the most abundant and functionally diverse groups of enzymes. Serine peptidases are involved in a broad range of biological processes including digestion, development, immunity, and blood coagulation [169]. MEROPS database v9.12 [62] lists 45 families in 15 clans as well as further 7 unassigned families. We found that serine peptidases are the largest group of peptidases in the L. sericata transcriptome. We identified more than 800 contigs in 270 clusters, which were assigned to 8 clans, 12 families, and 86 peptidase species (Table 7). These clusters showed a number of distinct tissue-specific expression profiles (Additional file 6).

3.9.1. Family S1

Clan PA family S1 comprises endopeptidases containing the catalytic triad His-Asp-Ser, and this was the largest peptidase family we found in the L. sericata transcriptome. Most S1 peptidases possess an N-terminal signal peptide and are synthesized as propeptides that must be cleaved to generate the active form. S1 peptidases are usually soluble, secreted enzymes, but membrane-bound and inactive homologs have also been described [69]. Many S1 peptidases have been identified in insects, where their roles include digestion [111], immunity [170], wound responses [171], and development [172]. S1 serine peptidases from L. sericata maggots have been associated with several of the beneficial effects of MEPs including blood coagulation [45], biofilm eradication [33], and wound debridement [15]. Although serine peptidases play an important role in MDT, only a small number of complete and partial L. sericata sequences representing these enzymes have been published thus far.

We detected 230 clusters representing S1 peptidases of subfamily S1A, which is typified by chymotrypsin and trypsin. We assigned 216 clusters to 62 peptidase species, whereas 14 clusters represented nonpeptidase homologs (Table 7). Interestingly, only 21 of the 62 species have already been provisionally identified and associated with specific functions, whereas the remaining 41 putative peptidases have not been characterized. Among the identified peptidases, we detected 39 clusters in 7 peptidase species encoding for trypsin-like peptidases (S01.110, S01.116, S01.117, S01.130, S01.A83, S01.A87, and S01.A91). These were mainly expressed in the larval gut (Additional file 6) and probably function as digestive enzymes as reported for other insect species [111]. We provisionally identified two chymotrypsin-like peptidase species, namely, chymotrypsin m-type 2 (S01.168) and Jonah 65Aiv (S01.B05), both of which were strongly expressed in the gut (Additional file 6) as discussed in more detail below. We also identified 8 peptidase species (S01.203, S01.413, S01.421, S01.493, S01.502, S01.507, S01.960, and S01.B27) with different tissue-specific expression profiles (Additional file 6) representing immunity-related peptidases. Melanization peptidase (S01.203), prophenoloxidase-activating peptidase (S01.413), and CLIP-domain prophenoloxidase-activating factor (S01.960) are involved in the regulation of invertebrate innate defenses [170]. Proteolytic lectin (S01.493) was first identified in Glossina spp. where it regulates interactions with trypanosome parasites [173]. TmSPE peptidase (S01.507) [174], Persephone (S01.421) [175], Grass (S01.502), and Spirit (S01.B27) [176] facilitate the activation of Toll pathway signaling, which triggers the synthesis of antimicrobial peptides in response to fungi and Gram-negative bacteria [177]. We also identified ovochymase (S01.024), which was discovered in Xenopus laevis eggs and may play a role in fertilization or early development [178], the Easter peptidase (S01.201) required for dorsoventral patterning in D. melanogaster embryos [179], the Tequila peptidase (S01.461) that mediates long term memory formation in D. melanogaster [180], the proapoptotic DmHtrA2-type mitochondrial peptidase (S01.476) [181], and testis-specific protein 50 (S01.993), which is necessary for spermatogenesis in mammals and is upregulated in breast cancer [182].

All the previously identified L. sericata serine peptidase genes were also identified in the transcriptome dataset (Table 8). Interestingly, only four of these previously described genes could be assigned to peptidase species with a known function, whereas most were identified based on homology to putative proteins in D. melanogaster (Table 8). We also found that although many of the enzymes were detected in MEPs, the corresponding mRNA was predominantly expressed in the larval gut (Additional file 6). The same phenomenon was confirmed for the Jonah-like peptidase, where high expression level of Jonah mRNA was observed in the gut but the native enzyme was only detected in MEPs [17]. These results indicate that peptides in MEPs are not exclusively produced by the salivary glands but rather a combination of the salivary glands, gut, and crop. Although further studies are required to confirm this hypothesis, we suggest that regurgitation and/or vomiting in dipteran species [183] contributes to the production of beneficial MEP molecules in L. sericata larvae.

Cluster LST_LS005873 was tentatively identified as chymotrypsin m-type 2 (S01.168) and this is identical to the previously described L. sericata chymotrypsin I [GenBank: CAS92770]. A recombinant form of this enzyme was shown to degrade wound eschar ex vivo [15] and to degrade microbial surface components recognizing adhesive matrix molecules from the slough [184]. As shown in Additional file 6, we identified seven further clusters representing the same peptidase species (S01.168). Another recombinant serine peptidase known as sericase [GenBank: AAA17384] was shown to enhance fibrinolysis [44]. Sericase was found to be identical to L. sericata trypsin-like serine peptidase, which was proposed to facilitate wound debridement [185]. We identified three clusters (LST_LS007476, LST_LS007613, and LST_LS010750) homologous to sericase, which represent one peptidase species provisionally identified as D. melanogaster putative protein CG7542 (S01.B07). Moreover, the most prominent cluster (LST_LS007476) was also found to be homologous to a previously identified serine peptidase [GenBank: FG360529] which is induced at the transcriptional level following an immune challenge [18]. Our data indicate that sericase is probably involved in several MEP functions including fibrinolysis, debridement, and other immune responses.

Debrilase [GenBank: AJN88395] is a serine peptidase known to play a role in L. sericata MDT. Debrilase is homologous to cluster LST_LS015273, which along with another eight clusters represents one peptidase species provisionally identified as D. melanogaster putative protein CG17571 (S01.A85). All the clusters share the same tissue-specific expression profile with strong upregulation in the gut (Additional file 6). Recently, L. sericata MEPs were shown to reduce the clotting time of human plasma, and this phenomenon was attributed to a serine peptidase activity [45]. Recombinant Jonah-like chymotrypsin was confirmed to reduce the clotting time of human plasma and to degrade certain extracellular matrix proteins [17]. We found 10 clusters representing one peptidase species, tentatively identified as Jonah 65Aiv (S01.B05). These clusters were predominantly expressed in the gut (Additional file 6). Interestingly, cluster LST_LS015269 was found to be homologous to a L. sericata serine peptidase [GenBank: FG360505] that is upregulated in immune challenged larvae thus indicating a role in immunity [18].

3.9.2. Family S8

Family S8 comprises two subfamilies of enzymes. Subfamily S8a is typified by the endopeptidase subtilisin, originally identified in Bacillus subtilis [69], as well as tripeptidyl-peptidase II, an exopeptidase involved in general intracellular protein turnover [69]. Subfamily S8b is typified by kexin (whose mammalian homolog is known as furin), which processes numerous proteins ranging from growth factors and chemokines to extracellular matrix proteins [186], and is therefore associated with diseases such as Alzheimer’s disease, atherosclerosis, and cancer [187]. We identified three clusters of L. sericata subtilisin-like enzymes, two clusters similar to tripeptidyl-peptidase II and two clusters as furin-like enzymes (Table 7).

3.9.3. Family S9

The family S9 prolyl oligopeptidases are intracellular enzymes that strictly cleave substrates containing proline residues, and they are thought to process neuropeptides in humans [188]. Interestingly, a prolyl oligopeptidase was recently identified in the human parasite Schistosoma mansoni. Although the enzyme is not secreted by the parasite, it cleaves the human vasoregulatory peptides bradykinin and angiotensin I in vitro, thus potentially modulating or dysregulating homeostasis in its host [189]. We identified 9 clusters and 4 peptidase species representing L. sericata prolyl oligopeptidases. The clusters showed different tissue-specific expression profiles but three of them (LST_LS016452, LST_LS001966, and LST_LS016700) were predominantly expressed in the gut and/or the salivary glands (Additional file 6). This specific distribution in L. sericata tissues associated with the production of MEPs indicates a potential role in wound homeostasis but more detailed experiments are required to confirm this hypothesis.

3.9.4. Family S10

Family S10 comprises lysosomal carboxypeptidases with predominantly regulatory functions, although hemipteran S10 peptidases were recently shown to be involved in the digestion of food [152]. Among four family S10 clusters identified in L. sericata, two (LST LS003337 and LST_LS015778) were strongly upregulated in the gut, whereas the others (LST_LS004162 and LST_LS016840) were expressed at similar levels in the L. sericata tissues we investigated. The clusters induced in the gut were assigned to the peptidase species without a known function, whereas the other two were annotated as vitellogenic carboxypeptidase-like proteins, suggesting a role in vitellogenesis.

3.9.5. Families S14 and S41

Family S14 comprises cytosolic ATP-dependent Clp endopeptidases and their homologs. Clp peptidases together with their ATP-binding subunits create an oligomeric complex of 20–26 subunits [69] that mediate protein quality control and regulatory degradation [190]. Family S41 comprises endopeptidases that are involved in the degradation of incorrectly synthesized proteins. They possess the catalytic tetrad Ser–His–Ser–Glu, which is unusual for serine peptidase, and neither the position of the active site residues nor the residues themselves are conserved [69]. We found that the L. sericata families S14 (one cluster in one peptidase species) and S41 (three clusters in two peptidase species) endopeptidases were similarly expressed in all the tissues we analyzed and are likely to be involved in the regulation of protein synthesis.

3.9.6. Family S16

The family S16 enzyme Lon is a bacterial ATP-dependent endopeptidase containing a peptidase domain and an ATP-binding domain in a single subunit. Similar enzymes are found in many other organisms [109] where they facilitate the degradation of unfolded proteins [191]. We identified three clusters assigned to two peptidase species, which were present in all L. sericata tissues.

3.9.7. Family S26

Family S26 consists of ubiquitous, membrane-bound enzymes with a catalytic dyad, which are involved in the cleavage of signal peptides thus facilitating the secretion of proteins [69]. We identified two L. sericata family S26 clusters and two peptidase species representing signal peptidases and another cluster that remained unassigned (Table 7). Cluster LST_LS009731 was strongly expressed in the salivary glands, which are known to secret large amounts of protein, thus indicating a role in protein secretion.

3.9.8. Family S28

Family S28 comprises the lysosomal Pro-Xaa carboxypeptidases, which are lysosome-specific exopeptidase found solely in eukaryotes, featuring an unusual selectivity for the cleavage of ProXaa bonds. In humans, such enzymes inactivate angiotensin II and activate plasma kallikrein [192]. We identified three L. sericata family S28 clusters assigned to two peptidase species that were upregulated in the crop and larval body samples (Additional file 6). The precise role of these enzymes remains unclear although they may contribute to the procoagulation activity of MEPs as recently described [45].

3.9.9. Family S49

Only one cluster in one peptidase species was assigned to the family S49. Family S49 is the signal peptidases required for intracellular protein processing and the regulation of protein export [193].

3.9.10. Family S51

Family S51 is typified by aspartyl dipeptidase, an exopeptidase originally identified in Salmonella typhimurium [194] that hydrolyzes α-aspartyl bonds. The crystal structure of the S. typhimurium aspartyl dipeptidase has been solved, revealing an unusual catalytic triad with Ser and His as predicted but Glu instead of Asp [195]. The biological role for this enzyme is not clear, but it seems to be involved in the production of nutritional amino acids [69]. Two clusters belonging to one peptidase species were identified in L. sericata transcriptome.

3.9.11. Family S54

Family S54 is typified by Rhomboid-1, an intramembrane enzyme identified in D. melanogaster [196] that plays an important role in embryonic development by cleaving the Spitz protein and thus activating the epidermal growth factor receptor [197]. We identified three clusters in three peptidase species, which were expressed in all L. sericata tissues.

3.10. qRT-PCR Verification of Gene Expression

To experimentally verify the results from digital gene expression analysis we performed qRT-PCR analysis of one peptidase gene from each peptidase clan (aspartic, cysteine, metallo, threonine, and serine). These genes (Table 9) code for peptidases with various physiological function and show different tissue expression profile. All the tested genes show the similar expression profiles as acquired by digital gene expression analysis (Figure 6).

4. Conclusion

The purpose of this study was to provide an overview of the distribution of proteolytic enzymes in L. sericata, focusing on the tissue-specific expression profiles and potential functions as the basis for further more detailed studies of individual peptidases. We identified 577 clusters representing five classes of proteolytic enzymes (aspartic, cysteine, threonine, serine, and metallopeptidases) which were further assigned into 26 clans, 48 families, and 185 peptidase species with diverse tissue-specific patterns of distribution. We identified all previously described therapeutic peptidases and found that most of them were most highly expressed in the larval gut, thus indicating that the larval gut contributes to the production of beneficial enzymes found in the MEPs. Although the majority of the enzymes we identified were serine peptidases, most of them were novel putative peptidases whose function is unclear, but whose specific tissue-specific expression profiles indicate an important role in MEP activity. Several peptides with the most intriguing expression profiles have been prepared as synthetic genes allowing the functional analysis of the corresponding recombinant peptides.

Additional Points

The dataset supporting the results of this paper is available in the (EBI Short Read Archive [SRA]) repository under the accession number PRJEB7567. The complete study can be accessed directly at http://www.ebi.ac.uk/ena/data/view/PRJEB7567.

Competing Interests

The authors declare that they have no competing interests.

Acknowledgments

The authors acknowledge funding by the Hessen State Ministry of Higher Education, Research and the Arts including a generous grant for the LOEWE research center “Insect Biotechnology and Bioresources.” Zdeněk Franta was supported by an Alexander von Humboldt Research Fellowship for Postdoctoral Researchers. The authors would like to thank Andre Baumann for his help with qRT-PCR analysis. The authors thank Richard M. Twyman for editing of the paper.

Supplementary Materials

The supplementary table 1 shows number of L. sericata transcripts assembled with Trinity and Oases. K-mer sizes vary between 21 and 75 for the Oases assembly. The supplementary figures 1 to 5 show Heat maps with relative expression levels of all L. sericata peptidases. The individual clusters encoding for aspartic (Fig1), cysteine (Fig2), metallo (Fig3), threonine (Fig4) and serine (Fig5) peptidases are depicted on the left, while corresponding clans, families and MEROPS IDs are depicted on the right. Shown are log2-transformed RPKM values (blue resembles lower-expressed genes, while red represents highly expressed genes). The supplementary figure 6 shows the melting curve analysis of all genes which were tested via qRT-PCR.

  1. Supplementary Material