Abstract

Saliva of bloodsucking arthropods contains dozens or hundreds of proteins that affect their hosts' mechanisms against blood loss (hemostasis) and inflammation. Because acquisition of the hematophagous habit evolved independently in several arthropod orders and at least twice within the true bugs, there is a convergent evolutionary scenario that creates a different salivary potion for each organism evolving independently to hematophagy. Additionally, the immune pressure posed by their hosts creates additional evolutionary pressure on the genes coding for salivary proteins, including gene obsolescence, which opens the niche for coopting new genes (exaptation). In the past 10 years, several salivary transcriptomes from bloodsucking Heteroptera and one from a seed-feeding Pentatomorpha were produced, allowing insight into the salivary potion of these organisms and the evolutionary pathway to the blood-feeding mode.

1. Introduction

The order Hemiptera (bugs) comprises hemimetabolous insects having in common tubular mouthparts specialized for sucking liquid diets. The diet of Hemiptera is varied, the majority feeding on plants by either tapping the vessels conducting sap or by lacerating and flushing tissues such as leaves or seeds. Within the suborder Heteroptera (true bugs), predatory feeding (with killing of the victim) also occurs, mostly targeting other insects but also including small vertebrates such as giant water bugs and toad bugs, as well as blood or hemolymph feeding (without killing the victim) from vertebrate and invertebrate animals. The mouthparts are not only important for channeling the liquid meal but are extremely important mechanically in finding the proper spot for meal suction [1].

Saliva is produced, sometimes copiously, during the probing phase (the time between mouthpart contact with the food substrate and the commencement of the meal) and throughout the meal [31, 32]. This saliva is ejected at the tip of the maxillae by the salivary channel, which is built in between the interdigitations of the two plates that form the maxillae [33]. Saliva helps probing and feeding physically by liquefying insoluble or viscous tissues or by helping to seal the feeding site in sap suckers, were the phloem is under very high pressure [34]. Saliva has a biochemical role in aiding digestion of the meal, just as we have amylase in our own saliva; most remarkably, predacious bugs inject a highly hydrolytic cocktail into their victims that is digested while the prey is held by the predator, which can then later suck the liquefied victim and discard it as an empty shell. Saliva can also work pharmacologically by preventing the hosts’ defense mechanisms against tissue loss, as occurs with the saliva of blood-feeding insects in preventing blood clotting, for example [35].

Among the Heteroptera, the blood-feeding habit evolved at least twice in the Cimicomorpha families, once in the Cimicidae (containing the bed bug) including the small sister group Polyctenidae (bat bugs), and in the Reduviidae (kissing bugs) from possible predacious or hemolymph-sucking ancestors [1]. Within the Reduviidae, it is possible that the genus Rhodnius (tribe Rhodnini) is monophyletic, having evolved independently of the remaining triatomines (tribe Triatomini) [4749]. The ancestral Cimicomorpha dates back to the Triassic/Jurassic border, over 250 MYA [1]; accordingly, the habit of blood or hemolymph feeding started in this group well before mammals irradiated. Within these hematophagous bugs, blood is the only diet for all immature and adult stages.

To obtain blood in a fluid state, these bugs have to counteract their host’s hemostasis, the physiologic process that prevents blood loss, which includes the triad of platelet aggregation, blood clotting, and vasoconstriction. Blood-circulating platelets may be triggered to aggregate by various signals, including ADP from broken cells and also released by activated platelets, collagen from subendothelial surfaces, thrombin (produced during blood clotting), and thromboxane A2 (TXA2—produced by activated platelets). Blood clotting may be initiated by activation of the intrinsic pathway via activation of Factor XII or by activation of the tissue factor pathway, both converging to the activation of Factor to , which activates prothrombin to thrombin which in turn cleaves fibrinogen into fibrin, forming the blood clot. Activated platelets produce the vasoconstrictor TXA2 and also release stored serotonin and epinephrine, both powerful vasoconstrictors. A single magic bullet cannot properly destroy a redundant and complex obstacle such as this; rather, a magic potion of several antagonists is required.

Saliva of hematophagous arthropods also contains activities that interfere with the host’s immune and inflammatory system in the form of immunomodulatory substances, particularly in ticks, which stay attached to their hosts for days or weeks, in contrast to minutes of host contact by bloodsucking bugs. Saliva also contains antimicrobial compounds that might help to control bacterial growth in the meal, because ejected saliva is reingested with the blood meal during blood feeding. For more detailed reviews on host hemostasis and immunity, see Francischetti et al. [56, 57].

Salivary anticlotting compounds from bloodsucking insects have been known to occur for nearly 100 years [58], while antiplatelet activity was first detected in the 1980s [59, 60] and vasodilators have only been described since the early 1990s [61]. In blood-feeding bugs, anticlotting agents from the saliva and crop of Rhodnius were described first by Hellman and Hawkins in 1965 [62], the first antiplatelet activity was reported in 1981 [59, 63], and Rhodnius salivary vasodilator was reported in 1990 [64]. Anticomplement activities have also been found [65], as well as antihistamine, antiserotonin [66], and antithromboxane [67] activities from Rhodnius saliva. An anesthetic was found in Triatoma infestans saliva in 1999 [68]. None of these earlier reports characterized the molecular nature of the compounds, most of these have been achieved in the past 20 years during the so-called “grind and find” period of discovery. Table 1 lists the molecularly characterized salivary components of blood-feeding Hemiptera.

2. On Sialomes

In the past 10 years, a new method to unveil the salivary potion of hematophagous insects has been practiced in the form of decoding their sialotranscriptomes (from the Greek, sialo = saliva), achieved by random sequencing of 500–2,000 cDNA clones originating from polyA-enriched RNA from the salivary glands of these animals. After assembly of these sequences into contigs (which represent full or near full-length mRNA), these can be compared by bioinformatic tools such as BLAST and rpsblast [69] to other proteins in public databases (such as Swissprot, Gene Ontology [70], and GenBank [71] protein data banks, and CDD, PFAM, SMART and KOG [72], which are motif databases to be explored with the rpsblast tool) to identify closely related sequences and functional motifs. Additional searches for signal sequences indicative of secretion [73], for transmembrane helices [74], and for glycosylation sites [75] are also helpful to attempt functional classification of the protein. We are now on the eve of another revolution, with the increase by thousands of fold on the number of sequences that can be economically sequenced from these libraries, which will allow identification of the lesser expressed (and possibly most potent) proteins.

So far, 12 sialotranscriptomes—all done with less than 3,000 sequenced clones per organism—have been reported from Heteroptera, 11 of which are from blood-feeding Cimicomorpha and one from the seed-feeding Oncopeltus fasciatus. Oncopeltus belongs to the Pentatomomorpha, the closest group to Cimicomorpha [76] (Table 2). Among the Cimicomorpha sialotranscriptomes, only one derives from Cimicidae (the bed bug Cimex lectularius); the remaining are from Triatominae, encompassing four genera (Rhodnius, Triatoma, Dipetalogaster, and Panstrongylus), although some of these transcriptomes have no proteins deposited in public databases and too few expressed sequence tags (ESTs) publicly available. A few isolated protein sequences are also available from GenBank, deriving mostly from predatory bugs. The publicly available proteins are displayed together in Additional File 1, which is a hyperlinked Excel spreadsheet where the putative secreted proteins are organized in one worksheet and the putative housekeeping proteins are displayed in another worksheet.

The secreted proteins can be classified in two major groups, those belonging to ubiquitous protein families and those of unique status among the Hemiptera family, genus, or even species level (Table 3). We will proceed to describe the protein families in the order shown on Table 3.

3. Ubiquitous Protein Families

3.1. Enzymes
3.1.1. Apyrase, 5′-Nucleotidase, and NUDIX Hydrolase

Apyrases are enzymes that can hydrolyze ATP and ADP to AMP [7779]. Initially the existence of true apyrases was doubted, because they could originate from a mixture of enzymes such as adenylate kinase and ATPases; however, their real intracellular existence in the potato was shown later [79, 80] and its function in carbohydrate anabolism and in the promotion of glycosyltransferases was only much later discovered, as indicated [81, 82]. The role of extracellular apyrases on preventing platelet aggregation was demonstrated for the first time in Rhodnius saliva [63, 83, 84] and later shown in the saliva of mosquitoes [8587] and in the vascular endothelium [8890]. The activity from Cimex lectularius was purified and cloned, revealing a new type of enzyme that is ubiquitous in nature [26, 91, 92]. That for T. infestans, though, was found to belong to a completely different family, that of the 5′-nucleotidase family of enzymes [12]. Interestingly, sand flies [93] express salivary apyrases of the Cimex type, while mosquito salivary apyrases belong to the 5′-nucleotidase family [87, 94], clear examples of convergent evolution.

Nudix hydrolases or bis(5′-nucleosidyl)-tetraphosphatases (EC: 3.6.1.17) are enzymes that hydrolyze nucleotides joined by their phosphate groups such as AP4A or AP5A in the case of diadenosine nucleotides, which are known agonists of platelet aggregation and inflammation [9598]. C. lectularius sialotranscriptomes presents clear evidence of such enzymes, but the activity in salivary homogenates was never studied.

Lacking in these Heteroptera sialotranscriptomes are additional nucleotide-acting enzymes, such as endonucleases, found in mosquitoes and sand flies [99101], and adenosine deaminase, found also in mosquito and some, but not all, sand flies [102104].

3.1.2. Acetylcholinesterases

Four well-expressed and closely related isoforms of a typical acetylcholinesterase enzyme were found in the sialotranscriptome of C. lectularius [45]. A single transcript from the same family was also found in Triatoma matogrossensis. Although most acetylcholinesterases are extracellular membrane-bound enzymes by virtue of a glycophosphatidyl-inositol membrane anchor in their carboxy termini, these Cimicomorpha enzymes lack this terminal region, and thus these enzymes are secreted. The role of these enzymes in blood feeding is not yet apparent.

3.1.3. Inositol Triphosphate Phosphatases (IPPase) Including Cimex Nitrophorin

This family of proteins has been found ubiquitously in the sialomes of bloodsucking Cimicomorpha, including the well-characterized enzyme from R. prolixus [8] and the C. lectularius nitrophorin [2729], a protein found associated with a heme moiety and a carrier and stabilizer of nitric oxide (NO), a very reactive gaseous substance that is also a potent vasodilator and platelet aggregation inhibitor. While the function of Cimex nitrophorin is without question, the function of an extracellular inositol phosphatase is puzzling, because these inositol phosphates are intracellular and not available to an extracellular enzyme. Indeed, it appears fitting that inositol polyphosphates should be hydrolyzed, because they perform a proplatelet aggregation function as well as proinflammatory and immune-enhancing roles in leukocytes [105, 106]. Perhaps the enzyme may reach the intracellular pool by some not yet understood mechanism. On the other hand, association of heme with inositol phosphatases seen in the case of Cimex nitrophorins is not at all common, being unique to these proteins; investigation of the amino acids that are associated with heme binding does not reveal similarities to other IPPases from either vertebrates or invertebrates (Ribeiro, unpublished).

The phylogram of the IPPase sequences found in Additional File 1 (Figure 1) shows the Cimex nitrophorins contained within a strong clade with 100% bootstrap support and constituted by at least three subclades representing at least three genes expressing these NO transporters, plus alleles or other genes. Cimex has two additional sequences outside the nitrophorin clade and near the IPPase clade of the remaining triatomines. It is thus interesting that both Cimex and triatomines have a common IPPase in their sialome, even though we have no idea of their function. IPPases have not been found in any other transcriptome so far done, including those of mosquitoes, sand flies, biting midges, black flies, and ticks, being thus uniquely from Cimicomorpha blood feeders.

3.1.4. Serine Proteases

Serine proteases are commonly found in the sialotranscriptomes of insects and ticks, as well as in those of Heteroptera. An unusual serine protease activity in the saliva of T. infestans has been noted before, but only a partial enzyme purification of the enzyme, named triapsin, was achieved [107]. Within the bloodsucking Heteroptera, only one Panstrongylus megistus sequence has been molecularly characterized as a fibrinolytic enzyme [30]. Additional File 1 shows such proteins from Cimicomorpha, including plant-feeding bugs such as Lygus lineolaris, Lygus hesperus, and Creontiades dilutus [108110]. The phylogram of these enzymes (Figure 2) shows two well-defined clades, one containing most of the Lygus sequences, but also two T. matogrossensis and one T. brasiliensis sequence, within a clade of 86% bootstrap support, suggesting a common ancestral salivary serine protease for plant- and blood-feeding Cimicomorpha. The fibrinolytic enzyme of Panstrongylus shares a strongly supported clade with two other T. matogrossensis sequences, which are probable orthologs of the Panstrongylus gene. The Cimex sequence appears as an outlier to the group. Rhodnius sialotranscriptomes have not revealed proteases, and its saliva does not hydrolyze the substrates used in the characterization of the T. infestans triapsin (Ribeiro, unpublished).

3.1.5. Other Enzymes

A chitinase and a lipase were found in Oncopeltus, while T. matogrossensis displayed a salivary phospholipase and a metalloprotease. The precise role of these enzymes is unknown. Salivary metalloproteases in ticks have been associated with fibrinolytic and antiangiogenic activities [111, 112], while the Oncopeltus enzyme may be associated with digestive or antifungal functions.

3.2. Protease Inhibitor Domains
3.2.1. Kazal Domain-Containing Peptides

The Kazal domain occurs in many protease inhibitors, and its structure was first determined for the proteinase inhibitor IIA from bull seminal plasma [113]. The sialotranscriptome of members of South American Triatoma (T. infestans, T. matogrossensis, and T. brasiliensis) but not North American T. dimidiata or T. rubida, nor any other sialotranscriptome of Cimicomorpha, abounds with transcripts coding for proteins containing this domain; however, none have been so far characterized functionally. In Rhodnius, Triatoma, and Dipetalogaster, the crop antithrombin has been characterized as a protein containing two such domains [114117], but salivary anticlotting of Rhodnius and Triatoma has been shown to be different lipocalins named prolixin S and triabin [2, 21, 118]. Kazal-type peptides can function as antimicrobials by inhibiting microbial exoproteases essential for their survival [119, 120] and can also work as vasodilators, as in the case of a tabanid salivary protein named vasotab, which is suspected to modify ion channels [121]. These functions should be taken into consideration in functional assays of the recombinant Kazal peptides.

3.2.2. Serpin

The serine protease inhibitor (serpin) family is ubiquitous in nature, functioning mostly as endogenous regulators of proteolytic cascades such as inhibiting thrombin in vertebrates (plasmatic antithrombin 3) or regulating phenol oxidase activation cascades in invertebrates [122, 123]. The salivary anticlotting proteins of Aedes mosquitoes (but not those of anopheline mosquitoes) are members of this family [124, 125]. A single sequence of this family, derived from four ESTs, was found in the sialotranscriptome of C. lectularius. Its target is still unknown.

3.2.3. Pacifastin and Cystatin

Proteins containing these domains were only found in the sialotranscriptome of Oncopeltus. Pacifastins are typical serine protease inhibitors of insects and crustaceans [126], while cystatins are ubiquitous proteins typically inhibiting cysteine proteases [127]. Although a single EST was found coding for the pacifastin peptide, five well-expressed cystatins were identified in Oncopeltus. The targets of these peptides are unknown, but it was suggested that the salivary cystatins may prevent plant apoptosis induced by cysteine proteases [46, 128, 129]. Tick sialomes have revealed cystatins that were shown to inhibit inflammation and maturation of dendritic cells in their hosts [130].

3.3. Lipocalins

The term lipocalin literally means a cup of lipid, as these proteins form a barrel with a hydrophobic interior cavity that is suitable to transport lipids and other hydrophobic compounds in an aqueous milieu [131133]. There is virtually no sequence conservation in the family, which is recognized by its typical 3D structure composed of a repeated +1 topology β-barrel. This protein family is by far the most abundant in sialotranscriptomes of triatomine bugs (see review [132]) but remarkably absent in Cimex and Oncopeltus; however, it was also abundantly recruited in tick sialomes [56], another case of convergent evolution. Additional File 1 provides for 331 lipocalins, which is more than half of all putative secreted proteins listed in this work. Several of these proteins may be alleles of the same gene. The sheer size of the family in individual species is indicative of gene duplication events that might have had an impact during the evolution of blood-feeding [134136]. Following gene duplication—by retrotransposition or more commonly by forming tandem repeats due to transposable element recombination—the new genes can lead to an increased transcript load in a particular organ or tissue. If this augmented expression increases fitness (e.g., helps the bug to feed), the gene will persist; otherwise, it will evolve to be a pseudogene [137]. Once genes are duplicated and fitness is increased by the duplication, these are free to evolve independently and to diverge from each other by acquisition of novel functions. Salivary genes of bloodsucking arthropods are under selection by two different processes. First, the gene can evolve in the direction of fine tuning its function in relationship to its target. For example, a bug feeding on a bird may have “ideal” anticlotting, but if ecologic changes appear and the bug shifts to another host, this anticlotting may still work but have some room for improvement (e.g., by increasing its affinity to the specific thrombin). Second, any protein injected into the skin of a vertebrate is capable of eliciting an immune reaction, which may lead to defensive host behavior following mast cell degranulation or complement-mediated local inflammation, leading to interruption of the meal or killing of the insect. This may lead to a scenario of balanced polymorphism, with the least common epitope being the best one to have, thus multiplying the number of different alleles in a population that are selected to have the same optimal function but the least common antigenicity. Host immune pressure can also lead to gene obsolescence, creating a niche for cooption (exaptation) of new genes, including horizontal transfer [138], which may substitute for the lost function and thus may explain the appearance of novel salivary genes in related organisms [139].

Lipocalin functions in triatomines are multiple and linked to their unique barrel when working as kratagonists (from the Greek kratos = seize) [140], which are binders of relatively small agonists such as biogenic amines, TXA2, leukotrienes, or ADP, or carrying the heme that carries NO in Rhodnius nitrophorins, or functions linked to their side chains when they work as anticlotting agents such as triabin (for references for these functions, see Table 1). Uniquely, the protein nitrophorin 2 from R. prolixus has three functions: (i) it carries NO, (ii) it binds histamine, and (iii) it is an inhibitor of the activation of Factor X [5, 141]. Notice that contrary to their names as “lipid cups,” many of these lipocalin ligands are well charged and not hydrophobic, such as biogenic amines and ADP. The functions of the salivary lipocalins in ticks are similarly associated with their kratagonist activity toward biogenic amines or arachidonic acid derivatives, or as inhibitors of complement activation [142148].

A phylogram of the triatomine lipocalins, although a bit overwhelming in size, presents a bird’s-eye view of the several distinct families arranged mostly in robust clades (Figure 3 and Additional File 2). Most clades have not a single member that has been analyzed functionally (marked with Roman numerals in Figure 3), including the clade containing the Triatoma protracta antigen procalin; accordingly there are eight clades that have no known function. Additional File 2 is provided for high-resolution display of the sequences, which have their NCBI accession numbers for sequence retrieval. A few details deserve some comments with respect to the phylogram. (i) The clade named Pal-Tri-Dip contains the Triatoma proteins pallidipin, triplatin, and the Dipetalogaster protein dipetalodipin, which are platelet inhibitors possibly all due to being TXA2 kratagonists as demonstrated for triplatin and dipetalodipin [25], thus indicating the conservation of this function among two different genera. (ii) Most Rhodnius lipocalins cluster in two clades, one containing all the known NO carriers, named nitrophorins (NP) and the other containing the adenosine nucleotide kratagonists named RPAI (Rhodnius platelet aggregation inhibitor). (iii) The Rhodnius biogenic amine-binding protein (BABP) somewhat surprisingly clusters with the nitrophorins, but BABP does not have a heme group and has higher affinity for serotonin and norepinephrine, constituting a good example of gene duplication and divergence of function [7]. (iv) Exceptionally, one T. matogrossensis and one T. dimidiata protein sequence group with the NP-BABP clade with 99% bootstrap support. The function of these proteins could lead to the original function of the Rhodnius nitrophorins, which are exclusive of the genus Rhodnius. Indeed the abundance of these heme proteins in Rhodnius salivary glands makes these glands distinctively bright cherry red in color, as first pointed out by Wigglesworth nearly 70 years ago [149]. Triatoma and Dipetalogaster glands are clear or of a very pale yellow color [150]. (v) Rhodnius lipocalins not belonging to the NP-BABP and RPAI clades are scattered in the phylogram, including one sequence between the procalin and triabin clades, one between the triabin and VI clades, and a group of four proteins between clades IV and V. None of these Rhodnius proteins group within strong bootstrap support to any of the Dipetalogaster- or Triatoma-containing clades. (vi) Finally, the procalin clade is very extensive and contains many robust subclades, many of which are of single species, indicating possible recent events of gene duplication or extensive polymorphism.

3.4. Odorant/Pheromone-Binding Family (OBP)

The OBP family, like the lipocalins, is specialized in carrying small hydrophobic ligands in aqueous media [151, 152]. A modified version of the odorant-binding family of proteins is very abundant in the sialotranscriptomes of hematophagous Nematocera [138] and named as the D7 protein family. A few mosquito proteins have been crystallized and functionally characterized, showing kratagonist activity toward biogenic amines, TXA2, and leukotrienes, in addition to anticlotting activity [153156].

In hematophagous Cimicomorpha, members of the OBP family are found in Rhodnius, Triatoma, and Cimex but are particularly abundant in Cimex, with two OBP proteins having over 250 ESTs in a total of ~2,000 ESTs, suggesting the OBP family has been recruited by Cimex to function as the lipocalins in triatomines. No salivary member of this family in Cimicomorpha has been so far functionally characterized.

3.5. Antigen-5 Family

This is a ubiquitous protein family found in plants and animals, including expression in the venom glands of vespids, where it was recognized as an antigen, thus the name antigen 5 for this family. They are members of the CAP superfamily, most with unknown function [157]. In snakes and lizards, they have been associated with venom toxins [158160]. In stable flies, one salivary antigen 5 protein binds immunoglobulins and may function as an inhibitor of the classical pathway of complement activation [161]. In horse flies, one protein has acquired a disintegrin motif and is a strong inhibitor of platelet aggregation [162164]. All triatomine sialotranscriptomes have revealed this class of proteins, which is particularly abundant in Dipetalogaster. The function of these proteins in triatomine blood feeding is still unknown.

3.6. Lectin

Triatoma dimidiata exclusively presents two partial sequences containing a galactose-binding domain. While lectins—mainly C-type lectins—have been described in the sialotranscriptome of mosquitoes (none with known function), this is so far a unique finding in triatomine sialotranscriptomes.

3.7. Immunity-Related, Ubiquitous Families

Immunity-related proteins and peptides are commonly found in the saliva of bloodsucking arthropods and may help to control microbial growth in the ingested meal and perhaps also avoid microbial infection of the bite site. Lysozyme, while common in mosquito sialomes, is found exclusively so far in Cimex sialomes, with four quite different proteins being reported. D. maxima presents a histidine-rich peptide that could function as an antimicrobial peptide, and a defensin is reported from T. infestans. The absence of commonly found salivary antimicrobial peptides in triatomines suggests that if this salivary function is present within these organisms, it may be encoded by lineage-specific gene families, one of which (trialysin) will be reported further below.

4. Arthropod-Specific Families

Several insect-specific families are further identified, none functionally characterized, and most without domains providing a clue for their function. These include proteins with chitin-binding domains and cuticle-like homologs, which may be associated with salivary ducts rather than a function in the injected saliva. One conserved secreted insect protein family of basic peptides having ~100 amino acids after signal peptide cleavage occurs in Cimex, Oncopeltus, and Triatoma sialotranscriptomes. Homologs are found by blastp to the nonredundant (NR) protein database including a venom protein from the wasp parasitoid Nasonia vitripennis identified in a proteomic study [165]. Exceptionally, there are also homologs to proteins from the soil bacteria Streptomyces clavuligerus, having 52% identity to the insect proteins. Similarly, the protein originally described in R. prolixus as MYS2 has homologs found in the sialotranscriptomes of T. brasiliensis and T. matogrossensis and is similar to many other insect proteins in the NR, including protein sequences deducted from the sialotranscriptome of the tsetse Glossina morsitans [166]. Three sequences, one each from C. lectularius, T. infestans, and T. matogrossensis, have 25% amino acid sequence identity but 52% similarity and little similarity to other proteins on the NR database. These sequences are grouped in Additional File 1 as the Cimex-Triatoma family. PSI-blast initiated by the T. matogrossensis sequence against the NR database initially retrieves only the two other sequences, but on first iteration it retrieves dozens of insect proteins (Additional File 3), and in the third iteration it retrieves Daphnia and tick proteins, suggesting this is an arthropod family of high divergence. Finally, the sialotranscriptome of T. matogrossensis identified four additional nonrelated proteins that have insect homologs but were not found in other reported sialotranscriptomes of Hemiptera but are similar to proteins reported from G. morsitans and from Aedes aegypti sialotranscriptomes. It is possible that these families function as antimicrobial peptides, but so far none has been characterized.

5. Hemiptera-Specific Families

5.1. Mys3/Hemolysin Family

When the R. prolixus sialotranscriptome was reported [36], an additional mysterious protein was named Mys3. Later, with additional sialotranscriptome reports, another protein family emerged, named as hemolysin-like because some members had weak similarity to bacterial proteins annotated as hemolysins. PSI-blast later revealed that these proteins all belong to a single family that is quite divergent, including a truncated protein from the sialome of Oncopeltus, suggesting a non-blood-feeding role, perhaps antimicrobial, for its members.

5.2. Triatoma-Specific Families

Sialotranscriptomes of several species of the Triatoma genus reveal several unique protein families, among which are the trialysin and short trialysin families. The trialysins are basic proteins of mature MW near 26 kDa that can be further processed to peptides that have lytic properties [17, 18] and may function as antimicrobials. Short trialysins have mature MW of ~6.1 and acidic pI and are so named because they match the amino terminal region of the mature trialysins. Both forms are abundantly expressed but only found in T. infestans and T. matogrossensis, which are from southern South America, and are not found in the sialotranscriptomes of T. brasiliensis, found in northeastern Brazil, or on those of the North American T. dimidiata or T. rubida. Additional File 1 reports 19 protein sequences from Triatoma that are not similar to anything deposited in the NR database and two pairs of sequences from T. matogrossensis that only match its pair members. None has been functionally characterized. It is interesting that of these 23 sequences only one derives from T. rubida and the remaining derive from T. infestans and T. matogrossensis, although the number of clones sequenced for the T. brasiliensis, T. dimidiata, and T. rubida was similar to those of T. infestans and T. matogrossensis, suggesting a greater sialome diversity in these bugs from southern South America.

5.3. Rhodnius-, Cimex-, and Oncopeltus-Specific Families

Additional File 1 presents 16 proteins from the bugs named above that have no significant matches to the NR database except in some cases for some proteins of low complexity. None of these proteins has been functionally characterized. This includes Rhodnius MY1 protein, one of three mysterious proteins revealed in the first bug sialotranscriptome [36]. As seen above, MYS2 and MYS3 were later found to be members of larger families. It is expected that, with a larger number of genomes and transcriptomes sequenced, MYS1—as well as the other orphan proteins in this group—will also be deorphanized.

6. Housekeeping Proteins

Mostly from the sialotranscriptomes shown in Table 2, many housekeeping protein sequences were also deduced, including many associated with energy metabolism, protein synthesis, modification, and export, among other classes (see worksheet named “Housekeeping” of Additional File 1). Interestingly, the sialotranscriptome of Triatoma rubida shows abundant expression of members of the cytochrome P450 as well as of the 15-hydroxyprostaglandin dehydrogenase, suggesting either that the salivary gland may have an active endogenous prostaglandin signaling or that prostaglandins may be secreted in the saliva of these bugs. Cyt P450 transcripts were also detected in Rhodnius and T. matogrossensis, and the prostaglandin dehydrogenase was also found in T. infestans. Increased depth of sequencing of these sialotranscriptomes may certainly reveal these two classes of proteins to be expressed in all triatomines.

7. Concluding Remarks

Blood-feeding Cimicomorpha have developed a sophisticated and divergent array of salivary pharmacologically active compounds that disarm their hosts’ reaction against blood loss. In a few transcriptomes encompassing members of the Reduviidae and Cimicidae, the convergent evolution scenario in the sialomes of these two families is apparent. Both have apyrase activity, but from different gene families; Cimex and Rhodnius (but not any Triatomini member) use NO as a vasodilator but co-opted completely different heme proteins to carry this unstable gas. The anticlotting compounds are different at the Reduviidae tribe level and so on. The lipocalin expansion is remarkable among the triatomines and nonexistent in Cimex. These proteins can play many different functions as binders of small agonists (kratagonists), NO carriers, or protease inhibitors. In Cimex, the expanded odorant binding family may have taken this role, but none thus far has been characterized.

Notice that the sialome of Oncopeltus, a member of the Pentatomomorpha—the most closely related suborder to the Cimicomorpha (see http://tolweb.org/Heteroptera/10805) [76, 167]—revealed virtually nothing in common with the Cimicomorpha, and the Cimicidae sialome also revealed little in common with the Reduviidae, perhaps as expected by the divergence of these families (see http://tolweb.org/Cimicomorpha/10817). Zooming-in on the Triatomine group, it will be interesting in the future to describe the sialomes of additional tribes of the Triatomine, such as the Bolboderini, which includes bugs that feed on insect hemolymph, the Cavernicolini that are associated with bats, and members of the Linshcosteus genus that are found in India [168] and could be divergent members. Zooming a little out and as indicated by Schofield and Galvão [49], facultative blood feeding is found in non-Triatominae members of the Reduviidae, including the Emesinae, Harpactorinae, Peiratinae, Physoderinae, and Reduviinae. Sialomes of these subfamilies could be more indicative of the prevalent “pre-adaptations” available as stepping stones and promoted by the blood-feeding habit. On the other hand, the Cimicidae are closely related to the bat bugs (Polyctenidae), which is a sister group, and to the Anthocoridae (flower bugs; http://tolweb.org/Cimicomorpha/10817), which feed on small insects. These non-blood-feeding closer relatives may reveal insights into the Cimicidae evolution to hematophagy.

Acknowledgments

This work was supported by the Intramural Research Program of the Division of Intramural Research, the National Institute of Allergy and Infectious Diseases, and the National Institutes of Health. The authors thank NIAID DIR intramural editor Brenda Rae Marshall for editing the manuscript. Because JMCR and IMBF are government employees and this is a government work, the work is in the public domain in the United States. Notwithstanding any other agreements, the NIH reserves the right to provide the work to PubMedCentral for display and use by the public, and PubMedCentral may tag or modify the work consistent with its customary practices. You can establish rights outside of the US subject to a government use license.