Abstract

Partial characterization of immunoglobulin C gene of water buffalo (Bubalus bubalis) revealed high amino acid sequence identity with C of cattle (94.28%) and sheep (91.71%). Four amino acid replacements (Met-301, Val-310, Asn-331, and Thr-432) in C 2, C 3, and C 4 of buffalo IgM are distinct, however. Unlike cattle, a codon deletion (GTG encoding valine at position 507 in cattle) and an insertion (GGC encoding glycine at position 532) occur in buffalo C 4. Three N-linked glycosylation (Asn-X-Thr/Ser) sites (one at position 325–327 in C 2; two at positions 372–374 and 394–396 in C 3) differentiate buffalo IgM from cattle and sheep. Similar to cattle, buffalo IgM has fewer prolines in C 2, which acts as hinge, which restricts Fab arm flexibility. Increased structural flexibility of the C1q-binding site in C 3 compensates for the rigid buffalo C 2 domain. Secondary structure of C1q-binding site is distinct in buffalo and cattle IgM where long alpha-helical structure is predominant that may be relevant to complement fixation function. Conserved protein motif “Thr-Cys-Thr-Val-Ala-His” provides protein signatures of C1q-binding region of ruminant species. The distinct structural features of C1q-binding site of buffalo and cattle IgM seem to be of functional significance and, therefore, useful in designing antibody based therapeutics.

1. Introduction

The water buffalo (Bubalus bubalis), member of family Bovidae domesticated approximately 5000 years ago in Asia, is raised for milk, meat, and draught purposes. Approximately, 170 million water buffaloes are mainly found in Asia (97%), but their number is growing across Africa, Australia, Europe and South America [1]. India possesses the best of dairy breeds (Murrah, Nili-Ravi, and Surti) that produce 72 million tones of milk annually, 5% of world’s total milk output [1, 2]. Buffalo milk is rich in fat, protein, and minerals but low in cholesterol [3] and is, thus, perfect source of good quality dairy products, especially the traditional Italian mozzarella cheese [4]. The demand for buffalo meat is high as it is relatively lean with low fat and high mineral content as compared to beef or pork. Buffaloes provide an excellent source of draught power in more than 50 countries [1, 2]. Buffalo utilizes poorly digestible feeds better than cattle and, therefore, can be maintained on low quality fodder and crop [1]. Importantly, buffaloes are resistant to common diseases, ticks, and external parasites that commonly afflict cattle [5].

Little is known about the structural and functional features of the immune system of this economically important species. The immunoglobulin genetics of other domestic species has been extensively studied [6] including humans and mice [7]. Limited sequence divergence is noted in phylogenetically close cattle [810] where somatic hypermutations [11, 12] and generation of exceptionally long third complementarity determining regions of heavy chains (CDR3H) [1214] provide the required antibody diversity. Within the preponderant λ-light chain expression in cattle, a restricted Vλ1-Jλ3-Cλ3 recombination encodes the most λ-light chain repertoire in cattle [15, 16]. Immunoglobulin heavy chain constant regions genes that encode IgM, IgD, IgG, IgA, and IgE isotypes have been analyzed in many species [6], including cattle [1620]. The immunoglobulin gamma heavy chain gene has been mapped to buffalo chromosome 20q23-q25 by in situ hybridization [21]. Buffalo IgG, IgM, and IgA immunoglobulin isotypes have been serologically characterized [22] where two subclasses of buffalo IgG (IgG1 and IgG2) are identified [23]. To advance genetic and structural understanding about buffalo immunoglobulins, we partially characterized buffalo germline C gene that encodes IgM, an immunoglobulin that appeared first during vertebrate evolution and is the first to be expressed on developing B-lymphocytes. The buffalo germline C gene sequence from Niliravi breed shares high amino acid sequence similarity with cattle and, also, the predicted distinct C1q-binding structural characteristics.

2. Materials and Methods

2.1. Genomic DNA

Peripheral blood collected from a water buffalo of Niliravi breed, kept at the dairy farm of Punjab Agricultural University, Ludhiana, India, was used to extract genomic DNA as described [9].

2.2. PCR and Sequencing

The buffalo germline C gene, spanning codons from 201 to 550, was PCR amplified using sense (5′-GTGTGCGAAGTCCAGCA-3′) and antisense (5′-AGACTAGTTACCGGTGGACTTGTCC-3′) primers from conserved C 1 and C 4 exon sequences, respectively, [17, 24] under conditions that did not permit PCR artifact [18]. The PCR steps involved a hot start at 95°C for 2 min, denaturation at 95°C for 1 min, annealing at 65°C for 1 min, and extension at 72°C for 1 min up to a total of 30 cycles. The PCR conditions included 1.5 mM MgCl2, 0.8  M of each primer, and 2.5 U of Taq polymerase (Perkin-Elmer, Branchburg, NJ, USA) in a 100  L volume. The PCR product (~1.5 kb) was gel fractionated and purified using GeneClean II (Bio 101, Vista, CA, USA) and subjected to automated DNA sequencing in both directions (MOBIX, McMaster University, Hamilton, ON, Canada). The internal sequencing primers were synthesized from the determined buffalo C gene sequence (5′-TGAGGCCTCGGTCTGCT-3′), corresponding to codons from 401 to 407. The buffalo C codons are numbered according to [7] following Ou index [31]. The DNA sequence was analyzed using Geneious Pro 5.6.4 program (http://www.geneious.com/) and the predicted protein secondary structure determined using the original Garnier-Osguthorpe-Robson algorithm (GOR I) provided by the EMBOSS suite [30].

3. Results and Discussion

The nucleotide sequence and the deduced amino acid sequence of water buffalo germline C gene, spanning between codons 201 and 550, are presented in Figure 1. The water buffalo C gene shares a high nucleotide (95.52%) and amino acid (94.28%) sequence similarity with C of cattle, closest ruminant species of family bovidae. Analysis of the buffalo germline C gene sequence revealed that it encodes part of C 1 domain (codons 201–221) and all of C 2 (codons 221–333), C 3 (codons 334–438), and C 4 (codons 439–549) domains of IgM. When compared with other species, the overall amino acid identity of water buffalo IgM was most similar to sheep (91.71%) followed by pig (64.00%), rabbit (63.14%), human (61.71%), horse (60.57%), and mouse (56.28%). High amino acid sequence similarity of buffalo IgM with cattle (94.28%) and sheep (91.71%) is expected given the close phylogenetic relationship in ruminant species. Similar to cattle and sheep, buffalo IgM has unique amino acid substitutions at 10 positions (Leu-239, Ser-246, Ile-274, Glu-279, Arg-303, Lys-319, Ser-367, Gly-370, Ala-421, and Lys-442) noted to be conserved in non-ruminant species (Figure 2). Buffalo IgM has four distinct amino acid replacements (Met-301, Val-310, Asn-331, and Thr-432) spread across C 2, C 3 and C 4 that diverge from conserved amino acids in cattle and sheep IgM. As compared to cattle, buffalo C gene has a codon deletion at position 507 (GTG encoding valine present in cattle) and insertion of GGC encoding glycine at position 532 in the C 4 domain (Figure 1). Nucleotide deletions, insertions, and substitutions are also noted in the intron sequences between the buffalo C exons.

The conserved cysteines in buffalo IgM, essential for domain structure formation via intrachain disulfide bridge, are noted at position 202 (C 1 domain which would interact with another cysteine residue within the C 1 domain; not investigated here), 252–313 (C 2 domain), 360–418 (C 3 domain), and 466–528 (C 4 domain; Figure 2). Similarly, cysteine amino acids responsible for interchain disulfide bridges between the heavy chains of the monomeric (position 330) or polymeric (position 406) IgM [31] are conserved. Like most other species, buffalo IgM has two tryptophan residues in each of the C 2, C 3, and C 4 domains (Figure 2). These findings are consistent with the critical role of conserved cysteine and tryptophan amino acids in maintaining the domain structure of immunoglobulin [32].

Buffalo IgM has three potential N-linked glycosylation (Asn-X-Thr/Ser) sites: one at position 325–327 in the C 2 domain and two at positions 372–374 and 394–396 in the C 3 domain (Figure 2). While other species have either one (cattle, sheep, pig, and human) or two (rabbit, horse, and mouse) N-linked glycosylation sites in C 1 domain. As for C 3 domain, there exist one (pig and mouse), two (cattle, sheep, human, and horse) or three (rabbit) N-linked glycosylation sites (Figure 2). This suggests significant variability in the number of N-linked glycosylation sites in both C 2 and C 3 domains across species. Such differences in the N-linked glycosylation sites in the C 2 and C 3 domains may be of functional significance as these could influence functional configuration of IgM [33], especially movement of Fab arms or accessibility of C1q binding site. No N-linked glycosylation site exists in the buffalo C 4 domain, a characteristic shared with other mammalian species but not with lower vertebrates [25, 34, 35].

Similar to cattle, sheep, and goat, buffalo IgM has five prolines in the C 2 domain, the lowest number of prolines in this region that acts as hinge in contrast to other species, such as, pig and rabbit (7), human (8), and horse and mouse (9). The buffalo IgM has only six hydrophilic threonine amino acids in the C 2 domain, the lowest number in this region, unlike cattle (7), sheep (8), rabbit (9), pig and human (10), horse (12), and mouse (13). Similar to cattle (19) and sheep (20), buffalo IgM is rich in serine (19) in the C 2 domain. Other species like mouse (9), human (12), and horse (13) have fewer serine residues in the C 2 domain, however. It seems that fewer hydrophilic threonine in C 2 domain of ruminant species is compensated by higher number of hydrophilic serine residues. Presence of fewer proline in the C 2 domain will provide structural rigidity that may restrict segmental flexibility of Fab arms. The higher number of hydrophilic threonine and serine amino acid residues in C 2 of buffalo IgM is likely to augment its ability to extend into the solvent, however. We earlier reported similar findings for cattle IgM where structural constraints imposed by the restricted segmental flexibility of Fab arms are compensated by exceptionally long CRD3H (>50 amino acids) region [13, 18]. It is possible that such a long CDR3H exists in buffalo antibodies as well.

The C1q-binding site buffalo IgM, spanning positions 408–428 in C 3 domain, has 12 conserved residues across species (positions 408-Glu, 409-Asp, 410-Trp, 411-Ser, 418-Cys, 419-Thr, 420-Val, 422-His, 424-Asp, 425-Leu, 426-Pro, and 428-Pro). Of the twelve conserved amino acids, three notable exceptions exist in humans (Glu replaced by Asp at position 408), mouse (Ser replaced by Asn at position 408), and goat (Trp replaced by Arg at position 409 (Figure 3)). The conserved protein motif “Thr-Cys-Thr-Val-Ala-His” provides protein signatures of C1q-binding region in ruminant species. The predicted protein secondary structure of C1q binding site reveals its distinct structural features in buffalo and cattle IgM where a long alpha-helical structure is predominant (Figure 3), unlike other species, followed by a short turn together with a coiled structure common to all species. The C1q binding site in cattle and buffalo IgM also lacks beta-strand altogether unlike other species. These structural features deviate from IgM of other ruminant species, like sheep and goats, where turns and/or coils are evident in this region similar to other species. By contrast, the alpha-helical structure is altogether absent in C1q binding site of human IgM. These configurational differences in the conserved C1q binding region of IgM across species appear to be relevant to complement fixation and activation by classical pathway. It is possible that increased structural flexibility in the C1q-binding site compensates for the structurally rigid C 2 domain of buffalo and cattle IgM.

Overall, buffalo C domain shares high amino acid sequence similarity with C of other ruminant species like cattle and sheep. The buffalo IgM has fewer proline residues in the C 2 acting as hinge that would restrict the segmental flexibility of Fab arms. High hydrophilic threonine and serine amino acid content in C 2 domain will likely enhance its ability to extend into the solvent. The secondary protein structure of C1q binding site reveals its distinct structural features in buffalo and cattle IgM where a long alpha-helical structure is predominant (Figure 3), unlike other species which seems to be of functional significance.

Acknowledgment

This research was supported by NSERC Canada Discovery Grant to Dr. Azad K. Kaushik.