In vertebrate animals, the molecules encoded by major histocompatibility complex (MHC) genes play an essential role in the adaptive immunity. MHC class I deals with intracellular pathogens (virus) in birds. MHC class I diversity depends on the consequence of local and global environment selective pressure and gene flow. Here, we evaluated the MHC class I gene in four species of the Turdidae family from a broad geographical area of northeast China. We isolated 77 MHC class I sequences, including 47 putatively functional sequences and 30 pseudosequences from 80 individuals. Using the method based on analysis of cloned amplicons () for each species, we found two and seven MHC I sequences per individual indicating more than one MHC I locus identified in all sampled species. Results revealed an overall elevated genetic diversity at MHC class I, evidence of different selection patterns among the domains of PBR and non-PBR. Alleles are found to be divergent with overall polymorphic sites per species ranging between 58 and 70 (out of 291 sites). Moreover, transspecies alleles were evident due to convergent evolution or recent speciation for the genus. Phylogenetic relationships among MHC I show an intermingling of alleles clustering among the Turdidae family rather than between other passerines. Pronounced MHC I gene diversity is essential for the existence of species. Our study signifies a valuable tool for the characterization of evolutionary relevant difference across a population of birds with high conservational concerns.

1. Introduction

The major histocompatibility complex (MHC) is a group of molecules encoded by certain genes that are most polymorphic to have been described in vertebrates’ genomes [1]. Two types of MHC gene families, class I and class II, are useful to cell surface glycoproteins that regulate the immune response. MHC class II molecules are heterodimers consisting of an α chain and a β chain; both contribute to presenting peptides from the processing of extracellular pathogens such as bacteria to the CD4+ T-helper cells [2]. Heterodimer molecules of MHC class I are made up of an α chain and a non-MHC molecule, the β2 microglobulin. The α chain constitutes a cytoplasmic tail, a transmembrane domain, and three extracellular domains named α1, α2, and α3 [3] that are encoded by exons 2, 3, and 4, respectively. The MHC class I molecules are expressed in almost all somatic cells and trigger an adaptive immune response by presenting endogenously derived peptides of viral protein and an individual’s own body cells to CD8+ cytotoxic T-cells [4]. Polymorphism is largely confined within the region encoding the ABS (antigen-binding site) of MHC class I [5]. Maintenance of surprising diversity is supposed to take place by two types of selection: heterozygote advantage and frequency-dependent selection. Heterozygotes could recognize a broader range of antigens from multiple pathogens and therefore have more fitness than either individual having a homozygote [6]. Other is frequency-dependent selection, in which rare alleles deliver a selective advantage where pathogens have found a means to escape against common immune defensive alleles in the population. Thus, alteration in the pathogen community with time and locality results in MHC variation in the host population. Generally, in an individual possessing huge numbers and diverse MHC alleles; more pathogens can be recognized [1].

Structural diversity and immune response have been explored in numerous research, including genomics [7, 8], ailment [911], and mate choice [1214]. Sequence similarity at PBR-based assignment to the locus is frequently hampered by various evolutionary indicators due to current recombination, duplication, and/or concerted evolution as well as positive selection mediated by a variety of pathogens [15]. Thus, numerous studies emphasized MHC genes as important markers to evaluate the adaptive potential and evolutionary status of a threatened population [16].

The emerging scenario inspires researchers to collect statistics from a group of wild taxa to enlarge our understanding of the evolution of the MHC gene [17]. Despite significant efforts, protocols for locus-specific MHC genotyping in avian are still difficult to achieve and remarkably rare [18]. MHC studies in population of wild birds remain neglected possibly due to complications in amplifying gene sequences from bird species not closely related to systematically studied chicken [19, 20].

A significant decline in habitats and fragmentation of available habitats are predisposing factors for dramatic deterioration in population sizes [21]. The avian genus Turdus is one of the broadly distributed passerine genera, with 65 documented extant species. The genus is listed wild territorial birds that are beneficial to china having economic and research value. Birds of this genus are strongly migratory thus experiencing a variety of environments. Up to the present, there are no studies on MHC class I genes in Turdidae species, which is the first step towards exploring the role of selection mediated by pathogens in the maintenance of MHC class I diversity. Precisely, this study aims to (1) Measured locus-specific variation in MHC I exon 3 genes across the Turdidae family to evaluate the mode of evolution by which such variation comes about. To achieve this, we have measured the diversity and selection at MHC I genes to make available the variations that exist across the Turdidae family. (2) We investigate the numbers of alleles possessed by each species and the general features of alleles in terms of functional genetic diversity. (3) Phylogenetic analyses to assess evolutionary relationships and processes driving avian MHC I diversity among four species of the Turdidae family and other avian species.

2. Material and Methods

2.1. Study Population

The study population was non-sympatrically distributed 80 individuals of four species of genus Turdus of the Turdidae family. Samples include two to three contour feathers, tissue from breast and liver of birds accidentally injured or died during migratory season of 2017-19 in autumn and deposit in State key laboratory of wildlife detection center in northeast forestry university, stored at 4°C. The geographical location of sample material is presented (Figure 1).

2.2. Extraction of Genomic DNA

Region of calamus to the rachis of contour feathers was excised, tissues from skeletal muscles were minced, placed into a 1.5 ml Eppendorf tube containing TNE buffer (10 mM Tris-HCl (pH 8.0), 150 mM NaCl, 2 mM EDTA, 1% SDS). Total genomic DNA was extracted with AxyPrep Multisource Genomic DNA Miniprep Kit (AXYGEN, China) according to the manufacturer’s instructions. The DNA concentration was measured with Nanopore Spectrophotometer at 260 nm absorbance. Samples above 100 ng/μl concentration were used for further analysis.

2.3. PCR, Cloning, and Sequencing

Polymerase chain reaction was conducted using motif specific primers designed for the amplification of MHC class I genes in great reed warbler. The forward primers HN36 5-TCCCCACAGGTCTCCACACAGT-3 and HN46 reverse 5-ATCCCAAATTCCCACCCACCTT-3 correspond to exon 3 region in the flanking introns, the region coding most of the peptide-binding site in MHC molecules (subunit α2) [2224]. The primers were used due to their successful amplification in many passerine species. Amplification was performed in the reaction mixture containing 20 ng DNA template, 0.2 μM of each primer, 25 μl 2× EasyTaq® PCR SuperMix (+dye) (Trans, China), and water (deionized) to reach 50 μl as final volume. Thermal cycling for MHC class I amplification began with one cycle at 94°C for 5 min, followed by 30 cycles of denaturation consisting of sequential steps of 94°C for 30s, 52°C for 30s, and 72°C for 30s, ending with a single extension step at 72°C for 5 min. Purification was carried out with AxyPrep™DNA Gel Extraction Kit in accordance with the manufacturer’s protocol. Purified PCR product was cloned using pEASY ®-T5 Zero Cloning Kit containing Trans1-T1 Phage resistant chemically competent cells (Transgen Biotech). PCRs were performed for positive clones using M13 forward and reverse primers. Several colonies (20-25) per individual were selected and used as a template for sequencing directionally on an automatic sequencer (ABI PRISM 3730; Invitrogen Biotechnology Co. Ltd.).

2.4. Definition of Allele

Since few artifacts introduced during the recombination of PCR products in cloning [25, 26]. Amplification, cloning and sequencing were performed twice. Sequences were verified and referred to as an Allele; either minimum of three sequences have the same nucleotide composition or repeated in both events. The sequences which showed any deletion, insertion, or premature stop codons within exons were identified as presumed pseudogene sequence, and others were considered as putative functional allele (PFA) [27]. All sequences appropriate to our criteria have been deposited into the GenBank (Accession No: MN849308-54).

2.5. Data Analysis
2.5.1. Sequence Analysis

Chromatogram signals of all sequencing were examined with chromas 2.2.6. Sequences without ambiguous signals were selected. Vector sequence from the MHC class I gene was removed using seqMan in the DNAStar7.1 package. Sequence editing and organization were done with BioEdit [28]. Sequences were aligned individually and then altogether four sampled species using CLUSTAL X [29]. The unique alleles were named according to the nomenclature for MHC in non-human species [30]. NCBI BLAST [31] was used for sequences confirmation representing close identity to passerine species previously published MHC class I exon 3 sequences. Sequences having at least one stop codon (shift in the reading frame due to indels or nonsense sequences) were classified as pseudogenes. Based upon sequences found to be translatable, a minimum number of functional loci MHC class I was estimated using a conservational approach that all Loci from samples species’ individual were in heterozygote state.

The average pairwise nucleotide distances (Kimura 2-parameter model - K2P), and the Poisson-corrected amino acid distances were calculated using MEGA7.0. Standard errors were obtained through 1000 bootstrap replicates. Haplotypes identification (Na), the average number of nucleotide differences (K), polymorphic sites (S)) and nucleotide diversity (π) were measured by DnaSP 5.10 [32].

2.6. Inference of Recombination

Recombination can influence the outcomes of selection, we first tested recombination. Analyses were implemented for the nucleotide alignment of exon 3 in the Recombination Detection Program version 4 (RDP4). Several method, including RDP [33], GENECONV [34], Chimaera [35], MaxChi [36], BootScan [33], SiScan [37], and 3Seq [38], were used to detect recombination events. In addition, the online GARD tool, provided by the Datamonkey webserver (http://www.datamonkey.org/), was used for recombination signals assessment [39].

2.7. Tests for Selection

For selection, we conduct a priori classification of peptide binding region (PBR) and non- peptide region upon inferred passerine PBR sequences [40, 41] homology sites with chicken MHC [42, 43] and human HLA [44]. The identification of sites subjected to selection in MHC class I Exon 3 was performed using various methods. The first standard selection test (Tajima’s , Fu and Li’s , and Fu and Li’s ) were calculated using DnaSP 5.0 [32]. Second method was the calculation of parameter () for functional alleles. It was carried out an overall estimation of of MHC class I Exon 3 and the other was codons comprising only PBR and non-PBR, which was calculated with MEGA 7.0 according to the Nei-Gojobori method [45] with the Jukes and Cantor correction. Standard error estimates were derived from 1000 bootstrap replicates. test of historical positive selection [46] was calculated in MEGA 7.0. Third, the Maximum likelihood implemented in codeml in PAML 4.9 was used for identification of sites involved in the positive selection, which are indicated where the ratio ω () larger than 1 [47]. Two different models corresponding were tested: M7 (beta), M8 ( and ). To find whether the alternative model (M8) provided better fitter than the M7, we performed Likelihood ratio tests to compare twice the difference of the log-likelihood ratios () using a distribution . PSSs in the M8 model was identified by PP more than 95% using the Bayes empirical Bayes procedure. Positively selected sites were verified at each codon site separately using many complementary approaches implemented in Datamonkey (http://www.datamonkey.org/) [48] in addition to afore mention methods. Specifically, we used MEME [49], FEL, SALC [50], and FUBER [51].

2.8. Phylogenetic Analysis

To assess the phylogenetic relationship, we construct two phylogenies (One for sampled species and other representing MHC class I sequences of related passerines plus sampled species) using Bayesian inference. We find the GTR + T nucleotide substitution model [52] that fits our data using MrModeltest [35] through the Akaike Information Criterion (AICc) [53]. Bayesian Markov chain Monte Carlo (MCMC) was run for two million generations and sampling every 1,000 generations to ascertain when log Likelihood reached stationary phase. The phylogenetic tree was summarized in MrBayes v3.1.2 [54] and the first 25% of the tree as burn-in was removed. Fig tree was used for visualization of the consensus tree. Exploration of relation between sampled species and related avian species, we conducted a maximum likelihood (ML) analysis with MEGA 7.0 [55]. The data were analyzed with the T92 + G model. We conducted 1000 bootstrap replicates to estimate the support. Values greater than 75% were indicated in the ML phylogenetic trees. The species covered are mainly from Passeridae, Acrocephalidae, Paridae, Motacillidae, Muscicapidae, Hirundinidae, Phylloscopidae, Fringillidae, Cardinalidae, and Sturnidae. To further identify allelic lineages among sampled species and related avian species, we conducted the Neighbor-Net algorithm in SplitsTree 4.14.8. Neighbor-Net networks were based on uncorrected -distances and carried out 1000 bootstrap replicates to estimate nodal support. Nodal support values (>75%) were displayed.

3. Results

3.1. Characterization of Alleles

We successfully and selectively amplified MHC class I exon 3 genes across 80 individuals from four species of the Turdidae family using HN36 and HN46 primers. An average of 22.7 clones per individual was sequenced. Sequences varied between 459 and 579 base pairs. The multiple sequence alignments of all sampled species were 411 base pair long. The final aligned MHC class I dataset included 285-291 bp (Primers not include). Analysis of gDNA alignment revealed a total of 77 distinct Haplotypes/alleles including 47 PFA. Each sequence was confirmed to exhibit similarity (81%-93%) with earlier reported passerine MHC class 1 sequences based upon BLAST search. The numbers of PFA sequences found in a single individual ranged from one to five, indicating that one to three loci exist in three of the four species of the Turdidae family. However, the number of putative functional alleles found in a single individual ranged from two to seven in Turdus atrogularis exhibiting two to four loci. Number of the individual tested, number of PFA and pseudogene retrieved, the minimum number of functional loci estimated is given in Table 1. Three alleles (Tuna-MHCIPFA05 = Tuen-MHCIPFA09, Tuna-MHCIPFA07 = Tuen-MHCIPFA02 and Tuen-MHCIPFA05 = Tuna-MHCIPFA015) were shared among Turdus naumanni and Turdus eunomus. Two alleles (Turu-MHCIPFA05 = Tuat-MHCIPFA02 and Turu-MHCIPFA09 = Tuat-MHCIPFA08) were also detected among individuals of Turdus ruficollis and Turdus atrogularis. Interestingly, genotypes comprising of one allele were by far the most repeated (26.67%, 8/30), followed by genotypes comprising two (16.67%, 5/30) and four alleles (13.3%, 4/30) in the population of Turdus naumanni. Almost pattern was consistent in population of Turdus eunomus and Turdus rufficollis. Genotypes constituting one allele (23.3%, 7/30) were the most repeated followed by three (16.67%, 5/30) in Turdus eunomus. Genotypes comprising one allele (33.33%, 5/15) were repeated in the population of Turdus rufficollis. Allelic repetition was absent in population of Turdus atrogularis.

Of the 77 sequences, 30 were non-translatable due to indels or the presence of stop codons resulted changes in the reading frame. Sequences were thus presumed to be pseudogenes. The number of identified pseudogenes within the four species ranged between three and five in most individuals of study population, and six of the thirteen pseudogene sequences were found to be identical in three individuals from the population of sampled species. We cannot ignore the likelihood that some of the identified pseudogene sequences may be due to PCR or sequencing artifacts, as such events would more often result in nonfunctional sequences. The nucleotide deletion result in loss of 3 amino acids was obvious in Tuna-MHCIPS07-9 and Tueu-MHCIPS01-04 and Tueu-MHCIPS08. Both nucleotide deletion, frame shift mutation and premature stop codons were detected in Turu-MHCIPS01,03 and MHCIPS09 at amino acid 33 encoding Exon 3. Loss of 3 amino acids was at position 78 was detected in Tuat-MHCIPS05 and Tuat-MHCIPS06.

3.2. Analysis of Genetic Diversity

Overall we find an elevated genetic diversity (π) within exon 3 alleles repertoire among individuals of Turdus atrogularis was (0.151) than Turdus eunomus (0.113). The average number of nucleotides difference (K) varied between 43.95 in Turdus atrogularis and 32.32 in Turdus eunomus.

3.3. Analysis of Recombination

The recombination detection program not only analyzes brake points but also identify parent sequences. We ran the test of recombination by pooling all putative functional alleles recovered from four species of the Turdidae family. We only find one potential recombination event in Tuna-MHCIPFA06 in Turdus naumanni at two recombinant breakpoints at position 148 and 253. Tuna-MHCIPFA02 as major and Tuna-MHCIPFA011 minor parent. Likewise, a single recombination was significant in Tueu-MHCIPFA07. We detected no recombination among other alleles. However, these recombinations were only significant in two out of seven tests and not consistent with recombination breakpoint identified by GARD, hence the results represent that overall recombination is not likely to have any prominent effects on tests for positively selected sites (Table 2). The recombination breakpoints identified by these two programs are often inconsistent, probably because they use different computational methods.

3.4. Analysis of Selection

Considering that the evolutionary history of each domain might have been different, we tested each domain separately for evidence of positive selection. Selection statistics by traditional methods did not disclose any statistical significant signal of selection that deviate from neutral expectations for Turdus eunomus (Tajima’s : -0.87309, ; Fu and Li’s test statistic: 0.36, ; Fu and Li’s test statistic: 0.03, ) and Turdus atrogularis (Tajima’s : -0.86107, Fu and Li’s test statistic: 0.19, ; Fu and Li’s test statistic: -0.077, ). Still, overall value was significantly higher statistically than in Turdus atrogularis (1.687) and ratio was more pronounced at codons presumably coding PBR (1.994) than codons not involved in such activity (0.884) is presented in (Table 3).

Application of Likelihood models represents that the model M8 allows for positive selection provides a better than the neutral evolution models M7. Sites being positively selected were recognized, are given in (Table 4). In total, we find 12 codons under positive selection in sampled species, of which three sites (25%) match homologues codons found positively selected in other avian species and one (8.3%) matched human peptide binding region (Table 4).

Usually consistent with the above finding, every substitute test (MEME, SALC, FEL, and FUBAR) for positive selection implemented in online adoptive evolutionary server Datamonkey (Weaver et al., 2018) identify numerous codons under positive selection (Figure 2) and (Figure 3).

Across all tests for positive selection, four codons (9, 29, 65, and 88) were frequently identified by all methods as having under positive selection. Of these, codons (42, 59) were corresponding to PBR in human and codons 9, 29, 64, and 88 also match homology to PBR, known as positively selected among passerine in general [56] (Figure 4). The ten most frequent MHC class I alleles retrieved from sampled species displayed 87%-91% sequence similarity to 18 sequences from five other passerine families (Acrocephalidae, Passeridae, Muscicapidae, Paridae, Passerellidae). None of the 77 alleles studied had 100% sequence similarity to other published sequences to GenBank; thus, it establishes no allelic pair in the study population that was 100% sequence likeness shared by another species.

3.5. Phylogenetic Analysis

In phylogenetic analysis, we observed that sampled species form a well-supported monophyletic clade with Erithacus rubeculs members of the Turdidae family in maximum likelihood analysis. Bayesian analysis represents that most of the alleles shared among Turdus atrogularis and Turdus reficollis. This pattern was almost consistent among Turdus naumanni and Turdus eunomus presented in Figure 4. The Net network of putative functional and pseudogene MHC class I exon 3 sequences in the Turdidae family with other passerines indicate that allelic distribution among them is almost congruent with limited divergence. For instance, Tueu-MHCIPFA02 and Tuna-MHCIPFA07 networks formed a monophyletic clade in the phylogenetic network of exon 3. Three alleles were shared among Turdus naumanni and Turdus eunomus two among Turdus rufficollis and Turdus atrogularis. The clustering of the sequences among species could be due to transspecies polymorphism or orthology [57].

4. Discussion

In this study, we have for the first time characterize MHC Class I gene in four species of the Turdidae family in the order Passeriformes from the wide geographical area of Northeast china. Analysis of MHC class I sequences revealed a total of 77 distinct Haplotypes/alleles including 47 putative functional alleles ever reported in passerine species, a group which is reported to have surprising MHC diversity [58, 59]. According to our findings based on MHC class I sequences, the functional loci in an individual ranged from one to three in three of the four species, which was consistent with findings from other passerine species studied till now [60]. In addition, we detected a large number of presumed pseudogene sequences in the sampled population as it retains important information about the evolution of MHC. This is not surprising, as it is consistent with the expectation of evolution by birth-and-death [61]. We made a significant effort to characterize the variation in regions of MHC class I exon 3 in our study population, we find that the primers would make some unlikely bias in allelic variations among individuals. Hence, MHC class I alleles variations per individual should, largly be due to copy number of genes variation among individuals, which has been confirmed in other birds [62]. Few MHC class I alleles were shared between Turdus naumanni and Turdus eunomus as well as among individuals of Turdus ruficollis and Turdus atrogularis is indicating allelic sharing due to common ancestors or challenging common pathogens, as this event is frequent in numerous avian species such as owls, ardeid birds, penguins and passerines [6365].

Generally, abundant variation in genetic material in a species is an indicator of the capacity to adapt to numerous environmental changes by that species. Rapidly evolving environmental pathogens would cause MHC genes to exhibit enlarged genetic diversity in species [66, 67]. Collectively, in our study, we find elevated genetic diversity among functional sequences and significant divergence, whereas pseudogene has low genetic variation and limited divergence. Similar results also have been described in other passerine species, including common yellow throat [68], great reed warbler and the great tit [69]. The allelic variation described in our study could be due to increased immunological defense against the internal pathogen since these are highly unlikely to adapt to novel, infrequent variant [15].

Recombination has been considered an important mechanism that influences allelic diversity and driving evolution of the MHC gene [70, 71] We only find one potential recombination event in Tuna-MHCIPFA06 at two recombinant breakpoints at position 148 and 253 identified with recombination detection program. Similarly, single recombination was significant in Tueu-MHCIPFA07. Recombination pattern was also restricted two out of seven tests; hence our finding indicate recombination is unlikely to have any significant influence on tests for PSs. Though we could not find any substantial recombination among other alleles, qualitatively our result suggests a role for recombination during the evolution of MHC class I in our species studied. Our finding is consistent with, that micro-recombination is frequently observed in MHC genes [57]. Further study of recombinant function in the future will contribute to a detailed understanding of its role in the evolution of the MHC gene.

Positive selection is the maintainer of alleles having the advantageous mutation that maintain fitness of an individual. In our study, the classical test of selection Tajima’s , Fu and Li’s and Fu and Li’s showed no deviation from neutral selection or balance selection. Considering the level of variation, conventional methods used to find selection are not influential [72]. As sites positively selected are likely to accumulate more non-synonymous than synonymous substitutions, influencing amino acid variation to result in functional modifications in proteins [73]. Our study revealed differential expression of selection pattern in functional sequences on regions related with PBR and non-PBR of the MHC class I gene. Codons involved in peptide binding region revealed more non-synonymous substitution than synonymous () in Turdus atrogularis as compared to non- peptide binding region (), pattern was consistent among all species tested, which might be enlightened that stronger selection pressure from intracellular pathogens than extracellular pathogens [74]. Evidence of positive selection at PBR of MHC has been reported in the house sparrow (PBR vs. non-PBR ) [75] and golden pheasant (PBR vs. non-PBR ) [76]. Of the 12 codons in total among species tested exhibit positive selection with Likelihood methods using PAML, 9, 29, 64, and 88 match homologues codons found positively selected in other passerine species.

It should be noted that the pooling of all alleles across loci will mostly reduce selection detection tests, so the outcomes might be conservative, but will be less prone to false positives [77, 78]. Therefore, attention should be given while inferencing about the detected diversity in MHC and the possible effects of selection on individual loci. Our results suggested that α2 domain of MHC class I exon 3 of all species are under positive selection pressure. Pronounced positive selection at antigen-binding sites permits a species or population to present a larger repertoire of peptides (antigens), thus increase the defensive ability against parasitic and pathogenic infections.

Finally, phylogenetic clustering of MHC class I data set of sampled species when pooled with other passerine species produces a contrasting pattern. In general, the MHC class I sequence of the Turdidae family clustered together with sequences from congeneric species. We found increased sequences similarities between the same species rather than within species (trans specific likenesses), is usually described with trans species polymorphism (TSP), which occurs due to alleles passage from ancestral to the decedent via partial arrangement of lineages [79]. Although trans specific similarities can be described with convergent evolution due to the results of similar environmental selective pressure. Studies indicate that TSP is a primary mechanism responsible for clustering of alleles at avian MHC class I [80] (Figure 5).

5. Conclusion

Our study shows that species of the Turdidae family has retained significant MHC class I diversity, which supports high conservational value and contributes to the evolution of MHC class I genes. Importantly, we specifically amplify the exon 3 locus and provide an opportunity to avoid chimera formation during molecular characterization of hypervariable genes of immunity. At the same time, our study is the first to validate contrasting patterns of allelic diversity and positive selection upon inferred PBR and non-PBR codons which supported the hypothesis that different mechanisms can shape evolutionary paths of MHC class I.


MHC:Major histocompatibility complex
PBR:Peptide-binding region
CDs:Cluster of differentiation
CDRs:Complementary-determining regions
ABS:Antigen-binding site
TCR:T-cell receptors
SDS:Sodium dodocyl sulfate
EDTA:Ethylene diamine tetra acetic acid
PCR:Polymerase chain reaction
HLA:Human leukocyte antigen
MEGA:Molecular Evolutionary Genetics Analysis
GTR:General time-reversible model
PFA:Putative function alleles
GARD:Genetic algorithm for recombination detection
RDP:Recombination detection program
PP:Posterior probability
:Nonsynonymous substitution
:Synonymous substitution
:Nucleotide distance
:Amino acid distance
SE:Standard error
PSSs:Positively selected sites
BEB:Bayes empirical Bayes
MEME:Mixed effects model of evolution
SALC:Single likelihood ancestor counting FEL, fixed effect likelihood
FUBAR:Fast unconstrained Bayesian approximation
PAML:Phylogenetic analysis using maximum likelihood
TSP:Transspecies polymorphism.

Data Availability

The data of this study will be available openly to readers, and they can access the data supporting the conclusions of the study.


The manuscript has been presented in “pre-print” at https://www.researchgate.net/publication/346148804.

Conflicts of Interest

The authors declare no conflict of interest.

Authors’ Contributions

MUG, AY, and LB designed the study. MUG carried out the experiment and drafted the experiment. LB supervised the whole study, provided recommendations for, and revised the MS.YCX provided valuable suggestion for the MS. All authors contributed to and approved the current manuscript draft.


This study was supported by the Fundamental Research Funds for Central Universities (grant no. 2572018BE04). Authors are indebted to Kang Hui and Wang Dong for their technical assistance in experiments and willingly providing guidance during silico data analysis. The authors also thank Jacob Njaramba Ngatia and Dr. Mehboob Ahmad for their valued suggestions.