Abstract

Myostatin (MSTN) is a negative modulator of muscle mass. We characterized the horse (Equus caballus) MSTN gene and identified and analysed single nucleotide polymorphisms (SNPs) in breeds of different morphological types. Sequencing of coding, untranslated, intronic, and regulatory regions of MSTN gene in 12 horses from 10 breeds revealed seven SNPs: two in the promoter, four in intron 1, and one in intron 2. The SNPs of the promoter (GQ183900:g.26T C and GQ183900:g.156T C, the latter located within a conserved TATA-box like motif) were screened in 396 horses from 16 breeds. The g.26C and the g.156C alleles presented higher frequency in heavy (brachymorphic type) than in light breeds (dolichomorphic type such as Italian Trotter breed). The significant difference of allele frequencies for the SNPs at the promoter and analysis of molecular variance (AMOVA) on haplotypes indicates that these polymorphisms could be associated with variability of morphology traits in horse breeds.

1. Introduction

Myostatin, encoded by the MSTN gene (previously referred to as GDF8), is a member of the transforming growth factor superfamily that normally acts to limit skeletal muscle mass by regulating both the number and growth of muscle fibres [1, 2]. MSTN is synthesized as precursor and upon proteolytic processing gives an N-terminal latency-associated peptide, termed myostatin propeptide or LAP-fragment, and a smaller mature peptide at the C-terminus [3]. The MSTN gene, composed of three exons and two introns, has been characterized in rodents [1], humans [4], and several livestock species [3, 58]. Natural mutations that decrease the amounts of myostatin and/or inhibit its function have been identified in a human subject [9] and in several cattle [2, 3, 1013], sheep [1416], and dog [17] breeds. In Belgian Blue, Piedmontese, Marchigiana, and other cattle breeds, loss-of-function mutations within the coding sequence of the MSTN gene determine increased skeletal muscle mass, relevant in shoulders and thighs, and the produced phenotype is known as “double-muscling” [2, 3, 1013, 18]. These polymorphisms have, in several cases, effects on growth, reproductive, performances, and carcass quality traits [3, 18, 19]. In the Whippet dog breed a mutation in the third exon determining a premature stop codon causes an increased muscle mass phenotype in homozygous state and enhanced racing performance in heterozygous dogs [17]. In two Norwegian sheep breeds, two different mutations in the MSTN coding region are associated with carcass conformation and fatness [15, 16]. In addition, in other sheep and in pigs, mutations identified in non coding regulatory regions affect the level of MSTN gene expression and/or are associated with growth, muscle mass, and other carcass traits [8, 14, 20, 21].

In horse (Equus caballus), only few studies examined the MSTN gene so far. Hosoyama et al. [22] isolated and sequenced an MSTN cDNA from a Thoroughbred horse and Caetano et al. [23] mapped this gene to equine chromosome 18. Mutations in the equine MSTN gene have been identified only recently in Thoroughbred breed [24].

Different horse breeds present a variety of morphological phenotypes that have been used to group breeds into a few classes. However, no system provides a robust classification in which each breed could have an unequivocal assignment. Based on size traits and build, horse breeds are categorized in draught (or heavy), light, and pony (or animals that mature at less than 148 cm high, usually used as riding school and children’s mounts) [25]. Considering skeletal structure, proportions, zoometrical indices, length, and volume of muscling, that, in turn, reflect the selective goals and uses of the horse breeds, they are categorized in brachymorphic, mesomorphic, dolichomorphic, and intermediate types (such as meso-dolichomorphic) [26]. Brachymorphic horses (corresponding to draught horses), traditionally referred to as cold-blooded horses in relation to their quiet and calm temperament, are tall in stature, heavy boned, and extremely muscular with short and thick muscles and slow twitch oxidative fibers for slow contraction. They most likely develop strength and power, and their conformation is well suited for pulling carriage, draught power and meat production. Dolichomorphic horses (corresponding to light horses) are characterized by longer bodies and long and thin muscles mainly constituted by fast twitch glycolytic fibers. They are selected for sport purposes, fast running, and high speeds. Examples of Italian breeds representative of these two extreme phenotypes are reported in Figure 1. Mesomorphic type is characterized by a lighter physical structure than brachymorphic but still powerful and compact with massive muscling. This group also includes some breeds with draft-type qualities and classified as ponies based on their withers height (such as Bardigiano and Haflinger breeds). The mesomorphic horses are usually used for pleasure and riding. In addition, several breeds (like local breeds with influence of Oriental, Thoroughbred, and Iberian halfbreed and descendents) have characteristics of both mesomorphic and dolichomorphic types (referred to as meso-dolichomorphic) [26].

For the important pleiotropic effects of the MSNT gene, including its role on muscle mass development, polymorphisms in this gene could contribute to explain the morphological variability among horse breeds. Here we sequenced the MSTN gene, including regulatory regions, in several horse breeds and identified a few polymorphisms that were used to evaluate their potential association with different morphological types.

2. Materials and Methods

2.1. Animals and Horse Breeds Classification Based on Different Morphological Types

A total of 396 minimal related horses belonging to 16 breeds were sampled in different farms or stables. Details of the horse breeds involved in the analysis are given in Table 1.

These horse breeds were classified as brachymorphic (B), mesomorphic, (M), meso-dolichomorphic (M-D), and dolichomorphic (D) (Table 1) as indicated in the homepage of their own Breed National Associations based on linear measures (height at withers, chest girth, and cannon circumference), structure and anamorphosis index (AI= (chest girth)2*100/height at wither), and based on bibliographic data [2730].

All horses were registered in the Stud Books or in the Italian Anagraphic Register constituted for local ethnic groups (Noric, Salernitano, Tolfetano, and Ventasso). The Lipizzan samples (Lipizzan Italian Stud, Monterotondo, Italy) included all six classical stallion lines: Conversano, Favory, Maestoso, Neapolitano, Pluto, and Siglavy.

2.2. PCR and Sequencing

Genomic DNA was extracted from hair roots following standard procedures. Ten primer pairs (Table 2) that amplify different MSTN regions were designed using Primer 3 (http://frodo.wi.mit.edu/primer3/input.htm) software. PCR reactions were performed in a final volume of 20  L containing 10–80 ng of equine genomic DNA, 250 mM of each dNTP, 10 pmol of each primer, 1 U of EuroTaq DNA polymerase (EuroClone Ltd., Paington, Devon, UK), or 1 U of TaKaRa Ex Taq DNA Polymerase (TaKaRa Bio Inc., Shiga, Japan) and 1 PCR buffer with concentration specific for each primer pair (Table 2). PCR conditions were: an initial step at for 5 minutes, 35 cycles of for 30 s, specific annealing temperature for each primer pair for 30 s, for specific reaction times for different primer pair (Table 2), and a final step at for 9 minutes. Genomic DNA obtained from 12 horses of 10 breeds (1 Bardigiano, 1 Haflinger, 1 Italian Saddle, 2 Italian Trotter, 1 Noric, 2 Rapid Heavy Draft, 1 Salernitano, 2 Throroughbred, and 1 Ventasso) constituted the sequencing panel. PCR fragments obtained from the sequencing panel with primer pairs 1–10 were purified using the QIAquick PCR Purification Kit (Qiagen, Düsseldorf, Germany) and sequenced on both strands using the BigDye Cycle Sequencing kit v.3.1 (Applied Biosystems, Foster City, CA, USA). Sequencing reactions were electrophoresed in a capillary sequencer (Applied Biosystems).

2.3. Sequence Analysis, Polymorphism Identification, and Genotyping

Sequences were aligned and processed with the help of the BioEdit software v.7.0.5.2. Polymorphisms were identified by visual inspection of the electropherograms and sequences were aligned with ClustalW2 program (http://www.ebi.ac.uk/Tools/clustalw2/index.html) and using BLASTN (http://blast.ncbi.nlm.nih.gov/Blast.cgi). The horse MSTN promoter sequence was analysed in silico for the presence of putative transcription factor binding sites using MatInspector (http://www.genomatix.de/) bioinformatics tool. This region of horse MSTN gene was aligned with that of cattle (AJ310751), goat (AY827576), human (AX058992), mouse (AY204900), pig (AY8641281), and sheep (DQ530260) to identify evolutionary conserved motifs.

PCR-RFLP protocols were designed to genotype two identified SNPs (g.26T C and g.156T C) in the sampled horses. To genotype the g.26T C SNP, the amplified products of 484 bp obtained with primer pair 1 (Table 2) were digested with RsaI (recognition sequence: GT AC). Briefly, of PCR reaction was restricted with 2.5 U of RsaI (Fermentas, Vilnius, Lithuania) at overnight and the resulting fragments (g.26T allele=484 bp; g.26C allele= ) were resolved on 2.0 % agarose gels stained with ethidium bromide. The g.156T C SNP was genotyped amplifying a fragment of 204 bp with primer pair 11 (Table 2) that inserted an artificial restriction site (with a mismatched reverse primer) for SspI (recognition sequence: AAT ATT) when allele g.156T occurred. The obtained fragments (g.156T allele=179 bp 25 bp; g.156C allele=204 bp) resulting from digestion of of PCR reaction with 2.5 U of SspI (Fermentas) at overnight were electrophoresed in 3.5% agarose gels and visualized with ethidium bromide.

2.4. Statistical Analysis

Allele and genotype frequencies, observed and expected heterozygosity, and were calculated using PopGene software v. 1.32 [31]. is a measure of population differentiation based on genotypic data. Allele frequencies among the four groups (B, M, M-D, and D) were compared using Fisher’s exact test. The haplotypes of the two promoter SNPs were reconstructed using PHASE program v. 2.0 [32]. ARLEQUIN software v. 3.1 (http://cmpg.unibe.ch/software/arlequin3) was used for the analysis of molecular variance (AMOVA) testing the effect of the morphological types in population differentiation with a model including types (four levels: B, M, M-D and D; and two levels: B M and M-D D), types/breeds, individuals/breeds, and individuals.

3. Results and Discussion

3.1. Horse MSTN Genomic Structure and Sequence Analysis

Sequenced fragments of the horse MSTN gene were assembled into one sequence of 5724 bp (submitted to GenBank under accession number GQ183900) that resulted 100% identical with that that was, in the meantime, annotated in the EquCab2 horse genome assembly derived from a Thoroughbred horse (http://www.ensembl.org/Equus_caballus/Search/, Ensembl release 52-Dec 2009). Our sequence contained 671 bp upstream from the ATG start codon, 538 bp of the promoter, and the entire -untranslated region (UTR) of 133 bp, the three exons (except 33 bp of exon 1), the two intervening introns, and 80 bp of the -UTR (Figure 2). The transcription start site of the first exon was deduced from human and bovine MSTN exon 1 sequences [4, 6]. The coding regions of exons 1, 2, and 3 of the horse MSTN gene contained 373, 374, and 381 bp, respectively. Introns 1 and 2 included 1829 bp and 2016 bp, respectively, almost the same length reported in cattle (1840 bp and 2033 bp, respectively) and pig (1809 bp and 1980 bp, respectively) [6, 8]. Intron 1 is a type 1 intron as it interrupts a codon between the first and second exon whereas intron 2 is a type 0 intron as it divides the coding sequence between two codons as in other species [6, 8]. The analysed proximal promoter region and the -UTR of the horse MSTN exhibited a degree of identity with the corresponding regions of other species ranging from 77% (mouse) to 90% (pig). Putative consensus DNA sequences known as transcription factor binding sites, DNA-binding motifs, or cis-regulatory elements were identified in the positive strand of horse promoter (Figure 3). Considering the general transcription factors, three different putative TATA boxes (TATA-1, TATA-2, and TATA-3) and one CCAAT box were detected. Among muscle-specific transcription factors, four E-boxes (named E1, E2, E3, and E4 boxes, Figure 3), one putative site for myocyte specific enhancer factor 2 (MEF2 or MEB1) and consensus sequences for FoxO and SMAD binding sites (CAAAATA and CAGACA, respectively) family sequences were identified. The alignment of MSTN promoter sequences across different species (horse, cattle, goat, human, mouse, pig, and sheep) revealed that these DNA-binding motives, particularly close to the TATA-1 surrounding sequence, were highly conserved across species. In particular, TATA-1 was conserved in all examined species except mouse, the second TATA sequence (TATA-2) was conserved across all seven species, and TATA-3 was conserved in all species except pig and mouse. The MEF2 and E-boxes were conserved in all the considered mammals except in human and mouse for E-box4. The E-boxes can be activated by the myogenic regulatory factors (MRFs: MyoD, Myf5, myogenin, and MRF4). MyoD upregulates MSTN transcription [33] and at the same time MSTN inhibits MyoD expression and activity regulating the differentiation of myoblasts into myotubes [34]. MyoD and MRF4 play competitive roles in myogenesis and might act as molecular switches to determine myogenic differentiation and cell proliferation, respectively [35]. Additional E-boxes were identified in the analysed region (such as an E-box located near the TATA-2 in pig and an additional E-box in all mammalian but cattle) and in the distal region of the promoter of the other mammals (not included in Figure 3). In cattle, Spiller et al. [33] showed the importance of three functional E-boxes (E3, E4, and E6) of which the E6, occupied by MyoD in vitro and in vivo, resulted crucial for the MSTN promoter activity. The close position of functional E-boxes suggests that they might function as a cluster to better sustain the stability of DNA-protein. Across the MSTN promoter sequences of all considered livestock species we identified the conserved position of sites matching the consensus for FoxO binding and the adjacent SMAD box whose presence was not evidenced in previous works [8, 33, 36]. Recent data demonstrated that these factors appear to act through independent pathways but additively to regulate the expression of MSTN and contribute to control muscle cell growth and differentiation [37, 38]. In addition, FoxO transcription factors plays a critical role in development of muscle atrophy by stimulating proteolysis and by increasing myostatin expression. Putative E-boxes were identified both in intron 1 (six boxes) and in intron 2 (six boxes) and one putative E-box was located in the -UTR at seven nucleotides downstream of the TGA stop codon (data not shown). The presence of E-boxes in the introns and -UTR of equine MSTN gene has not been described yet even if their occurrence has been highlighted recently in introns of porcine MSTN gene [8].

3.2. Identification of Polymorphisms in the Horse MSTN Gene

Sequencing of the panel of horses of different morphological types revealed a total of seven single nucleotide polymorphisms (SNPs) (Figure 2). Two transitions were located in the promoter region at -646 (GQ183900:g.26T C) and -516 (GQ183900:g.156T C) bp upstream from the start codon. The g.26T C SNP was within a conserved position (except in mouse) but not within an identified known functional motif while the g.156T C polymorphism was within a TATA box-like (TATA-3; YATAAA, Figure 3). Sequence alignments of the MSTN promoter regions of different species indicate that the g.26T and g.156T alleles derive from an ancestral MSTN sequence as most close species present the indicated nucleotides (Figure 3 and data not shown). The other five SNPs were in intronic regions: four were localized in intron 1 and one in intron 2 (Figure 2). Three of the SNPs of intron 1 (g.1634T G, g.2115A G, and g.2327A C) were also recently identified in Thoroughbred breeds [24]. One of which (g.2115A G; indicated by [24] as g.66493737C T) has been associated with sprinting ability and racing stamina in Thoroughbred horses [24]. The remaining SNPs were not reported by others and represent new polymorphisms of the horse MSTN gene. None of these intronic SNPs resided within splice sites or within particularly conserved sequence elements. No indels and synonymous or nonsynonymous substitutions were identified.

3.3. Analysis of Polymorphisms in Breeds with Different Morphological Types and Genetic Diversity Parameters

Allele frequencies for the two SNPs located in the promoter region (g.26T C and g.156T C) are shown in Table 3. The g.26T C SNP was polymorphic in 6 out of 16 breeds with higher observed frequency of the g.26C allele in the Lipizzan breed (0.21). For the g.156T C polymorphism, the mutant g.156C allele, which changes the predicted TATA box3-like, was detected in 11 out 16 breeds and was identified in homozygous condition in a few Bardigiano, Haflinger, Noric, Rapid Heavy Draft, and Uruguayan Creole horses.

Haplotype analyses of the two mutations showed the presence of three haplotypes: [g.26T:g.156T], [g.26T:g.156C], and [g.26C:g.156T] (Table 3). The [T:T] haplotype could be the wild type according to its presence in all breeds and higher frequency (from 0.54 to 1.00). The [T:C] haplotype was observed in 10 breeds (frequency from 0.05 to 0.40), whereas the [C:T] haplotype was identified only in 6 breeds (frequency from 0.01 to 0.21) (Table 3).

In order to evaluate if the two SNPs in the promoter region could account for a quote of variability related to morphological types, we classified the analysed horse breeds in four groups (brachymorphic, B; mesomorphic, M; meso-dolichomorphic, M-D; and dolichomorphic, D) (see Materials and Methods). Several descriptive statistics summarizing the genetic diversity of these groups are reported in Table 4. The B group showed the highest observed and expected heterozygosity ( and , respectively), whereas the D group had the lowest values ( for both measures). For the g.26T C SNP, differences in allele frequencies were significant between B and the other three groups ( for B versus M and B versus D; for B versus M-D). For the g.156T C polymorphism, only the comparison between B and M groups was not significant. In particular, differences in allele frequencies were highly significant between the B and D groups and between the M and D groups ( and , respectively. For the remaining comparisons: for B versus M-D and for M versus M-D, for M-D versus D). The overall value showed that the genetic differences among the groups accounted for 6.1% (3.6% for the g.26T C SNP and 7.0% for the g.156T C SNP) of the genetic variation. The AMOVA on haplotypes confirmed that a proportion of the total molecular variance was associated to morphological types of the horses. Using the four morphological types the molecular variance explained was 6.40% ( ). Grouping these four types into two groups (B M and M-D D) according to their similarities on morphological types the quote of explained molecular variance was 10.6% ( ). It could be possible that differences of allele and haplotype frequencies among types are influenced by phylogenetic closeness rather than any association with morphological types. This issue should be further investigated as, to our knowledge, there are no studies analyzing this question that include most of the breeds we investigated. However, Di Stasio et al. [39] analysed genetic relationships among only three breeds included in our study and evidenced significant genetic differentiation among Bardigiano, Haflinger, and Maremmano, suggesting that the results we obtained might not be biased by a putative common origin of the breeds. The association of the two promoter SNPs with morphological types could be due to linkage disequilibrium with alleles in other chromosome 18 loci that affect the variability of morphological traits in horses. However, based on our results it cannot be excluded that MSTN SNPs could influence morphological traits, that are indirectly related to muscle mass. A few SNPs in the promoter region of the swine MSTN gene were associated with muscularity, growth, and meat quality traits [8, 20, 21]. One of them, with high frequency in the muscled Belgian Pietrain breed, was associated with MSTN expression level, suggesting that promoter polymorphisms could contribute to muscle mass in this pig breed [8]. To demonstrate the putative functional role of the identified horse MSTN promoter SNPs, expression studies in skeletal muscle of animals with different genotypes should be performed. However, it is worth to point out that in vivo RNA expression studies in horses are very complicated as it is quite difficult to standardize temporary and permanent environmental factors (i.e. age, sex, management, feeding, etc.) that are major sources of variability in such experiments. For these reasons in vitro assays might be needed to clarify if the identified SNPs could alter MSTN gene expression. In addition association analysis in breeds segregating for the two promoter SNPs and for which estimated breeding values for several conformational and performance traits are available could be useful to further evaluate the association of these polymorphic sites with phenotypic traits.

Acknowledgments

The authors thank horse breeders, Italian Horse National Breeders Associations, Associazione Italiana Allevatori (A.I.A.), Istituto Sperimentale per la Zootecnia of Rome (Italy), and Association Rural de Uruguay for providing horse samples and genealogical information. They also thanks three anonymous reviewers for their comments that made it possible to improve the manuscript. This study was funded by the University of Bologna RFO.