Abstract

Microsatellites abound in most organisms and have proven useful for a range of genetic and genomic studies. Once primers have been created, they can be applied to populations or taxa that have diverged from the source taxon. We use PCR amplification, in a 96-well format, to determine the presence and absence of 46 microsatellite loci in 13 cichlid species. At least one primer set amplified a product in each species tested, and some products were present in nearly all species. These results are compared to the known phylogenetic relationships among cichlids. While we do not address intraspecies variation, our results present a phylogenetic index for the success of microsatellite PCR primer product amplification, thus providing information regarding a collection of primers that are applicable to wide range of species. Through the use of such a uniform primer panel, the potential impact for cross species would be increased.

1. Introduction

Microsatellites, or short sequence repeats (SSRs), are short (2–6 bp) DNA motifs that are repeated at least three, and up to hundreds of, times consecutively [1]. SSRs abound in most organisms, and fishes are no exception, with an estimated frequency of one locus per several kb of DNA [2]. The repetition of a microsatellite motif makes misalignment of template and newly synthesized strands during DNA replication very likely, resulting in a range of alleles differing by whole numbers of repeats [3]. Such unstable mutation dynamics hamper sequence-dependent function of a locus, so with few exceptions, such as the human Huntington's locus, observed microsatellite regions are not transcribed. As primarily neutral, polymorphic loci with a signature pattern that facilitates isolation, microsatellites have proven useful for a range of genomic studies.

Despite broad utility, a researcher interested in applying microsatellite-based tools to linkage mapping or phylogenetic analysis faces a significant investment in time and material to isolate repeat regions and create primers that anneal with the microsatellite flanking regions (MFRs) adjacent to the repeats. Their proximity to microsatellites makes MFRs likely to be selectively neutral, to the point that their sequence can be used as a molecular clock for phylogeny studies [4]. Since MFR-derived primer pairs do not anneal to repeat regions, this disruption does not interfere with these primers’ efficacy. Once MFR primers have been created, therefore, depending on genome-wide mutation dynamics, they can likely be applied to populations or taxa that have diverged from the source taxon. For example, Rico et al. [5] were able to amplify a microsatellite region with the same MFR-derived primer set in two fish species that diverged 470 Mya. However, the pattern of MFR sequence conservation was sufficiently unpredictable to require locus-by-locus confirmation. If, at the outset of work on one taxonomic group, MFR primers are available from previous work on a related group, some expense still must be undertaken to determine which microsatellites are present and informative (i.e., variable) in the particular genomes and populations of interest [6, 7]. The current study presents an index of putative MFR-specific primer sets tested in species representing the major groups within the most speciose family of fish (the Cichlidae). This information should reduce the entrance cost to those interested in applying microsatellite-based analyses to additional cichlid species.

The cichlids of the Great Rift Lakes of Eastern Africa are especially important as research model of evolutionary processes because their phylogenetic history has been reconstructed to reveal multiple adaptive radiations; many are recent ( 2 Mya) and some exceptionally recent ( 12.500 ya). This has resulted in extensive diversification, often exhibiting convergence of form, niche, and behavior [8]. The presence of many closely related species, often sympatric and quite subtly diverged [9], enables an appropriately subtle analysis of speciation genomics and genetic basis for adaptive traits. For example, jaw morphology-related genes have been studied in Malawi cichlids [10], and population structure has been addressed in the Tanganyika rock cichlid species [11] as well as for the sympatrically speciating Midas cichlids from South America [12].

The most intensively studied cichlids are the widely farmed, multigeneric tilapia species, in one of which, Oreochromis niloticus, Lee and Kocher have developed microsatellite isolation methods used to create several hundred sets of MFR-specific primers [13], in addition to creating a linkage map of those loci [14]. Albertson et al. also isolated microsatellites and created a linkage map for a hybrid of Labeotropheus fuelleborni and Metriaclima zebra, two closely related species of Mbuna or rock cichlid from Lake Malawi. In the creation of the Mbuna linkage map, 248 of the primer sets obtained from O. niloticus were also tested, and 46 were found to produce product in the Mbuna hybrid [10]. These 46 loci represent all but three of the O. niloticus linkage groups from the available genetic map [14], with as many as four loci for linkage groups 3, 10, and 17. As O. niloticus and the Mbuna species share a common ancestor with most of the Great Lakes cichlids approximately 18–30 Mya (Figure 1 ) [15], the 46 loci found in both species bore significant chances of being present in other African cichlids. This current study creates a phylogenetic index for a wide range of cichlid species indicating the presence and absence of PCR product using primers to these loci. Such an index is anticipated to aid cichlid researchers. The 46 primer sets were tested on a single genomic DNA samples extracted from each of 13 cichlid species: Astatotilapia burtoni, Neolamprologus brichardi, Perissodus microlepis, Protomelas similis, Metriaclima estherae, Tylochromis sp., Tropheus duboisi, Xenotilapia flavipinnis, Xenotilapia ochrogenys, Retroculus xinguensis, Cichla temensis, Astronotus sp., and Satanoperca sp. This sample covers most of the major African clades and some South American clades. No cichlids from Indian or Madagascan were examined. At least one primer set amplified a product in each species tested, and some PCR primer sets successfully amplified a product in nearly all test species.

2. Materials and Methods

Fin clips were collected in the field and placed immediately in ethanol. Genomic DNA was extracted from each individual using a standard proteinase K/Phenol protocol. PCR was performed using the standard FastStart Taq protocol in 10  L reactions, with 2.5  M MgCl2,  .25 mM Forward and Reverse primers, and  .5 ng template DNA, using the following program: 30 cycles; 30 s at 9 C, 30 s at 5 C, 1 m at 7 C. Alternate reaction conditions were not tested as it is our goal to identify, for use by other researchers, those primer sets that are likely to be most amenable to multiplexing and high-throughput analysis. PCR products were run on 4% agarose gel stained with ethidium bromide. Digital gel images were captured for analysis. Band presence, relative brightness, approximate length, and the presence or absence of a doublet (two distinct bands suggesting heterozygous state or multiple loci) were assessed by eye. Failed reactions were repeated for confirmation of negative results.

Clustering of species according to the pattern of successful PCR products was performed using R software v2.0.1 [16]. The dissimilarity measures were obtained using the dist function in the stats package based on Euclidean distance using product presence and absence information only. The consensus tree and bootstrap confidence values for each node were obtained with the consensus function in the MAANOVA package [17]. The consensus tree dendrogram and confidence values were calculated as the proportion of 1000 trees that agreed with the original tree as obtained by resampling with replacement, again using presence-absence data only.

3. Results

In total, 13 species were assayed including 9 from Africa and 4 from South America. In general, the number and pattern of successful amplification products reflect phylogenetic relationship (Figure 1). Among 9 African cichlid species, 2 are endemic to Lake Malawi and 7 to Lake Tanganyika. The mean number of positive amplifications per species (out of 46) was 33.5 (s.d. 3.5) for Lake Malawi and 35.67 (s.d. 4.03) for Lake Tanganyika. T. polylepis is excluded from this average calculation due to its recent immigrant status and distant relationship to other Tanganyikan cichlids [15, 18]. The 14 positive amplifications from T. polylepis were the least of any African species, but still more than any of the South American cichlids. The two species from the Ectodini tribe showed the most similar pattern of microsatellite amplification products, six primer sets amplified in X. flavipinnis and not X. ochrogenys, but there were no other differences in their amplification patterns. As expected, the more distantly related South American cichlid species showed significantly fewer successful microsatellite products. On average, 4.5 (s.d. 2.06) primers sets amplified in these species, and each species tested had a unique pattern of positive amplifications.

Clustering analysis resulted in a dendrogram that separated the African Great Lakes cichlids from their sister genus, Tylochromis, and from the South American species. There was insufficient statistical confidence to distinguish relationships within the Great Lakes and accurately capture relationships among the South American clades (Figure 1). These results agree, as far as resolution allows, with the mtDNA phylogeny studies [19, 20].

4. Discussion

The data presented here demonstrate that the previously isolated MFR primer sets should be useful for population studies in many cichlid taxa, particularly throughout the East African radiation. While the range of species used in this study cannot definitively predict which primer sets will yield informative (i.e., variable) genetic information for every cichlid species, it does provide a measure of the expected success rate for a given phylogenetic position. Furthermore, the availability of this primer set in a 96-well format will facilitate rapid screening for any species of interest.

The band brightness aspect of the data may estimate sequence divergence in these MFR's. It is possible that highly diverged loci will not amplify as efficiently, and further divergence would prohibit amplification all together; this should be anticipated when a distantly related (e.g., South American) species is studied. As Ellegren [3] made clear, mutation rates vary between loci, individuals, and taxa, due to disabled mismatch repair and proofreading, chromatin structure variation, or other mechanisms. Therefore, we cannot infer sequence similarity by a measure of band brightness, and this information provides only a rough guideline. Similarly, an allele of a given length may have arisen from either a lengthening or from a shortening mutation, meaning that its exact relationship to other alleles is unclear. In addition, as with absolute mutation rates, the relative frequencies of shortening and lengthening vary within genomes and taxa. Therefore, estimating a given allele's ancestry requires considerable groundwork to describe the variation at that locus for any species of interest. For research over a fairly short scale of divergence, where novel alleles are at a minimum, this groundwork will require amplification from several individuals’ genomic DNA to estimate whether enough polymorphism exists to allow for distinction between lineages. Here, (Figure 1) we do report all observed variation in relative brightness (denoted by shading) of the imaged PCR products as well as the presence or absence of a doublet (denoted by the asterisk) in order to provide all possible information regarding potential polymorphism of each locus. However, it must be noted that only a single individual was assayed in the current study and resolution was 20 bp or greater. Therefore, further work is required to describe polymorphic loci in any particular species of interest. We have not conducted such intraspecific analysis in this study because demonstration of variation and utility of the loci as genetic markers must be established for the exact species and population of interest and could not be inferred across species boundaries.

By contacting the corresponding author, the full set of 46 primes used in this study is freely available in a 96-well format and diluted to a working concentration for use in PCR with any species of interest.

Acknowledgments

The authors are grateful for the donation of tissue samples by D. Joyce (University of Hull), Malawi species, and S. Willis (University of Nebraska, Lincoln), South American species. They thank H. Machado for comments on the paper. This work has been supported by a Grant to S.C.P.R. from the M. J. Murdock Charitable Trust (no.:2006253:JVZ:2/22/2007).