Abstract

Bioinformatics tools have been employed for the direct development of gene-based simple sequence repeat (SSR) markers. Through the analysis of 28,056 Mesembryanthemum expressed sequence tag (EST) sequences, a total of 5,851 ESTs containing SSRs were identified, amounting to approximately 17.07 Mb. Among these, 938 EST sequences harbored more than one SSR marker, and 788 EST-SSR sequences were found in compound form. The most prevalent types of SSR motifs were mononucleotide repeats (MNRs), accounting for 44%, followed by di-nucleotide repeats (DNRs) at 37%, and trinucleotide repeats (TNRs) at 16%. Notably, TNR or longer SSR motifs primarily consisted of shorter repeat lengths, with only 51 motifs containing 10 or more repeats. The BLASTX analysis successfully assigned functions to 4,623 (79%) of the EST sequences. Among the developed primer sets, 21 primers amplified a total of 65 alleles, with primer PMA79 EST-SSR exhibiting the maximum of six alleles. The polymorphic information content (PIC) values ranged from 0 to 0.76, with a mean of 0.47. The marker index (MI) and discriminating power (D) values reached 0.66 (primer PMA63) and 0.95 (primer PMA20), respectively. Utilizing the unweighted pair group method with arithmetic mean (UPGMA), a dendrogram was constructed, successfully segregating the 24 Mesembryanthemum genotypes into three distinct clusters, with a similarity coefficient ranging from 0.96 to 0.38. In this study, we have developed a total of 83 EST-SSR primer pairs specific to the Mesembryanthemum genus. These newly developed EST-SSRs will serve as valuable tools for researchers, particularly molecular breeders, enabling gene-based identification and trait selection through marker-assisted breeding approaches.

1. Introduction

Mesembryanthemoideae (Aizoaceae) comprises a single genus, Mesembryanthemum, which consists of approximately 101 species and is indigenous to arid and semiarid regions of South Africa [1]. It is also found in the Mediterranean region, the Atlantic Islands, Saudi Arabia, South Australia, and California [2]. Mesembryanthemum plays a significant role in its native habitat by thriving in harsh, arid environments where other plants struggle to survive [3]. Several species of Mesembryanthemum have been recognized for their antioxidant properties, nutritional and medicinal importance, and ability to accumulate salt, thereby contributing to bioremediation effects [2, 4, 5]. Despite its diverse significance, certain species of Mesembryanthemum are classified as endangered or critically endangered by the International Union for Conservation of Nature (IUCN) [6]. Furthermore, molecular research, including the assessment of genetic diversity and genome mapping, has been hindered by the limited availability of codominant molecular markers such as simple sequence repeats (SSRs).

Initially identified in humans, SSRs or microsatellites are repetitive DNA sequences consisting of 1–6 nucleotide core units [7, 8]. These markers are widely distributed throughout most plant genomes. SSR markers possess several advantages, including high variability, codominant inheritance, easy detection, multiallelic nature, transferability between species, and amenability to PCR amplification [7, 9, 10]. However, the development of specific SSR markers typically involves labor-intensive, time-consuming, and costly procedures. The emergence of expressed sequence tag-simple sequence repeats (EST-SSRs) derived from EST and cDNA sequences [11] has become the preferred choice for SSR markers, given the growing availability of EST and cDNA sequences in global sequence databases such as NCBI [12]. Moreover, EST-SSR markers are located in the coding region of the genome, making them ideal DNA markers for cross-species transferability and gene tagging for desired traits [13, 14]. EST-derived SSR markers are expected to exhibit higher conservation and greater abundance among related species compared to anonymous sequence-derived SSR markers [14]. In barley (Hordeum vulgare L.), approximately 78% of the 165 EST-SSR markers used successfully amplified in wheat, followed by 75% in rye (Secale cereale L.) and 42% in rice (Oryza sativa L.) [14].

While EST-SSR markers have been developed and validated for numerous eudicot plants, including Vicia faba [15], Vigna angularis [16], and Lens culinaris Medik [17], to the best of our knowledge, SSR markers have not yet been developed in Mesembryanthemum. Therefore, this study was conducted to generate EST-SSR markers specific to the Mesembryanthemum genus.

2. Materials and Methods

In May 2021, a total of 28,056 Mesembryanthemum EST sequences corresponding to 17.07 Mb were retrieved from the National Center for Biotechnology Information (NCBI) website (https://www.ncbi.nlm.nih.gov). These sequences underwent a cleaning process to remove poly-A and poly-T tails using the TRIMEST program sourced from EMBOSS [18]. The identification of EST-SSRs was carried out using the MISA-web program developed by Beier et al. [19]. By employing the MISA-web engine online (https://webblast.ipk-gatersleben.de/misa/), mono, di, tri, tetra, penta, and hexa tandem repeats with minimum repeat unit criteria of 10, 6, 5, 5, 5, and 5, respectively, were selected (Table 1). A total of 7,181 SSR loci were discovered across 5,851 EST sequences. To design EST-SSR primers, the Primer3web software was utilized. The “targets” option was employed to indicate the location of the SSR motif to ensure the selection of appropriate flanking primers. The remaining software settings were maintained as default, except for the annealing temperature (set at 60°C ± 3°C) and primer length (set at 20 bp with a range of +6, −2 bp). A BLASTX search was conducted on the NCBI database to determine the putative function of the developed SSR markers. However, only 28 EST-SSR primers were employed for amplifying the genomic DNA from 24 Mesembryanthemum genotypes (Table 2). The iMEC online software [20] was utilized to calculate the polymorphism information content (PIC), heterozygosity index (H), discriminating power (D), marker index (MI), average heterozygosity (av. H), and resolving power (R) for each primer. In addition, a dendrogram representing the 24 Mesembryanthemum genotypes was constructed using NTSYS software and the unweighted pair group method with arithmetic mean (UPGMA) [21].

3. Results and Discussion

We present the novel development of unique EST-SSR markers derived from easily accessible ESTs for Mesembryanthemum. Approximately 17.07 Mb of Mesembryanthemum EST sequences, totaling 28,056 sequences, were analyzed to identify 7,181 EST-SSR markers (Table 3). Among these markers, 5,851 ESTs contained a total of 7,181 SSR repeats, indicating that 20.8% of the EST sequences harbored at least one SSR. The frequency of SSR occurrence was calculated as one repeat per 2.38 kb, which is comparable to the frequencies observed in Mentha piperita (1/3.4 kb) and pepper (1/3.8 kb) [22, 23]. Varshney et al. [14] reported that around 5% of ESTs contain SSRs when the minimum repeat length is set to 20 bp, indicating that the frequency of SSRs can vary significantly depending on the search criteria employed. Out of the 5,851 SSRs identified, 938 sequences contained multiple SSRs, and 788 SSRs occurred in compound form (Table 3).

The distribution and frequency of different motifs in SSRs have been observed to vary widely across plant species. In this study, mononucleotide repeats (MNR) were the most abundant (44%), followed by di-nucleotide repeats (37%), and trinucleotide repeats (16%), as depicted in Figure 1. MNRs have been shown to be valuable in bridging gaps in linkage maps constructed using SSR markers [24].

The majority of trinucleotide repeats (TNRs) or longer motifs consisted of shorter repeat lengths, with only 51 motifs containing 10 or more repeats (Table 4). In total, 65 different EST-SSR motifs were identified (Table 1). The most prevalent SSR motifs were A/T (39.2%) for MNRs, AG/CT (32.3%) for di-nucleotide repeats (DNRs), AAG/CTT (3.6%) for trinucleotide repeats (TNRs), AAG/CTT (10.8%) and AAGG/CCTT (0.3%) for tetra-nucleotide repeats (TtNRs), AAGAG/CTCTT (0.2%) for penta-nucleotide repeats (PNRs), and AACAGC/CTGTTG (0.3%) for hexa-nucleotide repeats (HNRs) (Table 1). Similar findings have been reported previously [12, 22, 25, 26]. Considering the increasing percentage of polymorphic markers with longer repeats, only EST-SSRs with 100 bp or more were selected for designing primer pairs. Consequently, 83 primer pairs were developed for Mesembryanthemum (Table 5). These SSR markers can be utilized in diversity studies, the construction of genetic linkage maps, and marker-assisted breeding. Furthermore, due to the high transferability of EST-SSRs across species, they can be employed in related species where a limited number of SSRs are available [12, 27].

The BLASTX searches successfully assigned putative functions to 4,623 (79%) of the identified EST-SSRs. This information is valuable for guiding the development of specific markers targeting desired genes and facilitating further exploration of gene-related information [27].

4. Validation

Twenty-eight recently designed EST-SSR primers (provided in Table 1) were carefully chosen to encompass all types of nucleotide repeats. These primers were utilized to amplify genomic DNA extracted from 24 Mesembryanthemum genotypes. Out of the 22 primers that successfully produced amplification, 21 primers exhibited polymorphic amplification profiles, resulting in a total of 65 alleles being amplified (Table 5). The maximum number of alleles, six in total, was observed for the PMA79 EST-SSR primer. The polymorphic information content (PIC) values, which estimate the discriminatory power of a locus based on allele number and frequencies, ranged from 0 to 0.76, with an average of 0.47 (Table 6). The marker index (MI), which assesses the overall efficiency of a molecular marker, varied from 0 (PMA44) to 0.66 (PMA63), with a mean of 0.41. In addition, the discriminating power (D) of the primers ranged from 0 (PMA44) to 0.95 (PMA20), averaging at 0.67 (Table 6).

The resulting UPGMA dendrogram (Figure 2), which is a visual representation of the genetic relationships, classified the Mesembryanthemum genotypes into three distinct clusters. This clustering indicates that there are underlying genetic similarities and differences among the genotypes. The UPGMA method organizes the genotypes based on their genetic profiles, allowing us to observe patterns of relatedness.

The similarity coefficient, ranging from 0.38 to 0.96 with a mean of 0.67, provides a quantitative measure of genetic similarity or dissimilarity among the genotypes. A higher similarity coefficient suggests a closer genetic relationship, indicating that genotypes with coefficients closer to 1.0 share a larger proportion of genetic material.

The diversity in the range of similarity coefficients (0.38 to 0.96) signifies a substantial genetic variation within the Mesembryanthemum genotypes being studied. The mean similarity coefficient of 0.67 suggests a moderate level of genetic similarity on average, implying a balanced mix of genetic relatedness and diversity among the genotypes. Understanding the genetic diversity and relationships among these Mesembryanthemum genotypes is crucial for various applications, including breeding programs, conservation efforts, and understanding the evolutionary history of these genotypes.

Due to their gene specificity, EST-SSRs are valuable tools for gene tagging and comparative investigations. They can be employed in the development of linkage maps and studies on diversity across related species, as demonstrated by Sahu et al. [27] and Akash and Myers [12]. The newly developed set of EST-SSRs presented in this study offers molecular breeders enhanced resources for gene-based identification and selection of traits through marker-assisted breeding.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

This study was supported by the Deanship of Academic Research The University of Jordan.