Abstract

Monocot genomic diversity includes striking variation at many levels. This paper compares various genomic characters (e.g., range of chromosome numbers and ploidy levels, occurrence of endopolyploidy, GC content, chromosome packaging and organization, genome size) between monocots and the remaining angiosperms to discern just how distinctive monocot genomes are. One of the most notable features of monocots is their wide range and diversity of genome sizes, including the species with the largest genome so far reported in plants. This genomic character is analysed in greater detail, within a phylogenetic context. By surveying available genome size and chromosome data it is apparent that different monocot orders follow distinctive modes of genome size and chromosome evolution. Further insights into genome size-evolution and dynamics were obtained using statistical modelling approaches to reconstruct the ancestral genome size at key nodes across the monocot phylogenetic tree. Such approaches reveal that while the ancestral genome size of all monocots was small (  pg), there have been several major increases and decreases during monocot evolution. In addition, notable increases in the rates of genome size-evolution were found in Asparagales and Poales compared with other monocot lineages.

1. Introduction: How Distinctive Are MonocotGenomes?

Monocotyledons (monocots) comprise c. 25% of all angiosperms and are a remarkably variable group with species found growing on all continents and in all habitats. They were first distinguished from other angiosperms by the presence of a single cotyledon [1] and they have since been shown to be strongly supported as sister to the eudicots + Ceratophyllum clade ([2] plus others).

At the genomic level, monocots are remarkably diverse, with striking variation at many levels ranging from gene sequences through to the number of chromosomes per genome, the number of genomes (ploidy), and the amount of DNA per genome (genome size). Yet how different are monocot genomes from other angiosperms? In this post-genomics era of large-scale sequencing and comparative analysis, the availability of large amounts of sequence information together with increasing amounts of more traditional cytological data provides new insights into this question.

This paper reviews available data to highlight some of the similarities and differences between monocots and the remaining angiosperms that have been revealed. One of the most striking features of monocots is their wide range of genome sizes, and this genomic character is analysed in greater detail to examine the diversity and dynamics of genome-size evolution within monocots.

2. Comparisons between the Genomes of Monocots with Other Angiosperms

Surveys of the literature and online databases have revealed that many aspects of monocot genomes are generally similar to other angiosperms.

2.1. Range of Chromosome Numbers

The minimum and maximum chromosome numbers so far reported for monocots and eudicots are similar. Both groups contain species with (four monocots and two eudicots) [6, 7], and the highest number so far recorded is 600 for the monocot palm, Voanioala gerardii (Arecaceae) [8, 9], and . 640 in the eudicot stonecrop Sedum suaveolens (Crassulaceae) [10].

2.2. Occurrence of Polyploidy and Maximum Ploidy Levels

In both monocots and eudicots 70%–80% of species are estimated to be cytological polyploids, suggesting similar propensities in each group to undergo polyploidization [11, 12]. There is also molecular evidence of ancient whole genome duplications not only at the base of both monocot and eudicot lineages but also in Nuphar, a member of an early diverging angiosperm lineage (Nymphaeaceae). This suggests that most if not all angiosperms retain evidence of polyploidy in their evolutionary history [13, 14]. Nevertheless, the maximum number of whole genome duplications so far reported is estimated to be only c. 38 in Poa literosa ( ) [15] and Voanioala gerardii [16] compared with 80 in Sedum suaveolens [10] perhaps pointing to differences between monocots and eudicots in the maximum possible number of polyploidy cycles. However, the reduction of chromosome numbers through dysploidy is a common mode of chromosome evolution in many groups which will obscure the signature of polyploidy over time. Thus, whether the observed differences in maximum ploidy levels reflect biologically different propensities to undergo polyploidy and/or dysploid reductions in monocots and eudicots is currently unknown.

2.3. Endopolyploidy

Endopolyploidy, the occurrence of elevated ploidy within cells of an organism arising either by endoreduplication or endomitosis [17], has been widely documented in angiosperms. However, surveys examining its occurrence in different families suggest that there is no significant difference between monocots and other angiosperms. For example, in a study of 49 species from 14 families (including three monocot families: Amaryllidaceae, Poaceae, and Liliaceae) Barow and Meister [18] showed the most significant factor determining whether or not a species underwent endopolyploidy was the particular life strategy adopted. It was observed to occur in species as a way to accelerate growth and was noted to be more frequent in annual and biennial herbs than perennials and absent in woody species. More recently,Barow and Jovtchev [19] reviewed the occurrence of endopolyploidy across angiosperms and listed 18 families (eight monocots and 10 eudicots) with predominantly endopolyploid species and ten families (three monocots and seven eudicots) with predominantly nonendopolyploid species.

2.4. GC Content of Genome

A number of papers have reported differences in nucleotide composition between monocot and eudicot genomes. These include differences in the %GC content at both the whole genome level and for individual genes. In both cases, the range of %GC values for monocots was wider compared with eudicots [2123]. Nevertheless, many of these studies were based on analyses of just a few species in which all the monocot examples were taken from Poaceae. In more recent large-scale analyses, which extend to other monocot orders including Acorales, Asparagales, and Zingiberales, the picture is less clear [2426]. Although species in Poales continue to show marked differences in their GC profiles compared with eudicots, analysis of the overall genomic %GC, the GC content of genes, and the distribution of GC content within coding sequences reveals that species belonging to some monocot orders are more similar to eudicots than Poales (e.g., Acorus; Acorales, Asparagus; Asparagales and Allium; Asparagales) whereas other species have GC profiles with characteristics shared by both eudicots and Poaceae (e.g., Musa; Zingiberales). A strong divide in genomic composition in terms of GC content and organisation does not therefore seem to exist between monocots and eudicots

3. Differences between the Genomes of Monocots and Other Angiosperms

Despite these overall similarities there are some genomic features that are distinctive in monocots, and these include the apparent greater flexibility in how DNA is organized into chromosomes and the amount of DNA comprising the genome.

3.1. Chromosome Packaging and Organization

In terms of chromosome packaging and organization, cytological investigations to date have suggested that the presence of holocentric chromosomes (i.e., those lacking a localized centromere) are more common in monocots than the rest of the angiosperms. Although the number of times they have arisen may be similar between these two groups (i.e., three families in each), the total number of species with holocentric chromosomes is greater in monocots. For example, they have been reported to be frequent in Cyperaceae which comprises c. 3,600 species [2730], Juncaceae (comprising c. 325 species) [31], and the genus Chionographis (Melanthiaceae) (comprising c. seven species, [32]). In contrast, in the rest of the angiosperms they have so far only been noted in the nutmeg, Myristica fragrans (Myristicaceae) [33], c. 28 species of the parasitic Cuscuta subgenus Cuscuta (Convolvulaceae) [34, 35], and Drosera (c. 80 species) (Droseraceae) [36].

Similarly, available data suggest that the packaging of DNA into a bimodal karyotype organization (i.e., karyotypes comprising two distinct sizes of chromosomes) is more common in monocots (especially in Asparagales, Alismatales and Liliaceae, [3739]) than the rest of the angiosperms where they have been reported in far fewer species (e.g., Rhinanthus minor (Orobanchaceae) [40] Acantholepis orientalis (Asteraceae) [41], Onosma (Boraginaceae) [42, 43], and some Australian Drosera [44].

Organization of the DNA at the telomeres of chromosomes also shows greater variability in monocots than in other angiosperms [45]. Whereas nearly all nonmonocot species analysed to date have been shown to contain typical Arabidopsis-like telomeric sequences at the ends of their chromosomes (i.e., ) (the exception being three genera of Solanaceae Vestia, Cestrum and Sessea) [46], in monocots, a large clade within Asparagales (comprising c. 6300 species) has replaced the Arabidopsis-type sequence with the human-type telomere sequence [4749] in the majority of species examined. Species of Allium were shown to be the exception even to this with no recognisable minisatellite so far identified [50].

3.2. Genome Size Diversity

Probably one of the most distinct differences is the diversity of genome sizes encountered in monocots compared with other angiosperms. Whereas several previous studies highlighted differences in the profile of genome sizes between monocots and dicots (e.g., [21, 51], both based on analyses of 2802 species), here the analysis is considerably extended to encompass the much larger and more representative genome size data set now available (see below) together with the more robust phylogenetic framework on which to analyze the data.

3.2.1. Data Available for Analysis

The Plant DNA C-values database [52] currently contains genome size data for 4427 angiosperms including 1885 monocot species. These values were combined with a further 1861 genome size estimates for species not already listed in the database but published in the literature to give a data set comprising 6288 species (including 2527 monocots). Table 1 shows the percentage representation for each of the major groups of angiosperms at different taxonomic levels.

4. Genome Size Diversity across Angiosperms

Across angiosperms, genome sizes range nearly 2000-fold from a 1C-value of just 0.063 pg in Genlisea margaretae (Lentibulariaceae) [53] to over 125 pg in tetraploid Fritillaria assyriaca (Liliaceae) [54]. This makes them one of the most variable groups of eukaryotes in terms of genome size. Nevertheless, a histogram showing the distribution of different genome sizes (Figure 1) reveals that most species have very small genomes, with a mode, median, and mean genome size of just 0.6, 2.6, and 6.2 pg, respectively (N.B. the 1C value corresponds to the DNA amount in the unreplicated gametic nucleus).

To examine how this diversity of genome size data is distributed within a phylogenetic context and to compare monocots with the rest of the angiosperms, data were superimposed onto the summary topology of angiosperms given by Soltis et al. [2] (Figure 2). The topology combines data from the three-gene, 567-taxon data set of Soltis et al. [55, 56] modified in light of more recent data arising from the analyses of nearly complete plastid genome data sets of Jansen et al. [57] and Moore et al. [58]. As Figure 2 shows, the large diversity of genome sizes is not spread evenly across all angiosperm groups. Thus although all clades contain species with small genomes, species with very large genomes occur in isolated clades within the monocots and eudicots.

There are clear differences between monocots and eudicots, and this is seen by comparing their genome size profiles (Figure 3). Not only is the maximum DNA amount of monocots (  pg) nearly 40% bigger than the largest eudicot genome (  pg) for which we have data, but also the mean and median values are significantly larger (Figures 3(a) and 3(b)). The differences are particularly apparent if we focus in on the upper end of the range (Figures 3(c) and 3(d)). In eudicots the largest genome sizes so far reported are found in two mistletoe species (  pg in Viscum cruciatum and  pg in Viscum album, both with ) [59, 60]. These are, however, clearly outliers as they are nearly twice the size of the next largest eudicot genome in the genus Hepatica (H. nobilis var. pubescens; ;  pg) in Ranunculaceae [61]. As Figure 3(c) shows, even this is an outlier as 99.5% of all eudicots have genomes smaller than 25 pg. In contrast there are many more monocot species possessing large genomes with 10% having genomes bigger than 25 pg based on the current sample (Figure 3(d)).

5. Genome Size Diversity within Monocots

Within monocots the 637-fold range of genome sizes is not distributed evenly across orders (Figure 4(a)); instead, distinct differences in the genome size range, mean, median, and modal values (Table 2) and profiles are apparent (Figure 4(b)). All orders have species with small genomes, whereas those with larger genomes (i.e.,  pg) are phylogenetically restricted.

Across angiosperms as a whole there is no overall clear correlation between genome size and total chromosome number (2n), and chromosomes can vary in size without any change in DNA depending on the nutrient status of the plant [6264]. Nevertheless, many studies have highlighted the potential for using chromosome data as proxies for genome size (e.g., [38, 6568]). Thus in the survey of genome size diversity in monocot orders presented below, the data have been supplemented with the more comprehensive chromosome information that is available. Taken together, it is apparent that different monocot orders follow distinctive modes of genome size and chromosome evolution.

5.1. Acorales

This order comprises a single genus with two species, Acorus gramineus and A. calamus. Small genome sizes for both species have been reported with  pg for A. gramineus and  pg for A. calamus, although no chromosome counts were given [69]. Nevertheless since only diploid counts have been recorded for A. gramineus so far ( ) whereas triploids ( ) and tetraploids with and 48 have been noted for A. calamus with small chromosomes (c. 1-2  m in A. calamus [7072]), it is suggested that the larger genome size reported for A. calamus is probably from a tetraploid cytotype.

5.2. Alismatales

Genome size data are available for 106 species in 12 of the 13 families within this order [4] and range from  pg in two species of Araceae (Spirodela polyrrhiza with and Pistia stratiotes with ) to  pg in Zamioculcas zamiifolia (although no chromosome count was reported, previous ones have all been ) [73]. As in monocots as a whole (Figure 3(b)), the distribution of genome sizes in this order is skewed towards the smaller sizes (Figure 4(b)), with only two families (Alismataceae and Araceae) possessing genomes larger than 5 pg. Polyploidy has played a role in generating these larger genomes but the predominant mechanism has been through increases in chromosome size, with some of the largest chromosomes so far reported being found in species with relatively low chromosome numbers in Alismataceae, Hydrocharitaceae, and Araceae [7377]. Indeed, the species with the highest chromosome number and a genome size estimate is Lemna minor (Araceae) with and yet its 1C-value is just 1.5 pg [78]. Similarly, the highest chromosome number so far reported in Alismatales is in Arisaema heterophyllum (Araceae) [79]; however, its chromosomes are small (c. 1  m), and its genome size is thus unlikely to exceed 24 pg.

5.3. Petrosaviales

Currently there are no genome size estimates available for the two genera in Petrosaviaceae, Petrosavia (three species) and Japanolirion (one species). Nevertheless, karyotype information suggests that this small family is characterized by relatively small genomes. Tamura and Takahashi [80] reported Petrosavia sakuraii to have with chromosomes ranging in size from 1.0 to 3.6  m and Satô [81] noted a bimodal karyotype of for Japanolirion osense comprising three long and nine short pairs of chromosomes ranging from 0.4 to 3.1  m. Satô also noted that the karyotype of J. osense was similar to Chionographis japonica (Melanthiaceae), and since the genome size of C. japonica has been estimated to be  pg (J. Pellicer, pers. comm.), this suggests that the genome size for J. osense will be of similar magnitude.

5.4. Pandanales

With just eight genome size estimates ( –1.5 pg) in four of the five families, representation in Pandanales is poor. In addition, attempts to supplement this information with cytological data are hampered because obtaining counts in some families has been reported to be extremely difficult (e.g., Cyclanthaceae [82]). This is partly due to the small (i.e., 2  m) and, in some cases, numerous chromosomes that characterise Pandanaceae ( (Freycinetia) and 60 (Pandanus) [83, 84]), Cyclanthaceae ( ), and Velloziaceae ( –48) [85]. Nevertheless, based on available karyotype data even smaller genomes may be found in this order as one of the smallest genomes so far reported is in Xerophyta humilis (Velloziaceae) with  pg and . However, South American Xerophyta species with only slightly larger chromosomes (i.e., up to 2.5  m) but with have been reported (e.g., Xerophyta minima) [85].

At the other end of the scale, larger genomes may be found in Stemonaceae in which chromosomes may reach 7  m in some species, although the highest chromosome number so far reported is [86].Triuridaceae may also contain large genomes as this family contains species with chromosomes up to 17  m in Sciaphila dolichostyla [87] although, like Stemonaceae, chromosome numbers do not exceed .

5.5. Dioscoreales

This order, sister to Pandanales, contains three families (Dioscoreaceae, Burmanniaceae, and Nartheciaceae) and is poorly represented for genome size data. Estimates are available for just 14 species, 12 for Dioscorea, one for Tacca (both Dioscoreaceae), and one for Narthecium (Nartheciaceae). These data show a narrow range of genome sizes from  pg in Narthecium ossifragum to 6.75 pg in Dioscorea elephantipes. Although all families are characterized by possessing small to very small chromosomes (e.g., see [8891]), high levels of polyploidy have been reported, particularly in Burmanniaceae and Dioscoreaceae in which chromosome counts of and c. 140, respectively, have been recorded [9294]. Such karyotype information suggests that genomes larger than 6.8 pg may well be found as representation of genome size data improves. Nevertheless, since increases in ploidy are often accompanied by decreases in chromosome size (as noted in Nartheciaceae by Larsen [95] and by Sen in Burmanniaceae [89]), the upper limit of genome size in this order is unlikely to be very large.

5.6. Liliales

Circumscription of families and genera comprising Liliales has been considerably revised in recent years with ten families now recognized based on the combined analysis of five DNA regions and morphological characters [4, 96]. In contrast to other monocot orders, a histogram showing the distribution of genome sizes for 142 species from seven of these families is not strongly skewed to the left but is more evenly distributed (Figure 4(b)), and this is reflected in the highest mean 1C value of 39.26 pg for monocots (Table 2). It is here that the truly giant plant genomes are found with the record holders going to tetraploid Fritillaria assyriaca (Liliaceae, 2n = 48, 1C = 127.4 pg) and Trillium rhombifolium (Melanthiaceae, ,  pg). However, very large genomes (i.e.,  pg) [97] are not uncommon in genera belonging to subfamily Lilioideae of Liliaceae (e.g., Lilium, Cardiocrinum, Notholirion, Tulipa, and Erythronium) [38, 98, 99], tribe Parideae in Melanthiaceae (e.g., Paris, Daiswa), and Alstroemeria (Alstroemeriaceae). Although there are currently no genome size data for species in the saprophytic family Corsiaceae, probably sister to all remaining Liliales, very large genomes may also be encountered here given that the chromosomes were reported to be similar in size to those of Pogonia (Orchidaceae) [100], the genome size of which has recently been estimated to be  pg [101] (N.B. both species are reported to have ; see [100, 102]).

Cytologically, Liliales are as diverse as other monocot orders with a wide range of chromosome numbers ( –216) ploidy (up to 22x), bimodal karyotypes (e.g., Alstroemeria, Luzuriaga, Rhipogonum, Smilax, and many genera in subfamily Lilioideae, Liliaceae), and holocentric chromosomes (Chionographis, Melanthiaceae). However, it is perhaps notable that to date no species of Liliales have been reported with very small chromosomes (i.e., 1  m) or genomes (i.e., 1.4 pg), as encountered in all other monocot orders. The smallest genome so far reported is in Chionographis japonica (Melanthiaceae,  pg, J. Pellicer, pers. comm.).

5.7. Asparagales

Around half of monocots are Asparagales (which comprise 14 families sensu [4]). The order includes five highly species-rich families (Orchidaceae, c. 25,000 species; Amaryllidaceae, c. 1,600 species; Asparagaceae, c. 2,500 species; Iridaceae, c. 1,900 species, Xanthorrhoeaceae, c. 850 species), with the remaining families containing between one and 36 species. Accompanying the species richness of the order is huge variation in chromosome number ( –228), karyotype structure (with bimodality being common in many genera), and modes of chromosome evolution [37].

From a genome size perspective, data are available for 1130 species in 12 of the 14 families and show that they too vary considerably (c. 250-fold from  pg in Trichocentrum maduroi (Orchidaceae) to  pg in hexaploid Galanthus lagodechianus (Amaryllidaceae), the largest range for any monocot order; Table 2). Nevertheless, the modal genome size is just  pg, and half of all species with data have genomes smaller than 11 pg, giving rise to the strongly skewed distribution of genome sizes (Figure 4(b)). Within the order it is clear that genome size diversity is restricted to the five species-rich families mentioned above (Figure 5), with Orchidaceae having the largest range for any family so far reported (168-fold, –55.4 pg) [101]. Genome sizes in the species-poor families do not exceed  pg. This is generally supported by chromosomal data as none of the smaller families is characterized by large chromosomes, and in Asteliaceae, where counts up to have been reported, the chromosomes are noted to be very small [103]. The only possible exception is Hypoxidaceae in which Hypoxis obtusa is reported to have [104]. Although there are currently no genome size data for any species of Hypoxis, a related species Rhodohypoxis milloides with has a genome size of 1.4 pg suggesting that genomes larger than 8 pg may occur in this family [105].

The largest chromosomes are found in Amaryllidaceae in Haemanthus (up to 24  m in the predominantly diploid genus with ) and Lycoris (up to 28  m in a genus where diploid chromosome numbers range from to 22 via Robertsonian translocations) [106108].

5.8. Commelinids
5.8.1. Dasypogonaceae

This small family, comprising four genera (Dasypogon, Calectasia, Kingia, and Baxteria) and c. 8 species, is poorly known both cytologically and from a genome size perspective. Currently there is just a single genome size estimate for Dasypogon hookeri with  pg and [109].

5.8.2. Arecales

In the palm family Arecaceae (the only family of Arecales), genome size data are available for 89 species in 57 of the 183 recognized genera and representing all five subfamilies (Figure 6(a)) [110]. C-values range c. 33-fold from 0.9 pg in the diploid Phoenix canariensis (Coryphoideae) ( ) [111] to 30.0 pg in the highly polyploid Voanioala gerardii (Arecoideae) with . 600 [8]. The large C-value for V. gerardii is, however, clearly an outlier (Figure 4(b)) with the next largest genome size belonging to diploid Pinanga subintegra with  pg. This reflects cytological data showing that polyploidy is rare in palms with just four polyploid species reported to date, two tetraploids (Arenga caudata, , and Rhapis humilis, ) and two rare, monotypic genera of high ploidy, c. 12x in Jubaeopsis caffra from South Africa ( –200) and c. 38x in Voanioala gerardii from Madagascar [16]. The latter two genera belong to the same subtribe, Attaleinae, of tribe Cocoseae.

At the diploid level, genome sizes still range 13.9-fold, and this diversity contrasts with the narrow range of chromosome numbers reported across the c. 2,500 species (i.e., , 28, 30, 32, 34, and 36). Röser [112] proposed that different chromosome numbers had evolved mainly through dysploidy due to the broadly similar DNA amounts in three related genera differing in chromosome number (Livistona, ; Johannesteijsmannia, ; Licuala, ). This is supported by an analysis of the larger data set available here. A comparison of the mean DNA amount for each chromosome number showed that they were not significantly different (data not shown). It is however clear that changes in genome size can occur with no alteration of chromosome number leading to related species having significantly different sized chromosomes (Figure 6(b)). The most dramatic example of this in palms is found in Pinanga where C-values range from  pg in P. celebica to 13.9 pg in P. subintegra, although all species have [110]. This is the largest range of genome sizes for any palm genus, and the possibility that it is linked to the diversity in reproductive evolution and speciation in Pinanga has been suggested by Loo et al. [113].

5.8.3. Zingiberales

Genome size-estimates are available for 71 species with at least one for each of the eight families comprising Zingiberales. The data show that this order is characterised by a narrow range of small genome sizes ( –6.0 pg). This reflects the more extensive cytological data indicating that the order is typified by karyotypes in which chromosomes are either all very small (i.e., c. 2  m; Marantaceae, Heliconiaceae) or small (i.e., c. 2–5  m; Cannanaceae, Musaceae, Strelitziaceae, Costaceae), or in which the karyotypes contain a few larger chromosomes (6 or 7  m; Lowiaceae, Zingiberaceae) as well as smaller ones. In addition, polyploidy is not widespread in the group as most species studied to date are cytologically diploid. It is only in a few genera of Zingiberaceae (e.g., Cucumis, Hitchenia, Hedychium, Globba, Boesenbergia) that polyploidy has played a significant evolutionary role, reaching 15-ploid in Curcuma raktakonta, with , the highest chromosome number so far reported in Zingiberales [114, 115]. As in other monocot groups, however, due to the small size of the chromosomes in C. raktakonta, its genome is not the largest for Zingiberales (  pg) [114]. Instead this is found in the diploid Zingiber officinale (  pg, ) [116]. The smallest genomes in Zingiberales are found in diploid species of Calathea (Marantaceae) and Heliconia (Heliconiaceae) with 1C-values of 0.3-0.4 pg [105, 117]. Very small genomes are also found in two tetraploid species of Maranta (M. arundinacea ,  pg and M. bicolor ,  pg) leading to the possibility that even smaller genomes may be found in diploid Maranta species such as M. arundinacea var. variegatum with and small chromosomes ( 2  m) [118].

5.8.4. Commelinales

Within Commelinales, although genome size estimates are available for 113 species and range 56-fold, the data are highly unrepresentative with 108 values from Commelinaceae and the remaining five from Haemodoraceae (two species), Hanguanaceae (one species), and Pontederiaceae (two species) (there are currently no genome size estimates for Philydraceae, the last family in Commelinales). Genome sizes from the last three families are the lowest for the order ranging from just 0.8 pg in Xiphidium caeruleum ( ; Haemodoraceae) [119] to 1.6 pg in Hanguana malayana ( . 170; Hanguanaceae) [109]. This narrow range reflects cytological data showing Haemodoraceae, Hanguanaceae, and Pontederiaceae to be characterised by possessing small to very small chromosomes. Nevertheless, polyploidy and dysploidy are also prevalent, particularly in Pontederiaceae in which chromosome numbers range –80, and in Hanguanaceae with –c. 170 [109, 120] suggesting that larger genome sizes within these families may be uncovered.

The most species-rich family by far is Commelinaceae with c. 650 species, and genome sizes here range 17-fold ( –43.4 pg). Even here, however, the data set is unrepresentative, being dominated by estimates from just three out of the c. 40 genera (i.e., Tradescantia, 52 species; Gibasis, 17 species; Commelina, 17 species). Nevertheless, given the extensive cytological data available for Commelinaceae (reviewed in [121123]) the upper limit may not be extended considerably as the largest chromosomes so far reported belong to Tradescantia virginiana and its North and Central American allies [124], and the largest genome size estimate available is for tetraploid T. virginiana ( ) with  pg [125]. Indeed the genomes that appear as outliers in Figure 4(b) (with –43.4 pg) are all tetraploid (where known) Tradescantia or Callisia species from N. America.

Nevertheless, it seems likely that Commelinaceae genomes smaller than  pg (for tetraploid Commelina erecta, ) will be uncovered as the smallest chromosomes so far reported are in Stanfieldiella with , Bufforestia ( ) and Cartonema with [121, 122, 126]. Not only are the chromosome numbers of these genera lower than C. erecta, but also the chromosomes are considerably smaller [121, 122, 127]. Very small genomes may also be found in Pollia, a genus noted to contain species with a low number ( ) of very small chromosomes by Jones and Jopling [121].

5.8.5. Poales

Genome size-estimates are available for 951 out of an estimated 18,325 species in Poales (an order comprising 16 families) [4]. A summary of the range and distribution of genome sizes encountered in the twelve families with data is given in Table 3 and Figure 7.

Phylogenetically, within Poales there are some well-supported groups. Both molecular and morphological data suggest that Typhaceae and Bromeliaceae are probably sister taxa and form a clade sister to Rapateaceae and the remainder of Poales (see Figure 7) [3]. From a genome size and chromosomal perspective, data for two of these families show that they are characterised by small genomes comprising numerous very small to small chromosomes (Typhaceae , 60, Bromeliaceae 50 with occasional polyploids with and 150) [128131]. For Rapateaceae there are no genome size data and only a few chromosome counts ( and 52) [132] with no pictures. Thus insights into their genomes are currently lacking.

The remaining families are split into two large, well-supported clades; (i) the cyperid clade (comprising Xyridaceae, Eriocaulaceae, Mayacaceae, Thurniaceae, Juncaceae, and Cyperaceae) and (ii) the graminid clade containing the restionids (Anarthriaceae, Centrolepidaceae, and Restionaceae) and core Poales (Flagellariaceae, Joinvilleaceae, Ecdeiocoleaceae, and Poaceae).

The Cyperid Clade
Within the cyperid clade there is only one genome size estimate of  pg for Mayacaceae (Mayaca cf. fluviatilis; Smarda and Bureš, pers. comm.) and only very limited chromosome counts; thus inferences about their genomes are difficult. For the remaining families both Xyridaceae and Eriocaulaceae (two families often considered to be sisters) are characterised by small but highly variable chromosome numbers ( –110) with polyploidy and dysploidy being important evolutionary mechanisms generating this diversity [133]. However, only two genome size-estimates are available (Table 3); so the full extent of genome size variation that accompanies chromosome diversity is currently unclear.
Based on molecular and morphological data, Juncaceae and Cyperaceae form a well-supported clade, most likely sister to Thurniaceae [3]. Genome size data are very sparse in Thurniaceae with currently just one genome size estimate (available for Prionium serratum of  pg; Smarda and Bureš pers. comm.) and no chromosome data. In contrast both Juncaceae and Cyperaceae have received considerable cytological attention because of the presence of holocentric chromosomes [27, 134]. Such studies have uncovered an extensive range in chromosome numbers (Cyperaceae ; Juncaceae –130). Indeed, chromosome evolution is considered to be more dynamic in Carex than in any other group of flowering plants with a series of chromosome numbers ranging from to [28, 135]. Across Cyperaceae and Juncaceae, polyploidy, agmatoploidy (increase in chromosome number through fragmentation of holocentric chromosomes), and symploidy (fusion of holocentric chromosomes) are considered to have been important in generating the diversity of chromosome numbers observed [27], with symploidy being so extensive in Rhynchospora tenuis (Cyperaceae) that its chromosome number has dropped to just [7].
Studies on genome size-evolution in taxa with holocentric chromosomes are more limited (e.g., [31, 136138]) but available data show that the narrow ranges of genomes sizes encountered in the two families are similar (see Table 3, Figure 7). In general the average chromosome size (obtained by dividing the 2C value by the chromosome number) varied considerably across the range of chromosome numbers encountered, suggesting that chromosome evolution by symploidy and agmataploidy is often accompanied by considerable loss or addition of DNA [31, 139].

The Graminid Clade
Within graminids, the restionids form a well-supported clade that is diverse in chromosome number ( –104) and size ( 1  m in Lepidobolus (Restionaceae) to over 10  m in Anarthria (Anarthriaceae)) [140, 141]. Currently, there is just one genome size estimate for Rhodocoma gigantia (Restionaceae) (  pg) with no chromosome data; so how typical it is for this clade is unclear [109].
Within core Poales the three families related to Poaceae are poorly characterised both cytologically and from a genome size perspective. Available data suggest that they may possess small genomes as chromosome counts of (Joinvilleaceae), 38 (Flagellariaceae), and 42 and 64–66 (Ecdeiocoleaceae) have been reported, and the chromosomes are noted to be small [141143]. The two genome size estimates available support this (i.e., Flagellaria guineensis,  pg, (Flagellariaceae) [109] and Ecdeiocolea monostachya  pg, . 38 (Ecdeiocoleaceae)[144]).
In contrast, Poaceae, one of the most species-rich angiosperm families (c. 10,000 species), has the greatest number of genome size estimates for any family in the monocots with values for 807 species and extensive chromosome data with numbers ranging from to 266. Given this large amount of data together with extensive genomic and phylogenetic information available for grasses there have been numerous studies on the evolution of grass genomes and their sizes [145147]. Indeed, many of the insights into the molecular basis and evolutionary dynamics of genome size-variation in angiosperms as a whole have been gained through the study of grass genomes [148150]. These have revealed the rapid and dynamic nature of genome size-evolution in grasses [146, 151], the mechanisms involved in generating genome size-diversity [152], and the contribution that transposons and in particular retrotransposons have made to genome size-differences [153156] and highlighted contrasting patterns of genome size and chromosome diversification in the grass subfamilies [146, 157].
Analysis of the 807 genome estimates reveals that subfamilies are characterized by different ranges of genome sizes. All subfamilies contain species with small genomes, whereas species with genomes greater than  pg are restricted largely to Pooideae and one species of Chloridoideae (Bouteloua gracilis  pg) (Figure 8(a)). Some of this variation can be attributed to polyploidy as species with genomes larger than  pg are all polyploid (4x–c. 38x), and although a count was not made for B. gracilis, previous records show it to range –77 (4x–11x) with small chromosomes (c. 0.5–2  m) [158]. Indeed, by plotting the distribution of mean chromosome sizes for each subfamily (by dividing 2C values by chromosome number) to remove the effect of polyploidy, the largest chromosomes are found in Pooideae and Panicoideae with all the other subfamilies being characterized by much smaller chromosomes (Figure 8(b)). Once again, species with high chromosome numbers in the data set have some of the smallest chromosomes and relatively small genomes (e.g., Spartina anglica ,  pg; Cenchrus caliculatus ,  pg [159]). The largest chromosomes are found in diploid species of Secale and Psathyrostachys with (both Pooideae).
The average chromosome sizes for the two largest subfamilies (Pooideae and Panicoideae) are distinct. Although most species in Panicoideae are characterized by relatively small chromosomes (with a modal DNA amount per chromosome of 0.1 pg (Figure 8(c)), there are two modal peaks in chromosome size at 0.4 and 0.7 pg for Pooideae (Figure 8(d)) suggesting that the evolutionary processes driving chromosome and genome size-evolution are different in these subfamilies. Even within subfamilies differences in the rates of genome change are apparent. Genomic comparisons between four grass genomes suggested that the rate of genome evolution in Aegilops tauschii (Pooideae) was substantially higher than Brachypodium distachyon (Pooideae), Sorghum bicolor (Panicoideae), and Oryza sativa (Ehrhartoideae) [160].

6. Evolution of Genome Size Diversity withinMonocots

From the above discussion it is clear that the different orders of monocots have undergone very different patterns of both genome size and chromosome evolution, giving rise to the genomic diversity observed. In seeking to understand how such diversity in genome size evolved over the c. 110–120 million years since monocots first appeared in the fossil record [161], ideally one would aim to obtain genome size estimates from key fossil taxa. However, although various approaches have been suggested for using fossil epidermal or guard cells as proxies for genome size [162, 163], the poor and patchy fossil record for monocots has precluded such an approach so far.

An alternative line of attack is to use statistical modelling to reconstruct genome size evolution. However, although there have been several studies that have used these approaches to reconstruct the size of the ancestral genomes across the angiosperm tree, including monocots (e.g., [164, 165]), many have analysed genome size as a discrete character requiring the data to be partitioned into size classes. Since genome size varies continuously, a biologically more meaningful approach is to analyse it as a continuous character, and there are now numerous studies that have used such approaches to analyse genome size within particular plant genera and families (e.g., [98, 166, 167]). Recently we have been extending the application of these approaches to examine genome size-evolution across monocots as a whole, not only to reconstruct ancestral genome sizes at different nodes of the monocot tree but also to compare rates of genome size evolution to see whether the different genomic profiles observed in the monocot groups (Figure 4(b)) are reflected in differences in the mode and tempo of genome size-evolution. The full details of the methods and approaches used are outlined in Beaulieu et al. (in prep.), and a summary of the findings is presented here.

The two statistical modeling programs used for analysis were BayesTraits [168170] and Brownie [171]. BayesTraits applies a generalized least square approach to model genome size evolution. It provides insights into the mode and tempo of genome size evolution and also reconstructs ancestral genome sizes at different nodes within the phylogenetic tree. In contrast, Brownie uses maximum likelihood to analyse rates of genome size evolution across a phylogenetic tree and it can be applied to test for substantial differences in the rate of genome size-evolution between monocot clades.

Using these approaches, the following picture of genome size-evolution in monocots is emerging.

6.1. The Ancestral Genome Size of Monocots

The ancestral genome size of all monocots was reconstructed as 1.85 pg (Figure 4(a)), similar to previous studies using MacClade in which the ancestral genome size was reconstructed as being “very small” (i.e.,  pg [164]. Within monocots, our analysis showed that there was a general tendency for increases in the ancestral genome size such as at the base of Arecales (  pg) and Asparagales ( pg) and a large increase at the base of Liliales (  pg). In contrast, decreases were observed within commelinids in the branch leading to Commelinales and Zingiberales (  pg) and a slight decrease at the base of Poales (  pg).

To what extent the predicted increases in ancestral genome size in Asparagales, Arecales, and Liliales reflect signatures of an ancient whole genome duplication near the base of all monocots is unclear although support for such an event is increasing based on the expanding sequence data being generated in key monocot species (see [14]). Alternatively, the larger ancestral genome sizes in these groups may reflect whole genome duplication events at or near the base of each clade. Already, an analysis of the Acorus genome has uncovered evidence of at least one round of polyploidy [13]. Whether multiple polyploid events occurred at or near the base of Liliales to contribute to an ancestral genome size (  pg) more than five times that of monocots as a whole (  pg) remains to be seen but requires sequence data from key species of Liliales that sadly are not currently available.

The predicted decreases in ancestral genome size along the branches leading to Poales and Zingiberales + Commelinales suggest that the origin of these clades may not have been accompanied by a whole genome duplication event, and this is supported by available sequence data that have failed to find evidence of polyploidy in these phylogenetic positions [172]. Nevertheless, polyploidy within these clades has clearly taken place especially within Poaceae based on both chromosomal and DNA sequence analyses. Cytologically 80% of all Poaceae are estimated to be polyploid [173], and an inferred whole genome duplication event 50–70 million years ago in Poaceae has been proposed (e.g., [174, 175]), close to the origin of Poaceae (c. 89 mya) [176]. Within Zingiberales sequence data have provided evidence for a whole genome duplication c. 60 mya in Musaceae but not in Zingiberaceae [26].

Nevertheless, it is clear that all groups of monocots analysed contain species with genomes smaller than the reconstructed ancestral genome size which highlights the propensity for genome size to decrease as well as increase.

6.2. Mode and Tempo of Genome Size Evolution in Monocots

The mode of genome size evolution was shown to be that of “scaled gradualism” meaning that genome size has evolved in a gradual rather than punctuated manner over time but with more changes in the shorter branches than the longer branches of the phylogenetic tree used for analysis. This suggests that the rate of genome size evolution slows down on the longer branches. Genome size evolution was also shown to be “slow” rather than “accelerated” suggesting that most diversity in genome size encountered in monocots was established early. This is consistent with studies in Orobanche (Orobanchaceae) [166] and Brassicaceae [177], which pointed to a slow tempo of genome size evolution implying that most of the diversity of genome sizes encountered in these two eudicot groups evolved early on in their diversification. In contrast, an accelerated tempo of genome size evolution was recently reported in a similar, but more focused study of genome size evolution in Liliaceae by Leitch et al. [98]. It is, however, too early to say to what extent these patterns reflect differences between monocots and eudicots and further studies are clearly needed.

Using Brownie the results showed that despite different genome size profiles of the clades analysed (Figure 4(b)) there was no evidence to suggest that clades with bigger genomes (Liliales) were evolving more rapidly than other clades. Instead all the major monocot clades were shown to be evolving at nearly the same rates with the exception of Asparagales and Poales, which were shown to be evolving at significantly higher rates than other monocot clades. Whether the elevated rates occur across the whole order or are restricted to specific families and genera within Asparagales and Poales needs to be investigated further. In recent studies of chromosome and sequence evolution in Poaceae, an accelerated rate of structural genome evolution was shown to be restricted to species in Triticeae with larger genomes when compared with relatives in other tribes with smaller-genomes [151]. Indeed, there may be a link between the activity of transposable elements, their rate of turnover, and genome size-evolution since species with larger genomes have been observed to have more interchromosomal duplications than species with smaller genomes [178]. Additional work is needed to extend these studies beyond Poaceae.

7. Future Directions

The picture emerging from current large-scale comparative sequence analyses of plants is that genomes of Poaceae are different from other monocot lineages that appear more eudicot-like [25, 26]. However, out of the nine plant genomes “completely” sequenced so far, the only monocots all belong to Poaceae (i.e., Oryza sativa, Sorghum bicolor, Zea mays, and Brachypodium distachyon). The anticipated release of complete genome sequences for some other grass genomes (e.g., Triticum aestivum) will no doubt add to the power of comparative analysis, but the need for species in other families and clades is clearly apparent if one is to really get to grips with the diversity of monocot genomes and how distinct they really are from other angiosperms.

8. Note Added in Proof

Following the acceptance of this paper the authors were made aware of a paper by Zonneveld [179] which is also published in this special issue. Zonneveld presents the first genome size estimate for the triploid hybrid Trillium x hagae ( ) with a 1C DNA amount of 132.5 pg. As this is larger than Fritillaria assyriaca (  pg), the range of genome sizes encountered in angiosperms and land plants as a whole has now increased to 2056-fold, while the range for monocots has increased to 665-fold. Zonneveld also reports new 1C-value estimates for the eudicot Viscum album (102.9 pg) and V. cruciatum (87.9) pg. Both values are higher than those reported previously for the same species. Using these values also extends the range of genome sizes encountered in eudicots to c. 1633-fold.