Research Article

The Interaction between Base Compositional Heterogeneity and Among-Site Rate Variation in Models of Molecular Evolution

Figure 5

Simulation model and results. (a) Tree from which the sequences were simulated. The colored branches show the different composition vectors, which drove the simulated sequences toward a convergence in nucleotide composition. I ran this simulation with 3 different levels of bias; each level had 100 simulation replicates. (b) Histograms showing the distribution of Robinson-Foulds distances (RF distance) between the maximum likelihood tree and the true tree. The maximum likelihood tree was generated by PhyML for 100 simulated replicates for each of the 3 parameter settings (high, low, and no CNC). The difference between the models (GTR versus GTR+G) grows as the level of CNC increases (**: highly significant; *: marginally significant; NS: not significant). In the simulation with no CNC, the GTR model performs as well as the GTR+G model. (c) Boxplot showing the distribution of bias introduced by the different composition vectors, under each of the 3 simulation parameter settings.
391561.fig.005