Eucalyptus globulus is grown extensively in plantations outside its native range in Australia. Concerns have been raised that the species may pose a genetic risk to native eucalypt species through hybridisation and introgression. Methods for identifying hybrids are needed to enable assessment and management of this genetic risk. This paper assesses the efficiency of a Bayesian approach for identifying hybrids between the plantation species E. globulus and E. nitens and four at-risk native eucalypts. Range-wide DNA samples of E. camaldulensis, E. cypellocarpa, E. globulus, E. nitens, E. ovata and E. viminalis, and pedigreed and putative hybrids (n = 606), were genotyped with 10 microsatellite loci. Using a two-way simulation analysis (two species in the model at a time), the accuracy of identification was 98% for first and 93% for second generation hybrids. However, the accuracy of identifying simulated backcross hybrids was lower (74%). A six-way analysis (all species in the model together) showed that as the number of species increases the accuracy of hybrid identification decreases. Despite some difficulties identifying backcrosses, the two-way Bayesian modelling approach was highly effective at identifying , which, in the context of E. globulus plantations, are the primary management concern.

1. Introduction

Plants are well known for their propensity to hybridise [1, 2], and the role of hybridisation in animal systems is receiving growing attention [3, 4]. Natural hybridisation has been widely documented in plants [1], with hybrid zones often being used to investigate the mechanisms that underlie speciation [57]. These studies have demonstrated that barriers to hybridisation that evolve in allopatry are often incomplete, and hybridisation and introgression are still possible when species secondarily come into contact [5, 7]. Two consequences of human development have been the fragmentation of natural plant populations and the widespread movement of plant species around the world [8]. In many situations this has resulted in exotic species coming into contact with cross-compatible indigenous species, leading to human mediated exotic hybridisation [911]. This exotic hybridisation and potential for subsequent introgression may threaten the genetic integrity of native species [10, 11].

Given the genetic risk posed by exotic hybridisation, methods for detecting hybrid progeny are needed to enable quantification and management of the issue [11]. In some situations first generation () hybrids can be detected based on intermediate morphology, but in species with similar characteristics and in advanced generation hybrids (second () and backcross (BC) generations) morphological detection is often difficult and unreliable [14]. Over the past two decades several techniques utilising molecular markers have been developed for detecting hybrids [1518]. Early methods often depended on identifying species specific markers that could be used to identify immigrants [19, 20]. This approach is highly effective in theory [21], but in practice identifying species specific markers is problematic, especially in closely related taxa [14, 17].

The development of highly polymorphic microsatellite markers, combined with new Bayesian statistical approaches [22], and advances in computing power have allowed the development of model-based techniques for hybrid detection [16, 17]. These approaches produce admixture estimates based on multilocus allele frequencies and Bayesian clustering [16, 17]. The techniques have now been widely used to identify interspecific hybridisation [2325] and introgression [2629]. The two most commonly used programs are STRUCTURE [17] and NEWHYBRIDS [16]. Sanz et al. [27] found that of four programs STRUCTURE and NEWHYBRIDS were the most effective at identifying hybrids, but STRUCTURE was the most accurate, correctly identifying 100% of simulated and hybrids and 96% of backcrosses, while NEWHYBRIDS misclassified 8 to 14% of and 30 to 34% of backcrosses.

In this study we test a Bayesian modelling approach (using STRUCTURE) for identifying hybrids between the plantation species Eucalyptus globulus and five other eucalypt species—E. camaldulensis, E. cypellocarpa, E. nitens, E. ovata, and E. viminalis. Over the last two decades there has been a major expansion of the eucalypt plantation estate in Australia, which now covers around 1,000,000 ha [30]. Eucalyptus globulus is the most widely planted species and is grown mainly outside its natural range [30], raising concerns that it could pose a genetic risk to native eucalypt populations [31, 32]. Hybridisation is well documented in eucalypts [33] and is more likely to occur between closely related species [32, 33]. Hybrids have been reported based on morphology, with typically being intermediate between the parental taxa [31, 33, 34]. The distinctive juvenile morphology of many species also makes identification of hybrids possible at an early age, and this characteristic has been widely used in studies investigating exotic gene flow in eucalypts [31, 35, 36].

As in other groups, identifying eucalypt hybrids can be problematic where species have similar morphology, or when there are advanced generation hybrids that resemble the backcross parent [35]. Additionally, in the context of E. globulus plantations, there are a range of native species that have similar seedling morphology to E. globulus, which could hybridise with other native species and produce hybrid seedlings resembling exotic hybrids. There are also at least 36 native eucalypt species that grow adjacent to E. globulus plantations [31], and these species have a wide range of juvenile characteristics. This diversity can make it difficult to distinguish between the intermediate morphology of a hybrid and the morphology of an unfamiliar species.

There is evidence that exotic hybridisation is occurring from E. globulus plantations [31]. Barbour et al. [31] found that 35% of E. globulus plantations are in close proximity to cross compatible native species and they detected low levels of hybridisation in open pollinated native seedlots, as well as hybrids establishing in native forest beside one plantation. In order to enable identification and management of exotic hybridisation from E. globulus plantations, an approach is needed to validate and/or identify hybrids between E. globulus and at-risk species. The species selected here have been chosen because they are common plantation neighbours and are known to hybridise with E. globulus and in some cases their seedling morphology is similar enough to that of E. globulus, that distinguishing their hybrids from E. globulus hybrids would be difficult (E. cypellocarpa and E. nitens). The specific aims of the study are to assemble range-wide molecular databases for each species using 10 microsatellite loci, and then, using Bayesian admixture analysis, test the ability of those marker sets to detect hybrids with E. globulus using simulated, pedigreed, and putative hybrid samples.

2. Materials and Methods

2.1. Sample Description

Collections of range-wide samples of E. camaldulensis, E. cypellocarpa, E. globulus, E. nitens, E. ovata, and E. viminalis were assembled from various sources (Table 1 and Figure 1). A range of pedigreed and putative hybrid samples were also collected for assessing and testing the ability of the modelling approach (see below) to detect hybrids. All hybrids referred to as “pedigreed” have either been validated with parentage analysis using molecular markers (to be published elsewhere) or are from controlled crossing. Samples validated with molecular markers include the following: E. camaldulensis globulus (), E. viminalis globulus (), and E. ovata globulus (). Samples produced through controlled crossing from an advanced generation hybrid trial between E. globulus and E. nitens were also used. Details of the crossing approach and trial establishment can be found in Costa e Silva et al. [44], and validation of the cross types with near infrared reflectance spectroscopy is explained in O’Reilly-Wapstra et al. [45]. Samples used from this trial were 12 , four , and 12 E. globulus backcross hybrids ().

Unconfirmed putative hybrid samples were also used to test the method. Four putative E. camaldulensis globulus hybrids were identified from morphology in open pollinated seed from E. camaldulensis trees beside an E. globulus plantation in South Australia. Seven putative E. cypellocarpa globulus hybrids from Victoria are as follows: one was collected from a mature native tree in mixed E. globulus/cypellocarpa forest; two were tentatively identified as “possible hybrids?” (with a high degree of uncertainty based on morphology) from beside an E. globulus plantation; and four were from a population that has been speculated to be a phantom hybrid zone [46, 47]. Four putative E. viminalis globulus samples were collected in Tasmania, one identified on the basis of seedling morphology in open-pollinated progeny from a native E. globulus tree; the other samples were collected from mature native trees with intermediate bud and capsule morphology. Details of the six pure species are given in (Supplementary Material 1 available online at http://dx.doi.org/10.1155/2014/650202), and Figure 1 shows distribution maps for each species, the collection location of samples in this study, and the distribution of E. globulus plantations in Australia.

2.2. Molecular Methods

A total of 606 samples were genotyped, and 27 samples were repeated to enable assessment of the accuracy and repeatability of allele calling. For the samples collected in this study, total genomic DNA was extracted from the frozen leaf samples using the CTAB protocol of J. J. Doyle and J. L. Doyle [48] with the adjustments used by Mckinnon et al. [49]. The quality and quantity of DNA was assessed using gel electrophoresis and comparison with Lambda HindIII molecular weight standard. Additionally because of quarantine restrictions preventing the importation of eucalypt material from New South Wales to Tasmania, the 45 E. cypellocarpa samples collected in New South Wales were sent fresh to the Australia, Genome Research Facility, South Australia for DNA extraction and quantification. Ten microsatellite loci were used for genotyping, four (EMCRC2; EMCRC7; EMCRC8; and EMCRC11) designed by Steane et al. [50], two (EMBRA11 and EMBRA16) designed by Brondani et al. [51], and four (EMBRA23; EMBRA30; EMBRA38; and EMBRA63) designed by Brondani et al. [52]; the primer sequences for all loci can be found in their respective references. These loci have been mapped and there is no evidence of linkage between them (J. S. Freeman, personal communication). In order to allow simultaneous analysis of different loci, the forward primers were labelled at their 5′ end with the fluorescent dyes NED, 6-FAM, PET, or Hex (PerkinElmer Applied Biosystems, Foster City, CA, USA). The ten loci were multiplexed in three mixes, mix 1 included EMCRC2, EMBRA63, and EMBRA11 (using 0.2 μM of primer/reaction—forward and reverse combined); mix 2 included EMBRA10 (0.2 μM/reaction), EMCRC11 (0.4 μM/reaction), EMBRA23 (0.2 μM/reaction), and EMCRC7 (0.4 μM/reaction); and mix 3 included EMBRA30, EMBRA38 and EMCRC8 (all at 0.4 μM/reaction). The PCRs were performed using a QIAGEN Multiplex PCR kit (Hilden, Germany) according to the manufacturer’s specifications for 5 μL reactions using approximately 5 ng of genomic DNA per reaction. Thermocycler conditions followed Bloomfield et al. [53] with annealing temperatures of 59°C for mix 1, and 58°C for mix 2 and mix 3. The PCR product was diluted 1 in 10 in H2O and then 1 μL of that dilution was dried at 50°C. Fragment separation was undertaken on an ABI3730 DNA analyser (Applied Biosystems Hitachi) by the Australian Genome Research Facility, South Australia. Allele scoring followed Bloomfield et al. [53]. The assigned genotypes of the 27 repeated samples (which were scored blindly) were compared at each locus to obtain a measure of repeatability (number of allelic errors/number of alleles compared).

2.3. Analytical Approach
2.3.1. Assessment of Genetic Differentiation between Species

Deviation from Hardy-Weinberg equilibrium within species was assessed in GENEPOP version 4.2 [54, 55]. Pairwise and (a standardised version of GST with a range from 0-1; 47) were calculated for each species pair in GENALEX [56] using Analysis of Molecular Variance (AMOVA), which also tests the statistical significance of the comparisons. This analysis used the default parameters for the AMOVA function except that the “interpolate missing data” function was switched on, and the number of permutations increased to 9999.

All 559 pure individual genotypes were run in STRUCTURE using the admixture model (which was used in all analyses) without a priori population information. This analysis used a burn-in of 50,000 Markov chain Monte Carlo (MCMC) iterations, followed by run of 100,000 data generating MCMC iterations, with all other program parameters set to default. A range of from 1 to 10 was used, with each analysis repeated 10 times. STRUCTURE HARVESTER [57] was used to calculate the mean likelihood of [17], and the log likelihood method, [58], to determine the most appropriate number of genetic clusters ().

2.3.2. Calculating Detection Power with Simulated and Pedigreed Hybrids

The program HYBRIDLAB [59] was used to generate a series of simulated hybrid generations to assess the accuracy of the STRUCTURE technique. For each parental combination (i.e., E. globulus and any of the other five species (parent-2)), 300 individuals were simulated, including 50 each of the following: parent-globulus, parent-2, , , , and . In order to check that the number of hybrid samples did not affect the accuracy of assignment, the pairs with the highest and lowest level of differentiation (E. globulus camaldulensis and E. globulus cypellocarpa, resp.) were also analysed using only 10 replicates of each simulated hybrid generation (keeping 50 simulated parental samples). This had very little effect on the hybrid assignments, so only the data for are presented. In Tasmania and Gippsland native populations of E. globulus and E. ovata occur adjacent to E. nitens plantations and distinguishing E. globulus ovata from E. nitens ovata juveniles based on morphology would be very difficult. Therefore a three way simulation was run involving E. globulus ovata (), E. nitens ovata (), and the three pure parental populations, so as to assess the ability to detect hybrid parentage between these three species in the field.

Each pair of parental and simulated hybrid populations, and any pedigreed hybrids samples were then analysed in STRUCTURE using = 2. After checking the species groups were correctly identified with no a priori information [17], the parental species were used to define the two genetic clusters (the USEPOPINFO method), and the genotype membership () of hybrid samples were allocated using the admixture model. This analysis was run five times/combination. The default parameters were used except that the “allele frequencies updated using individuals with POPFLAG = 1 ONLY” option was selected. A burn-in of 50,000 MCMC iterations was followed by 100,000 data generation runs. The program CLUMPP [60] was used to merge the five STRUCTURE runs and that data was used for hybrid allocation. A cut-off of ≥0.2 was used to identify hybrids. This cut-off means that if ≥ 0.2 in both clusters the individual is classified as a hybrid, and if < 0.2 in one cluster, then the individual is indistinguishable from the parental species (i.e., the cluster with > 0.8). The proportion of simulated individuals correctly assigned as hybrids at > 0.2 was used to estimate the hybrid detection power.

2.3.3. Classification of the Putative Hybrids Using STRUCTURE

The STRUCTURE protocol above was used to classify the putative hybrid samples for comparison to the pedigreed and simulated results. Finally, to test the model when maternity is completely unknown, all pure samples, the simulated parental samples, simulated , and the pedigreed and putative hybrids were all run together in a six species analysis. Classification of hybrids given = 6 is slightly more complicated. Classification as a hybrid was considered correct if the two true parental values summed to at least 0.67 (i.e., more than two thirds the total possible ), both true parents contributed > 0.2, and no other species contributed more than either parent. The program DISTRUCT [61] was used to produce individual genotype membership plots for comparison of pedigreed, simulated, and putative hybrid samples.

3. Results

In the 606 individuals genotyped at 10 loci we found 344 different alleles. There was 1.5% missing data, and repeatability was 93%. Eucalyptus camaldulensis had the highest average number of alleles per locus (25.2) and more private alleles than any other species (28). Eucalyptus ovata and E. nitens had the lowest genetic diversity ( = 0.82, and 0.83, resp.), while E. cypellocarpa had the highest ( = 0.91). The hybrid group had the lowest number of private alleles and the highest observed heterozygosity (full population genetic details are in Supplementary Material 2). As expected given our range wide sampling of the species, there was significant departure from Hardy-Weinberg equilibrium (HWE) at several loci, but just under half the loci-population combinations (24 out of 60) were in HWE, and there were no clear patterns in departure between species. STRUCTURE assumes HWE, and although other authors have found the program to be robust to deviations from HWE [62], this departure from the assumptions makes the simulation analysis particularly important for determining the accuracy of our hybrid allocations. According to the AMOVA, most variation in the dataset was within species (92%) with just 8% partitioned between species. The pairwise estimates were low (ranging from 0.027 to 0.112) but were all highly significant (Table 2). showed more intermediate levels of differentiation (0.26 to 0.71) than , but the pattern of pair-wise differentiation was similar between the two estimates. Under both measures the lowest levels of molecular differentiation were between E. cypellocarpa and E. viminalis and between E. cypellocarpa and E. globulus, while the most well-differentiated species pairs were E. ovata and E. nitens and E. camaldulensis and E. ovata (Table 2).

The ability of STRUCTURE to distinguish between the six species was tested by using no a priori species information. The mean likelihood of indicated a plateau in the likelihood surface corresponding with = 6 (suggesting six groups), while the method of [58] showed a major genetic split in the data at = 4, with secondary peaks at = 5 and 6 (Figure 2). At = 4 E. cypellocarpa and E. globulus clustered together, as did E. ovata and E. viminalis, while, at = 5, E. cypellocarpa clustered with E. viminalis. However, at = 6 the clusters corresponded to the species groups, which is the most biologically meaningful result (see Supplementary Material 2). This indicates that although higher levels of structure seem to exist (at = 4 and 5; which was also evident in the low measures between some pairs of species), the dataset does differentiate the six unique species groups at = 6.

In the two-way STRUCTURE analysis using a priori species information the accuracy of detecting both simulated and pedigreed was high, with 98% for each (Table 3). In three of the simulated and three of the pedigreed combinations 100% of hybrids were detected (Table 3). Detectability of simulated was slightly lower at 93% (Table 3). The overall accuracy of detecting simulated parental individuals was slightly lower again (91%), which was due mainly to difficulty in detecting simulated parents in the E. cypellocarpa globulus and E. viminalis globulus combinations (Table 3), which were also the least well-differentiated species in terms of (Table 2). The lowest detectability in both simulated and pedigreed hybrids was in the backcross generations, where the accuracy fell to just 33.4% for detecting the only pedigreed backcross combination (E. nitens globulus; ; Table 3; Figure 3). Figure 3 shows that although there was some variation in the simulated groups, the patterns across the different generations and species combinations are consistent with theoretical expectations. This was clear in the group means (presented in Supplementary Material 3). For example, in the E. camaldulensis globulus combination the means of the assignments were 0.53 (E. camaldulensis cluster) and 0.47 (E. globulus cluster), while the means for the were 0.74 and 0.26. These are very close to the theoretical allele frequencies expected for and BC generations (i.e., 0.5 to 0.5 and 0.75 to 0.25, resp.), and similar theoretically consistent mean values were obtained for all combinations (Supplementary Material 3).

Of the 15 putative hybrid samples assessed, 12 were classified as hybrids. The remaining three putative hybrids were indistinguishable from their pure parents (Table 4). The values of all 12 samples assigned as hybrids were outside the 95% confidence intervals (CIs) of both parents, indicating they are unlikely to be mis-classified parental samples. However, one E. camaldulensis globulus sample showed stronger affinities to E. globulus, with values of 0.335 (95% CIs = 0.108, 0.587) to E. camaldulensis and 0.665 (95% CIs = 0.413, 0.892) to E. globulus, but the 95% CIs did include the simulated means (E. camaldulensis cluster mean = 0.530; E. globulus cluster mean = 0.470). The remaining three E. camaldulensis globulus had . 0.5 assignment to each parent. The three individuals that were indistinguishable from one parent had overlapping 95% CIs with that parent.

The three-way simulation involving E. ovata E. nitens and E. ovata E. globulus showed that the marker set could accurately identify parental contributions when there are multiple parent-hybrid combinations in the dataset (Figure 3). Figures 3(f) and 3(g) show that there is an increase in the noise of the admixture assignments as the number of genetic clusters increases. This is more apparent in the six-way analysis (Figure 3), where for some species, in particular E. cypellocarpa and E. viminalis, the accuracy of assignment is markedly reduced (Figure 3). This being said the assignment of pedigreed samples is still relatively efficient at 83.3%. The reduction in efficiency of the six-way analysis can also be seen in the classification of the putative hybrid samples with four samples that classified as hybrids under the two-way analysis now being indistinguishable from one parental species (Table 4).

4. Discussion

4.1. Performance of the Approach

This study successfully developed and tested a microsatellite database and Bayesian modelling approach for a group of six species, so as to enable the detection of exotic hybrids from E. globulus (and possibly E. nitens) plantations when the maternal native species is known. This required sufficient differentiation in allele frequencies so that individuals with genotypes that are admixed between two species could be detected. Yet, the level of differentiation between species estimated from the inbreeding coefficient is low in comparison to a range of other eucalypts [6366], particularly considering that the comparison is between taxonomically distinct species. However, because the range-wide sampling strategy used here covered geographically discrete populations of these widespread species (Figure 1), it probably captures significant within species differentiation, which could subsequently reduce values between species [6770]. Nevertheless, all pairwise values were highly significant and the standardised metric indicated intermediate species differentiation. The level of differentiation in the dataset was sufficient for STRUCTURE to distinguish between the species and enabled the accurate detection of hybrids. Patterns in hybrid detection were consistent with patterns of between pairs of taxa. This is consistent with theoretical expectations [17] and empirical modelling [71] in that species with the highest differentiation (e.g., E. camaldulensis and E. globulus) showed the highest accuracy in hybrid detection and vice versa.

For all species combinations in the two-way analysis, the accuracy, and likelihood of identifying simulated and pedigreed hybrids was high, and comparable with similar studies in forest trees [24, 62, 72]. Parental and generations were also accurately identified, although the success in detecting backcross hybrids was lower. Several studies have had similar problems identifying backcross generations using the same approach [4, 24, 26]. For example Lepais et al. [72] reported that 32% of simulated oak backcrosses were misclassified as pure species despite strong species differentiation and the use of a lower cut-off ( > 0.1). If > 0.1 was used here, then the efficiency in detecting simulated E. camaldulensis globulus backcrosses would increase from 70% to 96%; however, there would be a parallel increase in the number of pure parents incorrectly identified as hybrids—rising from 1% to 36%. This trade off has been documented by several authors [26, 7173] and simulation studies show that the number of markers necessary for accurate and efficient identification of backcross hybrids could be as high as 48 microsatellite loci given an of 0.21 [71]. Vähä and Primmer [71] tested a range of values and found that > 0.2 most effectively balanced efficiency and accuracy. However, detection of exotic backcross hybrids is not currently necessary in the E. globulus system. The Australian E. globulus estate is young, with most plantations nearing the end of their first 10–15 year rotation [30, 74] making it highly unlikely that mature exotic hybrids exist to produce backcrosses.

If the detection of backcrosses does become necessary, more loci could be added to the existing set, or a different marker systems could be used [26, 71]. For example, Diversity Array Technology (DArT) has recently been developed for a range of eucalypt species including E. globulus, E. camaldulensis, and E. nitens. The DArT system produces hyper variable dominant markers [75], with over 5,000 polymorphic loci currently available in eucalypts [75, 76]. Despite the lower information content per marker due to dominance, using such large marker datasets and similar methods to those used here would presumably lead to highly accurate assignments. Indeed much smaller DArT datasets (1122 markers) have been shown to outperform similar sized microsatellite (8 loci) datasets in other studies employing Bayesian clustering [77]. Alternatively, with so many markers it may be possible to identify subsets of species-specific loci or alleles that could differentiate hybrid generations with greater power than microsatellite based systems, without the need to assay all 5,000 loci. For example, Boecklen and Howard [21] found that four or five independent species-specific markers can accurately identify first generation backcrosses. However, development of such marker systems is time consuming and expensive and their deployment will depend on a trade-off between cost, time, and the required detection power [78]. The system developed here is effective and cost efficient (lab costs are approximately $15 AU/new sample, including DNA extraction and microsatellite assay, but not technician time) given the current requirement for identifying hybrids.

In eastern and southern Tasmania as well as parts of Gippsland in Victoria, E. nitens plantations occur within the native range of both E. globulus and E. ovata. Because of the very similar juvenile morphology of E. globulus and E. nitens, this could result in situations where the parentage of hybrids detected with E. ovata would be ambiguous. Exotic hybrids between E. ovata and both plantation species are well known [31, 79] and do show similar morphology. The three-way simulation (E. globulus, E. nitens, and E. ovata) here showed the utility of the microsatellite based approach in overcoming ambiguity in this situation, and it could be an important management tool for distinguishing between exotic and natural hybridisation where E. nitens grows within the native range of E. globulus.

The six-way analysis is in some ways the ultimate test of the approach and will be particularly useful for assessing putative hybrids collected in the field where maternity is unclear. The accuracy clearly decreased as the number of species in the model increased (i.e., from 2-3–6). However, despite this reduction, the six-way model could still identify over 80% of hybrids. Most other published studies assess two [4, 23, 71, 8082], three [26, 73], or occasionally four [72] species when investigating hybrid parentage, and in reality it is unlikely that a situation will arise where all six species from this model are potential parents. An assessment of the species growing where the hybrid was collected would probably narrow down the number of potential parents. Also most exotic plantation hybrids identified in the field are found among pure seedlings of the native species [31, 83], likely enabling the identification of a single putative maternal species. In an operational context where a putative exotic hybrid has been identified in the field, an effective approach may be to run a full six-way analysis to rule out other species contributing pollen, which can travel long distances [84], then reduce the number of species to those found in the vicinity of the putative hybrid.

4.2. Allocation of Putative Hybrids

The putative hybrid samples assessed here came from a range of situations including native forests where the generation of the putative hybrids was unknown. This is a more challenging problem than identifying hybrids around E. globulus plantations where any hybrids can currently be assumed to be . However, by incorporating additional information, including demographic details at the collection site, and morphology, it is possible to improve allocation confidence and estimate the hybrid generation of the samples. Of the 15 putative hybrids 12 were classified as hybrids and three could not be distinguished from their pure parents in the two-way analysis. Of the three indistinguishable samples, the two putative E. cypellocarpa are probably correctly classified as pure E. cypellocarpa. These two putative hybrid were collected from seedlings beside a 10-year-old plantation—ruling out the possibility that they are backcrosses. The samples were only tentatively classified as “possible” (with a low degree of certainty) based on the degree of glaucousness, which can be a variable trait [12]. The putative E. viminalis globulus hybrid that was classified as E. globulus is less clear-cut. This sample was identified in open pollinated E. globulus seed, collected from a native forest with no other cross-compatible eucalypts nearby. The sample showed distinctively intermediate morphology on multiple traits, consistent with known hybrids between these species [33]. The likelihood of random morphological deviations on multiple traits leading to in intermediate characteristics is low, and is more easily explained by interspecific hybridisation [85, 86]. Therefore, considering the model inaccuracy when identifying backcrosses, it is possible that this sample is actually a first or perhaps later generation backcross.

Incorporating additional site and demographic information also indicates that several samples identified as hybrids might actually be advanced generation hybrids. For example, the putative E. cypellocarpa E. globulus hybrids collected at Mallacoota come from one of the first reported examples of a phantom hybrid zone [46, 47, 87]. All trees in the population appear to be intermediate to varying degrees between E. cypellocarpa and E. globulus, but despite E. cypellocarpa occurring nearby, the nearest native E. globulus tree is 6.4 km away—hence the “phantom” hybrid zone [88]. After morphological and chemical analysis, a previous study concluded that the population is most likely of hybrid origin and represents a genetic remnant of the past distribution of E. globulus [46]. The current population is at sea level and it was hypothesised that the E. globulus source population was probably flooded when sea level rose after the last glacial maximum [46]. This situation would result in the population being made up mainly of backcrosses (to E. cypellocarpa) or hybrids. The four samples analysed from this population do appear to fit this expectation with the one sample being consistent with an , , or a backcross, and the other three being most similar to backcrosses towards E. cypellocarpa (Figure 3 and see Supplementary Material 3 for more detail and a discussion of all putative hybrid samples).

5. Conclusion

The marker set and Bayesian modelling approach implemented here accurately identified simulated and pedigreed first generation hybrids, which was the aim of the study. The system was tested with more challenging scenarios from mature native forests that possibly included advanced generation hybrids. Despite this, it was concluded that 14 of the 15 unknown samples were correctly allocated, and one somewhat ambiguous sample was possibly an advance generation hybrid. The approach highlighted the power of using multiple lines of evidence, including morphology, the demographic setting of the native forest where the sample was collected, and molecular data, in classifying putative hybrids. The combined evaluation undertaken here has provided validation of natural advanced generation hybrids between E. globulus and E. cypellocarpa and E. globulus and E. viminalis. It also provided confirmation of exotic hybridisation between E. globulus plantations and native E. camaldulensis. The database is now available for deployment in the detection of exotic hybrids from plantations in Australia, and in the future could be built upon to include other species and used for comparison with other hybrid systems.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.


For assistance with sample collection the authors thank Martyn Lavery. For providing DNA and/or tissues samples for DNA extraction the authors thank Simon Southerton, Dean Williams, Sandra Hetherington, Kelsey Joyce, James Marthick, Susan Foster, Jules Freeman, Dorothy Steane, and Corey Hudson. For assistance in the lab the authors thank Sascha Wise. For providing access to plantations the authors thank Hancock Victoria Plantations and Australian Bluegum Plantations. The authors thank the Australian Bureau of Agricultural and Resource Economics and Sciences, Canberra, particularly Mijo Gavran for information and providing GIS data. This research was funded by Forests and Wood Products Australia, the CRC for Forestry and the Australian Research Council (DP130104220; LP0455522).

Supplementary Materials

Supplementary material 1: Describes the six study species and the sample collection for each.

Supplementary material 2: Provides additional results.

Supplementary material 3: Discusses the classification of putative hybrids in detail. It also gives the tabulated model results (including q values) summarizing the allocation of simulated, pedigreed and putative hybrid samples.

  1. Supplementary Materials