Review Article

Importance of Genetic Diversity Assessment in Crop Plants and Its Recent Advances: An Overview of Its Analytical Perspectives

Table 1

Some basic statistical concept on genomic data for genetic diversity assessment.

Concept terms Description/features Formulae/pros/cons

Band-based approachesEasiest way to analyze and measure diversity by focusing on presence or absence of banding pattern.Routinely use individual level.
Totally relay on marker type and polymorphism

(1) Measuring polymorphismObserving the total number of polymorphic bands (PB) and then calculating the percentage of polymorphic bands.This “band informativeness” (Ib) can be represented on a scale ranging from 0 to 1 according to the formula
Ib = ,
where is the portion of genotypes containing the band.

(2) Shannon’s information index ()It is called the Shannon index of phenotypic diversity and is widely applied..
These methods depend on the extraction of allelic frequencies.

(3) Similarity coefficientsUtilize similarity or dissimilarity (the inverse of the previous one) coefficients.
The Jaccard coefficient () only takes into account the bands present in at least one of the two individuals. It is therefore unaffected by homoplasic absent bands (where the absence of the same band is due to different mutations).
The simple-matching index (SM) maximizes the amount of information provided by the banding patterns considering all scored loci.
The Neil and Li index (SD) doubles the weight for bands present in both individuals, thus giving more attention to similarity than dissimilarity.
(i) Jaccard similarity coefficient or
Jaccard index .
(ii) Simple matching coefficient or index SM =  .
(iii) Sørensen-Dice index or Nei and Li index SD =
where is the number of bands (1 s) shared by both individuals; is the number of positions where individual has a band, but does not; is the number of positions where individual has a band, but does not; and is the total number of bands (0 s and 1 s).

(4) Allele frequency based approachesMeasure variability by describing changes in allele frequencies for a particular trait over time, more population oriented than band-based approaches.These methods depend on the extraction of allelic frequencies from the data.
The accurate estimates of frequencies essentially influence the results of different indices calculated for further measurements of genetic diversity.

(5) Allelic diversity ()Easiest ways to measure genetic diversity is to quantify the number of alleles present.
Allelic diversity () is the average number of alleles per locus and is used to describe genetic diversity.
= /
where is the total number of alleles over all loci; is the number of loci.
It is less sensitive to sample size and rare alleles and is calculated as
ability; it provides information about the dispersal ability of the organism and the degree of isolation among populations.

(6) Effective population size ()It provides a measure of the rate of genetic drift, the rate of genetic diversity loss, and increase of inbreeding within a population.Effective size of a population is an idealized number, since many calculations depend on the genetic parameters used and on the reference generation. Thus, a single population may have many different effective sizes which are biologically meaningful but distinct from each other.

(7) Heterozygosity ()There are two types of heterozygosity observed () and expected ().
The is the portion of genes that are heterozygous in a population and is estimated fraction of all individuals that would be heterozygous for any randomly chosen locus.
Typically values for and range from 0 (no heterozygosity) to nearly 1 (a large number of equally frequent alleles).
If and are similar (they do not differ significantly), mating in the populations is random. If , the population is inbreeding; if , the population has a mating system avoiding inbreeding.
Expected is calculated based on the square root of the frequency of the null (recessive) allele as follows:

where is the frequency of the ith allele.
is calculated for each locus as the total number of heterozygotes divided by sample size.

(8) -statisticsIn population genetics the most widely applied measurements besides heterozygosity are -statistics, or fixation indices, to measure the amount of allelic fixation by genetic drift.
The -statistics are related to heterozygosity and genetic drift. Since inbreeding increases the frequency of homozygotes, as a consequence, it decreases the frequency of heterozygotes and genetic diversity.
Three indexes can be calculated as follows:
= 1 − (/),
= 1 − (/),
= 1 − (/),
where is the average within each population, is the average of subpopulations assuming random mating within each population, and is the of the total population assuming random mating within subpopulations and no divergence of allele frequencies among subpopulations.