Abstract

The distribution of biodiversity at multiple sites of a region has been traditionally investigated through the additive partitioning of the regional biodiversity into the average within-site biodiversity and the biodiversity among sites. The standard additive partitioning of diversity requires the use of a measure of diversity, which is a concave function of the relative abundance of species, such as the Gini-Simpson index, for instance. Recently, it was noticed that the widely used Gini-Simpson index does not behave well when the number of species is very large. The objective of this paper is to show that the new weighted Gini-Simpson index preserves the qualities of the classic Gini-Simpson index and behaves very well when the number of species is large. The weights allow us to take into account the abundance of species, the phylogenetic distance between species, and the conservation values of species. This measure may also be generalized to pairs of species and, unlike Rao’s index, this measure proves to be a concave function of the joint distribution of the relative abundance of species, being suitable for use in the additive partitioning of biodiversity. The weighted Gini-Simpson index may be easily transformed for use in the multiplicative partitioning of biodiversity as well.

1. Introduction

Measuring biodiversity is a major, and much debated, topic in ecology and conservation biology. The simplest measure of biodiversity is the number of species from a given community, habitat, or site. Obviously, this ignores how many individuals each species has. The best known measures of biodiversity, that also take into account the relative abundance of species, are the Gini-Simpson index 𝐺 𝑆 and the Shannon entropy 𝐻 . Both measures have been imported into biology from other fields Thus, Gini [1] introduced his formula in statistics, in 1912. Much later, after 37 years, Simpson [2] pleaded convincingly in favour of using Gini’s formula as a measure of biodiversity. Shannon was an engineer who introduced his discrete entropy in information theory [3], in 1948, as a measure of uncertainty, inspired by Boltzmann’s continuous entropy from classical statistical mechanics [4], defined half a century earlier. Shannon’s formula was adopted by biologists about 17 years later [58], as a measure of specific diversity. This import of mathematical formulas has continued. Rényi, a probabilist, introduced his own entropy [9], in order to unify several generalizations of the Shannon entropy. He was a pure mathematician without any interest in applications, but later, Hill [10] claimed that by taking the exponential of Rényi’s entropy we obtain a class of suitable measures of biodiversity, called Hill’s numbers, which were praised by Jost [1113] as being the “true” measures of biodiversity. In 1982, Rao [14], a statistician, introduced the so-called quadratic entropy 𝑅 , which in fact has nothing to do with the proper entropy and depends not only on the relative abundance of species but also on the phylogenetic distance between species. This function has also been quickly adopted by biologists as a measure of dissimilarity between the pairs of species. In the last 20 years, a lot of other measures of diversity have been proposed. According to Ricotta [15], there is currently a “jungle” of biological measures of diversity. However, as mentioned by S. Hoffmann and A. Hoffmann [16], there is no unique “true” measure of diversity.

Starting with MacArthur [5], MacArthur and Wilson [7], and Whittaker [8], the distribution of biodiversity at multiple sites of a region has been traditionally investigated through the partitioning of the regional or total biodiversity, called γ-diversity, into the average within-site biodiversity, called α-diversity, and the between-site biodiversity or diversity turnover, called β-diversity. All these diversities, namely, α-diversity, β-diversity, and γ-diversity, should be nonnegative numbers. Unlike α-diversity and γ-diversity, there is no consensus about how to interpret and calculate β-diversity. According to Whittaker [8], who introduced the terminology, β-diversity is the ratio between γ-diversity and α-diversity. This is the multiplicative partitioning of diversity. According to MacArthur [5], Lande [17], and, more recently, Veech at al. [18], β-diversity is the difference between γ-diversity and α-diversity. This is the additive partitioning of diversity.

Let us assume that in a certain region there are 𝑛 species, m sites, and 𝜃 𝑘 is the distribution of the relative abundance of species at site 𝑘 . Let 𝜆 𝑘 > 0 be an arbitrary parameter assigned to site 𝑘 , such that 𝜆 1 + + 𝜆 𝑚 = 1 . These parameters may be used to make adjustments for differences (in size, altitude, etc.) between the sites. If no adjustment is made, we take these parameters to be equal, that is, 𝜆 𝑘 = 1 / 𝑚 , for every 𝑘 . If μ is a nonnegative measure of diversity, which assigns a nonnegative number to each distribution of the relative abundance of the 𝑛 species, then the corresponding γ-diversity is 𝛾 = 𝜇 ( 𝑘 𝜆 𝑘 𝜃 𝑘 ) and the α-diversity is 𝛼 = 𝑘 𝜆 𝑘 𝜇 ( 𝜃 𝑘 ) . The β-diversity is taken to be 𝛽 = 𝛾 𝛼 , in the additive partitioning of diversity, and 𝛽 = 𝛾 / 𝛼 , in the multiplicative partitioning of diversity. In general, a measure of biodiversity ought to be nonnegative, in which case the corresponding α-diversity and γ-diversity calculated by using such a measure are also nonnegative, as they should be. From a systemic point of view, the β-diversity shows to what extent the total, or regional diversity differs from the average diversity of the communities/habitats/sites taken together, as a system, reflecting the dissimilarity, or differentiation between communities/habitats/sites of the region with respect to the individual species. If the measure of biodiversity is a concave function of the distribution of the relative abundance of species 𝜃 𝑘 , then the corresponding 𝛽 -diversity is 𝛽 0 , in the additive partitioning of diversity, and 𝛽 1 , in the multiplicative partitioning of diversity, for arbitrary parameters 𝜆 𝑘 > 0 , 𝜆 1 + + 𝜆 𝑚 = 1 . If a measure of diversity 𝜇 is not a concave function of the distribution of the relative abundance of species 𝜃 𝑘 , then the corresponding β-diversity could be negative, in the additive partitioning of biodiversity, or less than 1, in the multiplicative partitioning of biodiversity, for some parameters satisfying 𝜆 𝑘 > 0 , 𝜆 1 + + 𝜆 𝑚 = 1 , which is absurd. As discussed by Jost [11, 12], if a measure of diversity 𝜇 is not a concave function of the distribution of the relative abundance of species 𝜃 𝑘 , we can still attempt the partitioning of biodiversity into 𝛼 -, 𝛽 -, and 𝛾 -diversity if a new kind of α-diversity may be introduced. This new type of 𝛼 -diversity would be based on a different way of averaging the diversities of the individual communities/habitats/sites instead of the simple, golden mean value 𝛼 = 𝑘 𝜆 𝑘 𝜇 ( 𝜃 𝑘 ) from statistics, which works so well for the concave measures of diversity. However, finding such an unorthodox, nonstandard α-diversity when the measure of biodiversity 𝜇 is not concave is not easy. It is also difficult to find a mathematical interpretation for such a new kind of α-diversity. In spite of the passage of time, the most popular measures of biodiversity are still 𝐺 𝑆 , 𝐻 , and 𝑅 . Both 𝐺 𝑆 and 𝐻 are concave functions of the distribution of the relative abundance of species and therefore can be used for doing the additive partitioning of biodiversity. The first two Hill’s numbers are mathematical transformations of 𝐺 𝑆 and 𝐻 , namely, 𝐻 𝑙 1 = 1 / ( 1 𝐺 𝑆 ) and 𝐻 𝑙 2 = e x p ( 𝐻 ) , and are used in the multiplicative partitioning of biodiversity. Recently, however, it was noted [13, 19] that both Shannon’s entropy and the Gini-Simpson index do not behave well when the number of species is very large. On the other hand, when a distance between species, such as the phylogenetic distance for instance, is also taken into account along with the relative abundance of species, Rao’s index [14] is a widely used measure of dissimilarity. But, unfortunately, Rao’s index is not a concave function of the distribution of the relative abundance of species, for an arbitrary distance matrix between species. Consequently, it proves to be suitable for use in the standard additive partitioning of diversity only in some special cases, but not in general. The objective of this paper is to show that the weighted Gini-Simpson quadratic index, a generalization of the classic Gini-Simpson index of biodiversity, offers a solution to both of the drawbacks just mentioned. Unlike Shannon’s entropy and the classic Gini-Simpson index, this new weighted measure of biodiversity behaves very well even if the number of species is very large. The weights allow us to measure biodiversity when a distance between species and/or conservation values of the species are taken into account, along with the abundance of species. When the phylogenetic distance between species is taken as the weight, the corresponding weighted Gini-Simpson index, unlike Rao’s index, is a concave function of the distribution of the relative abundance of the pairs of species, being suitable for use in the additive partitioning of biodiversity. A simple algebraic transformation makes the weighted Gini-Simpson index suitable for use in the multiplicative partitioning of biodiversity as well.

In Methodology, the weighted Gini-Simpson quadratic index is defined both for individual species and for pairs of species. This new measure of biodiversity is used for calculating the average within-site biodiversity (α-diversity), the intersite biodiversity (β-diversity), and the regional or total biodiversity (γ-diversity). It is also shown that the weighted Gini-Simpson quadratic index may be easily modified, by a simple algebraic transformation, to get a measure of biodiversity suitable for use in the multiplicative partitioning of biodiversity as well. In Section 3, a numerical example is presented, which illustrates how the mathematical formalism should be applied from a practical standpoint.

2. Methodology

2.1. The Weighted Measure of Diversity with Respect to Individual Species

Let us assume that there are 𝑛 species in a certain community/habitat/site and let 𝑝 𝑖 be the relative abundance of species 𝑖 (the number of individuals of species 𝑖 divided by the total number of individuals in that community/habitat/site). We have diversity if species 𝑖 is present at that location but other species are found there as well. The probability that the species 𝑖 is present and there are other species present as well is 𝑝 𝑖 ( 1 𝑝 𝑖 ) . If we take all possible values of 𝑝 𝑖 from the unit interval [ 0 , 1 ] into account, the wave function 𝑝 𝑖 ( 1 𝑝 𝑖 ) corresponding to the species 𝑖 is a nonnegative, symmetric, bell-shaped, concave function, reaching its maximum value 1 / 4 at 𝑝 𝑖 = 1 / 2 . If we sum up these wave functions, for all 𝑛 species, we obtain the classic Gini-Simpson index 𝐺 𝑆 ( 𝜃 ) corresponding to the given distribution of the relative abundance of the species 𝜃 = ( 𝑝 1 , , 𝑝 𝑛 ) . Since 1949, this has been considered to be a very good measure of biodiversity. In order to generalize it, we may assign an amplitude 𝑤 𝑖 0 to the wave function of the species 𝑖 , and the resulting new wave function 𝑤 𝑖 𝑝 𝑖 ( 1 𝑝 𝑖 ) continues to be a nonnegative, symmetric, bell-shaped, concave function of 𝑝 𝑖 , but this time its maximum value is 𝑤 𝑖 / 4 . Summing up these wave functions for all the species, we get the weighted Gini-Simpson index: 𝐺 𝑆 𝑤 ( 𝜃 ) = 𝑖 𝑤 𝑖 𝑝 𝑖 1 𝑝 𝑖 , ( 1 ) which depends both on the distribution of the relative abundance of species 𝜃 and on the nonnegative weights 𝑤 = ( 𝑤 1 , , 𝑤 𝑛 ) . The concavity of 𝐺 𝑆 𝑤 ( 𝜃 ) was proven in [20, 21]. The weight 𝑤 𝑖 could be anything which contributes to the increase in the diversity induced by the species 𝑖 . However, the weights may not depend on the relative abundance of species. If 𝑤 𝑖 = 𝑛 , for each 𝑖 , then (1) becomes the so-called Rich-Gini-Simpson index 𝐺 𝑆 𝑛 ( 𝜃 ) , introduced in [22], which is essentially dependent on the species richness of the respective community/habitat/site. If there are some conservation values assigned to the species 𝑣 = ( 𝑣 1 , , 𝑣 𝑛 ) , which are positive numbers on a certain scale of values, and the weights are 𝑤 𝑖 = 𝑛 𝑣 𝑖 , the corresponding weighted Gini-Simpson index is denoted by 𝐺 𝑆 𝑛 , 𝑣 ( 𝜃 ) . Obviously, if 𝑤 𝑖 = 1 , for each species 𝑖 , then (1) is the classic Gini-Simpson index 𝐺 𝑆 ( 𝜃 ) . An upper bound for 𝐺 𝑆 𝑤 ( 𝜃 ) , which depends only on the maximum weight and the number of species, is 0 𝐺 𝑆 𝑤 ( 𝜃 ) m a x 𝑖 𝑤 𝑖 𝑖 𝑝 𝑖 1 𝑝 𝑖 m a x 𝑖 𝑤 𝑖 1 1 𝑛 . ( 2 ) Denoting by 𝐵 1 the bound from the right-hand side of the inequality (2), the relative weighted Gini-Simpson index for individual species is 0 𝐺 𝑆 𝑤 ( 𝜃 ) / 𝐵 1 1 . In Appendix A, another bound of 𝐺 𝑆 𝑤 ( 𝜃 ) is given, denoted by 𝐵 2 , which depends on all the weights assigned to the species. If 𝑤 𝑖 = 1 , for each species 𝑖 , we get m a x 𝜃 𝐺 𝑆 ( 𝜃 ) = 1 1 / 𝑛 , and this maximum biodiversity is obtained when all species have the same relative abundance 𝑝 𝑖 = 1 / 𝑛 . The fact that the maximum value of 𝐺 𝑆 ( 𝜃 ) is almost insensitive to the increase of the number of species, tending very slowly to 1 when 𝑛 increases, allowed Jost [12, 13] and Jost et al. [19] to give some examples showing that the Gini-Simpson index does not behave well when the number of species 𝑛 is very large. R. C. Guiasu and S. Guiasu [22] showed, however, that the Rich-Gini-Simpson index 𝐺 𝑆 𝑛 ( 𝜃 ) has no such problem. Indeed, if we take 𝑤 𝑖 = 𝑛 , for each species 𝑖 , we get from (2), m a x 𝜃 𝐺 𝑆 𝑛 ( 𝜃 ) = 𝑛 1 , whose value sensibly increases when the number of species increases, which makes inapplicable the criticism of the classic Gini-Simpson index.

2.2. The Additive Partitioning of Biodiversity with Respect to the Individual Species

Let us assume that in a certain region there are 𝑛 species and 𝑚 sites. In what follows, the subscripts 𝑖 and 𝑗 refer to species ( 𝑖 , 𝑗 = 1 , , 𝑛 ) and the subscripts 𝑘 and 𝑟 refer to sites, ( 𝑘 , 𝑟 = 1 , , 𝑚 ) . Let 𝜃 𝑘 = ( 𝑝 1 , 𝑘 , , 𝑝 𝑛 , 𝑘 ) be the vector whose components are the relative abundances of the individual species at site 𝑘 , such that 𝑝 𝑖 , 𝑘 0 , ( 𝑖 = 1 , , 𝑛 ) , 𝑖 𝑝 𝑖 , 𝑘 = 1 , for each 𝑘 = 1 , , 𝑚 . Let 𝑤 = ( 𝑤 1 , , 𝑤 𝑛 ) be nonnegative weights assigned to the species. In dealing with species diversity, a good measure of the differentiation, or dissimilarity, among the sites in a certain region has to be nonnegative and equal to zero if and only if there is no such difference. We assign a parameter 𝜆 𝑘 to each site 𝑘 , such that 𝜆 𝑘 0 , ( 𝑘 = 1 , , 𝑚 ) , 𝑘 𝜆 𝑘 = 1 . ( 3 ) These parameters may be used to make adjustments for differences (in size, altitude, etc.) between the sites, as shown in [23]. If no adjustment is made and we focus only on the species abundance, we take these parameters to be equal, that is, 𝜆 𝑘 = 1 / 𝑚 , for every ( 𝑘 = 1 , , 𝑚 ) . As 𝐺 𝑆 𝑤 ( 𝜃 ) is a concave function of the distribution of the relative abundance 𝜃 , it may be used in the additive partitioning of biodiversity. The corresponding 𝛾 -diversity, reflecting the total or regional biodiversity, the 𝛼 -diversity, interpreted as the within-site diversity or the average diversity of the sites, and the 𝛽 -diversity, as a measure of between-site diversity, are given by 𝛾 = 𝐺 𝑆 𝑤 𝑘 𝜆 𝑘 𝜃 𝑘 , 𝛼 = 𝑘 𝜆 𝑘 𝐺 𝑆 𝑤 𝜃 𝑘 , 𝛽 = 𝛾 𝛼 . ( 4 )

The 𝛽 -diversity may be interpreted as a measure of dissimilarity or differentiation between the sites of the respective region with respect to the individual species. As shown in Appendix C, taking into account (4), the 𝛽 -diversity has the expression 𝛽 = 𝑖 𝑤 𝑖 𝑘 < 𝑟 𝜆 𝑘 𝜆 𝑟 ( 𝑝 𝑖 , 𝑘 𝑝 𝑖 , 𝑟 ) 2 < 2 𝑖 𝑤 𝑖 𝑘 < 𝑟 𝜆 𝑘 𝜆 𝑟 . ( 5 ) The upper bound in (5) is quite loose. As 𝛼 is nonnegative and cannot exceed 𝛾 , a better upper bound for 𝛽 is the value 𝐵 2 from (A.1), corresponding to 𝑛 and the given weights 𝑤 . Thus, the relative between-site diversity, or the dissimilarity between sites is 0 𝛽 / 𝐵 2 1 , with respect to the individual species. From (5), we can see that if the species have the same abundance in each site, which means that 𝑝 𝑖 , 𝑘 = 𝑝 𝑖 , for each site 𝑘 and each species 𝑖 , the 𝛽 -diversity is equal to zero, reflecting the fact that in such a case there is no dissimilarity between the sites.

2.3. The Multiplicative Partitioning of Biodiversity with Respect to Individual Species

Dealing with the multiplicative partitioning of diversity, Whittaker [8] suggested the use of the exponential of the Shannon entropy as a measure of biodiversity. The weighted Gini-Simpson index 𝐺 𝑆 𝑤 ( 𝜃 ) , given by (1), which can be used in the additive partitioning of diversity induced by individual species, may also be transformed into the measure of biodiversity (R. C. Guiasu and S. Guiasu [21]): 1 𝑖 𝑤 𝑖 𝑝 𝑖 𝐺 𝑆 𝑤 ( = 𝜃 ) 𝑖 𝑤 𝑖 𝑝 2 𝑖 1 , ( 6 ) which can be used in the multiplicative partitioning of diversity induced by the individual species. This measure of biodiversity may be viewed as being the weighted version of the classic Hill number of first degree from [10]. The corresponding multiplicative 𝛾 -diversity and 𝛼 -diversity are 𝛾 = 𝑖 𝑤 𝑖 𝑘 𝜆 𝑘 𝑝 𝑖 , 𝑘 2 1 , 𝛼 = 𝑘 𝜆 𝑘 𝑖 𝑤 𝑖 𝑝 2 𝑖 , 𝑘 1 . ( 7 ) Due to the convexity of the function 𝑖 𝑤 𝑖 𝑝 2 𝑖 , as a function of the distribution of the relative abundance of species, the 𝛾 -diversity cannot be smaller than the 𝛼 -diversity, as it should be, and, consequently, the multiplicative 𝛽 -diversity satisfies the inequality: 𝛽 = 𝛾 / 𝛼 1 . In the additive partitioning of diversity, the 𝛾 -diversity, 𝛼 -diversity, and 𝛽 -diversity are entities of the same kind and may be expressed in the same units. In the multiplicative partitioning of diversity, the 𝛽 -diversity is simply a ratio between the total, regional biodiversity 𝛾 and the average within-site biodiversity 𝛼 , a numerical indicator showing to what extent the regional biodiversity, as a whole, exceeds the average biodiversities of the sites of the respective region. Obviously, if the sites have the same species and the same abundance of these species, which means that 𝑝 𝑖 , 𝑘 = 𝑝 𝑖 , for each species 𝑖 , then 𝛽 = 1 .

2.4. The Weighted Measure of Diversity with Respect to the Pairs of Species

Let 𝐷 = [ 𝑑 𝑖 𝑗 ] be an 𝑛 × 𝑛 matrix whose entries are the distances between the pairs of 𝑛 species, such that 𝑑 𝑖 𝑗 0 , 𝑑 𝑖 𝑖 = 0 , ( 𝑖 , 𝑗 = 1 , , 𝑛 ) . This could be the matrix of the phylogenetic distance between species, for instance. When Rao introduced his quadratic index, improperly called quadratic entropy [14]: 𝑅 𝐷 = 𝑖 , 𝑗 𝑑 𝑖 𝑗 𝑝 𝑖 𝑝 𝑗 , ( 8 ) he had to focus on the pairs of species instead of the individual species, using the distance between the pairs of distinct species, along with their relative abundance, in order to measure the dissimilarity between species. Rao’s indicator is very simple and may be easily interpreted as the average dissimilarity between two individuals belonging to two different species when the phylogenetic distance is taken into account. There have been numerous attempts, such as [24], for instance, at using Rao’s index 𝑅 𝐷 in the additive partitioning of diversity. Unfortunately, 𝑅 𝐷 is not a concave function of the distribution of the relative abundance of species 𝜃 = ( 𝑝 1 , , 𝑝 𝑛 ) for an arbitrary distance matrix 𝐷 . Thus, 𝑅 𝐷 can be applied to the additive partitioning of biodiversity only for some special kinds of such matrices 𝐷 (as mentioned in [24], for instance), but not in general. On the other hand, there is no generally accepted proposal of a new kind of nonstandard 𝛼 -divergence which could be defined for such a measure which is not a concave function of the distribution of the relative abundance of species. This paper shows that the generalization of the weighted Gini-Simpson index to the pairs of species provides a concave measure of diversity which could indeed be used both for the additive partitioning and the multiplicative partitioning of diversity when the phylogenetic distance between species is taken into account. Therefore, this provides a suitable replacement of Rao’s index in the partitioning of biodiversity.

Let Θ = [ 𝜋 𝑖 𝑗 ] be an 𝑛 × 𝑛 matrix where 𝜋 𝑖 𝑗 is a joint probability of the pair of species ( 𝑖 , 𝑗 ) , in this order. As 𝜋 𝑗 𝑖 is the probability of the pair ( 𝑗 , 𝑖 ) , in this order, the probability of the subset of species { 𝑖 , 𝑗 } is 𝜋 𝑖 𝑗 + 𝜋 𝑗 𝑖 . We have 𝜋 𝑖 𝑗 0 , ( 𝑖 , 𝑗 = 1 , , 𝑛 ) ; 𝑖 , 𝑗 𝜋 𝑖 𝑗 = 1 . Let 𝑊 = [ 𝑤 𝑖 𝑗 ] be an 𝑛 × 𝑛 matrix whose entries are arbitrary nonnegative weights assigned to the pairs of species. However, these weights may not depend on the joint distribution [ 𝜋 𝑖 𝑗 ] . The weighted Gini-Simpson quadratic index of the pairs of species is 𝐺 𝑆 𝑊 ( Θ ) = 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝜋 𝑖 𝑗 1 𝜋 𝑖 𝑗 , ( 9 ) If 𝑤 𝑖 𝑗 = 1 , for all pairs of species, then (9) becomes the generalization of the classic Gini-Simpson index to the pairs of species and is denoted by 𝐺 𝑆 ( Θ ) . As a function of the joint distribution Θ , the weighted Gini-Simpson index 𝐺 𝑆 𝑊 ( Θ ) is nonnegative and concave, as shown in [20, 21]. An upper bound for 𝐺 𝑆 𝑊 ( Θ ) which depends only on the maximum weight and the number of species is 0 𝐺 𝑆 𝑊 ( Θ ) m a x 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝑖 , 𝑗 𝜋 𝑖 𝑗 1 𝜋 𝑖 𝑗 m a x 𝑖 , 𝑗 𝑤 𝑖 𝑗 1 1 𝑛 2 . ( 1 0 ) Denoting by 𝐵 3 the bound from the right-hand side of the inequality (10), the relative weighted Gini-Simpson index for pairs of species is 0 𝐺 𝑆 𝑊 ( Θ ) / 𝐵 3 1 . In Appendix B, another bound for 𝐺 𝑆 𝑊 ( Θ ) , denoted by 𝐵 5 , is given, which depends on all the weights assigned to the species. If 𝑤 𝑖 𝑗 = 1 , for each pair of species ( 𝑖 , 𝑗 ) , we get m a x Θ 𝐺 𝑆 ( Θ ) = 1 1 / 𝑛 2 , and this maximum biodiversity is obtained when all species have the same relative abundance 𝜋 𝑖 𝑗 = 1 / 𝑛 2 . The special cases of interest are the following ones.(a)If the species are independent, which means 𝜋 𝑖 𝑗 = 𝑝 𝑖 𝑝 𝑗 and the weights are 𝑤 𝑖 𝑗 = 𝑑 𝑖 𝑗 , then the weighted Gini-Simpson index (9) is denoted by 𝐺 𝑆 𝐷 ( Θ ) = 𝑖 , 𝑗 𝑑 𝑖 𝑗 𝑝 𝑖 𝑝 𝑗 1 𝑝 𝑖 𝑝 𝑗 , ( 1 1 ) which generalizes Rao’s index 𝑅 𝐷 given by (8).(b)If there are the positive numbers 𝑣 = ( 𝑣 1 , , 𝑣 𝑛 ) , representing conservation values of the individual species, the species are independent, which means 𝜋 𝑖 𝑗 = 𝑝 𝑖 𝑝 𝑗 , and the weights are 𝑤 𝑖 𝑗 = 𝑛 ( 𝑛 1 ) 2 1 2 𝑣 𝑖 + 𝑣 𝑗 𝑑 𝑖 𝑗 , ( 1 2 ) where 𝑛 ( 𝑛 1 ) / 2 is the number of distinct pairs of species ( 𝑖 , 𝑗 ) , such that 𝑖 < 𝑗 , or the number of the pairs of species { 𝑖 , 𝑗 } , and ( 1 / 2 ) ( 𝑣 𝑖 + 𝑣 𝑗 ) is the average value of the pair of species ( 𝑖 , 𝑗 ) , then the corresponding weighted Gini-Simpson index (9) is denoted by: 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 𝑛 ( Θ ) = ( 𝑛 1 ) 2 𝑖 < 𝑗 𝑣 𝑖 + 𝑣 𝑗 𝑑 𝑖 𝑗 𝑝 𝑖 𝑝 𝑗 1 𝑝 𝑖 𝑝 𝑗 , ( 1 3 ) which takes into account all the information available, namely, the species richness n, the relative abundance 𝜃 of species, the matrix 𝐷 of the distance between species, and the conservation values 𝑣 of the species. As the measure given in (13) is a nonnegative concave function of the distribution of the relative abundance of the pairs of species Θ , for an arbitrary distance matrix 𝐷 , and also depends explicitly on the species richness, the distance between species, and the conservation values of the species, all these are sufficient reasons to suggest that it could more than adequately replace the use of Rao’s index (8).

2.5. The Additive Partitioning of Biodiversity with Respect to the Pairs of Species

Let us assume that in a certain region there are 𝑛 species and 𝑚 sites. Again, in what follows, the subscripts 𝑖 and 𝑗 refer to species ( 𝑖 , 𝑗 = 1 , , 𝑛 ) and the subscripts 𝑘 and 𝑟 refer to sites, ( 𝑘 , 𝑟 = 1 , , 𝑚 ) . Let Θ 𝑘 = [ 𝜋 𝑖 𝑗 , 𝑘 ] be an arbitrary joint probability distribution of the pairs of species within site 𝑘 , where 𝜋 𝑖 𝑗 , 𝑘 is the probability of the pair of species ( 𝑖 , 𝑗 ) , in this order, within site 𝑘 , such that 𝜋 𝑖 𝑗 , 𝑘 0 , 𝑖 , 𝑗 𝜋 𝑖 𝑗 , 𝑘 = 1 . Let 𝑊 = [ 𝑤 𝑖 𝑗 ] be the matrix whose entries are nonnegative weights assigned to the pairs of species. We assign a parameter 𝜆 𝑘 to each site 𝑘 , satisfying (3). As 𝐺 𝑆 𝑊 ( Θ ) , given by (9), is a concave function of the joint distribution Θ assigned to the pairs of species, it may be used in the additive partitioning of biodiversity. The corresponding 𝛾 -diversity, reflecting the total or regional biodiversity, the 𝛼 -diversity, interpreted as the within-site diversity or the average diversity of the sites, and the 𝛽 -diversity, as a measure of between-site diversity, with respect to the pairs of species, are given by 𝛾 = 𝐺 𝑆 𝑊 𝑘 𝜆 𝑘 Θ 𝑘 , 𝛼 = 𝑘 𝜆 𝑘 𝐺 𝑆 𝑊 Θ 𝑘 , 𝛽 = 𝛾 𝛼 . ( 1 4 )

The 𝛽 -diversity may be interpreted as a measure of dissimilarity or differentiation between the sites of the respective region with respect to the pairs of species. As shown in [20, 21], taking into account (14), the 𝛽 -diversity has the following expression: 𝛽 = 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝑘 < 𝑟 𝜆 𝑘 𝜆 𝑟 𝜋 𝑖 𝑗 , 𝑘 𝜋 𝑖 𝑗 , 𝑟 2 < 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝑘 < 𝑟 𝜆 𝑘 𝜆 𝑟 . ( 1 5 ) The upper bound in (15) is quite loose. As 𝛼 is nonnegative and cannot exceed 𝛾 , a better upper bound for 𝛽 is the value 𝐵 5 from (B.4) of Appendix B, corresponding to 𝑛 and the given weights 𝑊 . Thus, the relative between-site diversity, or the relative dissimilarity between sites, is 0 𝛽 / 𝐵 5 1 , with respect to the pairs of species.

Let 𝜃 𝑘 = ( 𝑝 1 , 𝑘 , , 𝑝 𝑛 , 𝑘 ) be the vector whose components are the relative abundances of the individual species at site 𝑘 . If the species are independent, 𝜋 𝑖 𝑗 , 𝑘 = 𝑝 𝑖 , 𝑘 𝑝 𝑗 , 𝑘 . Let also 𝐷 = [ 𝑑 𝑖 𝑗 ] be the matrix of the distances between species. Then, for the weights (12), the corresponding 𝛽 -diversity (15) is 𝑛 𝛽 = ( 𝑛 1 ) 2 𝑖 < 𝑗 𝑣 𝑖 + 𝑣 𝑗 𝑑 𝑖 𝑗 𝑘 < 𝑟 𝜆 𝑘 𝜆 𝑟 𝑝 𝑖 , 𝑘 𝑝 𝑗 , 𝑘 𝑝 𝑖 , 𝑟 𝑝 𝑗 , 𝑟 2 , ( 1 6 ) measuring the between-site diversity with respect to the distinct pairs of species when the species richness, the distance between species, and the conservation values are all taken into account along with the relative abundance of the species. If the species have the same abundance in each site, which means that 𝑝 𝑖 , 𝑘 = 𝑝 𝑖 , for each site 𝑘 and each species 𝑖 , the corresponding 𝛽 -diversity is equal to zero, reflecting the fact that in such a case there is no dissimilarity between sites. If we divide (16) by the upper bound 𝐵 5 from the inequality (B.4) of Appendix B, taking the weights (12), we obtain the relative 𝛽 -diversity, which has the advantage of always being a number between 0 and 1.

At the same time, let us notice that Rao’s index 𝑅 𝐷 given by (8) is a linear function of the joint distribution of the relative abundance of the pairs of species Θ = [ 𝑝 𝑖 𝑝 𝑗 ] , and, consequently, the corresponding β-diversity induced by the pairs of species is equal to zero, for an arbitrary distance matrix 𝐷 and an arbitrary distribution of the relative abundance of individual species. Therefore, Rao’s index is not suitable for use in the standard additive partitioning of diversity induced by pairs of species when an arbitrary dissimilarity distance between species is taken into account. Attempts were made [24, 25] to either find particular cases of distance matrices or define new kinds of nonstandard 𝛼 -divergence for which the partitioning of diversity can still be performed.

2.6. The Multiplicative Partitioning of Biodiversity with Respect to the Pairs of Species

The weighted Gini-Simpson quadratic index 𝐺 𝑆 𝑊 given by (9), which can be used in the additive partitioning of diversity induced by pairs of species, may be transformed into the measure of diversity [21]: 1 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝜋 𝑖 𝑗 𝐺 𝑆 𝑤 ( = Θ ) 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝜋 2 𝑖 𝑗 1 , ( 1 7 ) which can be used in the multiplicative partitioning of diversity induced by the pairs of species. This measure of biodiversity may be viewed as being the weighted version for pairs of species of the classic Hill number of first degree from [10]. Using the notations from the previous Section 2.5, the corresponding multiplicative 𝛾 -diversity, 𝛼 -diversity, and 𝛽 -diversity are 𝛾 = 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝑘 𝜆 𝑘 𝜋 𝑖 𝑗 , 𝑘 2 1 , 𝛼 = 𝑘 𝜆 𝑘 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝜋 2 𝑖 𝑗 , 𝑘 1 , 𝛾 𝛽 = 𝛼 . ( 1 8 ) Due to the convexity of the function 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝜋 2 𝑖 𝑗 , as a function of the joint distribution Θ = [ 𝜋 𝑖 𝑗 ] , the 𝛾 -diversity cannot be smaller than the 𝛼 -diversity, as it should be, and, consequently, the multiplicative 𝛽 -diversity satisfies the inequality: 𝛽 = 𝛾 / 𝛼 1 . In the additive partitioning of diversity, the 𝛾 -diversity, 𝛼 -diversity, and 𝛽 -diversity are entities of the same kind and may be expressed in the same units. In the multiplicative partitioning of diversity, the 𝛽 -diversity is simply a ratio between the total, regional biodiversity 𝛾 and the average within-site biodiversity 𝛼 , a numerical indicator showing to what extent the regional biodiversity, as a whole, exceeds the average biodiversity of the sites of the respective region.

Let 𝜃 𝑘 = ( 𝑝 1 , 𝑘 , , 𝑝 𝑛 , 𝑘 ) be the vector whose components are the relative abundances of the individual species at site 𝑘 . If the species are independent, 𝜋 𝑖 𝑗 , 𝑘 = 𝑝 𝑖 , 𝑘 𝑝 𝑗 , 𝑘 . Let also 𝐷 = [ 𝑑 𝑖 𝑗 ] be the matrix of the distances between species. Then, for the weights (12), the corresponding 𝛽 -diversity from (18) is 𝛽 = 𝑘 𝜆 𝑘 𝑖 < 𝑗 𝑣 𝑖 + 𝑣 𝑗 𝑑 𝑖 𝑗 𝑝 2 𝑖 , 𝑘 𝑝 2 𝑗 , 𝑘 𝑖 < 𝑗 𝑣 𝑖 + 𝑣 𝑗 𝑑 𝑖 𝑗 𝑘 𝜆 𝑘 𝑝 𝑖 , 𝑘 𝑝 𝑗 , 𝑘 2 1 , ( 1 9 ) measuring the ratio between the regional biodiversity and the average biodiversity of the sites with respect to the pairs of species. Obviously, if the sites have the same species and the same abundance of these species, which means that 𝑝 𝑖 , 𝑘 = 𝑝 𝑖 , for each species 𝑖 , then 𝛽 = 1 .

2.7. The Weighted Shannon Entropy

The weighted Shannon entropy was introduced in [26]. If we have 𝑛 species such that the distribution of the relative abundance of these species is 𝜃 = ( 𝑝 1 , , 𝑝 𝑛 ) and the nonnegative weights assigned to the species are 𝑤 = ( 𝑤 1 , , 𝑤 𝑛 ) , then the weighted entropy is the nonnegative, concave function 𝐻 𝑤 ( 𝜃 ) = 𝑖 𝑤 𝑖 𝑝 𝑖 l n 𝑝 𝑖 . Similarly, if 𝑊 = [ 𝑤 𝑖 𝑗 ] is a matrix of nonnegative weights and Θ = [ 𝜋 𝑖 𝑗 ] a joint probability distribution assigned to the pairs of species, the joint weighted entropy is 𝐻 𝑊 ( Θ ) = 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝜋 𝑖 𝑗 l n 𝜋 𝑖 𝑗 . It is possible, in principle, to remake the analysis from Sections 2.12.6 using the weighted Shannon entropies 𝐻 𝑤 ( 𝜃 ) and 𝐻 𝑊 ( Θ ) instead of the weighted Gini-Simpson indices 𝐺 𝑆 𝑤 ( 𝜃 ) and 𝐺 𝑆 𝑊 ( Θ ) . However, the Shannon entropy is actually a measure of uncertainty and we cannot justify its use as a measure of diversity, as we did for the Gini-Simpson index at the beginning of Section 2.1. Also, since the Shannon entropy is a logarithmic function, it is much more difficult to obtain simple analytical formulas for its maximum values subject to given constraints. The weighted Gini-Simpson index is a simpler and more effective tool in measuring biodiversity.

3. Discussion

It seems to be much easier to discuss the significance of the concepts introduced in Section 2 by showing a representative numerical example. Let us assume that in a certain region there are three sites ( 𝑚 = 3 ) and three species ( 𝑛 = 3 ) . If 𝐴 𝑖 𝑘 denotes the absolute abundance (number of individuals) of species 𝑖 within site 𝑘 , let us assume that 𝐴 1 1 = 2 , 𝐴 2 1 = 2 4 , 𝐴 3 1 𝐴 = 1 4 ; 1 2 = 3 2 , 𝐴 2 2 = 4 , 𝐴 3 2 𝐴 = 1 4 ; 1 3 = 2 4 , 𝐴 2 3 = 3 6 , 𝐴 3 3 = 2 0 . ( 2 0 ) The corresponding relative abundance is 𝑝 1 , 1 = 0 . 0 5 , 𝑝 2 , 1 = 0 . 6 0 , 𝑝 3 , 1 𝑝 = 0 . 3 5 ; 1 , 2 = 0 . 6 4 , 𝑝 2 , 2 = 0 . 0 8 , 𝑝 3 , 2 𝑝 = 0 . 2 8 ; 1 , 3 = 0 . 3 0 , 𝑝 2 , 3 = 0 . 4 5 , 𝑝 3 , 3 = 0 . 2 5 . ( 2 1 ) Thus, in this example, 𝜃 1 = ( 0 . 0 5 , 0 . 6 0 , 0 . 3 5 ) , 𝜃 2 = ( 0 . 6 4 , 0 . 0 8 , 0 . 2 8 ) , and 𝜃 3 = ( 0 . 3 0 , 0 . 4 5 , 0 . 2 5 ) .

3.1. Biodiversity with Respect to the Individual Species

Using the Rich-Gini-Simpson index 𝐺 𝑆 𝑛 ( 𝜃 ) , given by (1) with the weights 𝑤 𝑖 = 𝑛 , to calculate the amount of diversity with respect to the individual species, in each site, we obtain 𝐺 𝑆 3 ( 𝜃 1 ) = 1 . 5 4 5 0 , 𝐺 𝑆 3 ( 𝜃 2 ) = 1 . 5 1 6 8 , and 𝐺 𝑆 3 ( 𝜃 3 ) = 1 . 9 3 5 0 . The maximum biodiversity in this case would be 𝑛 1 = 2 . We can see that the first two sites have almost the same biodiversity, both a little smaller than the biodiversity of the third site which is close to the maximum value, when only the richness and the abundance of species are taken into account.

Let us assume now that the three species have the following conservation values: 𝑣 1 = 6 , 𝑣 2 = 3 , and 𝑣 3 = 3 . These conservation values 𝑣 = ( 6 , 3 , 3 ) contribute to the diversity of the three sites. Taking the weights 𝑤 𝑖 = 𝑛 𝑣 𝑖 , we have 𝑤 1 = 1 8 , 𝑤 2 = 9 , 𝑤 3 = 9 . Therefore, 𝑤 = ( 1 8 , 9 , 9 ) . Using the weighted Gini-Simpson index 𝐺 𝑆 𝑤 ( 𝜃 ) given by (1), we obtain the following values of the biodiversity of each site: 𝐺 𝑆 𝑤 ( 𝜃 1 ) = 5 . 0 6 2 5 , 𝐺 𝑆 𝑤 ( 𝜃 2 ) = 6 . 6 2 4 , and 𝐺 𝑆 𝑤 ( 𝜃 3 ) = 7 . 6 9 5 . When the species have these conservation values, the biodiversity of the second and third sites are closer and higher than the biodiversity of the first site. But in order to have a better understanding of these numbers, we have to compare them with the bounds 𝐵 1 and 𝐵 2 from the inequalities (2) and (A.1), respectively. For the weights 𝑤 = ( 1 8 , 9 , 9 ) , the loose upper bound 𝐵 1 for 𝐺 𝑆 𝑤 , which takes into account only the number of species 𝑛 = 3 and the maximum weight m a x 𝑖 𝑤 𝑖 = 1 8 , has the value 12. For the much better upper bound 𝐵 2 for 𝐺 𝑆 𝑤 from (A.1), mentioned in Appendix A, which takes into account the number of species 𝑛 = 3 and all the weights 𝑤 = ( 1 8 , 9 , 9 ) , we get the value 8.1. Therefore, we can see that the bound 𝐵 2 is obviously better than 𝐵 1 . With respect to 𝐵 2 , the second and third sites have 81.78% and 95% of the maximum biodiversity for the given weights, whereas the first site has only 62.5%. If we do not discriminate among sites with respect to size, altitude, or any other factor, then the parameters assigned to the three sites are 𝜆 1 = 𝜆 2 = 𝜆 3 = 1 / 3 . In such a case, we have 𝑘 𝜆 𝑘 𝜃 𝑘 = 1 3 1 ( 0 . 0 5 , 0 . 6 0 , 0 . 3 5 ) + 3 + 1 ( 0 . 6 4 , 0 . 0 8 , 0 . 2 8 ) 3 = = 𝑞 ( 0 . 3 0 , 0 . 4 5 , 0 . 2 5 ) ( 0 . 3 3 0 0 , 0 . 3 7 6 7 , 0 . 2 9 3 3 ) 1 , 𝑞 2 , 𝑞 3 . ( 2 2 ) According to (4), the 𝛾 -diversity and 𝛼 -diversity, with respect to the single species, are 𝛾 = 𝐺 𝑆 𝑤 𝑘 𝜆 𝑘 𝜃 𝑘 = 𝑖 𝑤 𝑖 𝑞 𝑖 1 𝑞 𝑖 = 7 . 9 5 8 4 , 𝛼 = 𝑘 𝜆 𝑘 𝐺 𝑆 𝑤 𝜃 𝑘 = 1 3 ( 5 . 0 6 2 5 + 6 . 6 2 4 + 7 . 6 9 5 ) = 6 . 4 6 0 5 . ( 2 3 ) Thus, in the additive partitioning of diversity, the 𝛽 -diversity is 𝛽 = 𝛾 𝛼 = 1 . 4 9 7 9 . For the weights 𝑤 = ( 1 8 , 9 , 9 ) and 𝑛 = 3 , according to the formula (A.1) from Appendix A, the maximum value of 𝐺 𝑆 𝑤 is 𝐵 2 = 8 . 1 . Therefore, the biodiversity 𝛾 of the entire region is 98.25% of the maximum and the average within-site biodiversity 𝛼 is 79.76%. The value of the between-site diversity 𝛽 shows the average differentiation between sites corresponding to a difference of 18.49% between the values of 𝛾 and 𝛼 . We note that for identical sites, the value of 𝛽 would be equal to zero, as could be seen from (5). The advantage of the use of the additive partitioning of biodiversity is that the values of 𝛼 , 𝛽 , and 𝛾 are expressed on the same scale of values.

Doing the multiplicative partitioning of biodiversity for 𝜆 𝑖 = 1 / 3 , ( 𝑖 = 1 , 2 , 3 ) , and 𝑤 = 𝑛 𝑣 = ( 1 8 , 9 , 9 ) , from (7) we get 𝛾 = 0 . 2 4 9 3 and 𝛼 = 0 . 1 8 1 5 . Consequently, 𝛽 = 𝛾 / 𝛼 = 1 . 3 7 3 6 .

3.2. Biodiversity with Respect to the Pairs of Species

Let us assume that we have the matrix of the phylogenetic distances between the three species 𝐷 = [ 𝑑 𝑖 𝑗 ] , where 𝑑 1 2 = 3 , 𝑑 1 3 = 2 , a n d 𝑑 2 3 = 2 . If we assume that within each site the species are supposed to be independent from the point of view of their relative abundance, then the relative abundance of the pair of species ( 𝑖 , 𝑗 ) , in this order, is the product of the relative abundance of the corresponding individual species, namely, 𝑝 𝑖 , 𝑘 𝑝 𝑗 , 𝑘 , within every site 𝑘 . Therefore, the matrices Θ 𝑘 = [ 𝑝 𝑖 , 𝑘 𝑝 𝑗 , 𝑘 ] are: Θ 1 = , Θ 0 . 0 0 2 5 0 . 0 3 0 0 0 . 0 1 7 5 0 . 0 3 0 0 0 . 3 6 0 0 0 . 2 1 0 0 0 . 0 1 7 5 0 . 2 1 0 0 0 . 1 2 2 5 2 = , Θ 0 . 4 0 9 6 0 . 0 5 1 2 0 . 1 7 9 2 0 . 0 5 1 2 0 . 0 0 6 4 0 . 0 2 2 4 0 . 1 7 9 2 0 . 0 2 2 4 0 . 0 7 8 4 3 = . 0 . 0 9 0 0 0 . 1 3 5 0 0 . 0 7 5 0 0 . 1 3 5 0 0 . 2 0 2 5 0 . 1 1 2 5 0 . 0 7 5 0 0 . 1 1 2 5 0 . 0 6 2 5 ( 2 4 ) If we do not discriminate among sites with respect to size, altitude, or any other factor, then the parameters assigned to the three sites are 𝜆 1 = 𝜆 2 = 𝜆 3 = 1 / 3 . In such a case, we have 𝜆 1 Θ 1 + 𝜆 2 Θ 2 + 𝜆 3 Θ 3 = 0 . 1 6 7 4 0 . 0 7 2 1 0 . 0 9 0 6 0 . 0 7 2 1 0 . 1 8 9 6 0 . 1 1 5 0 0 . 0 9 0 6 0 . 1 1 5 0 0 . 0 8 7 8 . ( 2 5 ) Let us use Rao’s index (8) for doing the additive partitioning of diversity with respect to the pairs of species. Successively, we obtain 𝑅 𝐷 ( Θ 1 ) = 1 . 0 9 0 0 , 𝑅 𝐷 ( Θ 2 ) = 1 . 1 1 3 6 , and 𝑅 𝐷 ( Θ 3 ) = 1 . 5 6 0 0 . The corresponding 𝛼 -diversity is 𝛼 = 𝜆 1 𝑅 𝐷 ( Θ 1 ) + 𝜆 2 𝑅 𝐷 ( Θ 2 ) + 𝜆 3 𝑅 𝐷 ( Θ 3 ) = 1 . 2 5 5 , and the 𝛾 -diversity is 𝛾 = 𝑅 𝐷 ( 𝜆 1 Θ 1 + 𝜆 2 Θ 2 + 𝜆 3 Θ 3 ) = 1 . 2 5 5 . Consequently, the 𝛽 -diversity is 𝛽 = 𝛾 𝛼 = 0 , which is not surprising because Rao’s index is a linear function of the joint distribution of the pairs of species.

If we use the weighted Gini-Simpson index (11) with the weights 𝑤 𝑖 𝑗 = 𝑑 𝑖 𝑗 , we obtain 𝐺 𝑆 𝐷 Θ 1 = 0 . 9 0 7 0 , 𝐺 𝑆 𝐷 Θ 2 = 0 . 9 6 7 4 , 𝐺 𝑆 𝐷 Θ 3 = 1 . 3 7 7 5 , ( 2 6 ) and the corresponding 𝛼 -diversity is 𝛼 = 𝜆 1 𝐺 𝑆 𝐷 Θ 1 + 𝜆 2 𝐺 𝑆 𝐷 Θ 2 + 𝜆 3 𝐺 𝑆 𝐷 Θ 3 = 1 . 0 8 4 0 , ( 2 7 ) the 𝛾 -diversity is 𝛾 = 𝐺 𝑆 𝐷 ( 𝜆 1 Θ 1 + 𝜆 2 Θ 2 + 𝜆 3 Θ 3 ) = 1 . 1 3 8 1 , and the 𝛽 -diversity is 𝛽 = 𝛾 𝛼 = 0 . 0 5 4 1 . Calculating the upper bound 𝐵 5 of 𝐺 𝑆 𝑊 given in the inequality (B.4) from Appendix B, for the weights 𝑤 𝑖 𝑗 = 𝑑 𝑖 𝑗 , which means 𝑊 = 𝐷 , we obtain m a x 𝐺 𝑆 𝐷 = 2 . 7 5 . Compared to this maximum value, 𝐺 𝑆 𝐷 ( Θ 1 ) represents 3 2 . 9 8 % ; 𝐺 𝑆 𝐷 ( Θ 2 ) = 3 5 . 1 8 % ; 𝐺 𝑆 𝐷 ( Θ 3 ) = 5 0 . 0 9 % ; 𝛾 = 4 1 . 3 9 % ; 𝛼 = 3 9 . 4 2 % ; 𝛽 = 1 . 9 7 % .

We take now into account the number of species 𝑛 = 3 , the parameters assigned to the sites 𝜆 1 = 𝜆 2 = 𝜆 3 = 1 / 3 , the phylogenetic distances between species 𝑑 1 2 = 3 , 𝑑 1 3 = 2 , 𝑑 2 3 = 2 , and the conservation values of the species 𝑣 1 = 6 , 𝑣 2 = 3 , 𝑣 3 = 3 . The computation of the weighted Gini-Simpson index given by (13), with the weights 𝑤 𝑖 𝑗 = ( 𝑛 ( 𝑛 1 ) / 2 ) ( 1 / 2 ) ( 𝑣 𝑖 + 𝑣 𝑗 ) 𝑑 𝑖 𝑗 , gives 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 Θ 1 = 9 . 2 5 8 0 , 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 Θ 2 = 1 2 . 6 6 5 9 , 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 Θ 3 = 1 6 . 7 9 9 4 , ( 2 8 ) and the corresponding 𝛼 -diversity is 𝛼 = 𝜆 1 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 Θ 1 + 𝜆 2 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 Θ 2 + 𝜆 3 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 Θ 3 = 1 2 . 9 0 7 8 , ( 2 9 ) while the 𝛾 -diversity is 𝛾 = 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 ( 𝜆 1 Θ 1 + 𝜆 2 Θ 2 + 𝜆 3 Θ 3 ) = 1 3 . 5 3 2 1 , which gives the 𝛽 -diversity: 𝛽 = 𝛾 𝛼 = 0 . 6 2 4 3 . Calculating the upper bound 𝐵 5 of 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 given in the inequality (B.4) from Appendix B, for the weights 𝑤 𝑖 𝑗 = ( 𝑛 ( 𝑛 1 ) / 2 ) ( 1 / 2 ) ( 𝑣 𝑖 + 𝑣 𝑗 ) 𝑑 𝑖 𝑗 , we obtain m a x 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 = 3 4 . 2 2 3 7 . Compared to this maximum value, 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 ( Θ 1 ) represents 2 7 . 0 5 % ; 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 ( Θ 2 ) = 3 7 . 0 1 % ; 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 ( Θ 3 ) = 4 9 . 0 9 % ; 𝛾 = 3 9 . 5 4 % ; 𝛼 = 3 7 . 7 2 % ; 𝛽 = 1 . 8 2 % .

Doing the multiplicative partitioning of biodiversity for 𝜆 𝑖 = 1 / 3 , ( 𝑖 = 1 , 2 , 3 ) , and 𝑤 𝑖 𝑗 = 𝑑 𝑖 𝑗 , from (18) and (19), we get 𝛾 = 8 . 5 6 and 𝛼 = 5 . 8 6 . Consequently, 𝛽 = 𝛾 / 𝛼 = 1 . 4 6 . Doing the multiplicative partitioning of biodiversity for the site parameters 𝜆 𝑖 = 1 / 3 , ( 𝑖 = 1 , 2 , 3 ) , and the weights 𝑤 𝑖 𝑗 = ( 𝑛 ( 𝑛 1 ) / 2 ) ( 1 / 2 ) ( 𝑣 𝑖 + 𝑣 𝑗 ) 𝑑 𝑖 𝑗 , from (18), we get 𝛾 = 0 . 7 5 and 𝛼 = 0 . 5 1 . Consequently, 𝛽 = 𝛾 / 𝛼 = 1 . 4 7 .

4. Conclusion

Using a measure of biodiversity, as a mathematical tool, the distribution of biodiversity at multiple sites of a region has been traditionally investigated through the partitioning of the regional biodiversity, called γ-diversity, into the average within-site biodiversity, or α-diversity, and the biodiversity among sites, or β-diversity. According to Whittaker [8], who introduced the terminology, 𝛽 -diversity is the ratio between γ-diversity and α-diversity. This is the multiplicative partitioning of diversity. According to MacArthur [5], MacArthur and Wilson [7], and Lande [17], β-diversity is the difference between γ-diversity and α-diversity. This is the additive partitioning of diversity. All these diversities, namely, α-diversity, β-diversity, and γ-diversity, should be nonnegative numbers. In general, a measure of biodiversity ought to be nonnegative, in which case the corresponding α-diversity and γ-diversity, calculated by using such a measure, are nonnegative as well, as they should be. But the corresponding β-diversity is also nonnegative, in the additive partitioning of the biodiversity, or larger than 1, in the multiplicative partitioning of biodiversity, if the measure of biodiversity used is a concave function of the distribution of the relative abundance of species.

The best known measures of biodiversity are Shannon’s entropy and the Gini-Simpson index. Both of them measure the biodiversity taking into account only the relative abundance of species. The widely used Rao’s index measures the dissimilarity between species taking into account not only the relative abundance of species but also a distance between species, such as the phylogenetic distance, for instance. Both Shannon’s entropy and the classic Gini-Simpson index satisfy the mathematical properties (nonnegativity and concavity) that allow them to be successfully used in the additive partitioning of biodiversity. Unfortunately, as was pointed out recently [12, 13], these two measures do not give good results when the number of species is very large. On the other hand, Rao’s index of dissimilarity is not a concave function of the relative abundance of species for arbitrary distances between species and, consequently, can be used in the additive partitioning of biodiversity only for some particular distance matrices, but not in general. The main objective of this paper is to show that the weighted Gini-Simpson quadratic index 𝐺 𝑆 𝐷 given by (11), which is a generalization of the classic Gini-Simpson index 𝐺 𝑆 to the pairs of species, is a suitable measure for use in the standard additive partitioning of biodiversity because, unlike the commonly used Rao’s index of dissimilarity 𝑅 , it is a concave function of the relative abundance of the pairs of species. Unlike the classic Gini-Simpson index 𝐺 𝑆 , the weighted Gini-Simpson quadratic index 𝐺 𝑆 𝑛 , 𝐷 behaves very well when the number of species is very large. The index 𝐺 𝑆 𝑛 , 𝐷 may be generalized to get the diversity measure 𝐺 𝑆 𝑛 . 𝑣 , 𝐷 , given by (13), which takes into account not only the number of species, the relative abundance of the pairs of species, and the matrix 𝐷 of the distances between species, but also a vector 𝑣 of values assigned to the individual species, such as some conservation values for instance. The algebraic transformations (6) and (17) of the weighted Gini-Simpson quadratic indices 𝐺 𝑆 𝑤 , for single species, and 𝐺 𝑆 𝑊 , for pairs of species, given by (1) and (9), respectively, provide two measures of biodiversity which are suitable for use in the multiplicative partitioning of biodiversity. A detailed numerical example shows how the formulas should be implemented in applications.

From a practical point of view, the new weighted Gini-Simpson measure of biodiversity 𝐺 𝑆 𝑛 , 𝑣 , 𝐷 , which is a positive concave function of the relative abundance of the pairs of species, which essentially depends both on the matrix 𝐷 of the distances between species and on the conservation values 𝑣 of the species, is proposed as a suitable and improved replacement for the well-known Rao’s index in the partitioning of biodiversity.

Appendices

A. An Upper Bound for 𝐺 𝑆 𝑤 ( 𝜃 ) for Individual Species

The weighted Gini-Simpson index 𝐺 𝑆 𝑤 ( 𝜃 ) is a nonnegative, concave, quadratic function of the distribution of the relative abundance of species 𝜃 = ( 𝑝 1 , , 𝑝 𝑛 ) . We can apply the standard Lagrange multipliers technique from multivariate calculus in order to maximize 𝐺 𝑆 𝑤 ( 𝜃 ) subject to the constraint 𝑖 𝑝 𝑖 = 1 . When the positive weights 𝑤 = ( 𝑤 1 , , 𝑤 𝑛 ) are given, the maximum value of the weighted Gini-Simpson index 𝐺 𝑆 𝑤 ( 𝜃 ) , as a function of the weights, is m a x 𝜃 𝐺 𝑆 𝑤 1 ( 𝜃 ) 4 𝑖 𝑤 𝑖 ( 𝑛 2 ) 2 𝑖 𝑤 𝑖 1 1 . ( A . 1 ) If the bound from the right-hand side of the inequality (A.1) is denoted by 𝐵 2 , the relative weighted biodiversity is 0 𝐺 𝑆 𝑤 ( 𝜃 ) / 𝐵 2 1 .

B. An Upper Bound for 𝐺 𝑆 𝑊 ( Θ ) for the Pairs of Species

The weighted Gini-Simpson index 𝐺 𝑆 𝑊 ( Θ ) is a nonnegative, concave, quadratic function of the joint distribution assigned to the pairs of species Θ = [ 𝜋 𝑖 𝑗 ] . We can apply the standard Lagrange multipliers technique from multivariate calculus in order to maximize 𝐺 𝑆 𝑊 ( Θ ) subject to the constraint 𝑖 , 𝑗 𝜋 𝑖 𝑗 = 1 . When the positive weights 𝑊 = [ 𝑤 𝑖 𝑗 ] are given, the maximum value of the weighted Gini-Simpson index 𝐺 𝑆 𝑊 ( Θ ) , as a function of the weights, subject to the constraint 𝑖 , 𝑗 𝜋 𝑖 𝑗 = 1 , is m a x Θ 𝐺 𝑆 𝑊 1 ( Θ ) 4 𝑖 , 𝑗 𝑤 𝑖 𝑗 𝑛 2 2 2 𝑖 , 𝑗 𝑤 1 𝑖 𝑗 1 . ( B . 1 ) If the bound from the right-hand side of the inequality (B.1) is denoted by 𝐵 4 , the relative weighted biodiversity is 0 𝐺 𝑆 𝑊 ( Θ ) / 𝐵 4 1 .

Let us note that if 𝜋 𝑖 𝑗 = 𝜋 𝑗 𝑖 , for the distinct pairs ( 𝑖 , 𝑗 ) , and 𝑤 𝑖 𝑗 = 𝑤 𝑗 𝑖 , 𝑤 𝑖 𝑖 = 0 , which happens, for instance, in the important case when 𝑤 𝑖 𝑗 = 𝑑 𝑖 𝑗 , or when 𝑤 𝑖 𝑗 = 𝑛 ( 𝑛 1 ) 2 𝑣 𝑖 + 𝑣 𝑗 2 𝑑 𝑖 𝑗 , ( B . 2 ) where 𝑣 𝑖 > 0 is the conservation value of species 𝑖 and 𝑑 𝑖 𝑗 is the distance between the distinct species ( 𝑖 , 𝑗 ) , then 𝐺 𝑆 𝑊 ( Θ ) may be written as 𝐺 𝑆 𝑊 ( Θ ) = 2 𝑖 < 𝑗 𝑤 𝑖 𝑗 𝜋 𝑖 𝑗 1 𝜋 𝑖 𝑗 . ( B . 3 ) Maximizing 𝐺 𝑆 𝑊 ( Θ ) , which in this case depends only on 𝑛 ( 𝑛 1 ) / 2 variables 𝜋 𝑖 𝑗 , ( 𝑖 < 𝑗 ) , subject to the constraint: 2 𝑖 < 𝑗 𝜋 𝑖 𝑗 = 𝑐 , where 0 < 𝑐 = 1 𝑖 𝜋 𝑖 𝑖 1 , we obtain m a x Θ 𝐺 𝑆 𝑊 1 ( Θ ) 2 𝑖 < 𝑗 𝑤 𝑖 𝑗 𝑛 ( 𝑛 1 ) 2 1 2 𝑖 < 𝑗 𝑤 1 𝑖 𝑗 1 . ( B . 4 ) If the bound from the right-hand side of the inequality (B.4) is denoted by 𝐵 5 , the relative weighted biodiversity is 0 𝐺 𝑆 𝑊 ( Θ ) 𝐵 5 1 . ( B . 5 )

C. Concavity of the Weighted Gini-Simpson Index 𝐺 𝑆 𝑤 for Individual Species

Using the notation from Section 2.2 and taking into account that 𝜆 2 𝑘 𝑝 2 𝑖 , 𝑘 + 𝜆 𝑘 𝑝 2 𝑖 , 𝑘 = 𝜆 𝑘 1 𝜆 𝑘 𝑝 2 𝑖 , 𝑘 = 𝜆 𝑘 𝜆 1 + + 𝜆 𝑘 1 + 𝜆 𝑘 + 1 + + 𝜆 𝑚 𝑝 2 𝑖 , 𝑘 = 𝜆 1 𝜆 𝑘 + + 𝜆 𝑘 1 𝜆 𝑘 + 𝜆 𝑘 𝜆 𝑘 + 1 + + 𝜆 𝑘 𝜆 𝑚 𝑝 2 𝑖 , 𝑘 , f o r e v e r y 1 𝑘 𝑚 , ( C . 1 ) we get 𝛽 = 𝐺 𝑆 𝑤 𝑘 𝜆 𝑘 𝜃 𝑘 𝑘 𝜆 𝑘 𝐺 𝑆 𝑤 𝜃 𝑘 = 𝑖 𝑤 𝑖 𝑘 𝜆 𝑘 𝑝 𝑖 , 𝑘 1 𝑘 𝜆 𝑘 𝑝 𝑖 , 𝑘 𝑘 𝜆 𝑘 𝑖 𝑤 𝑖 𝑝 𝑖 , 𝑘 1 𝑝 𝑖 , 𝑘 = 𝑖 𝑤 𝑖 𝑘 𝜆 𝑘 1 𝜆 𝑘 𝑝 2 𝑖 , 𝑘 𝑘 𝑟 𝜆 𝑘 𝜆 𝑟 𝑝 𝑖 , 𝑘 𝑝 𝑖 , 𝑟 = 𝑖 𝑤 𝑖 𝑘 < 𝑟 𝜆 𝑘 𝜆 𝑟 𝑝 2 𝑖 , 𝑘 2 𝑝 𝑖 , 𝑘 𝑝 𝑖 , 𝑟 + 𝑝 2 𝑖 , 𝑟 = 𝑖 𝑤 𝑖 𝑘 < 𝑟 𝜆 𝑘 𝜆 𝑟 𝑝 𝑖 , 𝑘 𝑝 𝑖 , 𝑟 2 0 . ( C . 2 )

Remark 1. In the paper [21], two corrections are needed: (a) on page 799, in the first column, the rows 12-13, the numerical values should be 𝛼 = 0 . 7 2 3 , 𝛾 = 0 . 7 4 5 , 𝛽 = 0 . 0 2 2 ; (b) on page 798, (11) should be: 𝛾 = 𝑖 , 𝑗 𝑑 𝑖 𝑗 ( 𝑘 𝜆 𝑘 𝑝 𝑖 , 𝑘 𝑝 𝑗 , 𝑘 ) ( 1 𝑘 𝜆 𝑘 𝑝 𝑖 , 𝑘 𝑝 𝑗 , 𝑘 ) .

Acknowledgments

The authors would like to thank the editor, Professor Jean-Guy Godin, and the two referees for their detailed and very helpful comments.