Abstract
The distribution of biodiversity at multiple sites of a region has been traditionally investigated through the additive partitioning of the regional biodiversity into the average within-site biodiversity and the biodiversity among sites. The standard additive partitioning of diversity requires the use of a measure of diversity, which is a concave function of the relative abundance of species, such as the Gini-Simpson index, for instance. Recently, it was noticed that the widely used Gini-Simpson index does not behave well when the number of species is very large. The objective of this paper is to show that the new weighted Gini-Simpson index preserves the qualities of the classic Gini-Simpson index and behaves very well when the number of species is large. The weights allow us to take into account the abundance of species, the phylogenetic distance between species, and the conservation values of species. This measure may also be generalized to pairs of species and, unlike Rao’s index, this measure proves to be a concave function of the joint distribution of the relative abundance of species, being suitable for use in the additive partitioning of biodiversity. The weighted Gini-Simpson index may be easily transformed for use in the multiplicative partitioning of biodiversity as well.
1. Introduction
Measuring biodiversity is a major, and much debated, topic in ecology and conservation biology. The simplest measure of biodiversity is the number of species from a given community, habitat, or site. Obviously, this ignores how many individuals each species has. The best known measures of biodiversity, that also take into account the relative abundance of species, are the Gini-Simpson index and the Shannon entropy . Both measures have been imported into biology from other fields Thus, Gini [1] introduced his formula in statistics, in 1912. Much later, after 37 years, Simpson [2] pleaded convincingly in favour of using Gini’s formula as a measure of biodiversity. Shannon was an engineer who introduced his discrete entropy in information theory [3], in 1948, as a measure of uncertainty, inspired by Boltzmann’s continuous entropy from classical statistical mechanics [4], defined half a century earlier. Shannon’s formula was adopted by biologists about 17 years later [5–8], as a measure of specific diversity. This import of mathematical formulas has continued. Rényi, a probabilist, introduced his own entropy [9], in order to unify several generalizations of the Shannon entropy. He was a pure mathematician without any interest in applications, but later, Hill [10] claimed that by taking the exponential of Rényi’s entropy we obtain a class of suitable measures of biodiversity, called Hill’s numbers, which were praised by Jost [11–13] as being the “true” measures of biodiversity. In 1982, Rao [14], a statistician, introduced the so-called quadratic entropy , which in fact has nothing to do with the proper entropy and depends not only on the relative abundance of species but also on the phylogenetic distance between species. This function has also been quickly adopted by biologists as a measure of dissimilarity between the pairs of species. In the last 20 years, a lot of other measures of diversity have been proposed. According to Ricotta [15], there is currently a “jungle” of biological measures of diversity. However, as mentioned by S. Hoffmann and A. Hoffmann [16], there is no unique “true” measure of diversity.
Starting with MacArthur [5], MacArthur and Wilson [7], and Whittaker [8], the distribution of biodiversity at multiple sites of a region has been traditionally investigated through the partitioning of the regional or total biodiversity, called γ-diversity, into the average within-site biodiversity, called α-diversity, and the between-site biodiversity or diversity turnover, called β-diversity. All these diversities, namely, α-diversity, β-diversity, and γ-diversity, should be nonnegative numbers. Unlike α-diversity and γ-diversity, there is no consensus about how to interpret and calculate β-diversity. According to Whittaker [8], who introduced the terminology, β-diversity is the ratio between γ-diversity and α-diversity. This is the multiplicative partitioning of diversity. According to MacArthur [5], Lande [17], and, more recently, Veech at al. [18], β-diversity is the difference between γ-diversity and α-diversity. This is the additive partitioning of diversity.
Let us assume that in a certain region there are species, m sites, and is the distribution of the relative abundance of species at site . Let be an arbitrary parameter assigned to site , such that . These parameters may be used to make adjustments for differences (in size, altitude, etc.) between the sites. If no adjustment is made, we take these parameters to be equal, that is, , for every . If μ is a nonnegative measure of diversity, which assigns a nonnegative number to each distribution of the relative abundance of the species, then the corresponding γ-diversity is and the α-diversity is . The β-diversity is taken to be , in the additive partitioning of diversity, and , in the multiplicative partitioning of diversity. In general, a measure of biodiversity ought to be nonnegative, in which case the corresponding α-diversity and γ-diversity calculated by using such a measure are also nonnegative, as they should be. From a systemic point of view, the β-diversity shows to what extent the total, or regional diversity differs from the average diversity of the communities/habitats/sites taken together, as a system, reflecting the dissimilarity, or differentiation between communities/habitats/sites of the region with respect to the individual species. If the measure of biodiversity is a concave function of the distribution of the relative abundance of species , then the corresponding -diversity is , in the additive partitioning of diversity, and , in the multiplicative partitioning of diversity, for arbitrary parameters . If a measure of diversity is not a concave function of the distribution of the relative abundance of species , then the corresponding β-diversity could be negative, in the additive partitioning of biodiversity, or less than 1, in the multiplicative partitioning of biodiversity, for some parameters satisfying , which is absurd. As discussed by Jost [11, 12], if a measure of diversity is not a concave function of the distribution of the relative abundance of species , we can still attempt the partitioning of biodiversity into -, -, and -diversity if a new kind of α-diversity may be introduced. This new type of -diversity would be based on a different way of averaging the diversities of the individual communities/habitats/sites instead of the simple, golden mean value from statistics, which works so well for the concave measures of diversity. However, finding such an unorthodox, nonstandard α-diversity when the measure of biodiversity is not concave is not easy. It is also difficult to find a mathematical interpretation for such a new kind of α-diversity. In spite of the passage of time, the most popular measures of biodiversity are still , , and . Both and are concave functions of the distribution of the relative abundance of species and therefore can be used for doing the additive partitioning of biodiversity. The first two Hill’s numbers are mathematical transformations of and , namely, and , and are used in the multiplicative partitioning of biodiversity. Recently, however, it was noted [13, 19] that both Shannon’s entropy and the Gini-Simpson index do not behave well when the number of species is very large. On the other hand, when a distance between species, such as the phylogenetic distance for instance, is also taken into account along with the relative abundance of species, Rao’s index [14] is a widely used measure of dissimilarity. But, unfortunately, Rao’s index is not a concave function of the distribution of the relative abundance of species, for an arbitrary distance matrix between species. Consequently, it proves to be suitable for use in the standard additive partitioning of diversity only in some special cases, but not in general. The objective of this paper is to show that the weighted Gini-Simpson quadratic index, a generalization of the classic Gini-Simpson index of biodiversity, offers a solution to both of the drawbacks just mentioned. Unlike Shannon’s entropy and the classic Gini-Simpson index, this new weighted measure of biodiversity behaves very well even if the number of species is very large. The weights allow us to measure biodiversity when a distance between species and/or conservation values of the species are taken into account, along with the abundance of species. When the phylogenetic distance between species is taken as the weight, the corresponding weighted Gini-Simpson index, unlike Rao’s index, is a concave function of the distribution of the relative abundance of the pairs of species, being suitable for use in the additive partitioning of biodiversity. A simple algebraic transformation makes the weighted Gini-Simpson index suitable for use in the multiplicative partitioning of biodiversity as well.
In Methodology, the weighted Gini-Simpson quadratic index is defined both for individual species and for pairs of species. This new measure of biodiversity is used for calculating the average within-site biodiversity (α-diversity), the intersite biodiversity (β-diversity), and the regional or total biodiversity (γ-diversity). It is also shown that the weighted Gini-Simpson quadratic index may be easily modified, by a simple algebraic transformation, to get a measure of biodiversity suitable for use in the multiplicative partitioning of biodiversity as well. In Section 3, a numerical example is presented, which illustrates how the mathematical formalism should be applied from a practical standpoint.
2. Methodology
2.1. The Weighted Measure of Diversity with Respect to Individual Species
Let us assume that there are species in a certain community/habitat/site and let be the relative abundance of species (the number of individuals of species divided by the total number of individuals in that community/habitat/site). We have diversity if species is present at that location but other species are found there as well. The probability that the species is present and there are other species present as well is . If we take all possible values of from the unit interval into account, the wave function corresponding to the species is a nonnegative, symmetric, bell-shaped, concave function, reaching its maximum value at . If we sum up these wave functions, for all species, we obtain the classic Gini-Simpson index corresponding to the given distribution of the relative abundance of the species . Since 1949, this has been considered to be a very good measure of biodiversity. In order to generalize it, we may assign an amplitude to the wave function of the species , and the resulting new wave function continues to be a nonnegative, symmetric, bell-shaped, concave function of , but this time its maximum value is . Summing up these wave functions for all the species, we get the weighted Gini-Simpson index: which depends both on the distribution of the relative abundance of species and on the nonnegative weights . The concavity of was proven in [20, 21]. The weight could be anything which contributes to the increase in the diversity induced by the species . However, the weights may not depend on the relative abundance of species. If , for each , then (1) becomes the so-called Rich-Gini-Simpson index introduced in [22], which is essentially dependent on the species richness of the respective community/habitat/site. If there are some conservation values assigned to the species , which are positive numbers on a certain scale of values, and the weights are , the corresponding weighted Gini-Simpson index is denoted by . Obviously, if , for each species , then (1) is the classic Gini-Simpson index . An upper bound for , which depends only on the maximum weight and the number of species, is Denoting by the bound from the right-hand side of the inequality (2), the relative weighted Gini-Simpson index for individual species is . In Appendix A, another bound of is given, denoted by , which depends on all the weights assigned to the species. If , for each species , we get , and this maximum biodiversity is obtained when all species have the same relative abundance . The fact that the maximum value of is almost insensitive to the increase of the number of species, tending very slowly to 1 when increases, allowed Jost [12, 13] and Jost et al. [19] to give some examples showing that the Gini-Simpson index does not behave well when the number of species is very large. R. C. Guiasu and S. Guiasu [22] showed, however, that the Rich-Gini-Simpson index has no such problem. Indeed, if we take , for each species , we get from (2), , whose value sensibly increases when the number of species increases, which makes inapplicable the criticism of the classic Gini-Simpson index.
2.2. The Additive Partitioning of Biodiversity with Respect to the Individual Species
Let us assume that in a certain region there are species and sites. In what follows, the subscripts and refer to species and the subscripts and refer to sites, . Let be the vector whose components are the relative abundances of the individual species at site , such that , for each . Let be nonnegative weights assigned to the species. In dealing with species diversity, a good measure of the differentiation, or dissimilarity, among the sites in a certain region has to be nonnegative and equal to zero if and only if there is no such difference. We assign a parameter to each site , such that These parameters may be used to make adjustments for differences (in size, altitude, etc.) between the sites, as shown in [23]. If no adjustment is made and we focus only on the species abundance, we take these parameters to be equal, that is, for every . As is a concave function of the distribution of the relative abundance , it may be used in the additive partitioning of biodiversity. The corresponding -diversity, reflecting the total or regional biodiversity, the -diversity, interpreted as the within-site diversity or the average diversity of the sites, and the -diversity, as a measure of between-site diversity, are given by
The -diversity may be interpreted as a measure of dissimilarity or differentiation between the sites of the respective region with respect to the individual species. As shown in Appendix C, taking into account (4), the -diversity has the expression The upper bound in (5) is quite loose. As is nonnegative and cannot exceed , a better upper bound for is the value from (A.1), corresponding to and the given weights . Thus, the relative between-site diversity, or the dissimilarity between sites is , with respect to the individual species. From (5), we can see that if the species have the same abundance in each site, which means that , for each site and each species , the -diversity is equal to zero, reflecting the fact that in such a case there is no dissimilarity between the sites.
2.3. The Multiplicative Partitioning of Biodiversity with Respect to Individual Species
Dealing with the multiplicative partitioning of diversity, Whittaker [8] suggested the use of the exponential of the Shannon entropy as a measure of biodiversity. The weighted Gini-Simpson index , given by (1), which can be used in the additive partitioning of diversity induced by individual species, may also be transformed into the measure of biodiversity (R. C. Guiasu and S. Guiasu [21]): which can be used in the multiplicative partitioning of diversity induced by the individual species. This measure of biodiversity may be viewed as being the weighted version of the classic Hill number of first degree from [10]. The corresponding multiplicative -diversity and -diversity are Due to the convexity of the function , as a function of the distribution of the relative abundance of species, the -diversity cannot be smaller than the -diversity, as it should be, and, consequently, the multiplicative -diversity satisfies the inequality: . In the additive partitioning of diversity, the -diversity, -diversity, and -diversity are entities of the same kind and may be expressed in the same units. In the multiplicative partitioning of diversity, the -diversity is simply a ratio between the total, regional biodiversity and the average within-site biodiversity , a numerical indicator showing to what extent the regional biodiversity, as a whole, exceeds the average biodiversities of the sites of the respective region. Obviously, if the sites have the same species and the same abundance of these species, which means that , for each species , then .
2.4. The Weighted Measure of Diversity with Respect to the Pairs of Species
Let be an matrix whose entries are the distances between the pairs of species, such that . This could be the matrix of the phylogenetic distance between species, for instance. When Rao introduced his quadratic index, improperly called quadratic entropy [14]: he had to focus on the pairs of species instead of the individual species, using the distance between the pairs of distinct species, along with their relative abundance, in order to measure the dissimilarity between species. Rao’s indicator is very simple and may be easily interpreted as the average dissimilarity between two individuals belonging to two different species when the phylogenetic distance is taken into account. There have been numerous attempts, such as [24], for instance, at using Rao’s index in the additive partitioning of diversity. Unfortunately, is not a concave function of the distribution of the relative abundance of species for an arbitrary distance matrix . Thus, can be applied to the additive partitioning of biodiversity only for some special kinds of such matrices (as mentioned in [24], for instance), but not in general. On the other hand, there is no generally accepted proposal of a new kind of nonstandard -divergence which could be defined for such a measure which is not a concave function of the distribution of the relative abundance of species. This paper shows that the generalization of the weighted Gini-Simpson index to the pairs of species provides a concave measure of diversity which could indeed be used both for the additive partitioning and the multiplicative partitioning of diversity when the phylogenetic distance between species is taken into account. Therefore, this provides a suitable replacement of Rao’s index in the partitioning of biodiversity.
Let be an matrix where is a joint probability of the pair of species , in this order. As is the probability of the pair , in this order, the probability of the subset of species is . We have. Let be an matrix whose entries are arbitrary nonnegative weights assigned to the pairs of species. However, these weights may not depend on the joint distribution . The weighted Gini-Simpson quadratic index of the pairs of species is If , for all pairs of species, then (9) becomes the generalization of the classic Gini-Simpson index to the pairs of species and is denoted by . As a function of the joint distribution , the weighted Gini-Simpson index is nonnegative and concave, as shown in [20, 21]. An upper bound for which depends only on the maximum weight and the number of species is Denoting by the bound from the right-hand side of the inequality (10), the relative weighted Gini-Simpson index for pairs of species is . In Appendix B, another bound for , denoted by , is given, which depends on all the weights assigned to the species. If , for each pair of species , we get , and this maximum biodiversity is obtained when all species have the same relative abundance . The special cases of interest are the following ones.(a)If the species are independent, which means and the weights are , then the weighted Gini-Simpson index (9) is denoted by which generalizes Rao’s index given by (8).(b)If there are the positive numbers , representing conservation values of the individual species, the species are independent, which means , and the weights are where is the number of distinct pairs of species , such that , or the number of the pairs of species , and is the average value of the pair of species , then the corresponding weighted Gini-Simpson index (9) is denoted by: which takes into account all the information available, namely, the species richness n, the relative abundance of species, the matrix of the distance between species, and the conservation values of the species. As the measure given in (13) is a nonnegative concave function of the distribution of the relative abundance of the pairs of species , for an arbitrary distance matrix , and also depends explicitly on the species richness, the distance between species, and the conservation values of the species, all these are sufficient reasons to suggest that it could more than adequately replace the use of Rao’s index (8).
2.5. The Additive Partitioning of Biodiversity with Respect to the Pairs of Species
Let us assume that in a certain region there are species and sites. Again, in what follows, the subscripts and refer to species and the subscripts and refer to sites, . Let be an arbitrary joint probability distribution of the pairs of species within site , where is the probability of the pair of species , in this order, within site , such that . Let be the matrix whose entries are nonnegative weights assigned to the pairs of species. We assign a parameter to each site , satisfying (3). As , given by (9), is a concave function of the joint distribution assigned to the pairs of species, it may be used in the additive partitioning of biodiversity. The corresponding -diversity, reflecting the total or regional biodiversity, the -diversity, interpreted as the within-site diversity or the average diversity of the sites, and the -diversity, as a measure of between-site diversity, with respect to the pairs of species, are given by
The -diversity may be interpreted as a measure of dissimilarity or differentiation between the sites of the respective region with respect to the pairs of species. As shown in [20, 21], taking into account (14), the -diversity has the following expression: The upper bound in (15) is quite loose. As is nonnegative and cannot exceed , a better upper bound for is the value from (B.4) of Appendix B, corresponding to and the given weights . Thus, the relative between-site diversity, or the relative dissimilarity between sites, is , with respect to the pairs of species.
Let be the vector whose components are the relative abundances of the individual species at site . If the species are independent, . Let also be the matrix of the distances between species. Then, for the weights (12), the corresponding -diversity (15) is measuring the between-site diversity with respect to the distinct pairs of species when the species richness, the distance between species, and the conservation values are all taken into account along with the relative abundance of the species. If the species have the same abundance in each site, which means that , for each site and each species , the corresponding -diversity is equal to zero, reflecting the fact that in such a case there is no dissimilarity between sites. If we divide (16) by the upper bound from the inequality (B.4) of Appendix B, taking the weights (12), we obtain the relative -diversity, which has the advantage of always being a number between 0 and 1.
At the same time, let us notice that Rao’s index given by (8) is a linear function of the joint distribution of the relative abundance of the pairs of species , and, consequently, the corresponding β-diversity induced by the pairs of species is equal to zero, for an arbitrary distance matrix and an arbitrary distribution of the relative abundance of individual species. Therefore, Rao’s index is not suitable for use in the standard additive partitioning of diversity induced by pairs of species when an arbitrary dissimilarity distance between species is taken into account. Attempts were made [24, 25] to either find particular cases of distance matrices or define new kinds of nonstandard -divergence for which the partitioning of diversity can still be performed.
2.6. The Multiplicative Partitioning of Biodiversity with Respect to the Pairs of Species
The weighted Gini-Simpson quadratic index given by (9), which can be used in the additive partitioning of diversity induced by pairs of species, may be transformed into the measure of diversity [21]: which can be used in the multiplicative partitioning of diversity induced by the pairs of species. This measure of biodiversity may be viewed as being the weighted version for pairs of species of the classic Hill number of first degree from [10]. Using the notations from the previous Section 2.5, the corresponding multiplicative -diversity, -diversity, and -diversity are Due to the convexity of the function , as a function of the joint distribution , the -diversity cannot be smaller than the -diversity, as it should be, and, consequently, the multiplicative -diversity satisfies the inequality: . In the additive partitioning of diversity, the -diversity, -diversity, and -diversity are entities of the same kind and may be expressed in the same units. In the multiplicative partitioning of diversity, the -diversity is simply a ratio between the total, regional biodiversity and the average within-site biodiversity , a numerical indicator showing to what extent the regional biodiversity, as a whole, exceeds the average biodiversity of the sites of the respective region.
Let be the vector whose components are the relative abundances of the individual species at site . If the species are independent, . Let also be the matrix of the distances between species. Then, for the weights (12), the corresponding -diversity from (18) is measuring the ratio between the regional biodiversity and the average biodiversity of the sites with respect to the pairs of species. Obviously, if the sites have the same species and the same abundance of these species, which means that , for each species , then .
2.7. The Weighted Shannon Entropy
The weighted Shannon entropy was introduced in [26]. If we have species such that the distribution of the relative abundance of these species is and the nonnegative weights assigned to the species are , then the weighted entropy is the nonnegative, concave function . Similarly, if is a matrix of nonnegative weights and a joint probability distribution assigned to the pairs of species, the joint weighted entropy is . It is possible, in principle, to remake the analysis from Sections 2.1–2.6 using the weighted Shannon entropies and instead of the weighted Gini-Simpson indices and . However, the Shannon entropy is actually a measure of uncertainty and we cannot justify its use as a measure of diversity, as we did for the Gini-Simpson index at the beginning of Section 2.1. Also, since the Shannon entropy is a logarithmic function, it is much more difficult to obtain simple analytical formulas for its maximum values subject to given constraints. The weighted Gini-Simpson index is a simpler and more effective tool in measuring biodiversity.
3. Discussion
It seems to be much easier to discuss the significance of the concepts introduced in Section 2 by showing a representative numerical example. Let us assume that in a certain region there are three sites ( and three species (. If denotes the absolute abundance (number of individuals) of species within site , let us assume that The corresponding relative abundance is Thus, in this example, , , and .
3.1. Biodiversity with Respect to the Individual Species
Using the Rich-Gini-Simpson index , given by (1) with the weights , to calculate the amount of diversity with respect to the individual species, in each site, we obtain , , and . The maximum biodiversity in this case would be . We can see that the first two sites have almost the same biodiversity, both a little smaller than the biodiversity of the third site which is close to the maximum value, when only the richness and the abundance of species are taken into account.
Let us assume now that the three species have the following conservation values: , , and . These conservation values contribute to the diversity of the three sites. Taking the weights , we have . Therefore, . Using the weighted Gini-Simpson index given by (1), we obtain the following values of the biodiversity of each site: , , and . When the species have these conservation values, the biodiversity of the second and third sites are closer and higher than the biodiversity of the first site. But in order to have a better understanding of these numbers, we have to compare them with the bounds and from the inequalities (2) and (A.1), respectively. For the weights , the loose upper bound for , which takes into account only the number of species and the maximum weight , has the value 12. For the much better upper bound for from (A.1), mentioned in Appendix A, which takes into account the number of species and all the weights , we get the value 8.1. Therefore, we can see that the bound is obviously better than . With respect to , the second and third sites have 81.78% and 95% of the maximum biodiversity for the given weights, whereas the first site has only 62.5%. If we do not discriminate among sites with respect to size, altitude, or any other factor, then the parameters assigned to the three sites are . In such a case, we have According to (4), the -diversity and -diversity, with respect to the single species, are Thus, in the additive partitioning of diversity, the -diversity is . For the weights and , according to the formula (A.1) from Appendix A, the maximum value of is . Therefore, the biodiversity of the entire region is 98.25% of the maximum and the average within-site biodiversity is 79.76%. The value of the between-site diversity shows the average differentiation between sites corresponding to a difference of 18.49% between the values of and . We note that for identical sites, the value of would be equal to zero, as could be seen from (5). The advantage of the use of the additive partitioning of biodiversity is that the values of , and are expressed on the same scale of values.
Doing the multiplicative partitioning of biodiversity for , and , from (7) we get and . Consequently, .
3.2. Biodiversity with Respect to the Pairs of Species
Let us assume that we have the matrix of the phylogenetic distances between the three species , where . If we assume that within each site the species are supposed to be independent from the point of view of their relative abundance, then the relative abundance of the pair of species , in this order, is the product of the relative abundance of the corresponding individual species, namely, , within every site . Therefore, the matrices are: If we do not discriminate among sites with respect to size, altitude, or any other factor, then the parameters assigned to the three sites are . In such a case, we have Let us use Rao’s index (8) for doing the additive partitioning of diversity with respect to the pairs of species. Successively, we obtain ,, and . The corresponding -diversity is , and the -diversity is . Consequently, the -diversity is , which is not surprising because Rao’s index is a linear function of the joint distribution of the pairs of species.
If we use the weighted Gini-Simpson index (11) with the weights , we obtain and the corresponding -diversity is the -diversity is , and the -diversity is . Calculating the upper bound of given in the inequality (B.4) from Appendix B, for the weights , which means , we obtain . Compared to this maximum value, represents ; ; ; ; ; .
We take now into account the number of species , the parameters assigned to the sites , the phylogenetic distances between species , and the conservation values of the species , , . The computation of the weighted Gini-Simpson index given by (13), with the weights , gives and the corresponding -diversity is while the -diversity is , which gives the -diversity: . Calculating the upper bound of given in the inequality (B.4) from Appendix B, for the weights , we obtain . Compared to this maximum value, represents ; ; ; ; ; .
Doing the multiplicative partitioning of biodiversity for , and , from (18) and (19), we get and . Consequently, . Doing the multiplicative partitioning of biodiversity for the site parameters , and the weights , from (18), we get and . Consequently, .
4. Conclusion
Using a measure of biodiversity, as a mathematical tool, the distribution of biodiversity at multiple sites of a region has been traditionally investigated through the partitioning of the regional biodiversity, called γ-diversity, into the average within-site biodiversity, or α-diversity, and the biodiversity among sites, or β-diversity. According to Whittaker [8], who introduced the terminology, -diversity is the ratio between γ-diversity and α-diversity. This is the multiplicative partitioning of diversity. According to MacArthur [5], MacArthur and Wilson [7], and Lande [17], β-diversity is the difference between γ-diversity and α-diversity. This is the additive partitioning of diversity. All these diversities, namely, α-diversity, β-diversity, and γ-diversity, should be nonnegative numbers. In general, a measure of biodiversity ought to be nonnegative, in which case the corresponding α-diversity and γ-diversity, calculated by using such a measure, are nonnegative as well, as they should be. But the corresponding β-diversity is also nonnegative, in the additive partitioning of the biodiversity, or larger than 1, in the multiplicative partitioning of biodiversity, if the measure of biodiversity used is a concave function of the distribution of the relative abundance of species.
The best known measures of biodiversity are Shannon’s entropy and the Gini-Simpson index. Both of them measure the biodiversity taking into account only the relative abundance of species. The widely used Rao’s index measures the dissimilarity between species taking into account not only the relative abundance of species but also a distance between species, such as the phylogenetic distance, for instance. Both Shannon’s entropy and the classic Gini-Simpson index satisfy the mathematical properties (nonnegativity and concavity) that allow them to be successfully used in the additive partitioning of biodiversity. Unfortunately, as was pointed out recently [12, 13], these two measures do not give good results when the number of species is very large. On the other hand, Rao’s index of dissimilarity is not a concave function of the relative abundance of species for arbitrary distances between species and, consequently, can be used in the additive partitioning of biodiversity only for some particular distance matrices, but not in general. The main objective of this paper is to show that the weighted Gini-Simpson quadratic index given by (11), which is a generalization of the classic Gini-Simpson index to the pairs of species, is a suitable measure for use in the standard additive partitioning of biodiversity because, unlike the commonly used Rao’s index of dissimilarity , it is a concave function of the relative abundance of the pairs of species. Unlike the classic Gini-Simpson index , the weighted Gini-Simpson quadratic index behaves very well when the number of species is very large. The index may be generalized to get the diversity measure , given by (13), which takes into account not only the number of species, the relative abundance of the pairs of species, and the matrix of the distances between species, but also a vector of values assigned to the individual species, such as some conservation values for instance. The algebraic transformations (6) and (17) of the weighted Gini-Simpson quadratic indices , for single species, and , for pairs of species, given by (1) and (9), respectively, provide two measures of biodiversity which are suitable for use in the multiplicative partitioning of biodiversity. A detailed numerical example shows how the formulas should be implemented in applications.
From a practical point of view, the new weighted Gini-Simpson measure of biodiversity , which is a positive concave function of the relative abundance of the pairs of species, which essentially depends both on the matrix of the distances between species and on the conservation values of the species, is proposed as a suitable and improved replacement for the well-known Rao’s index in the partitioning of biodiversity.
Appendices
A. An Upper Bound for for Individual Species
The weighted Gini-Simpson index is a nonnegative, concave, quadratic function of the distribution of the relative abundance of species . We can apply the standard Lagrange multipliers technique from multivariate calculus in order to maximize subject to the constraint . When the positive weights are given, the maximum value of the weighted Gini-Simpson index , as a function of the weights, is If the bound from the right-hand side of the inequality (A.1) is denoted by , the relative weighted biodiversity is .
B. An Upper Bound for for the Pairs of Species
The weighted Gini-Simpson index is a nonnegative, concave, quadratic function of the joint distribution assigned to the pairs of species . We can apply the standard Lagrange multipliers technique from multivariate calculus in order to maximize subject to the constraint . When the positive weights are given, the maximum value of the weighted Gini-Simpson index , as a function of the weights, subject to the constraint , is If the bound from the right-hand side of the inequality (B.1) is denoted by , the relative weighted biodiversity is .
Let us note that if , for the distinct pairs , and , which happens, for instance, in the important case when , or when where is the conservation value of species and is the distance between the distinct species , then may be written as Maximizing , which in this case depends only on variables , , subject to the constraint: , where , we obtain If the bound from the right-hand side of the inequality (B.4) is denoted by , the relative weighted biodiversity is
C. Concavity of the Weighted Gini-Simpson Index for Individual Species
Using the notation from Section 2.2 and taking into account that we get
Remark 1. In the paper [21], two corrections are needed: (a) on page 799, in the first column, the rows 12-13, the numerical values should be , ; (b) on page 798, (11) should be: .
Acknowledgments
The authors would like to thank the editor, Professor Jean-Guy Godin, and the two referees for their detailed and very helpful comments.