Complexity

Volume 2017, Article ID 8715605, 10 pages

https://doi.org/10.1155/2017/8715605

## Advancing Shannon Entropy for Measuring Diversity in Systems

^{1}Department of Mathematical Sciences, Kent State University, Kent, OH, USA^{2}Department of Sociology, Kent State University, 3300 Lake Rd. West, Ashtabula, OH, USA^{3}School of Social and Health Sciences, Abertay University, Dundee DD1 1HG, UK

Correspondence should be addressed to R. Rajaram; ude.tnek@marajarr

Received 31 January 2017; Revised 5 April 2017; Accepted 23 April 2017; Published 24 May 2017

Academic Editor: Enzo Pasquale Scilingo

Copyright © 2017 R. Rajaram et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

From economic inequality and species diversity to power laws and the analysis of multiple trends and trajectories, diversity within systems is a major issue for science. Part of the challenge is measuring it. Shannon entropy has been used to rethink diversity within probability distributions, based on the notion of information. However, there are two major limitations to Shannon’s approach. First, it cannot be used to compare diversity distributions that have different levels of scale. Second, it cannot be used to compare parts of diversity distributions to the whole. To address these limitations, we introduce a renormalization of probability distributions based on the notion of* case-based entropy * as a function of the cumulative probability . Given a probability density , measures the diversity of the distribution up to a cumulative probability of , by computing the length or support of an equivalent uniform distribution that has the same Shannon information as the conditional distribution of up to cumulative probability . We illustrate the utility of our approach by renormalizing and comparing three well-known energy distributions in physics, namely, the Maxwell-Boltzmann, Bose-Einstein, and Fermi-Dirac distributions for energy of subatomic particles. The comparison shows that is a vast improvement over as it provides a scale-free comparison of these diversity distributions and also allows for a comparison between parts of these diversity distributions.

#### 1. Diversity in Systems

Statistical distributions play an important role in any branch of science that studies systems comprised of many similar or identical particles, objects, or actors, whether material or immaterial, human or nonhuman. One of the key features that determines the characteristics and range of potential behaviors of such systems is the degree and distribution of diversity, that is, the extent to which the components of the system occupy states with similar or different features.

As Page outlined in a series of inquiries [1, 2], including* The Difference* and* Diversity and Complexity*, diversity within systems is an important concern for science, be it making sense of economic inequality, expanding the trade portfolio of countries, measuring the collapse of species diversity in various ecosystems, or determining the optimal utility/robustness of a network. However, an important major challenge in the literature on diversity and complexity, which Page also points out [1, 2], remains: the issue of measurement. Although statistical distributions that directly reflect the spread of key parameters (such as mass, age, wealth, or energy) provide descriptions of this diversity, it can be difficult to compare the diversity of different distributions or even the same distribution under different conditions, mostly because of differences in scales and parameters. Also, many of the measures currently available compress diversity into a single score or are not intuitive [1–4].

At the outset, motivated by examples of measuring diversity in ecology and evolutionary biology from [3, 4], we sought to address these challenges. We begin with some definitions and a review of our previous research.

First, in terms of definitions, we follow the ecological literature, defining* diversity* as the interplay of “richness” and “evenness” in a probability distribution.* Richness* refers to the number of different diversity types in a system. Examples include (a) the different levels of household income in a city, (b) the number of different species in an ecosystem, (c) the diversity of a country’s exports, (d) the distribution of different nodes in a complex network, (e) the various health trends for a particular disease across time/space, or (f) the cultural or ethnic diversity of an organization or company. In all such instances, the greater the number of diversity types (be these types discrete or continuous), the greater the degree of richness in a system. In the case of the current study, for example,* richness* was defined as the number of different energy states.

In turn,* evenness* refers to the uniformity or “equiprobability” of occurrence of such states. In terms of the above examples,* evenness* would be defined as (a) a city where household income was evenly distributed, (b) an ecosystem where the diversity of its species was equal in number, (c) a country with an even distribution of exports, (d) a complex network where all nodes had the same probability of occurrence, (e) a disease where all possible health trends were equiprobable, or (f) a company or organization where people of different cultural or ethnic backgrounds were evenly distributed. In the case of the current study, for example,* evenness* was defined as the uniformity or “equiprobability” of the occurrence of all possible energy states.

More specifically, as we will see later in the paper, we define the diversity of a probability distribution as the number of equivalent equiprobable types required to maintain the same amount of Shannon entropy (i.e., the number of Shannon-equivalent equiprobable states). Given such a definition, a system with a high degree of richness and evenness would have a higher degree of , whereas a system with a low degree of richness and evenness would have a low degree of . In turn, a system with high richness but low evenness (as in the case of a skewed-right system with long tail) would have a lower degree of than a system with high richness and high evenness.

##### 1.1. Purpose of the Current Study

Recently, we have introduced a novel approach to representing diversity within statistical distributions [5, 6], which overcomes such difficulties and allows the distribution of diversity in any given system (or cumulative portions thereof) to be directly compared to the distribution of diversity within any other system. In effect, it is a* renormalization* that can be applied to any probability distribution to produce a direct representation of the distribution of diversity within that distribution. Arising from our work in the area of complex systems, the approach is based on the notion of* case-based entropy*, [5]. This approach has two major advantages over the Shannon Entropy , which, as we alluded to above, is one of the most commonly used measures of diversity within probability distributions and which calculates the average amount of uncertainty (or information, depending on one’s perspective) present in a given probability distribution. First, can be used to compare distributions that have different levels of scale; and, second, can be used to compare parts of distributions to their whole.

After developing the concept and formalism for case-based entropy for discrete distributions [5], we first applied it to compare complexity across a range of complex systems [6]. In that work, we investigated a series of systems described by a variety of skewed-right probability distributions, choosing examples that are often suggested to exhibit behaviors indicative of complexity such as emergent collectivity, phase changes, or tipping points. What we found was that such systems obeyed an apparent “limiting law of restricted diversity” [6], which constrains the majority of cases in these complex systems to simpler types. In fact, for these types of distribution, the distributions of diversity were found to follow a scale-free rule, with or more of cases belonging to the simplest or less of equiprobable diversity types. This was found to be the case regardless of whether the original distribution fit a power law or was long-tailed, making it fundamentally distinct from the well-known (but often misunderstood) Pareto Principle [7].

In the following, we continue to explore the use of case-based entropy in comparing systems described by statistical distributions. However, we now go beyond our prior work in the following ways. First, we extend the formalism in order to compute case-based entropy for continuous as well as discrete distributions. Second, we broaden our focus from complexity/complex systems to diversity in* any* type of statistically distributed system. That is, we start to explore distributions of diversity for systems where richness is not a function of the degree of complexity types.

Third, the discrete indices we used had a degree of subjectivity to them, for example, how should household income be binned and what influence does that have on the distribution of diversity? As such, we wanted to see how well worked for distributions where the unit of measurement was universally agreed upon.

Fourth, we had not emphasized how was a major advance on Shannon entropy . As known, while has proven useful, it compresses its measurement of diversity into a single number; it is also nonintuitive; and, as we stated above, it is not scale-free and therefore cannot be used to compare the diversity of different systems; neither can it be used to compare parts of the diversity within a system to the entire system.

Hence, the purpose of the current study, as a demonstration of the utility of , is to renormalize and compare three physically significant energy distributions in statistical physics: the energy probability density functions for systems governed by Boltzmann, Bose-Einstein, and Fermi-Dirac statistics.

#### 2. Renormalizing Probability: Case-Based Entropy and the Distribution of Diversity

The quantity* case-based entropy* [5], , renormalizes the diversity contribution of any probability distribution , by computing the true diversity of an equiprobable distribution (called the* Shannon-equivalent uniform distribution*) that has the same Shannon entropy as . is precisely the number of equiprobable types in the case of a discrete distribution, or the length, support, or extent of the variable in the case of continuous distributions, which is required to keep the value of the Shannon entropy the same across the whole or any part of the distribution up to a cumulative probability . We choose the Shannon-equivalent uniform distribution for two reasons:(i)First, it is well known that, on a finite measure space, the uniform distribution maximizes entropy: that is, the uniform distribution has the maximal entropy among all probability distributions on a set of finite Lebesgue measures [8].(ii)Second, a Shannon-equivalent uniform distribution will, by definition, count the number of values (or range of values) of that are required to give the same information as the original distribution if we assume that all the values (or range of values) are equally probable.

Hence, the uniform distribution renormalizes the effect of varying relative frequencies (or probabilities) of occurrence of the values of without losing information (or entropy). In other words, if all choices of the random variable are equally likely, the number of values (or the length, if it is a continuous random variable) needed for the random variable to keep the same amount of information as the given distribution is a measure of diversity. In a sense, each new value (or type) is counted as adding to the diversity, only if the new value has the same probability of occurrence as the existing values. Diversity necessarily requires the values of the random variable to be equiprobable since lower probability, for example, means that such values occur rarely in the random variable and hence cannot be treated as equally diverse as other values with higher probabilities. Hence, by choosing an equiprobable (or uniform) distribution for normalization, we are counting the true diversity, that is, the number of equiprobable types that are required to match the same amount of Shannon information as the given distribution.

This calculation (as we have shown elsewhere [5]) can be done for parts of the distribution up to a cumulative probability of . This means that a comparison of for a variety of distributions is actually a comparison of the variation of the fraction of diversity contributed by values of the random variable up to .

Since, regardless of the scale and units of the original distribution, and both vary from to , one can plot a curve for versus for multiple distributions on the same axes. thus provides us with a scale-free measure to compare distributions without omitting any of the entropy information, but by renormalizing the variable to one that has equiprobable values. What is more, it also allows us to compare different parts of the same distribution, or parts to wholes. That is, we can generate a versus curve for any part of a distribution (normalizing the probabilities to add up to in that part) and compare the curve of the part to the curve of the whole or another part to see if the functional dependence of on is the same or different. In essence, has the ability to compare distributions in a “fractal” or self-similar way.

In [5], we showed how to carry out the renormalization for discrete probability distributions, both mathematical and empirical. In this paper, as we stated in the Introduction, we make the case for how constitutes an advance over , in terms of providing a scale-free comparison of probability distributions and also comparisons between parts of distributions. More importantly, we demonstrate how works for continuous distributions, by examining the Maxwell-Boltzmann, Bose-Einstein, and Fermi-Dirac distributions for energy of subatomic particles. We begin with a more detailed review of .

#### 3. Case-Based Entropy of a Continuous Random Variable

Our impetus for making an advance over the Shannon entropy comes from the study of diversity in evolutionary biology and ecology, where it is employed to measure the true diversity of species (types) in a given ecological system of study [3, 4, 9, 10]. As we show here, it can also be used to measure the diversity of an arbitrary probability distribution of a continuous random variable.

Given the probability density function of a random variable in a measure space , the Shannon-Weiner entropy index is given by

The problem, however, with the Shannon entropy index , as we identified in our abstract and Introduction, is that while being useful for studying the diversity of a single system, it cannot be used to compare the diversity across probability distributions. In other words, is not multiplicative: a doubling of value for does not mean that the actual diversity has doubled. To address this problem, we turned to the* true diversity* measure [3, 11, 12], which gives the range of equiprobable values of that gives the same value of :

The utility of for comparing the diversity across probability distributions is that, in , a doubling of the value means that the number of equiprobable ranges of values of has doubled as well. calculates the range of such equiprobable values of that will give the same value of Shannon entropy as observed in the distribution of . We say that two probability densities and are Shannon-equivalent if they have the same value of Shannon entropy. Case-based entropy is then the range of values of for the Shannon-equivalent uniform distribution for . We also note that Shannon entropy can be recomputed from by using .

In order to measure the distribution of diversity, we next need to determine the fractional contribution to overall diversity up to a cumulative probability . In other words, we need to be able to compute the diversity contribution up to a certain cumulative probability . To do so, we replace with , the conditional entropy, given that only the portion of the distribution up to a cumulative probability (denoted by ) is observed with conditional probability of occurrence with density up to a given cumulative probability . That is,

The value of for a given value of cumulative probability is the number of Shannon-equivalent equiprobable energy states (or of values of the variable in the -axis in general) that are required to explain the information up to a cumulative probability of within the distribution. If , then is the number of such Shannon-equivalent equiprobable energy states for the entire distribution itself.

We can then simply calculate the fractional diversity contribution or case-based entropy as

It is at this point that the renormalization ( as a function of ) becomes scale independent as both axes range between values of and with the graph of versus passing through and . Hence, irrespective of the range and scale of the original distributions, all distributions can be plotted on the same graph and their diversity contributions can be compared in a scale-free manner.

To check the validity of our formalism, we calculate for the simple case of a uniform distribution given by on the interval . Intuitively, if we choose , then, owing to the uniformity of the distribution, we expect itself. In other words, the diversity of the part is simply equal to , that is, the length of the interval , and hence the versus curve will simply be the straight line with slope equal to . This can be shown as follows:

With our formulation of complete, we turn to the energy distributions for particles governed by Boltzmann, Bose-Einstein, and Fermi-Dirac statistics.

#### 4. Results

##### 4.1. for the Boltzmann Distribution in One Dimension

We first illustrate our renormalization by applying it to a relatively simple case: that of an ideal gas at temperature . The kinetic energies of particles in such a gas are described by the Boltzmann distribution [8]. In one dimension, this iswhere is the Boltzmann constant and .

The entropy of can be shown to be , and hence the true diversity of energy in the range is given by

The cumulative probability from to is then given by

Hence, can be computed in terms of as

Equation (9) is useful for the one-dimensional Boltzmann case to eliminate the parameter altogether in (11) to obtain an explicit relationship between and . It is to be noted that, in most cases, both and can only be parametrically related through . The other quantities introduced in Section 3 can then be calculated as follows:

We note that, in (13), the temperature factor cancels out, indicating that the distribution of diversity for an ideal gas in one dimension is independent of temperature. The resulting graph of as a function of is shown in Figure 1. It is worth noting in passing that reaches when , indicating that approximately of the molecules in the gas are contained within the lower of* diversity* of energy probability states at all temperatures (here,* diversity* is defined as the number of equivalent equiprobable energy states required to maintain the same amount of Shannon entropy ). Thus, the one-dimensional Boltzmann distribution obeys an interesting phenomenon that we have identified in a wide range of skewed-right complex systems, which (as we briefly discussed in the Introduction) we call* restricted diversity* and, more technically, the 60/40 rule [6]. The independence of temperature in the versus curve, for the Boltzmann distribution, shows that the effect of increasing is to shift the mean of the distribution to higher energies and to increase its standard deviation, but not to change its characteristic shape. Still, what is key to our results is that the temperature independence of the curve for the Boltzmann distribution in one dimension validates that our renormalization preserves the fundamental features of the original distribution.