An Iterative Scheme to Compute Size Probabilities in Random Graphs and Branching Processes

Serafini, Paolo

doi:https://doi.org/10.1155/2018/3791075

Scientific Programming

On this page

Abstract Introduction Conclusions Conflicts of Interest References Copyright Related Articles

Research Article | Open Access

Volume 2018 | Article ID 3791075 | https://doi.org/10.1155/2018/3791075

An Iterative Scheme to Compute Size Probabilities in Random Graphs and Branching Processes

Paolo Serafini¹

Academic Editor: Fabrizio Riguzzi

Received25 Sept 2017

Revised29 Jan 2018

Accepted11 Feb 2018

Published01 Apr 2018

Abstract

We deal with a functional equation that plays an important role in random graphs and in branching processes. In branching processes, the functional equation relates offspring probabilities to population size probabilities, while in random graph it relates degree probabilities to small component size probabilities. We present an iterative scheme that allows computing the size probabilities numerically. It is also theoretically possible to invert the iteration, although this inverse iteration is numerically unstable.

1. Introduction

Let and be two probability generating functions that are linked through the functional equation

Functions of this type occur in branching processes and in random graphs [1–6]. In branching processes, represents the probabilities of new offspring from a member of the population and represents the population size probabilities. In the configuration model [4] of random graphs, represents the excess degree probabilities of a vertex in small components and represents the small component size probabilities. Note that, in both cases, can be a defective generating function; that is, .

Usually is given and has to be computed. Only in rare cases is it possible to find an explicit analytic expression of . However, a numerical iteration to compute the coefficients of is possible. To the best of our knowledge, such a question has not been investigated and it seems that the iteration we propose in this paper is new.

Interestingly enough, this iteration can be inverted; that is, from the size distribution, we can infer the degree probabilities. We present also this inverse iteration, although it has to be remarked that the inverse iteration is numerically unstable.

The paper is organized as follows. In Section 2, we provide the mathematical background by referring to the case of random graphs. Then, in Section 3, we present the main result, that is, the iteration to compute the size probabilities of the small components of the graph. The possibility of inverting this computation is presented in Section 4. Then, in Section 5, we point out how the same iteration can be used for a branching process. Some conclusions are presented in Section 6.

2. Mathematical Background

We first present our result by explicitly referring to random graphs in the configuration model for which the picture is more complex. In a later section, we show how to relate the iteration to branching processes. Hence, all definitions in this section and in Sections 3 and 4 are related to random graphs.

A random graph has assigned degree probabilities , , and is the probability that a randomly selected vertex has degree . We recall that the degree of a vertex is the number of vertices adjacent to it. The study of random graphs through generating functions is asymptotic; that is, it assumes an infinite number of vertices. Let be the probability generating function of the degree distribution; that is, Let be the average degree andwhere clearly The values are known as excess degree probabilities. Let and be two generating functions that can be expressed as power series as and they are defined by the equationsOur aim is to compute the coefficients and .

The motivation for the generating functions and derives from the analysis of the asymptotic properties of the random graph in the configuration model. If the graph is sufficiently dense, it exhibits the so-called giant component, that is, a connected component whose size asymptotically goes to infinity. The giant component, if present, is unique. The rest of the graph consists of an infinite number of finite trees, the so-called small components (see [3], among many possible references).

It can be shown that if is the probability that a randomly chosen vertex (in the whole graph) has degree , then is the probability that a randomly chosen vertex belongs to a small component of size , and is the probability that, after choosing a random vertex of degree at least one and then a random vertex adjacent to , the vertex belongs to a small component of size after removing the edge .

If the giant component is present, the conditional probability of choosing in the small components a vertex of degree is different from and similarly for the excess degree probability . It can be shown thatwhere is the solution of and is the fraction of vertices in the small components. We can briefly justify (7) by using Bayes’ formula:with being the random event of choosing a vertex in a small component and being the random event of choosing a vertex of degree . Clearly and consequently . If , is the probability that all adjacent vertices belong to a small component once we have removed the corresponding edges, and so its value is . This explains the left expression in (7). To justify the right expression, we need to compute the average degree in the small components by taking the derivative of (by using (7)) and computing it for ; that is, . From this, we immediately get the expression at the right.

It turns out that using instead of in the definition of and has the only effect of scaling the values by the constant factor and the values by the constant factor , which correspond to the conditional probability of choosing within the small components. In particular, we have and if we use and in the definition of and , respectively, whereas we have and if we use and .

We also define the probability that a randomly selected small component has size . Of coursewith being the average size of a small component. Here we have to discount because the choice of small components necessarily conditions the choice within the small components.

3. Computing the Coefficients of and

From the recursive equation,and from (necessarily ), we deriveso thatLet be the coefficient of in . Note that and in particular . From (13), Hence, the computation of requires the coefficients . In turn, the computation of requires the terms and so to compute we only need knowledge of .

The recursion works as follows: initially , and then, for , the following block is computed:Note that and therefore , where is an upper index for and a power exponent for .

We also derive fromthe expression so thatIn this case, the computation is straightforward, since it involves all previously computed quantities.

Theoretically, the generating functions and involve an infinite series, but obviously only a finite number of coefficients can be computed. Hence, the computation has to be stopped after having computed the desired number of terms and . Since each term is computed only once and it is not the result of subsequent smaller and smaller additions, truncating the computation up to a certain index has no effect on the accuracy of the values we compute. In other words, if we compute just a few terms, they are computed with the same accuracy as we had computed all coefficients.

It is clear from the definitions and the previous iteration that implies ; that is, once we know the values, the values are also known and implied by the values. It is not difficult to see that the converse is also true. By differentiating (6) and using (3), we getand by integrating (19), we getthat is,which leads to the following identities term by term:For , we get in particularand, for , we havewhich allows writingso that all values can be recursively computed once we know . We note that implies that . Hence, the recursion is well defined if , which is an almost necessary assumption if we investigate the presence of small components. Hence, knowledge of the values implies knowledge of the values.

If we do not know , we may still compute from the recursion. We first note that all depend on through the factor . Therefore, we initially guess the value and compute tentative values . Since , we find the correct value for as and so we have the correct values:

4. Inferring the Degree Probabilities from the Component Size Probabilities

We may also consider the inverse problem of finding and from and , that is, computing the degree distribution which gives rise to a particular small component size distribution. This problem presents interesting features. Arbitrary degree distributions of and may not be feasible; that is, there may be no degree distribution that can lead to those values.

Formally, the recursion can be easily inverted; that is, knowing the values, we can compute the values and values. Indeed, from (14), we havethat is,Computing the values is straightforward once we know the values. From the values, we easily deduce the values, apart from the fact that cannot be derived from the values. However, , and so it is known a priori. Note also that implies that .

There is, however, a subtle point to be settled. Let us assume that a giant component may be present but we do not know the and values. Then, it is simpler to work with the conditional probabilities within the small components. Starting from the (conditional) probabilities, we compute the values as explained in the previous section but by using the normalization . This way, we actually compute and . Then, from (7), we get and . The unknowns and are computed by imposing and , which is equivalent to solving and with and defined on and .

However, the inverse recursion is numerically unstable, and, unless we use exact data, it can produce absurd outcomes, like probabilities outside the range . The reason of the instability is clear from (28), where we have a difference in the numerator and the denominator is getting smaller and smaller with . As a simple exercise, suppose that we wonder which degree distribution gives rise to a size distribution of the small components of exponential type; that is, with . Hence, we have andWe remark that these are conditional probabilities. Now we have to compute the values from the values. As explained in the previous section, we initially fix and compute from (25) the tentative values:for which . Hence, , implying the correct values: If we carry out the computation in (28) symbolically, we getfrom whichFrom (7), we haveBy imposing , we getand also . Hence, there is no giant component in this case.

Now assume that we have experimental data from which we infer the values: (these data have been generated by slightly perturbing the previous theoretical values with ). The previous computation leads to Not only are there negative values but also the absolute value of is increasing with showing an amplifying effect of error propagation. Therefore, a lot of care should be exerted in order to carry out computations on experimental data. This can be matter of further investigation and is beyond the scope of this paper.

We show a second example for the inverse computation. Assume that where are the Catalan numbers. If we carry out the computation in (28) symbolically, we getfrom which so thatandThe normalization yields and , so that . Again, we show how perturbed data can lead to strange outcomes. If we perturb the data as with for the other indices, we getWe see again the same inconsistencies and the amplifying effect. In any case, we may note that the values with odd index are correctly computed as null values.

5. Branching Processes

Now we define , , as the probability that a member of the population generates offspring. We are interested in computing the probability that the population will eventually have members, starting from a population consisting of one member. If and are the probability generating functions of and , respectively, then the following functional equation holds: Hence, the same relations (13) and (14) hold as well as the recursion (15). This time there are no coefficients to be computed and the picture is simplified. We may still view a branching process like a random graph. However, while in random graphs we pick up randomly any vertex within the small components, in branching processes, the small components are rooted trees and we pick up the roots randomly. Hence, the values we state here for a branching process can be related to the values of random graphs.

Iteration (15) can also be carried out in exact arithmetic, thus producing results from which closed formulas can be inferred. As a simple example, suppose that ( for ). Then, by applying (15), we obtain a sequence whose first terms areWe may guess that the denominator grows as the powers of 3, and so if we multiply the th term of (47) by , we obtain the new sequence By looking at [7], we discover that these are the Motzkin numbers whose th term is in closed form:with being the th Catalan number. Hence,and we have found another combinatorial interpretation of the Motzkin numbers besides the many listed in [7]. The reason for as subscript is due to the fact that the first index of the sequence is , whereas Motzkin numbers in (48) as defined above start from .

The same considerations about inferring the probabilities from the probabilities () can be applied also to branching processes. The example with is almost trivial if we have in mind a branching process.

6. Conclusions

In this paper, we have presented an iterative scheme to compute the coefficients of a generating function that plays an important role in random graphs and in branching processes. The generating function is related to the population size probabilities for a branching process and to the small component size probabilities for random graphs. We also show that the iteration can be inverted; that is, for a branching process, from the population size probabilities, one can infer the offspring probabilities, but the inverse iteration is numerically unstable.

Conflicts of Interest

The author declares that there are no conflicts of interest regarding the publication of this paper.

References

N. Alon and J. H. Spencer, The Probabilistic Method, John Wiley & Sons, 2004.
N. Baumann and S. Stiller, “Network Models,” in Network analysis: methodological foundations, U. Brands and T. Erlebach, Eds., vol. 3418 of Lecture Notes in Computer Science, pp. 341–372, Springer, Berlin, Germany, 2005.
View at: Google Scholar
M. E. J. Newman, Networks: an introduction, Oxford University Press, 2010.
M. E. J. Newman, S. H. Strogatz, and D. J. Watts, “Random graphs with arbitrary degree distributions and their applications,” Physical Review E, vol. 64, Article ID 026118, 2001.
View at: Google Scholar
Y. Shang, “Impact of self-healing capability on network robustness,” Physical Review E, vol. 91, Article ID 042804, 2015.
View at: Google Scholar
Y. Shang, “Effect of link oriented self-healing on resilience of networks,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2016, no. 8, Article ID 083403, 2016.
View at: Publisher Site | Google Scholar
N. J. A. Sloane, “The on-line encyclopedia of integer sequences: sequence A001006,” http://oeis.org/A001006, 2015.
View at: Google Scholar

Copyright

Copyright © 2018 Paolo Serafini. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

677

Downloads

666

Citations