Abstract

Boolean networks are a popular modeling framework in computational biology to capture the dynamics of molecular networks, such as gene regulatory networks. It has been observed that many published models of such networks are defined by regulatory rules driving the dynamics that have certain so-called canalizing properties. In this paper, we investigate the dynamics of a random Boolean network with such properties using analytical methods and simulations. From our simulations, we observe that Boolean networks with higher canalizing depth have generally fewer attractors, the attractors are smaller, and the basins are larger, with implications for the stability and robustness of the models. These properties are relevant to many biological applications. Moreover, our results show that, from the standpoint of the attractor structure, high canalizing depth, compared to relatively small positive canalizing depth, has a very modest impact on dynamics. Motivated by these observations, we conduct mathematical study of the attractor structure of a random Boolean network of canalizing depth one (i.e., the smallest positive depth). For every positive integer , we give an explicit formula for the limit of the expected number of attractors of length in an n-state random Boolean network as n goes to infinity.

1. Introduction

Dynamic mathematical models are a key enabling technology in systems biology. Depending on the system to be modeled, the data and information available for their construction, the questions to be answered, and different modeling frameworks can be used. For kinetic models, systems of ordinary differential equations have a long tradition. Generally, they will have the very special structure of polynomial equations representing Michaelis–Menten kinetics, even in the case of systems, such as gene regulatory networks, that are not proper biochemical reaction networks. It is this special structure that gives models desirable properties and aids in model analysis. Besides continuous models, a range of discrete models are finding increasingly frequent use, in particular Boolean network models of a broad variety of biological systems, from intracellular molecular networks to population-level compartmental models (see e.g., [15]), going back to the work of Kauffman in the 1960s [68]. While Boolean network models, a collection of nodes, whose regulation by other nodes is described via a logical rule built from Boolean operators, are intuitive and mathematically simple to describe, their analysis is severely limited by the lack of mathematical tools. It generally consists of simulation results. Any set function on binary strings that takes on binary values can be represented as a Boolean function, so that the class of general Boolean networks is identical to the class of set functions on binary strings of a given length, making any general analysis impossible. The search for special classes of Boolean functions that are broad enough to cover all or most rules that occur in biology, but special enough to allow for mathematical approaches has a long history.

It was again Kauffman who proposed a class of functions [7] with properties inspired by the developmental biology concept of canalization, going back to Waddington in the 1940s [9]. There is some evidence that canalizing Boolean functions do indeed appear disproportionately in published models and that the dynamics of Boolean network models consisting of canalizing functions has special properties, in particular a “small” number of attractors. This is important since, in the case of intracellular molecular network models, attractors correspond to the different phenotypes a cell is capable of. Here, again, the majority of available results are obtained by simulating large numbers of such networks. The main question of this paper is as follows: What do the dynamics of a random canalizing Boolean network look like? We approach this question using both computer simulations and analytical methods, with the main result of the paper being Theorem 2, which gives a provable formula for the number of expected attractors of a general Boolean network with a particular canalization property. In addition to providing important information about canalizing Boolean network models, this result can be viewed as a part of a growing body of mathematical results characterizing this class of networks that promises to be as rich as that for chemical reaction network models based on ordinary differential equations.

2. Background

The property of canalization for Boolean functions was introduced by Kauffman in [7], inspired by the concept of canalization from developmental biology [9]. A Boolean function is canalizing if there is a variable and a value of the variable such that if the variable takes the value, then the value of the function does not depend on other variables. It was shown that models defined by such functions often exhibit less chaotic and more stable behavior [10, 11]. Nested canalizing functions, obtained by applying the concept of canalization recursively, were introduced in [2]. They form a special subset of canalizing functions and have stable dynamics [11]. We note that there are other important properties shared by Boolean networks arising in modeling (for example, sparsity [7]). In this paper we focus only on canalization and its impact on the dynamics, and one of the natural future directions would be to consider several such properties simultaneously.

To cover more models arising in applications, the notion of nested canalizing function was relaxed by Layne et al. [12] by assigning to every Boolean function its canalizing depth. Noncanalizing functions have canalizing depth zero, and nested canalizing functions have the maximal possible canalizing depth equal to the number of variables. Canalizing depth of a Boolean network is defined as the minimum of the canalizing depths of the functions defining the network. In [12], activities and sensitivities of functions with different canalizing depths and stability and criticality of Boolean networks composed from such functions were investigated. It has been observed that Boolean networks of higher canalizing depth tend to be more stable and less sensitive. However, increasing the canalizing depth to the maximum does not improve the stability significantly compared to moderate positive canalizing depth. These observations give a strong indication of the biological utility of canalizing function, even with small canalizing depth.

Attractors in Boolean network models can be interpreted as distinct cell types [13, p. 202] and their lengths can be viewed as the variety of different gene expression patterns corresponding to the cell type. Thus, understanding the attractor structure of a random Boolean network defined by functions of a fixed canalizing depth is important for assessing biological relevance of such models. Analytic study of the attractor structure of nested canalizing Boolean networks has been carried out in [11]. For discussion about attractors of length one (i.e., steady state), we refer to [14].

3. Our Results

The main question of this paper is as follows: What do the dynamics of a random canalizing Boolean network look like? We approach this question using both computer simulations and analytical methods.

In our computational experiments, we generate approximately 30 million random Boolean networks of all possible canalizing depths with the number of variables ranging from 4 to 20. For each of these networks, we determine sizes of all the attractors and basins of attraction and analyze the obtained data. We discover the following:(1)For a fixed number of variables, the sample mean of the number of attractors and average size of an attractor decrease when the canalizing depth increases(2)The decrease of the average size of an attractor is much greater than the decrease of the number of attractors as the canalizing depth increases(3)Both decreases from (8) are substantial when the canalizing depth changes from zero to small canalizing depths, but a further increase of the canalizing depth does not lead to a significant decrease for either the sample means or for the empirical distributions(4)The relative decrease of the sample mean of the number of attractors and the average attractor size when the canalizing depth changes from zero to one becomes sharper when the number of variables increases

Observations (8) and (A.4) are consistent with the results obtained in [12] for sensitivity and stability. This provides new evidence that Boolean networks of small positive canalizing depth are almost as well-suited for modeling as those with nested canalizing functions, from the point of view of stability. Since there are many more canalizing functions of small positive canalizing depth than nested canalizing functions [15, Section 5], they provide a richer modeling toolbox.

Motivated by observation (A.4), we conduct a mathematical study of the attractor structure of a random Boolean network of canalizing depth one (that is, the minimal positive depth). Our main theoretical result, Theorem 2, gives, for every positive integer , a formula for the limit of the expected number of attractors of length in a random Boolean network of depth one. The same formulas are valid for a random Boolean network defined by canalizing functions (see Remark 5). In particular, our formulas show that a large random network of depth one, on average, has more attractors of small sizes that an average Boolean network (Remark 6).

Formulas similar to the ones in our proofs (e.g., in Lemma A.4) have already appeared in the study of the average number of attractors of a given length in sparse Boolean networks, e.g., [16, equation (2)] and [17, equation (6)]. The results of [16, 17] are based on describing the asymptotic behavior of these formulas in terms of N, the number of nodes in the network, and the asymptotics is of the form . In our case, the average number of attractors of a given length simply approaches a constant as (that is, ), but our methods allow us to find the exact value of this constant.

The source code we used for generating and analyzing data is available at https://github.com/MathTauAthogen/Canalizing-Depth-Dynamics. The raw data are available at https://github.com/MathTauAthogen/Canalizing-Depth-Dynamics/tree/master/data.

Structure of the Paper. The rest of the paper is organized as follows. Section 4 contains necessary definitions about canalizing functions and Boolean networks. Outlines of the algorithms used in our computational experiments are in Section 5. The main observations are summarized in Section 6. Our main theoretical result about attractors in a random Boolean network of canalizing depth one (Theorem 2) is presented in Section 7. Section 8 contains conclusions. The proofs are located in the Appendix.

4. Preliminaries

Definition 1. A Boolean network is a tuple of Boolean functions in n variables. For a state at time t, we define the state at time by

Definition 2. (attractors and basins). Let be a Boolean network.(i)A sequence of distinct states is called an attractor of if for every and .(ii)An attractor is called a steady state if .(iii)Let be an attractor of . The basin of A is the set

Definition 3. A nonconstant function is canalizing with respect to a variable if there exists a canalizing value such that

Example 1. Consider (the product is understood modulo 2, that is, logical AND). It is canalizing with respect to with canalizing value 0 because regardless of the value of . Analogously, it is canalizing with respect to with canalizing value 0.
Consider (summation is understood modulo 2, that is, logical XOR). It is not canalizing with respect to becauseThe same argument works for as well.

Definition 4. has canalizing depth [15, Definition 2.3] k if it can be expressed aswhere(i) are distinct integers from 1 to n(ii)(iii) is a noncanalizing function in the variables

Example 2. For example, if ,and is noncanalizing. Therefore, f has canalizing depth 1.

Remark 1. Since in Definition 4 is noncanalizing, every function has a single well-defined canalizing depth. In particular, a function of depth two is not considered to have depth one.

Definition 5. We say that a canalizing Boolean function is nested if f has canalizing depth n, that is, or (see Definition 4). For example, is nested canalizing becauseso the canalizing depth of f is 3, which is equal to .

Definition 6. We say that a Boolean network has canalizing depth k if are Boolean functions of canalizing depth k.

5. Simulations: Outline of the Algorithms

In our computational experiment, we generated random Boolean networks of various canalizing depths. For each network, we store a list of pairs , where is the size of the ith attractor of the network and is the size of its basin. The generated data are available at https://github.com/MathTauAthogen/Canalizing-Depth-Dynamics/tree/master/data. To generate the data, we used two algorithms: one for generating a random Boolean network of a given canalizing depth and one for finding the sizes of attractors and their basins (Algorithm 1).

In: A Boolean network in n variables
Out: A list of pairs , where is the size of the ith attractor of and is the size of its basin
(1)(Network  Graph) Build a directed graph G with vertices corresponding to possible states and a directed edge from to for every .
(2)(Attractors) Perform a depth-first search [18, § 22.3] traversal on G viewed as an undirected graph to detect the unique cycle in each connected component, these cycles are the attractors.
(3)(Basins) For each cycle from Step 2, perform a depth-first search traversal on G with all the edges reversed. The dfs trees will be the basins.
(4)Return the sizes of the attractors and basins found on Steps 2 and 3.
5.1. Generating Random Boolean Functions of a Given Canalizing Depth

[12, Section 5] contains a sketch of an algorithm for generating random Boolean functions that have canalizing depth at least k for a given k. Here, we generate functions of canalizing depth equal to k and take a different approach than [12]. In order to ensure that the probability distribution of possible outputs is uniform, we use the following structure theorem due to He and Macaulay [15].

Theorem 1. (see [15], Theorem 4.5). Every Boolean function can be uniquely written aswhere for every , is a noncanalizing function, and is the canalizing depth. Each appears in exactly one of , and the only restrictions on equation (8) are the following “exceptional cases”:(E1) If and , then (E2) If and and , then

Example 3. Consider can be represented asso , , , , and . This can be verified by expanding the brackets in the original and new representations of f.
Consider . It can be represented asso , , , , and .
Our algorithm is summarized in Algorithms 2 and 3 below. Correctness of Algorithm 2 follows from Theorem 1, and correctness of Algorithm 3 can be proved directly by induction on k.

Remark 2. The complexity of Algorithm 2 is (see Proposition B.2). Given that the size of the output is , and this is nearly optimal.
We measured the runtimes of our implementation of Algorithm 2 on a laptop with a Core i5 processor (1.60 GHz) and 8 Gb RAM. Generating a single function with 20 variables (the largest number we used in our simulations) takes seconds (faster for smaller canalizing depth). On a laptop, our implementation can go up to 24 variables ( minutes to generate a function), and then hits memory limits. One can go further by using a lower level language and more careful packing. However, already a Boolean function in 40 variables would require at least 128 Gb of memory.

In: Nonnegative integers k and n with
Out: A Boolean function f in n variables of canalizing depth k such that, for fixed k and n, all possible outputs have the same probability
(1)In the notation of Theorem 1, generate the following:
(a)random bits ;
(b)random subset with ;
(c)random ordered partition of X (using Algorithm 2);
(d)random noncanalizing function in variables (see Remark 3).
(2)Form a function using the data generated in Step 1 as in Theorem 1, where involves exactly the variables from for every .
(3)If f does not satisfy any of the conditions (E1) or (E2), discard it and run the algorithm again. Otherwise, return f.

Remark 3. We generate a random noncanalizing function as follows. We generate a random Boolean function and test for canalization until we generate a noncanalizing one. Then, we return it. Since canalizing functions are rare [15, Section 5], this algorithm is fast enough for our purposes (see Lemma B.1).

In: A finite set X with
Out: An ordered partition into nonempty subsets such that, for a fixed X, all possible outputs have the same probability
(1)Compute , where is the number of ordered partitions of a set of size i, using the recurrence , (see [19, equation (9)]).
(2)Generate an integer N uniformly at random from .
(3)Find the minimum integer j between 1 and k such that .
(4)Randomly select a subset of size j.
(5)Generate an ordered partition of recursively.
(6)Return .

6. Simulations: Results

Notation 1. For a Boolean network , let and denote the number of the attractors of and the sum of the sizes of the attractors of , respectively. We define the average size of an attractor as .

6.1. Sample Means of and

For every and every , we generate random Boolean networks in n variables of canalizing depth k and compute the mean of and . Figure 1 shows how these means depend on k for (based on 50,000 samples for each k). The shape of the plots is similar for other values of n we did computation for (that is, ). Note that although both means are decreasing, the decrease of the mean of is more substantial.

6.2. Distributions of and

Figure 2 shows the empirical distributions of and for and based on 300,000 samples for each k. From the plot, we can make the following observations:(i)The distributions become more concentrated and the peak shifts towards zero when k increases(ii)The distributions for nonzero canalizing depths (especially for larger depths) are much closer to each other that to the distribution for zero canalizing depth. This agrees with the plots on Figure 1.

6.3. Relative Decreases

From Figure 1, we can observe that, for both and , the sample mean decreases rapidly for small canalizing depths. In order to understand how this decrease behaves for large n, we introduce

is defined analogously. Figure 3 plots , , , and and , , , and as functions of n. From the plots we see that(i)The relative initial decrease from canalizing depth 0 to canalizing depth 1 becomes even more substantial when n increases(ii)The relative decrease from canalizing depth 0 to canalizing depth 3 is already very close to the relative decrease from depth zero to the maximal depth (i.e., nested canalizing functions)

7. Theory: The Main Result

We will introduce notation needed to state the main theorem. Let us fix a positive integer . For a binary string , we define(i) denotes the number of ones(ii) denotes component-wise negation(iii) denotes a cyclic shift to the right

For binary strings , we define

Then, we define a matrix bywhere we interpret numbers as binary sequences of length .

Theorem 2. Let be the limit of the expected number of attractors of length in a random Boolean network of canalizing depth one (see Definition 6) when the number of variables n goes to infinity. Then,where is the characteristic polynomial of matrix introduced above. In particular, we have

Remark 4. The plots below show that the result of Theorem 2 agrees with our simulations (Figure 4).

Remark 5. As explained in Remark A.1, Theorem 2 stills holds if we replace a random Boolean network of canalizing depth one with a random Boolean network defined by canalizing functions.

Example 4. Let . Then, for example, we have and . In total, we have

Remark 6. Theorem 2 and Corollary A.1 imply that for every . On the other hand, a direct computation shows that the expected number of attractors of length in a random Boolean network (without any canalization requirements) is . This is consistent with our observations from Section 6.1.

Remark 7. A sage script for computing numbers is available at https://github.com/MathTauAthogen/Canalizing-Depth-Dynamics/blob/master/core/theory.sage.

8. Conclusions

We conducted computational experiments to investigate the attractor structure of Boolean networks defined by functions of varying canalizing depth. We observed that networks with higher canalizing depth tend to have fewer attractors and the sizes of the attractors decrease dramatically when the canalizing depth increases moderately. As a consequence, the basins tend to grow when the canalizing depth increases. These properties are desirable in many biological applications of Boolean networks, so our results give new indications of the biological utility of Boolean networks defined by functions of positive canalizing depth.

We proved a theoretical result, Theorem 2, which complements the above observation as follows. The theorem implies that a large random Boolean network of canalizing depth one has on average more attractors of small size than a random Boolean network of the same size although it has less attractors in total. This also provides an explanation to the fact that the total size of attractors decreases faster than the number of attractors as the canalizing depth grows.

Furthermore, we observed that all the statistics we computed are almost the same in the case of the maximal possible canalizing depth (so-called nested canalizing Boolean networks) and in the case of moderate canalizing depth. This agrees with the results of Layne et al. [12]. This observation elucidates an interesting and powerful feature of canalization: even a very moderate canalizing influence in a Boolean network has a strong constraining influence on network dynamics. It would be of interest to explore the prevalence of these features in published Boolean network models.

Finally, we provided evidence that the observed phenomena will occur for Boolean networks with larger numbers of state variables.

Appendix

A. Proofs

Notation A.1. We fix a positive integer .(i)For every , we define a subset by(ii)For every , let be the submatrix of with rows and columns having indices from .

Lemma A.1. For every , we have(1)is stochastic (see [20, § 8.5]), and has exactly one eigenvalue being equal to 1.(2)For every , there exists a matrix with nonnegative entries such that is stochastic and has exactly one of the eigenvalues being equal to 1.

Proof. We will first show that is stochastic and irreducible (see [20, § 3.11]).
By definition, showing that is stochastic is equivalent to proving that, for every :Since shift just permutes binary strings, this sum is equal to . For a fixed ß and , the number of such and is equal to . Thus,To prove irreducibility, we observe that, if denotes a zero binary string, then and for every . Then, [20, § 3.11, Exercise 12a] implies that is irreducible.
Since is stochastic, its largest eigenvalue is equal to 1 [20, § 8.5, p.156]. Since is irreducible, [20, Theorem 8.2] implies that 1 is a simple eigenvalue.
To prove the second part of the lemma, we fix . We will show that for every Indeed, let γ be a binary string with all zeroes and one at the ith position. Then, since , we haveInequality (A.4) implies that there exists a matrix with nonnegative entries such that is stochastic.
Since , the same argument as in the proof of the first part of the lemma shows that is stochastic and has exactly one of the eigenvalues being equal to 1.

Corollary A.1. Let be the charactersitic polynomial of . Then, for every , .

Notation A.2. Fix a positive integer n. For vectors and , we denote

Lemma A.2. Let A be an stochastic matrix with only one of the eigenvalues being one. We setLet be the characteristic polynomial of A. Then, .

Proof. We recall that the Lambert W function [21] is the principal branch of the inverse of . We will use the notation from [22] so that . Function has a singularity of the square-root type at and has the following expansion around this point (see [22, p. 107]):From this, we obtainThe main result of [23] implies that, for every complex matrix A, we haveSince is stochastic, we have , soIf we perform a substitution and use the definition of the Lambert W function, we obtainFrom this, we obtain can be rewritten asFinding the asymptotic behavior of the Taylor coefficients of would yield an asymptotic for . We will do this using singularity analysis [24, Chapter VI] (similarly to [22, Theorem 2]). Since for (see [21, Figure 1]) and all roots of lie in the unit circle due to the stochasticity of A, is the singularity of with the smallest absolute value. Due to Lemma A.1, , where . Using (A.8), we obtain the following expansion of around :Singularity analysis [24, Corollary VI.1] implies thatUsing Stirling’s formula, we getUsing , we deduce , and this finishes the proof.

Lemma A.3. On the set of all Boolean networks with n states consider two probability distributions:(A) All the networks with canalizing depth one have the same probability, and all others have probability zero(B) The probability assigned to each network is proportional to the product of the number of canalizing variables of the functions defining this networkWe fix a positive integer . By and we denote the average number of attractors of length in a random Boolean network with n states with respect to distributions (A) and (B), respectively. Then,

Example A.1. We will illustrate the (B) distribution by an example. Consider the following three networks with two states:Since the canalizing depth of is zero, , the probability of with respect to B, is zero. Since the canalizing depths of and are 2 and 1, respectively, the ratio is equal to .

Proof. Let and be the number of Boolean functions in n variables with canalizing depth exactly one and more than one, respectively. We will use the following bounds:(1): we look term-by-term. There are at most ways to choose first and second canalizing variables. There are at most 4 choices for the canalizing outputs and at most 4 choices for canalizing values for these two variables. There are at most core functions, since that is all possible functions, which may or may not be canalizing. Since redundant arrangements of canalizing variables are not accounted for, this must overcount.(2): this is a lower bound for the number of noncanalizing core function in variables because is an upper bound on the number of canalizing functions in variables (obtained in the same way as the bound above).We also introduceFor X being (A) or (B) and positive integer n, let denote the probability (it is always the same) of choosing a network from distribution X with all functions being of depth exactly one. Let be the maximal probability of choosing a network from (B) with at least one function being of depth more than one, respectively. By and we denote the total number of attractors of length in networks with all functions being of depth exactly one and with at least one function being of depth more than one, respectively.
The statement of the lemma is equivalent to the statement thatUsing the notation introduced above, we can bound asWe set and . Then, (A.21) would follow from and , so we will prove these two equalities.
Since any network has at most attractors of length , . Since the total sum of the products of canalizing depths over all Boolean networks does not exceed , we have . Since , we have(A.20) implies that for large enough n. Hence, for large enough n, we haveBy similar arguments, and , soSince for large enough n, using (A.20), we have

Remark 8. The proof of Lemma A.3 will be valid if we replace distribution (B) with any other distribution such that, for every Boolean network (i)If at least one of ’s is noncanalizing, (ii)There exists a constant such that, if the canalizing depth of every is one, then (iii)We have (using notation from the proof of Lemma A.3)The above properties hold, for example, for the following distribution.(C) All the networks defined by canalizing functions have the same probability, and all others have probability zero.Using this distribution instead of (A), we see that Theorem 2 holds also for a random Boolean network defined by canalizing functions.

Lemma A.4. We will use Notation A.1 and notation from Lemma A.2. Then, for every positive integers and n, we have

Proof. We fix n. Consider a tuple of distinct elements of . For , we denote . For , letThen, . First, we will show thatwhere .
To prove (A.29), we will use that the functions in the network are chosen independently to decompose the left-hand side aswhere we use notation and the probability of each Boolean function to be chosen is assumed to be proportional to the number of its canalizing variables. We show that, for every ,Then, (A.29) would follow from multiplying (A.31) for all i. To prove (A.31), without loss of generality, we consider . Consider a setwith a uniform probability distribution . Observe that for a function f with canalizing variables , , , we haveIf we can show that, for every ,then (A.31) would follow by summing up (A.34) over all k and using the law of total probability.
We consider one of the canalizing variables of f, say, . Let c be the canalizing value of , and let be the value taken by f when . Then, , and all these four cases have the same probability due to the symmetry. As , it is sufficient to show thatand then sum for all .
To prove (A.35), consider any j, say . There are then 4 cases for the values of and :(1) and is 0 or 1. With probability , we have . This is true due to symmetry, as for any which takes on the value at , we can produce another function that is equal to 0 if and if . Then, .(2) and . Since , the probability of is zero.(3). Since and , the canalization property implies that with probability one.The only case in which is where there is at least one j such that case 2 is realized. In this case, the probability in the left-hand side of (A.35) will be zero. Otherwise, each occurrence of case 1 will multiply the total probability by and each occurrence of case 3 will multiply the total probability by 1. Thus, we show that the left-hand side of (A.35) is indeed equal to . This finishes the proof of (A.29).
To finish the proof of the lemma, we setSumming (A.29) over all -tuples of distinct elements of , we obtain (see (A.7))On the other hand, if is supported on one some , then , where denotes the restriction of on the coordinates from . This implies thatThis finishes the proof of the lemma.

Proof of Theorem 2. We fix positive integer . In the notation of Lemma A.3, we have . Lemma A.3 implies that . We fix any , and let be the matrix given by Lemma A.1. We set . Then,Lemma A.2 implies that is finite, thus we have that . We finish the proof of the theorem by considering the limit of (A.27) and applying Lemma A.2 to .

B. Complexity analysis

Proposition B.1. Complexity of Algorithm 3 is .

Proof. First, we show the complexity of a single run the algorithm, i.e., not taking into account the recursive call, is . Since the first k rows of the Pascal’s triangle can be precomputed in , the complexity of step 1 is also . Similarly, the complexity of step 3 is . It remains to observe that step 2 takes and step 4 takes (indeed, selecting a subset of size j amounts to selecting and removing j indices). In total, we obtain .
The depth of the recursion calls is at most k. Since the complexity of each single call is , so the total complexity is .

Lemma B.1. The average complexity of the algorithm for generating a function in variables which is either 1 or noncanalizing described in Remark 3 is .

Proof. [25, p. 116] implies that the proportion of functions which are canalizing in n variables is bounded from above by . Note that [25] considers constant functions canalizing which we do not. Thus, the probability of choosing a function which is either 1 or noncanalizing is bounded from above byThis bound is less than for all values of n except 1 and 2, but we can compute directly that and . Therefore, the number of times the generation of a function needs to be repeated averages to , which does not exceed 4, so the average complexity of the whole procedure is the same as of a single generation step.
The complexity of a single step consists of generating a random function (which is ) and checking whether it is canalizing or not. We perform this check by running linearly through the table for each variable, so the complexity is time. Thus, the total complexity is indeed .

Lemma B.2. There is a constant such that the probability that a function generated in steps 1 and 2 of Algorithm 2 does not satisfy one of the conditions (E1) or (E2) is bounded by c for every n.

Proof. Notice thatWe will show that there is a constant such that and do not exceed c.(i): the probability of having (the only possible ) is just the proportion of ordered partitions with a single element at the end. We can construct all of these by picking an element and then picking a partition of the remaining elements, so this creates possibilities. Thus, the probability of this occurring is . [19, equation (5)] implies that this approaches as n goes to infinity. Thus, there exists such c.(ii): the probability of ever picking is just , so we can take .

Proposition B.2. Complexity of Algorithm 2 is .

Proof. Lemma B.2 implies that the average number of reruns in step 3 is constant. Thus, the complexity of the algorithm is the same as of a single run.
Proposition B.1 and Lemma B.1 imply that the complexity of step 1 is . Step 2 generates a truth table for the function. There are input-output pairs, and computing the function takes at most k steps, so this is . In step 3, the conditions (E1) or (E2) are verified in time.
Summing everything, we obtain

Data Availability

Python/sage code and the results of simulations used to support the findings of this study have been deposited at https://github.com/MathTauAthogen/Canalizing-Depth-Dynamics.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors are grateful to Claus Kadelka, Christian Krattenthaler, and Doron Zeilberger for helpful discussions. GP was partially supported by NSF grants CCF-1564132, CCF-1563942, DMS-1760448, DMS-1853482, and DMS-1853650 by PSC-CUNY grants #69827-0047 and #60098-0048. RL was partially supported by Grants NIH 1U01EB024501-01 and NSF CBET-1750183. EP, GP, and WQ are grateful to the New York Math Circle, where their collaboration started.