Convergence of a Belief Propagation Algorithm for Biological Networks

Choo, Sang-Mok; Kim, Young-Hee

doi:https://doi.org/10.1155/2019/9362179

Discrete Dynamics in Nature and Society

On this page

Abstract Introduction Preliminaries Results Conclusions Data Availability Conflicts of Interest Authors’ Contributions Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2019 | Article ID 9362179 | https://doi.org/10.1155/2019/9362179

Convergence of a Belief Propagation Algorithm for Biological Networks

Sang-Mok Choo¹and Young-Hee Kim²

Academic Editor: Maria Alessandra Ragusa

Received02 Apr 2019

Revised07 Jun 2019

Accepted24 Jun 2019

Published24 Jul 2019

Abstract

Constructing network models of biological systems is important for effective understanding and control of the biological systems. For the construction of biological networks, a stochastic approach for link weights has been recently developed by using experimental data and belief propagation on a factor graph. The link weights were variable nodes of the factor graph and determined from their marginal probability mass functions which were approximated by using an iterative scheme. However, there is no convergence analysis of the iterative scheme. In this paper, at first, we present a detailed explanation of the complicated multistep process step by step with a network of small size and artificial experimental data, and then we show a sufficient condition for the convergence of the iterative scheme. Numerical examples are given to illustrate the whole process and to verify our result.

1. Introduction

Systems of mathematical equations have been used for modeling interactions of genes or proteins in biological systems. The modeling consists of two parts: one is to construct a network of nodes and links, where nodes usually represent genes or proteins and links denote interactions of nodes. The second is to build a set of mathematical formulas for governing the network dynamics, which might be expressed as a system of differential equations or Boolean logic functions [1, 2]. Modeling of a biological network has been focused on the construction of a directed network for a given biological system and its mathematical formulas, with which biological phenomena and mechanisms have been explained and controlled [3] and, furthermore, modeling of a biological network is a useful tool for understanding and explaining the distribution of the biological communities too [4]. However, there are few researches on how to determine an optimal set of nodes and links which are key components of networks for biological phenomena.

There could be multiple networks for explaining a given biological system. So, it would be meaningful to use a probabilistic modeling approach which could yield multiple networks for the biological system. One approach was recently developed in [5, 6] for the construction of network models of a melanoma cell line (SK-MEL-133), the most serious skin cancer, in which each link weight was considered as a discrete random variable and its marginal probability mass function (PMF) was constructed by using a factor graph and a belief propagation (BP) algorithm. The marginal PMFs were used to choose multiple weights of high-probability values and then to construct multiple networks for the biological system. The initial step of the approach was to determine a prior knowledge network (PKN) for melanoma and its mathematical formula structure. Nodes were chosen for representing some drugs (drug nodes) for melanoma or proteins (measured nodes) which take important roles in melanoma. The drug nodes had outgoing links to each measured node without incoming links. Each measured node had an outgoing link to each other measured node and an incoming link from each other measured node. Each link weight in the mathematical formulas was not determined in the initial step. Next, each protein concentration was measured at steady state before and after treatment of the drugs, which produced experimental data for determining the link weights. Then, the joint PMF of link weights was defined as the Boltzmann/Gibbs form [7] for the factorization of the joint PMF. The form contained a cost function of the experimental data and the simulated values of measured nodes at steady state, where the simulated values were not constants for undetermined link weights. So, the factorization of the joint PMF directly led to the factor graph which is a bipartite graph consisting of variable nodes (link weights in the network) and factor nodes (factors in the factorization). Finally, the marginal PMFs were approximated with an iterative scheme constructed from belief (message) propagation on the factor graph.

In general, computation time could be high to directly compute marginal distributions from a joint probability distribution. So, factor graphs and BP have been applied to infer marginals in diverse problems [8–12]. It is known that a marginal obtained from applying BP becomes the true marginal for an acyclic factor graph and BP could be successfully applied to a cyclic graph [13–18]. There have been many researches on conditions for the convergence of beliefs, which can be applied for computing marginals from a Gaussian joint probability density function (PDF) [19, 20]. However, the convergence conditions cannot be applied to the convergence of beliefs (messages) in [5, 6], since the joint random variable is discrete. And the authors [5, 6] did not analyze the convergence. As far as we know, there is no framework for the inference of marginal PMFs based on both BP and perturbation data except the approach in [5, 6]. For the use of the stochastic approach [5] in the construction of networks based on BP as well as perturbation data, a sufficient condition for the convergence of messages is necessary. Under such a sufficient condition marginal PMFs can be identified and, as a result, multiple network models for the given biological system can be constructed.

In this paper, with our simple network, we present a detailed explanation of the long and intricate process of constructing a system of equations with messages from variable nodes to factor nodes, and vice versa for the approximation of marginal PMFs. Since approximate marginal PMFs are defined with the solution of the system of equations, we identify recursive relations for the solution of the system of equations and construct iterative schemes for solving the recursive relations. Finally we show a sufficient condition for the convergence of the schemes by using a Banach fixed-point theorem (see, e.g., [21, 22]). Numerical examples are given to illustrate the process in Section 4 and to verify our results.

2. Preliminaries

For a clear explanation of the convergence of the iterative schemes mentioned in the Introduction section, we make a simple network of three nodes and four links (see Figure 1(a)), which is assumed to be a PKN. Since the simple network has key components necessary for determining approximate marginal PMFs, the simple network can be extended to any network, including the biological network in [5]. So, our convergence analysis in the Results section can be applied for any network. The network has two measured nodes and one drug node . We consider two treatments as the and perturbations. Symbols in Figure 1(b) denote , where and are the concentrations of at steady state before and after the perturbation, respectively. Since each drug node has an outgoing link to each measured node and no incoming link to any node, drug node has two outgoing links to and with weights and , respectively. Each measured node has an outgoing link to each other measured node and incoming links from each other nodes, and so, measured node has one outgoing link to with weight and two incoming links from and with weights and , respectively. Similarly, measured node has one outgoing link to with weight and two incoming links from and with weights and , respectively.

(a)

(b)

(c)

(d)

Figure 1

Given information and our goal. (a) A prior knowledge network of two measured nodes and , and one drug node . The symbol denotes the link weight from to , which is a discrete random variable. (b) Experimental data. The and perturbations mean the and experiments. Symbol denotes the -ratio concentration of at steady state before and after the perturbation. (c) Differential equation model for the perturbation. Symbol denotes the relative change of at time in the perturbation. (d) Construction of approximate marginal PMFs from a joint PMF. Set denotes the domain of each marginal PMF.

As in [5], the dynamics of the given situation is modeled with the mathematical formulas as in Figure 1(c). Simulated values at steady state under the perturbation are assumed to be the formwhere . From now on, weights are considered as discrete random variables with three values (-1, 0 and 1) and their PMFs are denoted as follows: for which are approximately calculated in the Results section by using a factor graph and BP based on the factor graph. In [3], discretization over the 3 weight values led to a high rate of convergence and did not capture appropriate uncertainty in a probability distribution and then 11 weight values were used. However, our goals in this paper are to find a convergence condition for messages and to explain the complicated multistep process in [3] step by step by using a prior knowledge network of small size. So, the three weight values are enough for our goals. For the extension of our results to networks of large size and real perturbation data, some notations and mathematical expressions used in our paper might be rather cumbersome in later sections.

3. Results

In this section, we present a system of equations for each marginal PMF, iterative schemes for the solutions and a sufficient condition for the convergence of the iterative schemes.

3.1. System of Equations for Marginal PMFs

To define a joint PMF of link weights based on the experimental data we consider the sum of squared errors between simulated values and experimental values as well as the penaltywhich reflects the property that the total number of links tends not to be large. Note that due to no incoming link to drug node . Then the cost function is defined aswhere the error and penalty are weighted by positive constants and , respectively. Using (1), the cost is written asPosteriori probability (joint PMF) is defined aswhere is the constant ensuring that the sum of the probabilities over all model configurations is equal to one. Let , where and are the constants ensuring that the sums of probabilities and over all model configurations are equal to one, respectively. Substituting the cost (6) into gives which implies that and can be independently determined by using a similar approach. So, we consider to show the approximation of marginal PMF :It is not efficient to calculate the exact marginal with the enumeration in the cases where the numbers of nodes in a network or experimental perturbations become large. Therefore, the process to approximate the marginal is presented step by step by using a factor graph and BP on the factor graph.

Step 1 (introduction of a factor graph and BP on the factor graph). Using the factorization in (9), factor nodes are defined asand then marginal in (9) can be written as follows:which yields the factor graph of two variable nodes and two factor nodes as in Figure 2. Following BP on the factor graph, message from variable node to factor node is defined as where is the normalization constant ensuring that the sum of the probabilities over all model configurations is equal to one and is the message from to defined asUsing message , marginal PMF in (11) can be approximated aswhere is the normalization constant ensuring that the sum of the probabilities over all model configurations is equal to one.

(a)

(b)

Step 2 (approximation of the summation (13)). The process of the approximation used in [5] can be divided into two parts: the first is to change multiple summations into a single summation by introducing a new random variable. The next is to change the single summation into an integral.
Step 2-1. Since in (13) is a function of , all random variables in , which are in (13), can be divided into two type of random variables: one is and the other is , which becomes the linear combination of random variables in the cases where the number of nodes becomes large. Using the definitions of and in (10), we can write in (13) aswhich is a function of random variables and . Then (13) becomesLettingfor some positive integer , message in (16) becomesNote thatwhich implies that the following can be a PMF of So, message in (18) can be written as a single summationStep 2-2. Note that is a sum of random variables , which are assumed to be independent. Even though is not identically distributed, the authors [5] invoked the central limit theorem to approximate the PMF of as a Gaussian with reference to [23], where there was no explicit justification for the application of this theorem. Since sums of independent random variables converge in distribution to the standard normal as long as some condition (e.g., the Lindeberg Condition [24]) is satisfied, we think that such a condition might be implicitly assumed in [24]. So, the approximate PMF of becomes where average and variance of are defined asSymbols and denote the averages of and , respectively:and then the sum over configurations in (19) is approximated with a Gaussian integration:

Step 3 (approximation of the improper integral (26)). For the calculation of the improper integral (26), error is linearized with respect to the maximization of the fitness in (15). Note that the equalitycan be written asunder the assumption thatThen error in (15) is approximated bywhich yields Hence we have where the last equality is obtained both by the identity and the property of a PDF (the integral of a PDF over its domain is equal to 1). Therefore adopting the final approximation of in [5], approximate marginal PMF in (14) can be obtained by solving the system of equationswhere , , , and are in (23)–(25).

Remark 1. Since the approximation (34) was used in [5] and our goal in this paper is to find a convergence condition for the algorithm in [5], we also use (34) instead ofEven when using (36) instead of (34), the convergence analysis in the later subsections could be applied (we do not show it in this paper).

Remark 2. Similarly approximate marginal PMF can be obtained by replacing subscripts 1 and , related with node , in , , , and in (34), (35), and (23)–(25) with 2 and , related with node : for , :where is the normalization constant ensuring that the sum of the probabilities over all model configurations is equal to one and

3.2. Iterative Schemes for Solving the System of Equations

In this subsection, we construct sequences and by using (34) and (35). In the next subsection we present a sufficient condition for the convergence of the two sequences, which implies that the two limits satisfy (34) and (35). So, the limit of becomes value , leading to the construction of approximate marginal PMF in (14). We assume thatand initial terms of are defined aswhere is the normalization constant ensuring that the sum of the probabilities over all model configurations is equal to one. The iterations and are defined similarly to (34) and (35) and so we need to define and by using (23) and (24): and So, the iteration is defined aswhere is the normalization constant ensuring that the sum of the probabilities over all model configurations is equal to one. Similarly the iteration is defined asfor , and . Here is the normalization constant ensuring that the sum of the probabilities over all model configurations is equal to one andTherefore the desired iterative schemes consist of (42), (43), (47), (48), and (49) under the assumption (29).

Remark 3. Since defined in (48) is a PMF due to (47) and (49), we havewhich gives the positivity of the following term in (49)So, we obtain a lower bound of Therefore we obtain an upper boundwhich is used in the proof of Lemma 8.

Remark 4. Note that sequence in the recursive relation (47) contains no . Then, in the next subsection, we present a sufficient condition for the convergence of sequence without using and, as a result, the limit of is used to define the message .

Remark 5. Similarly, to define messages , we use the following sequences:where is the normalization constant ensuring that the sum of the probabilities over all model configurations is equal to one and

3.3. A Sufficient Condition for the Convergence of the Iterative Schemes

Letandwhere the subscript 1 of and represents node . The iteration in (47) is written asIt follows from (47), (57), and (58) that are functions of two independent variables: for We use Banach fixed-point theorem [21] for the convergence of the sequence (59) to prove Theorem 9, which is our main result.

Theorem 6. Let be a closed subset of for a positive integer . If a function satisfies that for a constant and all in then there exists a unique fixed point such that , which is the limit of sequence for any .

For the application of Theorem 6 to the proof of our main result we need the following lemmas.

Lemma 7. Assume that experimental data are contained in the codomain of . Assume that positive constants and satisfy inequalities for and defined in (53). Then defined in (49) and (57)–(60) becomes a function from domain to codomain .

Proof. It follows from (49) and (57)–(60) that for andWe have that for , and and for and so, we obtainSimilarly, under the same condition, we can obtainwhich are desired results.

Lemma 8. Assume that experimental data are contained in the codomain of . Assume that positive constants and satisfy inequalities for and defined in (53). Then for defined in (49) and (57)–(60) and where

Proof. Using (62) and (63), we have that for where and function is defined as follows: for Due to the mean value theorem there exists a constant c in (0,1) such thatNote thatwhereUsing , we haveUsing the given condition in this lemma and (53), we have the positive lower boundwhich gives thatSimilarly,Substituting (73), (74), (78), and (79) into (70), we haveFollowing the calculations for obtaining (80), we can obtain that for the perturbationwhere is defined in Lemma 8.
Similarly we can have that for the perturbationwhere are defined in Lemma 8. Therefore, it follows from (80)–(82) that which gives the desired result.

Using Theorem 6 and Lemmas 7 and 8, we can obtain our main result.

Theorem 9. Assume that the experimental data are contained in the codomain of . Suppose that positive constants and satisfyandfor , defined in (53) and defined in Lemma 8. Then sequence converges for any .

Proof. Let be the closed subset of andThen Lemmas 7 and 8 give that andTherefore we can apply Theorem 6 and so we obtain a unique fixed point such that , which implies the convergence of sequence for any with limit .

Remark 10. We definewhere the vector in the right side denotes limit in Theorem 9. Since in (57)–(59) converges, in (48) also converges to limit . Therefore, in (88) and satisfy the system of equations in (34) and (35), which give approximate marginal PMFs in (14) as follows:

Remark 11. When using the sequences in (54)–(56) instead of the sequences in (47)–(49), we can obtain the limits of the sequences in (54)–(56) under conditions similar to those in Theorem 9: assume that experimental data are contained in the codomain of . Let Suppose that positive constants and satisfy that andwhereWe can define the messages as in Remark 4and then obtain the approximate marginal PMFs

4. Numerical Examples

In this section, we verify our main result and show its application.

Step 1 (generation of artificial experimental data). We randomly generate 100 sets of artificial experimental data, where each set consists of the six values of as in Figure 3(a). Every randomly generated experimental data has a value contained in the open interval , the codomain of , and so the condition on the experimental data in Theorem 9 is satisfied.

(a) 100 sets of 6 experimental data

(b) Cost parameters

Figure 3

Experimental data and cost parameters. (a) Each of 100 color matrices denotes the values of 6 artificial experimental data . Values of the - and -axis in each color matrix denote the values of and , respectively, where the color in the entry denotes the value of . (b) A value on the -axis denotes an experimental data set in Figure 3(a). The data set at entry in Figure 3(a) is represented as the data set in Figure 3(b). There are few values of and greater than 0.1 and we do not show them here.

Step 2 (determination of cost parameters). We find a pair of positive cost parameters as in Figure 3(b) for each experimental data set, which satisfy the condition on in Theorem 9.

Step 3 (generation of values of initial terms in sequences). We randomly generate initial values of 24 terms for each experimental data set, which satisfy the condition on initial values in Theorem 9 (see Figure 4).

(a) 100 sets of 12 initial values

(b) 100 sets of 12 initial values

Step 4 (convergence of the sequences). We obtain the convergence of all sequences for each experimental data set in Figure 3(a), in Figure 3(b), and initial values in Figure 4, where every sequence converges within 14 iterations (see Figure 5). Due to the recursive relations in (47), (48), (49), (53), (54), (55), and (56), we can divide all sequences into four groups (, , , and ) and show their convergences as in Figure 5.

(a)

(b)

(c)

(d)

Step 5 (approximation of marginal PMFs). Substituting the limits (Figure 6(c)) of the sequences for the experimental data set (Figure 6(a)) and its corresponding cost parameters (Figure 6(a)) into (89) and (93), we obtain the approximate marginal PMFs , , , and as in Figure 6(d), where the maximum value of each PMF is marked as red.

(a)

(b)

(c)

(d)

Figure 6

Approximate marginal PMFs. (a) The experimental data and cost parameters are the data set and its corresponding values of in Figures 3(a) and 3(b), respectively. (b) 12 initial values in the left and right denote the values in the initial data set in Figures 4(a) and 4(b), respectively. (c) 12 limit values in the top and bottom are obtained for Figures 6(a) and 6(b). (d) The approximate marginal PMFs are obtained from (89) and (93). The red numbers denote the maximum values of the marginal PMFs. The first network and its mathematical model in the right denote those in Figures 1(a) and 1(c) with link weights of maximum values. Replacing in the first network with gives the second network and its mathematical model.

Step 6 (construction of multiple networks and their mathematical formulas from the approximate marginal PMFs). We can use the maximum values of four link weights to make a network and its corresponding mathematical formulas (Figure 6(d), first grey box), which could be considered as the most possible network model of the artificial biological system. In addition, since value is almost equal to value , we can also consider the network with and its mathematical formula as in the second grey box in Figure 6(d). We present two networks and their mathematical formulas in the two grey boxes to show that there could be multiple networks for explaining the artificial biological system by using the approach in [5]. Similarly, in cases where a given PKN has more nodes and discrete values of link weights, we could obtain multiple networks and their corresponding mathematical formulas, where probability is high.

5. Conclusions and Future Work

In this paper, using a simple network with artificial experimental data, we have presented a detailed explanation of the whole process for determining approximate marginal PMFs of link weights in a prior knowledge network based on BP on the factor graph. For the factorization of the joint PMF of link weights, the Boltzmann/Gibbs form has been used. And we have shown a sufficient condition for the convergence of beliefs, which leads to determining the approximate marginal PMFs. Since our simple PKN and experimental data have key components necessary for the application of BP in determining approximate marginal PMFs, our results can be extended to any PKN with perturbation data, including the PKN in [5], which is our future study.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

All authors carried out the main results and completed the corresponding proof. All authors read and approved the final manuscript.

Acknowledgments

This work was supported by the 2018 Research Fund of University of Ulsan.

References

W. Materi and D. S. Wishart, “Computational systems biology in cancer: modeling methods and applications,” Gene Regulation and Systems Biology, vol. 1, pp. 91–110, 2007.
View at: Google Scholar
H. F. Fumiã and M. L. Martins, “Boolean Network model for cancer pathways: predicting carcinogenesis and targeted therapy outcomes,” PLoS ONE, vol. 8, no. 7, p. e69008, 2013.
View at: Publisher Site | Google Scholar
A. Barbagallo and M. A. Ragusa, “On Lagrange duality theory for dynamics vaccination games,” Ricerche di Matematica, vol. 67, no. 2, pp. 969–982, 2018.
View at: Publisher Site | Google Scholar
R. Costa and P. Pavone, “Diachronic biodiversity analysis of a metropolitan area in the Mediterranean region,” Acta Horticulturae, Article ID 1215_8, 2018.
View at: Publisher Site | Google Scholar
E. J. Molinelli, A. Korkut, W. Wang et al., “Perturbation biology: inferring signaling networks in cellular systems,” PLoS Computational Biology, vol. 9, no. 12, Article ID e1003290, 2013.
View at: Publisher Site | Google Scholar
A. Korkut, W. Wang et al., “Perturbation biology nominates upstream-downstream drug combinations in RAF inhibitor resistant melanoma cells,” Elife, vol. 4, Article ID e04640, 2015.
View at: Google Scholar
L. D. Landau, L. P. Pitaevskii, and E. M. Lifshits, Statistical Physics: Transl. from the Russian by J. B. Sykes and M. J. Kearsley, Pergamon Press, 3rd, Rev. and Enl. edition, 1980.
F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Transactions on Information Theory, vol. 47, no. 2, pp. 498–519, 2001.
View at: Publisher Site | Google Scholar | MathSciNet
H. Loeliger, “An Introduction to factor graphs,” IEEE Signal Processing Magazine, vol. 21, no. 1, pp. 28–41, 2004.
View at: Publisher Site | Google Scholar
H. Loeliger, J. Dauwels, J. Hu, S. Korl, L. Ping, and F. R. Kschischang, “The factor graph approach to model-based signal processing,” Proceedings of the IEEE, vol. 95, no. 6, pp. 1295–1322, 2007.
View at: Publisher Site | Google Scholar
C. M. Bishop, Pattern Recognition and Machine Learning, Springer, New York, NY, USA, 2006.
View at: MathSciNet
K. P. Murphy, Machine Learning: A Probabilistic Perspective, MIT press, 2012.
R. G. Gallager, “Low-density parity-check codes,” IEEE Transactions on Information Theory, vol. 8, pp. 21–28, 1962.
View at: Publisher Site | Google Scholar | MathSciNet
R. G. Gallager, Low Density Parity Check Codes, vol. 21 of Research Monograph Series, 1963.
D. J. MacKay and R. M. Neal, “Good codes based on very sparse matrices,” in Proceedings of the IMA International Conference on Cryptography and Coding, pp. 100–111, Springer, Berlin, Germany, 1995.
View at: Google Scholar
R. J. McEliece, D. J. C. MacKay, and J.-F. Cheng, “Turbo decoding as an instance of Pearl's ‘belief propagation’ algorithm,” IEEE Journal on Selected Areas in Communications, vol. 16, no. 2, pp. 140–152, 1998.
View at: Publisher Site | Google Scholar
T. Heskes, “Stable fixed points of loopy belief propagation are local minima of the bethe free energy,” in Advances in Neural Information Processing Systems, pp. 359–366, 2003.
View at: Google Scholar
J. S. Yedidia, W. T. Freeman, and Y. Weiss, “Understanding belief propagation and its generalizations,” Exploring Artificial Intelligence in the New Millennium, vol. 8, pp. 236–239, 2003.
View at: Google Scholar
Q. Su and Y. Wu, “On convergence conditions of gaussian belief propagation,” IEEE Transactions on Signal Processing, vol. 63, no. 5, pp. 1144–1155, 2015.
View at: Publisher Site | Google Scholar
Z. Zhang and M. Fu, “On convergence rate of the gaussian belief propagation algorithm for markov networks,” https://arxiv.org/abs/1903.02658.
View at: Google Scholar
M. A. Khamsi and A. K. William, An Introduction to Metric Spaces And Fixed Point Theory, vol. 53, John Wiley & Sons, 2011.
M. A. Ragusa, “Local Hölder regularity for solutions of elliptic systems,” Duke Mathematical Journal, vol. 113, no. 2, pp. 385–397, 2002.
View at: Publisher Site | Google Scholar | MathSciNet
A. Braunstein, A. Pagnani, M. Weigt, and R. Zecchina, “Inference algorithms for gene networks: a statistical mechanics analysis,” Journal of Statistical Mechanics: Theory and Experiment, vol. 2008, Article ID P12001, 2008.
View at: Google Scholar
D. R. Hunter, Notes for a Graduate-Level Course in Asymptotics for Statisticians, Penn State University, Pennsylvania, Penn, USA, 2014.

Copyright

Copyright © 2019 Sang-Mok Choo and Young-Hee Kim. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

832

Downloads

764

Citations